<<

Copyright 0 1997 by the Genetics Society of America

The Complete DNA Sequence of the Mitochondrial of a “Living ,” the ( chalumnae)

Rafael Zardoya and Axel Meyer

Department of Ecology and and Program in Genetics, State University of New York, Stony Brook, New York 11 794-5245 Manuscript received December 26, 1996 Accepted for publication April 14, 1997

ABSTRACT The complete nucleotide sequenceof the 16,407-bp mitochondrial genome of the coelacanth (Latirne- ria chalumnae) was determined. Thecoelacanth mitochondrial genome orderis identical to the consensus gene orderwhich is also found in all ray-finned , the , and most . Base composition and codon usage also conform to typical vertebrate patterns. The entire mitochondrial genome was PCR-amplified with 24 sets of primers that are expected to amplify homologous regions in other related vertebrate . Analyses of the controlregion of the coelacanth mitochondrial genome revealed the existence of four 22-bp tandem repeats close to its 3’ end. The phylogenetic analyses of a large data set combining genes coding for rRNAs, tRNA, and proteins (16,140 characters) confirmed the phylogenetic position of the coelacanth as a lobe-finned ; it is more closely related to tetrapods than to ray-finned fishes. However, different phylogenetic methods applied to this largest available molecular data set were unable to resolve unambiguously the relationship of the coelacanth to the two other groupsof extant lobe-finned fishes, the and thetetrapods. Maximum parsimony favored a lungfish/coelacanth or a lungfish/ sistergroup relationship depending on which transver- sion:transition weighting is assumed. Neighbor-joining and maximum likelihood supported a lungfish/ ” tetrapod sistergroup relationship.

OELACANTHS were for the first time recognized SMITH1956; THOMSON1966; BEMISand NORTHCUTT C as a distinct taxonomic group >150 ago by 1991; MANCUM 1991; SCHULTZEand CLOUTIER1991; in his renowned book “Recherches sur les TAMAIet al. 1994). Thesefishes are characterized by the Poissons Fossiles” (1844). Since then, many more extinct possession of several distinct morphological, physiologi- coelacanth species have been discovered (CLOUTIER cal, and life history traits, such as lobed paired , an and Fow1991). These range in age from the heterocercal tailwith an apical lobe, an intracranial Early to the Late (HUXLEY1861; joint, a rostral electroceptor organ, a retention WOODWARD1891; STENSIO1921). No fossil mechanism and very large (and give birth to living are known from more recent times, and, hence, this young) (e.g., FOREY1988; LONG1995; MAISEY 1996). group was believed to have gone extinct -70-80 mya. Unlike other sarcopterygians, the group to which they However, in 1938, a living coelacanth (Latim’achalum- belong (see below),coelacanths lack choanae (internal nae) was trawled off the estuary of the Chalumna river, nostril), maxillae and functional . near East London,South (SMITH1939, 1956; Unfortunately, little is known about the coelacanth COURTENAY-LATIMER1979). Since the sensational dis- at a molecular level. To date, only the triosephosphate covery of this , a total of only -200 speci- isomerase (KOLB et al. 1974), the CY and ,B parvalbumin mens have been caught off the Comoro archipelago (PECHEREet al. 1978), and the CY and ,B hemoglobin near the eastern coast of Africa in the . It amino acid sequences (GORRet al. 1991), as well as the proved to be the only known coelacanth population 18s rRNA (STOCKet al. 1991), 28s rRNA (HILLISet al. (SCHLIEWENet al. 1993; FRICKEet al. 1995). 1991; ZARDOYA and MEYER 1996b), cytochrome b Coelacanths are large gray-blue marine fishes whose (MEXERand WILSON1990), 12srRNA (MEYERand WIL- has remained seemingly unchanged dur- SON 1990; MEYERand DOLVEN1992; HEDGESet al. 1993), ing the last400 million years (but see FOREY1991). 16s rRNA (HEDGESet al. 1993), cytochrome oxidase During the last 50 years, detailed anatomic and func- subunit 1 (YOKOBORI et al. 1994), andMhc class I (BETZ tional morphological analysis of the circulatory, repro- et al. 1994) nucleotide sequences have been reported ductive, locomotive, and respiratory systems of the coe- from the coelacanth (reviewed by MEYER 1995). lacanth have been conducted (e.g., MILLOT 1954, 1955; Despite much prior effort, thephylogenetic position of the coelacanth still remains controversial (reviewed by MEYER 1995). It is well established that the coela- Corresponding author: Rafael Zardoya, Department of Ecology and Evolution, 662 Life Sciences, State University of New York, Stony canth, together with lungfishes, and the extinct rhip- Brook, New York 117945245. E-mail: [email protected] idistians form the subclass , the lobe-

Genetics 146 995-1010 (July, 1997) 996 R. Zardoya and A. Meyer

TABLE 1 PCR and sequencing primers used in the analysis of the coelacanth mitochondrial genome

Approximate product PCR primers Sequence (5' "* 3') length (bp) 1. 16SF AGT TCA GAC CGG AGC AAT CCA GG 600 2. LATI NDl R ACT TCG TAG GAA ATA GTT TGT GC 3.LATI ND1 F1 CCT TGA TCG GGG CCC TTC GAG CA 600 4. LATI MET R TCG GGG TAT GGG CCC GAA AGC TT 5. LATI GLN F TGA ACC TAT ACT AAA GAG ATC AA 750 6. LATI ND2 R ATT TTT CGT AGC TGC GTT TGA TT 7. LATI ND2 F GGA CTT ATC CTG TCG ACT TGA CA 700 8. LATI TRP R TTAAAGCTTTGAAGGCTC TTAGT 9. LATI TRP F AAG CAT CAT AAC CCT CAT AGC ACT 700 10. LATI COI R GGC ATC ACTATAAAGAAGATT AT 11. LATI COI F1 GAT GAC CAA ATT TAT AAT GTA GT 700 12. LATI COI R1 ATT GCC ATT ATC GCT CAG ACT AT 13. LATI COI F2 AAG AAA GAA CCA TTC GGG TAT AT 1200 14. H7886 TAN SWY CAR TAY CAY TGR TGN CC 15. LATI COII F TAATTGATGAAGTCGAAAACC CTCA 650 16. LATI AT8 R TTAGGC TCATGGTCAGGTTCA 17. LATI AT8 F AGT GAATGC CTCARY TAAAYC C 1000 18. LATI COIII R TTT TGT ACA GGT AGT GTG TGG TG 19. LATI COIII F ATATAT CAATGATGA CGAGA 600 20. LATI COIII R1 ACATCAACGAAATGT CAGTAC CA 21. LATI COIII F1 TAC CAC TTC ACA TCA AAC CAC CA 1600 22. LATI ND4 R TAG TAG GAC GGC GGC TAG TAC TAT 23. LATI ND4 F1 CCT AAR GCC CAY GTA GC 900 24.LATI LEU R TTTGCACCAAGAGTT TTTGGT TCC TA 25. LATI LEU F CTA AAG GAT AAT AGC TCA TCC ATT 600 26. LATI ND5 R CAY CAG CCR ATT ART ARR AAT GAY AT 27. LATI ND5 F CAR YTA TTY ATC GGN TGR GAR GG 650 28. LATI ND5 R1 CCYATYTTT CKGATRTCYTGYTC 29. LATI ND5 F2 TAA AGC AAT GCT GTT CCT ATG CT 850 30. LATI ND5 R3 AAN AGN GTN AGR TAN GTY TTR AT 31. LATI ND5 F3 ATC AAC AAC TCG CAA CAA GGA C 1100 32. H15149" AAA CTG CAGCCC CTCAGAATGATATTT GTC CTCA 33. LATI WBF TAC TTA CAA AAA GAA ACC TG 850 34. LATI THR R CGG CTT ACA AGG GCC GAT GCT TT 35.LATI THR-F AGC CTT AGT AGC TTA AAC CC 1000 36. LATI12S-R AGC AAG GCT GGG ACC AAA CCT TT 37.L-Phe TAAAGCATAGCACTGAAAATG 850 38.12Sb GAG GGT GAC GGG CGG TGT GT 39.12Sa TGG GAT TAG ATA CCC CAC TAT 800 40. 16s-1H KTAGCT CRC CYAGTTTCGGG 41. 16s-1L AGT ACC GCA AGG GAA ARC TGA AA 800 42. 16s-2H GAT TRY GCT ACC TTY GCA CGG TCA 43.16Sar' CGC CTG TTT ATC AAA AAC AT 500 44. 16Sbr' CCG GTC TGA ACT CAG ATC ACG T Coelacanth Mitochondrial DNA 997

TABLE 1 Continued

Approximate product PCR primers Sequence (5' + 3') length (bp) 45.ND6-FAA LATI AAA ATC TAR AAA CCY CCC ATC 650 32. H15149' AAACTGCAGCCC CTCAGAATGATATTTGTC CTCA 46. LATI 16SF TCCATCAA CAA GTC TTTAGA CCA 450 47. LATI 16SR GTACTG GCT CCT TAA TGA ATA GT Sequencing LATI COI F3 CCCTGC TAT GAG CAC TAG GAT TT LATIND3 F AATGTGGCTTTGATC CTC TAG GA LATIND3 R TTTAGT CGC TCT GTT TGA TTA CC

IUB code: R, A/G: Y. C/T:., W, . A/T: S. G/C:., and K, G/T. "YOKOBOM it al. (1994). ' KOCHERet al. (1989). PALUMBIet al. (1991). finned fishes (ROMER1966; CARROLL 1988; AHLBERG the three extant sarcopterygian groups, ie., lungfishes, 1991; CLOUTIERand AHLBERG 1996) from which tetra- the coelacanth and tetrapods, can be accomplished by pods were originated. The origin of land the evolutionary analyses of molecular data sets and from -bound lobe-finned fishes involvedboth mor- would help to recreate the sequence of evolutionary phological and physiological innovations and was one events that might have preadapted for the extinct sar- of the most significant events in the history of verte- copterygian conquest of land. All three possible com- brates (PANCHENand SMITHSON1987). There is recent peting hypotheses that explain the relationships among strong evidence supporting elpistostegids (or pander- living representatives of sarcopterygians, have been sup- ichthyids), a group of rhipidistians, as the extinct sis- ported at various times by either morphological and/ ter group of the earliest tetrapods (VOROBYEVAand or molecular data: (1) the lungfishes as the closest living SCHULTZE 1991; CLACK 1994; SCHULTZE 1994; AHLBERG relatives of tetrapods ( ROSENet al. 1981; FOREY1987, et al. 1996; CLOUTIERand AHLBERC 1996). 1988; MEYFLRand WILSON1990). (2)The coelacanth as However, there is no general agreement about the the sister group of tetrapods among living vertebrates relationships among the coelacanth,lungfishes and the (FRITZSCH1987; SCHULTZE1987; GORRet al. 1991).And rhipidistian+tetrapod (e.g., SCHULTZE1994; for (3) lungfishes and the coelacanth form a monophyletic review, see MEYER1995). Since molecular phylogenetic group and, hence, are equally closely related to tetra- data can only be collected from extant species, they can pods (NORTHCUTT1987; FOREY1988; CHANG1991; contribute only in limited ways to this debate (MEYER SCHULTZE1994; ZARDOYA and MEYER1996b). The ap and WILSON1990; MEYERand DOLVEN1992; HEDGESet parently rapid successive origin and subsequent diversi- al. 1993; YOKOBORI et al. 1994; ZARDov.4 and MEYER fication of the three remaining lineages of sarcopterygi- 1996b; see MEYER 1995 for a review). However, the es- ans back in the Devonian makes the determination of tablishment of the phylogenetic relationships among their phylogenetic relationships difficult. Many mor-

Latimeria chalumnae mtDNA 16,407 bp

ATPase 8 COIII \ \ Control region ATPase6 ND3 ND4L FIGURE1.-Gene organization, restriction map, and cloning/sequencing strategy for the coelacanth mitochondrial genome. All protein coding genes are encoded by the H strand with the exception of ND6, which is coded by the L strand. Localization and direction of the primers used in the PCR amplification are denoted by arrows (see Table 1 for the primer DNA sequence associated with each number). 998 R. Zardoya and A. Meyer

tRNA-Phe+ - 12s rRNA-+ 1 CAAAGGTTTGGTCCCAGCCTTGCTATCAATT 100

101 TTAACCAGGATTACACATGCAAGCATCAACTCCCCAGTGAGAATGCCCCTGACTTATCCGTCAAAGATAACAGGGAGTAGGTATCAGGCACACAACATTA 200 201 ACGCTAGCCCAAGACACCTTGTCCAGCCACACCCCCAAGGGAACTCAGCAGTGATAGACATTGAATAATAAGTGAAAACTTGACTCAGCCATGGTTACAA 300 301 GGGCCGGTCAACTCCGTGCCAGCCACCGCGGTTACACGGAAGACCCAAAATGATAACACTACCGGCGTAAAGCGTGATTAAAGGACACCCACCATAATGG 400 401 AGCCACAAATAACTAAAGCTGTTATACGCACTTAAAAAAATATGCTCATCACACGAAAGTAACTCCAGCACCCAAAGGAACCCTGAACCCACGAAAGCTA 500

501 AGAAACAAACTGGGATTAGATACCCCACTATGCTCAGCCCTAAACACAAACAATTCAAACACACACTGTTCGCCAGGGGAACTACAAGCGCCAGCTTCAA 600

601 ACCCAAAGGACTTGGCGGCACTTCAAACCCACCTAGAGGAGCCTGTTCTAAAACTGACAACCCCCACCTAACCTCACCATCCCTAGCCATTAAACCAGCC 700

701 TATATACCGCCGTCGCCAGCCCACCCTGTGAAGGAAATACAATGGGCAAAAATAAAAAAATTAAAAACGTCAGGTCGAGGTGTAGCAAATGAGATGGGAA 800 801 GAAATGGGCTACATTTTCTAAATATAGAATATTACGAAAAAATACAGCGAAACCTGTACTTTGAAGGAGGATTTAGCAGTAAAAGGGGAATAGAGAGCCC 900

901 CTCTGAAACCGGCCCTGAAATGCGCACACACCGCCCGTCACTCTCCTCACCCAAAATCGGCCCCATCTTTTAATAAACAAAAAACCAAGCATACAAGTAG 1000 tRNA-Val+ - 1001 AGGAGGCAAGTCGTAACAAGGTAAGTGTACCGGAAGGTGCACTTGGACTAAT~1100 16s rRNA+ 1101 CGAAAGCGGGTCATTTTGWGCTATATAGCTAGGCCAACAAAACCATAACCACTACACCATTAACCAAAAACTTGAACAAACAAAACCTAAACCATTTACC 1200

1201 ATCCAAGTATAGGCGATAGAAAAGACACCAGGCACAATAGTAAAAGTACCGCAAGGGAACGCTGAAAAAGAAATGAAACAACTCGTAAAAGCAATAAAAA 1300

1301 CCAAAGACTAACCCTTGTACCTTTTGCATTATGATCTAGTTAGACCCCACCGGGCAAAATGAATTTAAGTCCAACCCCCCGAAACTAAGTGAGCTACTTC 1400

1401 GAAACAGCCTATGAGGGCAAACCCTTCTCTGTGGCAAAAGAGTGGGAAGATTTCCAAGTAGAGGCGATAAACCTAACGAGCTTAGTGATAGCTGG5TATT 1500

1501 CGAGAAATGATTTTTAGTTCAGCCCCTCGGCCTCCAAACCCTAACAACATAAAACCAATGTGAGCCAGAGAGAGTTATTCAAAGGGGGTACAGCCCCTTT 1600

1601 GAAAAAGGACACAACCTTCTAAGACAGGATAAAGACTATATTAACTCAAGGATCAGCTGCCCAAGTGGACCTAAAAGCAGTCACCTAAAAAAGCGTTAAA 1700

1701 GCTTAAGCAGCACTAAACCAACCTATACTAATAAATTTAAATTTAACCTCATTCCACCATCACTATCGAATTATTCTATATGCATAGAAGAATAAATGCTAGAATT 1800 1801 AGTAACAAGAAGGCCATTAAGCCTTCTCTAACTGCATAAGTGTACATCAGATTAGATTAACCACTGATAATTAACGACTTCAAAGAGAATACTATGACAT 1900

1901 AAAACAAGAAAAGCACACACGCCCACATCGTTAATCCAACACAGGAATGCAACCACGAAAGATTAAAAGAAAGAAAAGGAACTCGGC~CTATAAGCCC 2000

2001 CGCCTGTTTACCAAAAACATCGCCTCCCGCCAACAACAGAAGTATTGGAGGTCCCGCCTGCCCAGTGACAAGATTTAACGGCCGCGGTATCCTGACCGTG 2100 2101 CAAAGGTAGCGAAATCACTTGTCTTTTAAATGAAGACCTGTATGAATGGCACCACGAGGGCTTAACTGTCTCCTCTTTCCAATCAGTAAAATTGATCTGT 2200 2201 CCGTGCAGAAGCGGACTTAACTACATTAGACGAGAAGACCCTGTGGAGCTTCAGACACAAAGCCAACTACAAATAAACACTTAGATAATAAATTGATAAA 2300 2301 ACCAGTAGCCTAATACTGGCCCTATTGTCTTTGGTTGGGGCGACCACGGAGAAAAAACAATCCTCCAAGCCGATTGGTACCACCTGTACTAAAACAAAGG 2400 2401 GTAACACCCCAAAGTAATAAAAATTTTATCGGACATGACCCAGGGACTAAACCTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCAATCCTTTCC 2500

2501 AAGAGTCCAAATCGACGAAAGGGTTTACGACCTCGATGTTGGATCAGGACACCCCAATGGTGAAGCCGCTATTAAAGGTTTGTTTGTTCAACAATTAAAG 2600 2601 TCCTACGTGATCTGAGTTCAGACCGGAGAAATCCAGGTCAGTTTCTATCTATGATGTTAATTCTCCCAGTACGAAAGGACCGGAGAACTTGAGCCAATGC 2700 tRNA-Leu (UUR)-+ 2701 CACAAGCACGCTCATCTTCAACCTGATGAAAACAACTAAAACAGATAAAGAAGAACACAACCGCTCCCGTAAACAACGGGCAATGCCGGGGTGGCAGAGC 2800 NADH 1-3 - MTKIITHLLNPLAV 2801 TGACAAAAATTATTACACACCTGCTAAACCCACTAGCTG 2900 IIPILLAVAFLTLIERKVLGYMQLRKGPNIVGP 2901 TCATCATTCCAATTTTACTAGCCGTAGCGTTCCTGACACTCATCGAACGAAAAGTACTAGGTTATATGCAACTACGAAAGGGTCCTAATATCGTAGGCCC 3000 YGLLQPLADGLKLFIKEPVRPSTSSPLLFITTP 3001 ATATGGTCTCCTACAACCCCTAGCAGATGGACTAAAACTATTCATTAAGGAGCCAGTACGACCTTCTACATCATCCCCGCTACTCTTCATTACCACTCCA 3100 MLALTMALTLWLPLPLPHPMTNLNLGMLFILAIS 3101 ATACTAGCACTCACCATAGCACTAACTCTATGACTACCACTACCCCTCCCACATCCAATAACAAACCTAAACCTAGGAATATTATTTATCCTAGCAATCT 3200 SLTVYSILGSGWASNLKYALIGALRAVAQTISY 3201 CAAGTCTAACAGTATACTCAATTCTAGGCTCCGGCTGAGCATCAAACTTAAAATATGCCCTAATTGGGGCCCTCCGAGCAGTCGCACAAACAATCTCCTA 3300 EVSLGLILLAMIIFAGGFTLTTFNTSQETIWLL 3301 TGAAGTAAGCCTAGGACTTATCCTACTGGCCATAATCATCTTCGCAGGCGGTTTTACACTAACAACATTTAATACATCACAGGAAACCATTTGACTTCTA 3400 TPGWPLAAMWYISTLAETNRAPFDLTEGESELVS 3401 ACGCCAGGATGACCACTCGCAGCAATATGATACATCTCAACCTTAGCAGAAACCAACCGAGCCCCATTTGACCTCACAGAAGGAGAATCAGAACTTGTAT 3500 GFNVEYAGGPEALFFLAEYANILLMNTLSTILF 3501 CGGGGTTTAATGTGGAATATGCAGGAGGACCATTTGCACTATTCTTTCTAGCAGAATATGC~ATATTCTACTAATAAACACACTATCAACTATTCTATT 3600 MGAMHNPITPELTSINLMIKASALSMLFLWVRA 3601 TATAGGAGCCATACACAACCCAATCACACCAGAACTAACCTCAATTAACCTAATGATTAAAGCCTCCGCACTATCAATACTCTTCCTATGAGTACGAGCC 3700

FIGURE 2.-Complete Lstrand nucleotide sequence of the coelacanth mitochondrial genome. Position 1 corresponds to the first nucleotide of the tRNAphegene. Direction of transcription for each gene is represented by arrows. The deduced amino acid sequence for each gene product is shown above the nucleotide sequence (one-letter amino acid abbreviation is placed above the first nucleotide of each codon). Complete termination codons are indicated (*). tRNA genes are underlined and the corresponding anticodons are overlined. In the control region, four repeats are shown. Coelacanth Mitochondrial DNA 999

SYPRFRYDQLMHLVWKNFLPITLAMILWHTSLPI 3701 TCATACCCACGATTCCGATATGACCAGCTAATACACTTAGTATGAAAAAACTTCTTACCAATCACCCTAGCCATGATCCTATGACACACCTCCCTTCCAA 3800 F T G S L P P Q T * tRNA-Ile+ - 3801 TTTTTACAGGAAGCTTACCACCACAAACCTAA+ 3900 f-tRNA-Gln - tRNA-Met+ 3901 4000 NADH 2+ - MSPYVTMILISSLGLGTTI 4001 TGAGCCCTTACGTAACAATAATCCTTATCTCAAGCCTTGGACTCGGGACAACAATT 4100 TFTSSSWLMAWMGLEINTLAITPLMVKQHHPRAT 4101 ACATTTACAAGCTCATCCTGACTGATAGCTTGAATAGGTCTAGAAATTAATACCCTAGCCATCACCCCCCTAATAGTAAAACAACATCACCCTCGGGCAA 4200 EATTKYFLTQATASGLLLFATLNNAWMTGEWNT 4201 CTGAAGCCACAACAAAATATTTTCTTACCCAGGCAACAGCATCAGGACTGCTATTATTCGCAACCCTTAACAACGCTTGGATAACAGGAGAATGAAACAC 4300 MELSNNLSAPMITMALALKMGVAPMHFWLPEVL 4301 AATAGAACTATCAAACAATTTATCCGCCCCAATAATTACAATAGCCCTCGCACTAAAGATAGGAGTAGCACCAATACACTTCTGATTACCAGAAGTGCTA 4400 QGLPLLTGLILSTWQKLAPFTLLYMTSHELNTTT 4401 CAAGGACTCCCCCTACTAACTGGACTTATCCTGTCGACTTGACAAAAACTAGCCCCCTTCACCCTACTATATATAACATCACATGAATTAAACACAACAA 4500 MTILGLTSTIIGGLGGLNQTQLRKVLAYSSIAH 4501 CAATAACAATCCTAGGATTGACATCAACAATTATCGGTGGCCTTGGTGGATTAAACCAAACTCAACTGCGAAAAGTCCTAGCTTACTCATCAATTGCACA 4600 LGWMVIIIQYSKTLALLNLLLYITMTSTAFLTL 4601 CCTCGGATGAATAGTTATCATTATCCAATACTCCAAAACACTAGCCCTACTAAACCTACTGCTATATATTACAATAACATCAACAGCATTTTTAACACTT 4700 MTLSATKINTLSTKWATTPIATMTAMLALLALGG 4701 ATGACTCTATCAGCCACAAAAATTAATACCCTATCAACAAAATGAGCAACAACCCCTATCGCAACTATAACTGCAATACTAGCTCTACTAGCATTAGGAG 4800 LPPLTGFMPKWLILQELTKQNLPALATLMALSA 4801 GTCTCCCACCACTAACAGGATTTATACCAAAATGACTAATTTTACAAGAACTTACCAAGCAAAATTTACCCGCTCTAGCCACACTAATAGCCCTATCAGC 4900 LLSLFFYLRMCHTMTLTISPNTNNNMITWRKKP 4901 CCTATTAAGCCTCTTCTTCTACCTACGAATATGCCATACAATAACCCTTACAATCTCACCAAACACAAACAATAACATAATTACATGACGAAAGAAACCT 5000 GQKALPLAMLSIMTLMALPTTPTMVAIMN* tRNA-Trp 5001 GGCCAGAAAGCACTACCCCTAGCCATACTAAGCATCATAACCCTCATAGCACTCCCAACAACCCCAACAATAGTAGCCATCATAAACTAATA~ 5100 + - )G) 5101 )G) 5200 - +tRNA-Ala - 5201 C 5300 +tRNA-Asn - CtRNA- 5301 CATCCATCCTAACATTCCAAGAAAAAAAGGAATGTTT- 5400 COI4 CYS - +tRNA-TyrT M M I R W L F 5401 GCTTGGTAAGAGGAGGAATTGAACCTCCTGTGATAATCACTCGTTGACTATTC 5500 STNHKDIGTLYMIFGAWAGMVGTALSLLIRAELS 5501 TCAACCAACCATAAAGACATTGGTACCCTATACATGATCTTCGGTGCCTGAGCTGGAATAGTTGGAACCGCCCTAAGCCTGCTTATTCGAGCTGAACTCA 5600 QPGALLGDDQIYNVVVTAHAFVMIFFMVMPIMI 5601 GCCAACCTGGGGCTCTCCTGGGCGATGACCAAATTTATAATGTAGTCGTTACAGCACATGCATTCGTGATAATCTTCTTTATAGTAATACCGATCATAAT 5700 GGFGNWLIPLMIGAPDMAFPRMNNMSFWLLPPS 5701 CGGCGGGTTTGGCAACTGATTAATTCCCCTGATGATTGGGGCACCCGACATAGCATTTCCACGTATAAACAACATAAGCTTCTGACTACTACCACCCTCA 5800 LLLLLASSGVEAGAGTGWTVYPPLAGNLAHAGAS 5801 CTCCTACTCCTACTAGCATCTTCTGGAGTAGAAGCAGGAGCAGGCACAGGATGGACAGTATACCCTCCACTAGCGGGCAACCTCGCCCATGCAGGAGCAT 5900 VDLTIFSLHLAGVSSILGAINFITTVINMKPPT 5901 CCGTAGATTTAACAATTTTCTCCTTACATCTAGCCGGTGTATCCTCAATCTTAGGGGCCATCAACTTCATCACAACAGTAATCAACATAAAACCCCCAAC 6000 MTQYQTPLFIWSVLVTAVLLLLSLPVLAAGITM 6001 AATAACACAGTATCAGACACCACTATTTATCTGATCAGTCTTAGTGACCGCCGTACTACTCCTACTCTCGCTACCGGTGCTAGCTGCCGGAATTACCATA 6100 LLTDRNLNTTFFDPAGGGDPILYQHLFWFFGHPE 6101 CTACTGACAGATCGAAATCTAAACACAACATTCTTTGACCCTGCTGGAGGAGGAGACCCTATTCTATACCAACACCTAT~CTGATT~T~~~~~~~~CCTG6200 VYILILPGFGMISHIVAYYSGKKEPFGYMGMVW 6201 AAGTATACATCCTAATTTTACCAGGATTTGGTATAATCTCACACATTGTGGCCTACTACTCTGGAAAGAAAGAACCATTCGGGTATATAGGTATAGTATG 6300 AMMAIGLLGFIVWAHHMFTVGMDVDTRAYFTSA 6301 AGCTATAATGGCAATTGGACTTCTAGGCTTCATCGTATGAGCCCATCATATATTTACCGTAGGAATGGATGTTGACACACGAGCATACTTTACATCAGCA 6400 TMIIAIPTGVKVFSWLATLHGGVTKWDTPLLWAL 6401 ACCATAATTATTGCCATCCCAACAGGAGTAAAAGTGTTCAGCTGACTAGCGACACTTCACGGAGGAGTGACCAAATGAGACACACCCCTGCTATGAGCAC 6500 GFIFLFTVGGLTGIVLANSSLDIILHDTYYVVA 6501 TAGGATTTATCTTTCTTTTTACAGTAGGAGGCCTAACAGGCATCGTACTGGCAAACTCATCACTAGACATCATCCTACATGACACTTATTACGTAGTAGC 6600 HFHYVLSMGAVFAIMGGLVHWFPLMTGYTLHNT 6601 ACACTTCCACTATGTCCTATCAATAGGAGCAGTATTTGCAATCATAGGGGGACTCGTGCACTGATTTCCACTAATAACAGGATATACCTTACACAACACA 6700 FIGURE2.- Continued 1000 R. Zardoya and A. Meyer

WTKIHFGVMFTGVNLTFFPQHFLGLAGMPRRYSD 6701 TGAACAAAAATCCACTTTGGTGTAATATTCACAGGAGTAAACCTAACATTTTTCCCACAACACTTCCTCGGACTAGCAGGAATACCACGACGTTACTCAG 6800 YPDAYTLWNTVSSIGSLISLIAVIMFMFILWEA 6801 ACTATCCAGATGCCTATACTTTATGAAACACAGTATCATCAATTGGCTCTCTAATTTCACTAATTGCCGTAATCATATTTATATTTATCCTGTGAGAAGC 6900 FSAKREVLIVEMTTTNVEWLHGCPPPHHTYEEP 6901 TTTCTCTGCCAAACGAGAAGTACTAATTGTAGAAATAACAACAACAAATGTAGAATGGCTGCACGGATGCCCACCACCACACCACACATATGAAGAACCA 7000 AFVQAPR* - +tRNA-Ser (UCN) 7001 GCATTCGTACAAGCTCCTCGATAAAACAC~7100 COII+ tRNA-Asp+ - MAHPSQL 7101 TATTTCCAAATGGCACACCCATCACAGTTA 7200 GLQDAASPVMEELLHFHDHALMIVFLISTLVFYI 7201 GGATTACAAGATGCAGCTTCTCCCGTTATAGAAGAACTCCTCCACTTTCACGATCATGCACTAATAATTGTATTTTTAATTAGCACATTAGTATTTTACA 7300 ILAMMTTKMTDKYILDAQEIEIVWTLLPAIVLI 7301 TTATTCTAGCCATAATAACAACAAAAATAACTGACAAATATATCTTAGACGCACAAGAAATTGAAATTGTGTGAACACTACTCCCAGC~TCGTCCTAAT 7400 LVALPSLRILYLIDEVENPHLTIKAMGHQWYWS 7401 CCTAGTTGCCCTACCCTCGCTACGAATCCTATATCTAATTGATGAAGTCGAAAACCCTCACCTAACAATTAAAGCAATAGGCCACCAATGATACTGAAGC 7500 YEYTDYEELSFDSYMTPLQDLNPGQFRLLETDHR 7501 TATGAGTACACGGACTATGAAGAACTAAGCTTCGACTCATACATAACACCACTACAAGACCTAAACCCGGGCCAATTCCGCTTGCTGGAAACAGACCATC 7600 MVIPMESLIRVLISAEDVLHSWAVPALGVKMDA 7601 GAATGGTTATCCCAATAGAGTCGCTTATCCGAGTACTAATTTCAGCTGAAGACGTACTACACTCATGAGCAGTCCCAGCCCTAGGAGTAAAAATAGATGC 7700 VPGRLNQITFMISRPGLYYGQCSEICGANHSFM 7701 AGTCCCAGGGCGACTCAACCAAATTACATTCATAATTTCCCGACCAGGACTATATTATGGACAATGCTCAGAGATTTGTGGAGCAAACCACAGCTTTATA 7800 PIVLEAIPLDPFEDWSSSMLEEA tRNA-Lys+ 7801 CCCATCGTACTTGAAGCAATCCCACTAGACCCCTTCGAAGACTGATCTTCATCAATGCTGGAAGAAGCCT~ 7900 ATPase 8"f - MPQLNPSPWLLILLFSWLI 7901 CCACCCTCAGTGGCATGCCACAACTAAACCCCTCCCCCTGACTACTAATCCTGCTATTCTCCTGACTCATC 8000 FLTMLPSKTQLHTFPNMPSTQNMCKQEPEPWTWP 8001 TTCTTAACTATACTCCCCTCTAAGACACAATTACACACCTTCCCAAACATGCCATCAACACAAAATATATGCAAACAAGAACCAGAACCATGAACCTGAC 8100 ATPase 6 + MSLNFFDQFMSPTLLGVPLIAVAMMFPWTLLPT WA* 8101 CATGAGCCTAAACTTCTTTGACCAATTTATGAGCCCAACACTATTAGGAGTACCACTCATTGCTGTAGCAATAATATTCCCATGGACCCTATTACCAACA 8200 PTNRWLNNRTLTLQNWFIGR€TNQLLQPLNTGGH 8201 CCAACCAACCGATGACTTAATAACCGAACACTAACACTACAAAACTGATTTATCGGCCGCTTCACTAATCAACTACTACAACCATTAAACACTGGAGGAC 8300 KWAMILMSLNLLGLLPYTFTPTTQLSLNMGLAI 8301 ACAAATGAGCAATAATCTTAATATCACTAAACCTCCTGGGACTTCTACCGTATACATTCACACCAACAACACAACTATCACTAAACATGGGACTTGCTAT 8400 PFWLATVLLGLRNQPTAALGHLLPEGTPTLLIP 8401 TCCATTCTGACTAGCAACAGTATTACTGGGACTGCGTAACCAACCCACTGCCGCGCTAGGACACCTTCTCCCAGAAGGAACACCAACCCTGCTAATCCCA 8500 ILIIIETISLLIRPFALGVRLTANLTAGHLLMQL 8501 ATCCTAATTATTATTGAAACAATCAGCCTACTTATCCGCCCCTTCGCCCTAGGAGTACGACTAACAGCCAATCTCACAGCAGGCCACCTCCTAATACAAT 8600 IATAAFVLLPMMPTVALLTTLVLFLLTLLEIAV 8601 TAATTGCTACCGCCGCCTTCGTACTCCTACCTATAATACCAACAGTAGCATTATTAACAACATTAGTCCTATTCCTCCTGACCCTGCTAGAAATTGCCGT 8700 COIII "f AMIQAYVFVLLLSLYLQENVMAHQAHAYHMVDP 8701 AGCAATAATCCAAGCCTACGTGTTTGTTCTATTACTAAGCCTCTATCTACAAGAAAATGTCTAATGGCCCACCAAGCACACGCATATCACATAGTTGACC 8800 SPWPITGATAALLVTSGLAAWFHFNSMILILMG 8801 CAAGCCCATGACCCATTACAGGGGCCACGGCCGCCCTACTTGTAACCTCAGGCCTAGCAGCGTGATTTCACTTCAACTCAATAATCTTAATTTTAATAGG 8900 LTLLLLTMYQWWRDIIRESTFQGHHTLPVQKSL 8901 ACTAACACTATTGCTACTAACTATGTATCAATGATGACGAGATATTATTCGAGAAAGCACATTCCAAGGTCACCACACACTACCTGTACAAAAAAGCCTA 9000 RYGMILFITSEVFFFLGFFWAFYHSSLAPTPELG 9001 CGATATGGTATAATCCTGTTCATTACATCCGAAGTATTCTTCTTCCTAGGGTTCTTCTGAGCCTTTTACCATTCAAGTCTGGCACCCACTCCTGAACTCG 9100 GLWPPTGITPLDPFEVPLLNTAVLLASGITVTW 9101 GAGGACTCTGACCTCCCACTGGAATTACACCCCTAGATCCATTTGAAGTACCACTATTAAACACAGCAGTTCTACTAGCCTCGGGAATTACAGTCACATG 9200 AHHSLMEGQRKEAIQSLFITVLLGLYFTALQAT 9201 AGCCCATCACAGCCTAATAGAGGGGCAACGAAAAGAGGCTATCCAATCACTATTTATCACAGTTCTGTTAGGACTATACTTTACAGCACTGCAAGCCACA 9300 EYYESPFTIADGAYGSTFFVATGFHGLHVIIGST 9301 GAATACTACGAATCCCCATTTACAATCGCTGACGGAGCCTATGGCTCAACCTTTTTTGTAGCAACCGGATTCCACGGTCTACATGTCATTATTGGCTCTA 9400 FLIVCLVRQTQYHFTSNHHFGFEAAAWYWHFVD 9401 CATTCCTAATCGTATGCCTAGTACGACAAACACAATACCACTTCACATCAAACCACCACTTTGGCTTTGAAGCAGCAGCATGATACTGACATTTCGTAGA 9500 V V W L F L Y V S I Y W W G S * tRNA-Gly+ - 9501 CGTAGTCTGACTATTCTTATACGTATCAATCTACTGATGAGGCTCATAAACCCCTCTTGGT 9600

FIGURE2.- Continued Coelacanth Mitochondrial DNA 1001

NADH 3+ MNLILAGLLIMSILSMILAMIAFWLPN 9601 ~TGAACCTGATTCTAGCGGGCCTACTTATCATAAGCATCCTCTCTATAATTTTAGCTATAATCGCATTCTGACTACCAAAC 9700 MTPDTEKLSPYECGFDPLGSARLPFSLRFFLVAI 9701 ATGACCCCTGATACAGAAAAACTATCTCCCTACGAATGTGGCTTTGATCCTCTAGGATCCGCACGACTCCCATTCTCCCTACGATTTTTCCTAGTAGCAA 9800 LFLLFDLEIALLLPLPWADQLTNPTLALTWTTS 9801 TCCTATTCCTGCTATTTGACCTAGAAATTGCATTATTATTACCCCTACCCTGGGCAGACCAACTAACAAACCCAACACTTGCATTAACCTGGACAACAAG 9900 IIALLTLGLIHEWTQGGLEWAE tRNA-Arg+ - 9901 CATCATCGCCCTACTAACACTAGGACTAATCCACGAATGAACTCAAGGAGGCCTCGAATGGGCAGAAT~ 10000 NADH 4L"f MTPVQLSFNTAFTLGLMGVTF 10001 CATGACCCCAGTACAACTCAGCTTTAACACTGCATTCACACTAGGCTTAATAGGAGTAACATTC 10100 HRAHLLSALLCLEGMMLSLYMGLSLWPMQLESTT 10101 CACCGAGCCCATCTGCTATCAGCATTACTCTGCCTAGAAGGAATAATATTATCCCTGTATATAGGACTGTCCCTATGACC~TGCAACTAGAATCAACTA 10200 YMTTPLLLLAFSACEAGAGLALMVATSRTHGTD 10201 CATACATAACCACACCACTACTACTACTCGCCTTCTCAGCCTGTGAGGCTGGAGCAGGCCTAGCCCTCATAGTAGCAACATCCCGCACACATGGTACGGA 10300 NADH 4 "f MLKVLMPTIMLILTTWLTKPAWLWP HLQNLNLLQC* 10301 CCACCTCCAAAACCTAAACTTACTACAATGCTAAAAGTTTTAATACCAACAATTATGCTTATCTTAACCACATGATTAACAAAACCTGCATGACTCTGAC 10400 TMTTNSLLVATISLTWLKWDSESGWKSLNSSMA 10401 CAACAATAACAACCAATAGCCTACTCGTAGCTACCATCAGCTTAACCTGACTAAAATGGGACTCAGAGTCAGGATGAAAATCTCTCAACAGCTCAATGGC 10500 TDPLSTPLLILTCWLLPLMILASQNHMFMEPLN 10501 TACCGACCCCCTATCTACACCATTACTAATCCTCACATGCTGGCTTCTACCCCTCATAATTCTCGCAAGCCAAAACCACATGTTTATAGAACCACTAAAC 10600 RQRSFISLLISLQTFLIMAFGATEIILFYIMFEA 10601 CGCCAACGATCATTCATCTCCCTACTCATCTCCCTACAAACATTCCTAATTATAGCATTTGGTGCCACTGAAATCATCCTATTTTACATTATATTTGAAG 10700 TLIPTLIIITRWGNQTERLNAGTYFLFYTVMGS 10701 CAACCCTAATCCCAACACTAATTATTATTACCCGATGGGGTAATCAAACAGAGCGACTAAACGCAGGAACATACTTTTTATTTTATACAGTAATAGGGTC 10800 LPLLVALLMTQNNLGTLSMPLIQHMYQMKLHTH 10801 ACTACCACTATTAGTTGCACTTTTAATAACACAAAATAACCTTGGTACCCTATCAATACCGCTCATCCAACACATATACCAAATAAAACTTCATACACAT 10900 GDMMWWTACLLAFLVKMPLYGVHLWLPKAHVERP 10901 GGAGACATGATATGATGAACAGCCTGCCTATTAGCCTTCTTAGTAAAAATACCACTATACGGAGTCCACCTTTGACTCCCAAAAGCCCATGTAGAAGCCC 11000 IAGSMVLAAVLLKLGGYGMMRLIMMLAPMTKTL 11001 CAATTGCAGGATCAATAGTACTAGCCGCCGTCCTACTAAAACTAGGAGGATACGGAATAATACGACTAATCATAATATTAGCTCCAATAACAAAAACCCT 11100 AYPFIILALWGIIMTGSICLRQTDLKSLIAYSS 11101 AGCCTATCCATTCATCATCCTCGCCCTATGAGGAATCATTATAACTGGATCAATCTGCTTACGACAAACAGACCTAAAATCCCTAATCGCCTACTCATCA 11200 VGHMGLVAAGILTQTPWGFTGATVLMIAHGLTSS 11201 GTAGGCCACATAGGACTAGTGGCAGCAGGTATCCTAACACAAACACCATGAGGCTTTACAGGAGCTACTGTTCTAATAATTGCTCACGGTCTTACATCCT 11300 ALFCLANTNYERTHSRTMILARGMQVILPLMTF 11301 CAGCCCTATTCTGTCTAGCAAACACAAACTATGAACGAACCCATAGCCGAACCATGATCCTAGCACGAGGAATGCAAGTTATCCTCCCACTCATGACATT 11400 WWLMMNLANLALPPSTNLMGELLIITMTFNWSN 11401 CTGATGACTTATAATAAATTTAGCTAACTTAGCCTTACCCCCATCCACTAATCTAATAGGAGAACTACTTATTATTACAATAACCTTCAACTGATCAAAC 11500 WTLTMTGLGMLITAIYSLHMFLTTQRGLMTNHII 11501 TGGACACTAACTATAACAGGACTGGGCATACTAATCACAGCTATCTACTCATTACACATGTTCCTCACAACACAACGAGGCCTGATAACAAACCACATTA 11600 SIEPSHTREHLLMTMHALPMLLLILKPELIWGW 11601 TCTCAATTGAACCCTCCCACACCCGAGAACACCTACTAATAACAATACATGCCCTCCCAATACTATTGCTAATCCTTAAACCAGAACTGATCTGAGGCTG 11700 S Y tRNA-His -+ - (AGY) tRNA-Ser "f 11701 ATCCTACT~TGTGGCT~TCCTCTT~11800

- (CUN) tRNA-Leu 4 - 11801 GCCC> 11900 NADH 5 -i MYTTLIFNSTLMTMFTILTAPILTTLNP 1901 AACTCCAAGTAGCGGCCATGTACACAACATTAATTTTTAACTCAACACTTATAACCATATTTACCATCCTAACAGCCCCAATCTTAACCACCCTTAACCC 12000 IKPNKKWTELWVKTAVQLAFFTSLIPLFIYLDQ 2001 TATCAAACCCAACAAAAAGTGAACAGAATTATGAGTAAAAACCGCCGTACAGCTAGCCTTCTTCACAAGTCTAATCCCGCTCTTCATCTATCTTGACCAA 12100 GIETITTNWQWMNTNTFNINISFKFDQYSIVFI~ 2101 GGAATCGAGACCATTACAACAAACTGACAATGAATAAACACCAACACATTTAACATTAACATCAGCTTTAAATTTGACCAATACTCAATTGTATTCATCC 12200 IALYVTWSILEFANWYMHQDPKMNQFFKYLLLF 12201 CAATCGCACTGTATGTCACATGATCAATTCTAGAATTCGCCAACTGATACATGCACCAAGACCCAAAAATAAACCAATTTTTCAAATACCTGCTCCTATT 12300 LITMMILVTANNMFQLFIGWEGVGIMSFLLIGW 12301 CCTAATCACCATAATAATCTTAGTTACAGCAAACAACATGTTCCAACTATTTATCGGCTGAGAAGGAGTAGGAATTATATCATTCCTATTAATTGGCTGA 12400 WHGRANANTAALQAVIYNRVGDIGLTLSMVWFAI 12401 TGACATGGCCGAGCTAACGCCAATACCGCAGCCCTACAAGCAGTCATCTACAATCGAGTAGGAGACATTGGGTTAACCCTAAGCATGGTCTGATTTGCAA 12500 NLNTWEMQQMFIMSYNTDMTIPLLGLTLAAAGK 12501 TCAACCTAAATACATGAGAAATACAACAAATATTTATCATATCCTACAACACCGACATAACCATCCCCCTACTAGGCCTAACTCTGGCAGCAGCAGGAAA 12600 FIGURE2.- Continued 1002 R. Zardoya and A. Meyer

SAQFGLHPWLPAAMEGPTPVSALLHSSTMVVAG 12601 ATCAGCCCAATTTGGATTACACCCATGACTACCAGCAGCTATAGAAGGTCCAACACCGGTCTCTGCCCTACTACACTCAAGTACCATAGTGGTTGCCGGA 12700 IFLLIRLHPLMDNNKLILTTCLCLGALTTLFTAA 12701 ATCTTCCTGCTTATCCGACTACACCCCCTCATAGACAATAATAAACTAATCCTCACTACCTGCCTCTGCCTAGGAGCACTAACCACCCTATTTACTGCCG 12800 CALTQNDIKKIIAFSTSSQLGLMMVAIGLNQPQ 12801 CATGCGCACTCACCCAAAACGATATTAAAAAAATTATTGCATTTTCAACATCAAGTCAACTAGGACTAATAATGGTAGCAATTGGACTAAATCAACCACA 12900 LAFLHICTHAFFKAMLFLCSGSIIHSLNDEQDI 12901 ACTAGCATTCCTCCACATCTGCACCCACGCTTTCTTTAAAGCAATGCTGTTCCTATGCTCCGGATCAATTATTCACAGTCTAAATGATGAACAAGATATC 13000 RKMGGLHNMLPTTSSCTMVSSMALTGMPFLAGFF 13001 CGAAAAATAGGTGGCTTACACAACATGCTCCCAACAACAAGCTCCTGTACAATAGTTAGCAGCATAGCCCTAACAGGAATGCCATTCCTAGCAGGCTTCT 13100 SKDAIIESLNSSHLNAWALTTTLMATSFTAVYS 13101 TCTCAAAAGACGCAATCATCGAATCACTAAACTCCTCTCACCTAAACGCCTGAGCCCTA~CTACTACACTAATAGCCACATCATTCACCGCAGTCTATAG 13200 LRIIYLVMMNYPRFQSLSPIDENYTKTRNPIKR 13201 CCTCCGAATTATTTACCTTGTAATGATAAACTATCCACGATTTCAAAGCCTGTCCCCCATCGATGAAAATTATACAAAAACACGTAACCCAATTAAACGA 13300 LAWGSVIAGMMIFLNILPTKSQTMTMPTHMKTAA 13301 CTAGCATGAGGAAGCGTAATCGCAGGAATAATGATCTTCCTAAACATCCTACCAACTAAATCACAAACAATGACCATACCCACCCATATAAAAACAGCCG 13400 IMVTILGVLTAMELTKLTSTQLKITPNVNMHNS 13401 CAATCATAGTTACAATCCTAGGAGTCCTCACCGCTATAGAACTAACAAAACTTACAAGCACACAACTAAAAATCACCCCAAACGTCAACATACACAACTC 13500 SNMLGYFPNIIHRLLPQTNLYLGQKMATHLTDQ 13501 ATCCAACATACTAGGATACTTCCCAAACATCATTCACCGACTTCTACCACAAACAAACTTATATCTAGGACAAAAAATAGCAACACACCTAACAGATCAA 13600 TWFEKIGPKGILALQIPTTKMINNSQQGLIKTYL 13601 ACATGATTTGAAAAAATCGGGCCAAAAGGAATTTTAGCCCTACAAATCCCTACAACTAAAATAATCAACAACTCGCAACAAGGACTAATCAAAACATACC 13700 *VARLAGRSLGRTIELV TLFFLTTVLFTTMTMI* 13701 TGACCCTATTTTTTCTAACAACAGTTCTATTTACTACAATAACCATAATCTAACTGCTCGAAGCGCACCACGGGACAACCCACGAGTAATCTCCAAAACC 13800 VFLALLLVWGTVLLLFGGSSYVMPVGSVDGSLV 13801 ACAAACAAAGCCAACAATAACACCCACCCAGTCACCAACAACAAAAATCCCCCAGAAGAATAAACTATGGGCACACCACTGACATCCCCTGACAAAACTG 13900 SMEGFVDLSMWYFDYWWGYCYYGVLLFGFLYVLI 13901 ATATCTCACCAAAAACATCCAATGATATTCAATAAAAATCATACCACCACCCATAACAATAATACCCAACAAGTAAAAAGCCAAACAAATAAACCAAAAT 14000 YLFVSWSGWSEPYPEAALAASYAFVVLMGGLYI 14001 ATAAAGAAAAACCGACCAACTCCCCCAAGACTCAGGATAAGGCTCTGCTGCTAAAGCAGCAGAATAAGCAAAAACAACCAATATCCCCCCCAAATAAATC 14100 LFLVLSLFSGGYGVLFGCGVVASFVLGLAAFYP 14101 AAAAACAGTACTAAGGACAAAAAAGAGCCACCGTACCCAACCAAAAAGCCACAGCCAACAACTGCAGAAAATACTAACCCAAGAGCAGCAAAATAAGGAG 14200 +NADH 6 APNSAVAMLGVVLGMLVVFVFYIM 14201 CTGGATTAGACGCAACTGCCATCAAACCAACCACGAGTCCCATCAACACAACAAAAACAAAATAAATCAT 14300 Cyt b+ - +tRNA-GluMTNIRKTHPLIKIINETIID 14301 TGACTTGAAAAACCACCCCATGACAAACATCCGAAAGACACACCCGCTAATTAAAATTATCAACGAAACCATCATCG 14400 LPTPSNISIWWNFGSLLGICLITQIVTGLFLAM 14401 ACCTCCCCACACCATCAAACATCTCAATCTGATGAAATTTTGGGTCACTACTAGGAATTTGTTTAATTACACAAATCGTAACAGGCCTATTCTTAGCAAT 14500 HYTADITTAFSSVAHICRDVNYGWLIRSTHANG 14501 ACACTACACAGCTGACATTACAACAGCATTCTCATCAGTAGCCCACATCTGCCGAGATGTAAACTATGGATGACTAATCCGAAGTACCCATGCCAACGGA 14600 ASLFFICIYLHVARGLYYGSYLQKETWNIGVILL 14601 GCCTCTCTATTCTTCATCTGCATCTACCTACATGTAGCACGTGGACTCTACTATGGGTCATACTTACAAAAAGAAACCTGAAACATCGGAGTTATCCTCC 14700 MLVMITAFVGYVLPWGQMSFWGATVITNLLSAV 14701 TCATGCTAGTTATGATTACTGCCTTCGTGGGATATGTCCTCCCCTGAGGCCAAATATCATTCTGAGGGGCAACCGTCATCACAAACCTCCTGTCAGCAGT 14800 PYIGDTLVQWIWGGFSVDNATLTRFFAFHFLLP 14801 ACCCTATATTGGAGATACACTAGTTCAATGAATCTGAGGAGGTTTTTCCGTAGACAACGCCACACTCACGCGATTCTTTGCCTTCCACTTCCTCCTGCCA 14900 FVISGASIIHLLFLHETGSNNPTGLNSDADKVTF 14901 TTCGTAATCTCAGGAGCCTCGATTATTCACCTGCTCTTCCTTCATGAGACAGGATCCAACAACCCAACTGGCCTTAACTCTGACGCAGATAAAGTAACCT 15000 HPYFSYKDMLGFLITLTTLAFLTLFTPNLLGDP 15001 TCCACCCGTACTTCTCGTACAAAGACATGTTAGGATTCCTAATTACACTAACAACACTAGCATTCCTAACCCTATTCACTCCAAACCTGTTAGGAGACCC 15100 ENFTPANPLTTPPHIKPEWYFLFAYAILRSIPN 15101 CGAAAACTTCACACCAGCAAACCCATTAACCACCCCGCCACACATCAAACCAGAATGATACTTCCTATTCGCCTATGCAATTCTACGATCCATCCCCAAC 15200 KLGGVLALVSSILVLLLVPILHTSKQRGNTFRPI 15201 AAACTAGGAGGAGTTCTAGCTCTAGTCTCCTCTATTCTAGTACTACTACTAGTGCCAATCTTACACACCTCAAAACAACGAGGAAACACCTTCCGCCCAA 15300 TQMLFWALVADMLILTWIGGQPVEYPFMTIGQI 15301 TTACCCAAATACTATTCTGAGCCCTTGTAGCAGACATGCTAATCCTAACCTGAATCGGAGGCCAACCAGTAGAATATCCATTCATGACAATCGGACAAAT 15400 ASITYFSLFLILIPMTGWLENKAMNWN tRNA-Thr+ 15401 TGCTTCAATCACCTACTTCAGCCTATTCCTCATCCTAATCCCAATAACTGGATGACTAGAAAACAAGGCCATAAACTGAAACTA~ 15500 - - T: 15501 T: 15600 +tRNA-Procontrol region+ 15601 TATTCTCTGTTCCCAAGCTCTGCCCATCCACTCAATAACCCCCCTCCTTCTAGTACGTACTATCTCTATGTATATCGAACATT 15700

FIGURE2.-Continued Coelacanth Mitochondrial DNA 1003

15701 CGTTACTCTCAAGTACATTATACATTACATGTTAATTTACCATTAGAACTGTCACACCACATTTAGGGCGAGAAACCAGCAATACTTGCATAAATGATTA 15800

15801 ATTAATTTACGATGTGGTTAGACACTTATTTCTTAATCTAAACTTGGCTTTATCCATTACTGGCCACTGGTACTGTGCGATGGAGAATAATAGAAAGATT 15900 15901 TATATGATTATAAATTATCTATTACTGGCATCTGGTTTTGGGTTTAGTGAGGGGAAGGGCTTTTTAACCCGTAACTCAGTATCACTTTACTATACTGGCC 16000 16001 AGCCGTTATTTTTTTGGGGGGGGGGAACGTGAAGTTAGACAGCACTTAACTTGCGTCAAACGTCGTTTATGATTGGACTTTTAGTCGACACTCAAGTACT 16100

tl I Re.nf-+ 7 / RePat 3 16101 TTTGGATCTATGACAAAGGATAATCAGTTAATGATAGTTAATGATAGATAGACATATAATGAATGATAGATAGACATAT~TGAATGATAAATAGACATATAATGAATGA 16200

I t4 16201 TAGTTAGATATATAGATTAATGTAAGAAAGATAATTGAACCCATGACAGAGGACATACTTTTAATGATTTCAGGACATAAACACCATGCACATCAGATTG 16300 16301 TATCAACTAAACCCCTCCCAACCCCTAAAACCCAGGACTCGCTAAACACATCAACCCATATATTTTTTCAGTATATACATTGTTATACATCAAATAAATA 16400 16401 ATGTAAT 16407 FIGURE2.- Continued phological and molecular characters that could have 1 have successfully amplified in fishes and , unpub- evolved at the time of this rapid radiation have been lished data). Thirty-five cycles of PCR (denaturing at 94" for 60 sec, annealing at 45-50" for 60 sec and extending at 72" obliterated by subsequent changes and back mutations for 60-105 sec) were performed in 25-pl reactions containing during thelast 400 million years. Basedon thecurrently 67 mM Tris-HC1, pH 8.3, 1.5 mM MgC12, 0.4 mM of each dNTP, available data, it had remained difficult to distinguish 2.5 ,UM of each primer, template DNA (10-100 ng), andAm- between competing phylogenetic hypotheses, and pliTaq DNA polymerase (1 unit, Perkin-Elmer-Cetus) . larger molecular data sets are needed to address this PCR products were cloned using the pGEM-T vector (Pro- mega) and sequencedusing the FSTaq Dye Deoxy Termina- important phylogenetic issue. tor cycle-sequencing kit (Applied Biosystems Inc.) with an Mitochondrial DNA has been widely and successfully automated DNA sequencer (Applied Biosystems 373A) aspre- used in the past to infer phylogenetic relationships viously described (ZARDOYA and MEYER 1996a). DNA se- among many different species and might therefore be quences were obtained using both M13 universal sequencing expected tobe a suitable candidate for theevolutionary primers, and threespecific oligonucleotide primers for those question at hand. However, several authors (e.g., CUM- clones containing inserts >lo00 bp (see Table 1). Typically, only one clone per PCR product was sequenced. No differ- MINGS et al. 1995; Russo et al. 1996; ZARDOYAand MEYER ences were found in any of the overlapping sequences be- 1996c) recently pointed out the high risk of not recov- tween contiguous clones. The fidelity of Taq polymerase is ering the exact evolutionary relationships of the taxa -5 X errorsper nucleotide incorporatedper cycle (GEL under study when individual mitochondrial genes are FAND and WHITE1990) (one errorin 16,400 nucleotides after analyzed and have suggested that larger or more hetero- 35 cycles). However, recently XU and ARNASON (1996) have reported that the sequence of the gorilla mitochondrial ge- geneousdata sets, ie., complete mitochondrial ge- nome obtainedby direct isolation from mitochondria differed nomes, are neededto confidently resolve some phyloge- outside the control region by 49 nucleotides with respect to netic questions. To address the question of the relation- the corresponding mitochondrial genome obtained partially ships among living sarcopterygians and to study the byPCR (HORAIet aL, 1995). Obviously, not all these differ- evolution of the mitochondrial genome in vertebrates, ences are due to Taq polymerase error during amplification we previously sequenced the complete mitochondrial but also are due tosequencing errors, so we expect in amore realistic estimate that our sequence might differ maximally genome of an African lungfish (ZARDOYA and MEYER (outside the control region) by <20 nucleotides with respect 1996a). Here, we present the complete nucleotide se- to the hypothetical sequence from a genome obtained by quence of the mitochondrial genomeof the coelacanth, direct isolation from mitochondria. Based on these calcula- L. chalumnae. tions, we can submit that the sequence reportedis representa- tive of the coelacanth mitochondrial genome and that, al- though we recommend traditional methods for obtainingmi- MATERIALSAND METHODS tochondrial genomesif possible, our PCR approach to obtain DNA extraction, PCR amplification, cloning and sequenc- the complete mitochondrial genome sequence will be useful ing: The scarcity of coelacanth samples (BRUTONand Cou- for other rare or endangeredspecies. TOUVIDIS 1991; FRICKE1992; FMCKE et al. 1995) andtheir Molecular and phylogenetic analyses: Sequence data were generally poor quality did notallow us to isolate mitochondria analyzed with the GCG program package (DEVEREUXet aL directly from tissue and to obtain intact mitochondrial ge- 1984), MacCladeversion 3.06 (~DISONand WDISON nomes. Therefore, a total DNA extraction was performed 1992) and PAUP* version d54 (SWOFFORD1997). DNA se- (TOWNER1991) and the isolated DNA was cleaned through quences were aligned using CLUSTAL W (THOMPSONet aL a Sephadex G50 column. A combinationof 24sets of versatile 1994) followed by refinement by . Ambiguous alignments, primers was designed (Table 1) based on highly conserved mainly in 5' and 3' ends of proteincoding genes, in tRNA vertebrate mtDNA regions. They were used to amplify via gene sequences corresponding to the tRNA DHU and T@C PCR, contiguous and overlapping fragments (averaging -800 arms, and in several highly variable regions of the rRNA genes, bp) which covered the entire coelacanth mtDNA molecule were excludedfrom the phylogenetic analyses (aligned se- (Figure 1). Some of these primers are expectedto successfully quences are available from the authors upon request). Third amplify mitochondrial DNA fragments in other related verte- codon positions were also excluded fromall phylogenetic anal- brate species (e.g., the primerpairs 19-20 and 27-28 of Table yses. Transitions and transversions were given equal weight 1004 R. Zardoya and A. Meyer

TABLE 2 Localization of features in the mitochondrial genome of the coelacanth

Codon Feature From Feature To Size (bp) Start stop tRNA-Phe 1 69 69 12s rRNA 70 1052 983 tRNA-Val 1053 1119 67 16s rRNA 1120 2784 1665 tRNA-Leu (UUR) 2785 2860 76 NADH 1 2861 3832 972 ATG TAA tRNA-Ile 3833 3906 74 tRNA-Gln 3974 3904 69 (L) tRNA-Met 3974 4043 70 NADH 2 4044 5090 1047 ATG TAA tRNA-Trp 5093 5164 72 tRNA-Ala 5235 5167 67 (L) tRNA-Asn 5310 5238 71 (L) tRNA-Cys 5403 5338 65 (L) tRNA-Tyr 5475 5404 69 (L) co I 5477 7024 1548 GTG TAA tRNA-Ser (UCN) 7100 7030 69 (L) tRNA-Asp 7105 7174 70 co I1 7180 7870 691 ATG T" tRNA-Lys 7871 7943 73 ATPase 8 7944 8111 168 ATG TAA ATPase 6 8102 8763 662 ATG TA- co I11 8764 9549 786 ATG TAA tRNA-Gly 9550 9619 70 NADH 3 9620 9968 349 ATG T" tRNA-Arg 9969 10,036 68 NADH 4L 10,038 10,334 297 ATG TAA NADH 4 10,328 11,708 1381 ATG T" tRNA-His 11,709 11,777 69 tRNA-Ser (A0 11,778 11,844 67 tRNA-Leu (CUN) 11,845 11,917 73 NADH 5 11,918 13,753 1836 ATG TAA NADH 6 14,270 13,750 519 (L) ATG AGA tRNA-Glu 14,340 14,271 70 (L) Cyt b 14,343 15,484 1142 ATG TA- tRNA-Thr 15,485 15,556 72 tRNA-Pro 15,626 15,558 67 (L) Control region 15,627 16,407 78 1

(for exceptions see below). Overlapping positions (two open PHYLIP (version 3.55) (F84 model, FELSENSTEIN 1989), MOL- reading frames) in several genes (ATPaset?/ATPaseG, ND4L/ PHY version 2.2 (ADACHI and HASEGAWA 1992), and PAUP* ND4, ND5/NDb) were duplicated inthe analyses. Gaps resulting version d54 (SWOFFORD1997). Robustness of theinferred from the alignment were treated as missing data. The control was tested by bootstrapping (FELSENSTEIN 1985) (as im- region was also excluded from the analyses due to its fast rate plemented in PAUP version 3.1.1., PAUP*, and PHYLIP with of evolution which prevented reliable alignment and made it 100 pseudoreplications each). not appropriate for this phylogenetic question. Statistical methods: The statistical confidence of MP analy- Three data sets (protein, rRNA, and tRNA coding genes) ses was evaluated by calculating the standard deviation of the were combined and subjected to the maximum-parsimony difference in number of steps between the resulting most (MP) method (PAUP version 3.1.1, SWOFTORD1993; and parsimonious trees and the two alternative hypotheses using PAUP* version D54; SWOFFORD1997), usingheuristic the method of TEMPLETON(1983) as implemented in PHY- searches (TBR branch swapping; MULPARS option in effect) LIP. Similarly, the statistical confidence of the resulting best with 10 random stepwise additions of taxa to find the most of the ML analysis with respect to competinghypotheses parsimonious trees. Neighbor-joining (NJ) (SAITOU and NEI was assessed by calculating the standard deviation of the dif- 1987) (based on Kimura-corrected distance matrices, jumble ference in log-likelihood between the resulting best tree and option in effect) and maximumlikelihood (ML) (transver- the alternative hypotheses using the formula of KISHINO and sions were given double the weight of transitions; empirical HASEGAWA (1989) as implemented in PHYLIP, PAUP*, and base frequencies and five random stepwise addition of taxa MOLPHY. In both cases, competing trees were declared sig- were used) analyses of the sequences were performed with nificantly different if the difference in number of steps or Coelacanth Mitochondrial DNA 1005

TABLE 3 canth mitochondrial genome (16,407 bp) is depicted Base composition of vertebrate mitochondrial in Figure 2. The organization of the coelacanth mito- chondrialgenome conforms to the consensus verte- A C A C T brate mitochondrial geneorder (Figures 1 and 2, Table 2).As in other vertebrates, two rRNAs, 22 tRNAs and 13 Proteins proteins are encoded by the coelacanth mitochondrial Tetrapods 1 30.7 25 21.2 23.1 genome. The overall base composition of the L strand 2 19.3 27.1 12.4 41.2 is A 34%; T: 24%; C: 27%; and G 15%. As in other 3 41 31.9 4.8 22.2 vertebrate mitochondrial genomes, guanine is the rar- Total 30.3 28 12.8 28.9 est nucleotide whereas adenine is the most frequent Lungfish (MEYER 1993). A more detailed analysis of the base 1 27.6 25.4 23.6 23.4 composition was performed by considering the rRNA, 2 18.4 27.2 13.3 41.1 the tRNA and the protein codinggenes separately (Ta- 3 34.7 28.5 8.4 28.4 Total 26.9 27 15.1 31 ble 3). In coelacanth protein-coding genes, there is an Coelacanth anti-G bias in third codon positions and pyrimidines 1 30.2 26 23.2 20.6 are overrepresented in second codon positions, as was 2 18.4 27.3 13.3 41 noted before for other vertebrate mitochondrial ge- 3 47.7 28.3 6.9 17.1 nomes (NAYLORet al. 1995). Coelacanth tRNAs are A+T Total 32.4 27.1 14.5 26 rich (57%) whereas the rRNAshave a high adenine content (Table 3). The noncoding intergenic spacer 1 26.2 26.6 26.2 21 2 18.5 27.4 13.8 40.3 regions (22 bp), which are likely not subjected to strong 3 37.8 33.2 8.1 20.9 selection, showed an A+C bias (73%). This indicates Total 27.5 29.1 16 27.4 that, as is the case in other vertebrate mitochondrial genomes, an asymmetrical directional mutation pres- 1 29.9 23.7 23 23.4 sure may be operating in the coelacanth genome (JER- 2 18.9 27 12.8 41.3 MIIN et al. 1995). 3 42.7 27.8 4.4 25.1 Noncodingsequences: The control region in the Total 30.5 26.2 13.4 29.9 coelacanth mitochondrial genome is 781 bp long and 1 30.4 22.9 22.6 24.1 65% A+T rich (Figure 2). The coelacanth mitochon- 2 19 26.5 12.9 41.6 drial control region is characterized by the presence of 3 41.3 21.5 3.8 33.4 four 22-bp tandem repeats in the right domain, close Total 30.2 23.7 13.1 33 to the 3' end. Of these, threeare perfect repeats whereas one is imperfect (six nucleotides are different). tRNAs Even with the presence of these repeats, the coelacanth Tetrapods 31.9 18.6 19.8 29.7 Lungfish 28.3 20.9 23.6 27.2 mitochondrial control region is, so far,the shortest Coelacanth 30 20.8 21.8 27.4 among vertebrates (excluding the atypical control re- Teleosts 28.2 21.3 23.5 27 gion of the lamprey) (LEEand KOCHER1995). Neither Bichir 30.7 18.8 20.8 29.7 conserved sequence blocks (CSBs, WALBERGand CLAY- Lamprey 30.4 18.8 20.4 30.4 TON 1981; SOUTHERN et al. 1988; DILLONand WRIGHT 1993) nor termination associated sequences (TASs, rRNAs DODA et al. 1981; FORANet al., 1988), which are com- Tetrapods 35.4 23.7 17.7 23.2 monly found in other vertebrate mitochondrial control Lungfish 33 22.7 20.1 24.2 Coelacanth 36.7 24.9 18.6 19.8 regions, could unambiguously be identified in the coe- Teleosts 33.4 25.4 21.5 19.7 lacanth mitochondrial control region. Putative CSB-I1 Bichir 34.5 22 19.9 23.6 and -111 motifs, sharing limited sequence similarity to Lamprey 36.1 22.6 17.7 23.6 the human and mouse consensus sequence (WALBERG Tetrapods: human, Bluewhale, , chicken, and and CLAY~~N1981), were tentatively identified at posi- frog; teleosts: Rainbow trout, , and loach. tions 16,308 and 16,324, respectively (see Figure 2). The origin of light strand replication (0,) of the Pag-likelihoods where found to be >1.96 times the standard coelacanth mitochondrial genome is, as in most verte- deviations (FELSENSTEIN 1989). brates, located in a cluster of five tRNA genes (WANCY The complete mtDNA sequence of the coelacanth has been deposited at the EMBL/GenBank data libraries under acces- region) (but see SEUTINet al. 1994) (Figure 2). This sion no. U82228. region is 26 nucleotides in length and has the potential to fold into a stem-loop secondary structure. The fold- RESULTS ANDDISCUSSION ing of the OL does not require the use of part of the Genomeorganization and basecomposition: The adjacent tRNAm as has been described for the lungfish complete L-strand nucleotide sequence of the coela- mitochondrial genome (ZARDOYA and MEYER 1996a). 1006 and R. Zardoya A. Meyer

TABLE 4 Amino acid composition of vertebrate mitochondrial proteins

T etrapods Lungfish Coelacanth TeleostCoelacanth Lungfish Tetrapods Bichir AverageLamprey Ala 260(GCC) 312 (GCC) 286 (GCA) 344 (GCC) 307 (GCC) 297(GCC) 301 (GCC) Arg 67 (CGA) 71 (CGA) 73 (CGA) 78 (CGA) 74 (CGA) 68 (CGA) 72 CGA) Asn 151 (AAC) 143 (AAY) 147 (AAC) 117 (AAC) 147 (AAC) 143 (AAT) 141 AAC ) ASP 67 (GAC) 71 (GAC) 71 (GAC) 76 (GAC) 73 (GAC) 67 (GAC) 71 GAC ) CY s 27(TGC) 28 (TGY) 25 (TGC) 26 (TGC) 29 (TGC) 40 (TGT) 29 TGC) Gln 93 (CAA) 100 (CAA) 103 (CAA) 100 (CAA) 98 (CAA) 98 (CAA) 99 CAA) Glu 92 (GAA) 93 (GAA) 102 (GAA) 102 (GAA) 94 (GAA) 93 (GAA) 96 GAA ) GlY 215 (GGA) 246 (GGA) 239 (GGA) 249 (GGA) 226 (GGA) 213 (GGA) 231 GGA ) His 100 (CAC) 93 (CAC) 103 (CAC) 105 (CAC) 98 (CAY) 106 (CAC) 101 ( CAC) Ile333(ATY) 318 (ATT) 285 (ATC) 281 (ATY) 356 (ATT) 334 (ATT) 318 ( ATT) Leu 626(CTA) 650 (CTW) 622 (CTA) 637 (CTA) 611 (CTA) 610 (YTA) 626 (CTA) LYS 93 (AAA) 72 (AAA) 84 (AAA) 70 (AAA) 82 (AAA) 99 (AAA) 83 (AAA) Met 210 (ATA)210Met 176 (ATA) 224 (ATA) 164 (ATA) 212 (ATA) 217 (ATA) 200 (ATA) Phe 226(TTC) 242 (TTT) 214 (TTC) 225 (TTY) 214 (TTY) 235 (TTT) 223 (TTY) Pro 211(CCM) 212 (CCA) 206 (CCA) 209 (CCM) 203 (CCA) 202 (CCA) 207 (CCA) Ser288(TCM) 258 (TCM) 226 (TCA) 235 (TCM) 256 (TCA) 263 (TCW) 254 (TCM) Thr 326 (ACM) 295 (ACA) 359 (ACA) 305 (ACC) 299 (ACA) 306 (ACA) 315 (ACA) TrP (TGA)107 116 (TGA) 121 (TGA) 120 (TGA) 116 (TGA) 109 (TGA) 114 (TGA) TY r 125(TAY) 120 (TAY) 117 (TAY) 114 (TAC) 118 (TAY) 108 (TAT) 117 (TAY) Val 169 (GTA) 172 (GTA) 183 (GTA) 222 (GTA) 174 (GTA) 192 (GTW) 185 (GTA) stop 13 (TAA) 13 (TAA) 13 (TAA) 13 (TAA) 13 (TAA) 13 (AGA/TAA) 13 (TAA) Total 3799 3801 3803 3792 3800 3813 3801 Tetrapods: human, Blue whale, opossum, chicken, and frog; : Rainbow trout, carp, and loach. The most commonly used codon for each amino acid is shown in parentheses.

The coelacanth 012loop, unlike other fish which have size variability in their DHU and T+C arms when com- a polypyrimidine tract, but similar to tetrapods, con- pared with other vertebrate mitochondrial tRNAs. tains a stretch of thymines that is needed for the initia- The coelacanth mitochondrial genome contains 13 tion of L-strand synthesis (WONGand CLAWON1985). protein-coding genes (Figures 1 and 2); in two cases Coding sequences: The coelacanth 12s and 16s there is a reading-frame overlap on the same strand rRNA genes are 983 and 1665 nucleotides long, respec- (ATPases 8 and 6 share 10 nucleotides; ND4L and ND4 tively (Table 2). The secondary structure of both rRNA overlap by seven nucleotides). All coelacanth mitochon- genes appears to be reasonably conserved in respect drial proteincoding genes begin with a ATG start co- to those of other vertebrates. Most of our rRNA gene don except COI, which starts with GTG (Table 2). This sequence showed only minor differences (99.58% simi- initiation codon usage is also found in all other fish larity) to that previously reported (HEDGESet al. 1993) mitochondrial genomes that have been completely se- with the exception of the last 332 bp of the 16s rRNA quenced so far (TZENGet al. 1992; CHANGet al. 1994; gene sequence, which differed by 20%. These 332 bp LEEand KOCHER 1995; ZARDOYAet aL 1995; JOHANSEN correspond to the last PCR fragment amplified by and BAKKE1996; NOACKet al. 1996; ZARDOYA and MEYER HEDGESet al. (1993) and a search in GenBank revealed 1996a). Most coelacanth ORFs end withTAA (NDl, that the sequenceof this PCR fragment had96.7% simi- ND2, COI, ATPase 8, COIII, ND4L and ND5), one ends larity to that sequence of the 16s rRNA gene with AGA (AD@, and the rest have incomplete sequence(HEDGES 1994). This andother reported stop codons, either T (COII, ND3, ND4) or TA (C't 6) cases of apparently scrambled sequences during data (Table 2). preparation can be recognized and prevented by indi- The codon usage of the coelacanth is similar to that vidually checking PCR amplified fragments for compati- of lamprey (LEE and KOCHER 1995), bichir (NOACKet bility in recovering congruent phylogenetic trees (see al. 1996),carp, trout and loach (TZENGet al. 1992; EDWARDSand ARCTANDER, 1997). CHANGet al. 1994; ZARDOYA et al. 1995), lungfish (ZAR- All 22 coelacanth tRNA gene sequences can be folded DOYA and MEYER 1996a), and tetrapods (e.g., et into acanonical cloverleaf secondary structure with the al. 1985; DESJARDINSand MOMS 1990; ARNASON and exception of the tRNAser'AGU)(data not shown). Two GULLBERG1993; HORAIet al. 1995) (Table 4). A total mismatched base pairs in the stems of these putative of 3803 amino acids are encoded by the coelacanth cloverleaf secondary structures were detected on aver- mitochondrial genome. As in other vertebrates, the age for each tRNA. The coelacanth tWAs ranged in most abundant aminoacid residue encoded by the coe- size from 65 to 76 nucleotides (Table 2) and showed lacanth mitochondrial genome is leucine, whereas the Coelacanth Mitochondrial DNA 1007

A B

100 100

100 Opossum -100 Opossum 100 -96 Chicken - Chicken 97 Frog - Frog

" Lungfish

Coelacanth

Loach Loach

Carp &carp

Rainbow trout Rainbow trout FIGURE3.-Phylogenetic position of the coelacanth. A data set combining all (proteincoding, rRNA and tRNA) mitochondrial genes was analyzed with maximum parsimony (A), neighbor-joining and maximum likelihood (B) phylogenetic methods. Num- bers shown above branches represent bootstrap values from 100 pseudoreplicates for MP (A) and NJ (B). Rainbow trout, carp, and loach were used as outgroup taxa. Third codon positions and transitions in first codon positions of protein coding genes were excluded from the MP phylogenetic analysis. The NJ analysis was performed using Kimura two-parameter distances, an a = 0.53 (gamma distribution), and excluding third codon positions. The ML analysis was performed with a transition: transversion ratio of 21, using the F84 model implemented in PHYLIP version 3.55 (FEUENSTEIN 1989), which accounts for different base frequencies, and excluding third codon positions. rarest is cysteine (Table 4). Adenines and cytosines are fied as the sister group of tetrapods (Figure 3A). This preferentially used in third codon positions of the coe- is the same hypothesis as that which is favored by the lacanth mitochondrial protein codinggenes, indicating largest available nuclear DNA data set, the complete that codon usage is probably influenced by strand-spe- 28s rRNA gene sequences ( ZARDOYAand MEYER199613). cificbase composition mutational bias (FLOOKet al. The node grouping thecoelacanth/lungfish clade with 1995;JERMIIN et al. 1994; LEEand KOCHER1995). Alter- tetrapods was supported by a 56% bootstrap value (Fig- natively, third codonposition usage may be determined ure 3A). However, if the same data set was analyzed by availability of ribonucleotides in the mitochondria with MP and transitions in first positions of the protein- (XIA1996). coding genes were included in the analysis or a transver- Phylogenetic positionof the coelacanth: To asses the sion:transition weighting of 2:l was assumed for the position of the coelacanth with respect to lungfishes whole data set, a lungfish/tetrapod clade was favored and tetrapods, a combined data set comprising rRNA, with low or moderate bootstrap support (51 and 6876, tRNA and protein coding nucleotide sequences of the respectively). human (HORAIet al. 1995), blue whale (ARNMON and For the NJ analysis, to account for the variation of GULLBERG1993), opossum UANKE et al. 1994), chicken substitution rates among sites, the CY shape parameterof (DESJARDINSand MORAIS 1990), frog (ROEet al. 1985), the gamma distribution of rate variation was estimated carp (CHANGet al. 1994), loach (TZENGet al. 1992), based on the MP tree (Figure 3A)by the method of rainbow trout (ZARDOYA et al. 1995), African lungfish YANG and KUMAR (1996). A NJ analysis of the same (ZARDOYA and MEYER 1996a), and coelacanth mito- data set, excluding third codonpositions of the protein- chondrial genomes was constructed; it is composed of coding genes, with teleosts as outgroup taxa, using Ki- a total of 16,140 characters. This data set was analyzed mura 2-parameter distances, and an CY = 0.53, arrived with the three most commonly used methods of phylo- at a tree in which lungfishes are placed as the sister genetic inference, i.e., maximum parsimony (MP; FITCH group of tetrapods(97% bootstrap value for the 1971), neighbor-joining (NJ; SAITOUand NEI 1987), lungfish+tetrapod node) (Figure 3B). The same topol- and maximun likelihood (ML; FELSENSTEIN1989). ogy was recovered when a ML analysis (F84 model, When third codon positions and transitions in first 2: 1 transition: transversion ratio; Ln likelihood = codon positions of the protein-coding gene sequences -61,846.08) was performed excluding third codon po- were excluded from the analysis and teleosts (trout, sitions of the protein-coding genes and including tele- carp and loach) were included as outgroup taxa, MP osts as outgroup taxa. In this case, all branch lengths recovered one most parsimonious tree (8468 steps, C.I. were found to be significantly greater than zero (P < = 0.65) in which a coelacanth/lungf%h clade is identi- 0.01). 1008 R. Zardoya and A. Meyer

The disagreement between methods of phylogenetic tionships, with special reference to the . Zool. 1. Linn. Soc. 103: 241-288. inference was investigated with KISHINO and HASEGAWA AHLBERC, P. E., J. A. CLACK and E. LUKSEVICS,1996 Rapid braincase (1989) and TEMPLETON(1983) tests. None of the three evolution between and the earliest tetrapods. Na- possible hypotheses, i.e., lungfish as sister group of tetra- ture 381: 61 -63. ARNASON,U., and A. GULLBERG,1993 Comparison between the pods (Ln likelihood = -61,846.08, best ML tree; 10,812 complete mtDNA sequences of the blue and the whale, two steps, best MP tree), coelacanth as sister group of tetra- species that can hybridize in . J. Mol. Evol. 37: 312-322. pods [Ln likelihood = -61,863.43, ALnL = 17.35 2 BEMIS,W. E., and R. G. NORTHCUTT,1991 Innervation of the basi- cranial muscle of Latimeria chalumnue. Environ. Biol. Fishes 32 22.77 (SD); 10,832 steps, Asteps = 20 2 15.431, or 147-158. coelacanth as sister group of lungfish (Ln likelihood = BET%,U. A., W. E. MAWR and J. KLEIN, 1994 Major histocompatibil- -61,859.78, ALnL = 13.70 2 22.69; 10,813 steps, ity complex class I genes of the coelacanth Latimm'n rhalumnap. Proc. Natl. Acad. Sci. USA 91: 11065-11069. Asteps = 1 2 16.03), could be statistically rejected BRUTON,M. N., and S. E. COUTOUVIDIS,1991 An inventory of all based on the mitochondrial DNA data set when third known specimens of the coelacanth 1,atimm'a rhalumnue, with codon positions of mitochondrial protein-coding genes commentson trends in the catches.Environ. Biol. Fishes 32: 371 -390. were excluded from the analyses and teleosts were used CARROLL,R. L., 1988 Vertebrate Paleontology and Evolution. Freeman, as outgroup taxa. In a separate publication, we will re- New York. port in more detail on the conflicting phylogenetic sig- CHANG,M. M., 1991 Rhipidistians, pp. 1-28 in Origins of thr Higher Cfloup.r of Tetrupods. Controversy and Consensus, edited by H. P. nal that is contained in the tRNA, rRNA, and protein SC:H~U.'IXEand L. TRUEB. CornellUniversity Press, Ithaca, NY. mitochondrial data sets and that may be one of the CHANG,Y. S., F. L. HUANC:and T. B. LO, 1994 The completenucleo- major causes that explain why none of the threehypoth- tide sequence and gene organization of carp (Cypn'nxs rarpio) mitochondrial genome. J. Mol. Evol. 38: 138-155. eses can be statistically ruled out (R. ZARDOYA, Y. CAo, CIACK,J.A,, 1994 Earliest known tetrapod braincase and the evolu- M. HASEGAWAand A. MEmR,unpublished data). tion of the stapes and fenestra ovalis. Nature 369 392-394. Our evolutionary analyses confirmedthe phyloge- CLOLJTIER,R., and P.E. AHLBERG,1996 Interrelationships of sarcopterygians, pp. 445-479 in Intmelationships ofFishes, edited netic position of the coelacanth as a closer relative to by M. I,. J. STIASSNY,L. R. PARENTIand G. D. JOHNSON.Academic tetrapods and other sarcopterygians than to the ray-fin Press, San Diego. fishes (). These results indicate that the CLOUTIER,R., and P. L. FOREY,1991 Diversity of extinct and living actinistian fishes (Sarcopterygii). Environ. Biol. Fishes 32: 59- rate of evolution of the mitochondrial genomeis appro- 74. priate for resolving relationships even among ancient COUKTENAY-LATIMER,M., 1979 My story of the first coelacanth. Oc- lineages (at least up to the Devonian). However, it cas. Pap. Calif. Acad. Sci. 134: 6-10. CUMMINGS,M. P., S. P. Om0 and J. wAI(El.EY, 1995 Sampling prop- seems to fall short of recovering and strongly support- erties of DNA sequence data in phylogenetic analysis. Mol. Biol. ing the relationships among the extantlineage of lobe- EvoI. 12: 814-822. finned fishes that originated within a narrow window DES~RDINS,P., and R. MOMS, 1990 Sequence and gene organiza- tion of the chicken mitochondrial genome. J. Mol.Biol. 212: in time -400 mya. The rapid origin of the various lobe- 599-634. finned lineages and their rapid radiation before the DEVEREUX,J., P. HAEBERIJ and0. SMITHIES,1984 A comprehensive origin of tetrapodscontinues to make it difficult to set of sequence analysis programs for the VAX. Nucleic Acids Res. 12: 387-395. resolve their relationships with confidence. DII.I.ON,M. C., and J. M. WRIGHT, 1993 Nucleotide sequence of the D-Loop Region of the sperm whale (Physeter macrocephalus) mito- We thank and an anonymous reviewer for provid- Scorn EDWARDS chondrial genome. Mol. Biol. Evol. 10: 296-305. ing helpful suggestions on the manuscript. The coelacanth tissue was DODA,J. N., C. T. WRIGHTand D. A. CLAYTON,1981 Elongation of kindly provided by ROBERTMURPHY (). P.J. displacement loop strands in human and mouse mitochondrial PERI.assisted in the cloning and automated sequencing.DAVID SWOF- DNAis arrested near specific template sequences. Proc. Nd. FORD kindly granted permission to publish resulo based on the test Acad. Sci. USA 78: 6116-6120. version d.54 of his PAUP* program. R.Z. was sponsored by a postdoc- EDWARDS,S. V., and P. ARCTANDER, 1997 Congruence and phyloge- toral grant of the Ministerio de Educacion y Ciencia of Spain. This netic re-analysis of perching cytochromr b sequences. Mol. work received partial financial support fromgrants from the National Phyl. Evol. 7: 266-271. FEISEKS-KIN,J., 1985 Confidence limits on phylogenies: an ap Science Foundation (BSR-9107838, BSR-9119867,and DEB-9615178) proach using the bootstrap. Evolution 39: 783-791. and from a collaboration grant with the Max-Planck-Institut fur Biol- FELSENSTEIN,J., 1989 PHYLIP-phylogeny inference package (Ver- ogy in Tubingen from the Max-Planck Society, Germany. This publica- sion 3.4.). 5: 164-166. tion was prepared during A.M.'S tenure as Miller Visiting Research FITCH,W.M., 1971 Toward defining the course of evolution: mini- Professor and a Guggenheim Fellow at the University of California mal change for a specific tree topology. Syst. Biol. 20: 406-416. at Berkeley. The support of the John Simon Guggenheim Founda- FI.OOK,P. K., C. H. F. RowEu and G. GELIJNSEN,1995 The se- tion, and the Miller Institute and the hospitality of the Museum of quence organization and evolution of the Locusta migratmiu mito- Vertebrate Zoology and the Departments of Integrative Biology and chondrial genome. J. Mol. Evol. 41: 928-941. FORAN,D. R., J. E. HIXSONand W. M. BROWN,1988 Comparisons Molecular and Cell Biolom is gratefully acknowledged. of ape and human sequences that regulate mitochondrial DNA transcription and D-loop DNA synthesis. Nucleic Acids Res. 16: LITERATURE CITED 5841-5861. FOREY,P. L., 1987 Relationships of lungfishes. J. Morphol. 1 Suppl.: ADACHI, J.,and M. HASEGAWA,1992 MOLPHY: Programs for Molecular 75-91. Phyhgenetics I-PROTML: Maximum Likelihood Inference of Protein FORF;Y,P. L., 1988 Golden jubilee for the coelacanth Latimrria cha- Phylogeny Institute of Statistical Mathematics, Tokyo. lumnue. Nature 336 727-732. AGASSIZ,L., 1844 Rerherches sur Zes Poissons Fossiles, Imprimerie de FOREY, P. L., 1991 1,atirnm'a chalumnae and its pedigree. Environ. Petitpierre, Neuchatel. Biol. Fishes 32: 75-97. AHI.BER(:,P. E., 1991 A re-examination of sarcopterygian interrela- FRICKL,H., 1992 Coelacanth tissue bank. Nature 357: 105. Coelacanth Mitochondrial DNA 1009

FRICKE,H., K. HISSMANN,J. SCHAUER and R. PLANTE, 1995Yet more NAYLOR,G. J., T. M. COLLINSand W.M. BROWN,1995 Hydropho- danger for coelacanths. Nature 374: 314. bicity and phylogeny. Nature 373: 555-556. FRITZSCH,B., 1987 Inner ear of the coelacanth fish Latim'a has NOACK,K., R. ZARDOYA and A. MEYER,1996 Thecomplete mito- tetrapod affinities. Nature 327: 153-154. chondrial DNA sequence of the bichir (Polypterus matipinnis), GELFAND,D. H. and WHITE,T.J. 1990 Thermostable DNA polymer- a basal ray-finned fish: ancient establishment of the consensus ases, pp. 129-141 in PCR Protocols: A Guide to Methods and Applica- vertebrate gene . Genetics 144: 1165-1180. tions, edited by M. A. INNIS, D. H. GELFAND,J. J. SNISNKYand T. J. NORTHCUTT,R. G., 1987 Lungfish neural characters and their bear- WHITE.Academic Press, San Diego. ing on sarcopterygian phylogeny. J. Morphol. 1 (Suppl.): 277- GORR, T., T. KLEINSCHMIDT andH. FRICKE,1991 Close tetrapod 297. relationships of the coelacanthLatimeria indicated by haemoglo- PALUMBI,S.R., A. MARTIN, S. ROMANO,W. 0. MCMILLAN,L. STICEel bin sequences. Nature 351: 394-397. al., 1991 The Simple Fool's Guide to PCR Department of Zoology, HEDGES,S. B., 1994 Molecular evidence for theorigin ofbirds. Proc. University of Hawaii, Honolulu. Natl. Acad. Sci. USA 91: 2621-2624. PANCHEN,A. L., and T. R. SMITHSON,1987 Character diagnosis, fos- HEDGES,S. B., C. A. HASS and L. R. MAXSON, 1993 Relations of fish sils and the origin of tetrapods. Biol. Rev. 62: 341-438. and tetrapods. Nature 363: 501-502. PECHERE,J. F., H. ROCHATand C. FEW, 1978 Pawalbumins from HILLIS,D. M., M. T. DIXONand L. K. AMMERMAN, 1991 The relation- coelacanth muscle. I1 Amino acid sequence of the two less acidic ships of the coelacanth Latimen'a chalumnae: evidence from se- components. Biochim. Biophys. Acta 536: 269-274. quences of vertebrate 28s ribosomal RNA genes. Environ. Biol. ROE, B. A,, M. DIN-POW,R. K. WIISONand J. F. WONG, 1985 The Fishes 32: 119-130. complete nucleotide sequence of the Xenopus lamis mitochon- HORN, S., K. HAYASAKA, R.KONDO, K. TSUGANEand N. TAKAHATA, drial genome. J. Biol. Chem. 260: 9759-9774. 1995 Recent African origin of modernhumans revealed by ROMER,A. S., 1966 VertebratePaleontology. University of completesequences of hominoid mitochondrial . Proc. Press, Chicago. Natl. Acad. Sci. USA 92:532-536. ROSEN,D. E., P. L. FOREY,B. G. GARDINER andC. PATTERSON,1981 HUXLEY,T. H., 1861 Preliminary essay upon the systematic arrange- Lungfishes, tetrapods, paleontolgy, and plesiomorphy. Bull. Am. ment of the fishes of the Devonian epoch. Figures and descrip Natl. Mus. Nat. Hist. 167: 159-276. tions illustrative of British organic remains. Mem. Geol. Surv. RUSSO,C. A. M., N. TAKEZAK~and M. NEI, 1996 Efficiencies of differ- U.K. Dec. 10: 1-40. ent genes and different tree-building methods in recovering a JANKE,A,, G. FEIDMAIER-FUCHS,K. THOMAS, A. VON HAESELERand known vertebrate phylogeny. Mol. Biol. Evol. 13 525-536. S. PMRO, 1994 The marsupial mitochondrial genome and the SAITOU,N., and M. NEI, 1987 The neighbor-joining method: a new evolution of placental . Genetics 137: 243-256. method for reconstructing phylogenetic trees. Mol. Biol. Evol. JERMIIN, L. S.,D. GRAUR,R. M. LOWEand R. H. CROZIER,1994 Analy- 4: 406-425. sis of directional mutation pressure and nucleotide content in SCHIJEWEN,U., H. FRICKE, M. SCHARTL,J. T. EPPLENand S. PMno, mitochondrial qtochrome b genes. J. Mol. Evol. 39: 160-173. 1993 Which home for the coelacanth! Nature 363: 406. JERMIIN,L., D. GRAURand R. H. CROZIER,1995 Evidence from analy- SCHLKTLE,H. P., 1987 Dipnoans as sarcopterygians. J. Morphol. 1 ses of intergenic regions for strand-specific directional mutation (Suppl.): 39-74. pressure in metazoan mtDNA. Mol. Biol. Evol. 12 558-563. SCHULTZE,H. P., 1994 Comparison of hypotheses on the relation- JOHANSEN,S., and I. BAKKE,1996 The complete mitochondrial DNA ships of sarcopterygians. Syst. Biol. 43: 155-173. sequence of Atlantic (Gadus morhua):relevance to taxonomic SCHULTZE,H. P., and R. CLOUTIER,1991 Computed tomography studies among codfishes. Mol. Mar. Biol. Biotech. 5: 203-214. and magnetic resonance imaging studies of Latimen'a chalumnae. KISHINO, H., and M. HASEGAWA,1989 Evaluation of the maximum Environ. Biol. Fishes 32: 159-181. likelihood estimate of the evolutionary tree topologies from DNA SEUTIN,G., B. F. LANG, D. P. MINDEI'I.and R. MORAIS, 1994 Evolu- sequence data, and the branching orderin Hominoidea. J. Mol. tion of the WANLY region in mitochondrial DNA. Mol. EvoI. 29: 170-179. Biol. Evol. 11: 329-340. KOCHER,T. D., W. K. THOMAS,A. MEYER,S. V. EDWARDS, S. PUBO et SMITH,J. L. B., 1939 A living fish of type. Nature 143: 455- al., 1989 Dynamics of mitochondrial DNA evolution in . 456. Proc. Natl. Acad. Sci. USA 86: 6196-6200. SMITH, J.L. B., 1956 Old Fourlegst The Stoly of the Coelacanth. Long- KOLB,E., J. I. HARRISand J. BRIDGEN,1974 Triose phosphate iso- mans, London. merase EC-5.3.1.1 from the coelacanth: an approach to the rapid SOUTHERN,S. O., P. J. SOUTHERNand A. E. DIZON,1988 Molecular determination of an amino-acid sequence with small amounts of characterization of a cloned dolphin mitochondrial genome. J. material. Biochem. J. 137: 185-197. Mol. Evol. 28 32-42. LEE, W. J., and T. D. KOCHER,1995 Complete sequence of a sea STENSIO,E. A,, 1921 Fishesfiom Spitrbergen. Holzhausen, Vi- lamprey (Petromyzon marinus) mitochondrial genome: early estab enna. lishment of the vertebrate genome organization. Genetics 139: STOCK,D. W., K. D. MORERG,L. R. MAXSON and G. S. WHITT, 1991 873-887. A phylogenetic analysis of the 18s ribosomal RNA sequence of LONG,J. A., 1995 The Rise of Fishes: 500 Million Years of Euolution. coelacanth Latimeria rhalumnae. Environ. Biol. Fishes 32: 99- 117. The John Hopkins University Press, Baltimore. SWOFFORD,D. L., 1993 PAUP: Phylogenetic Analysis Using Parsimony. IMADDISON,W. P., and D. R. MADDISON,1992 MacClade: Analysis of Illinois Natural History Survey, Champaign, IL. Phylogeny and CharacterEuolution. Sinauer Associates, Sunderland, SWOFFORD, D.L., 1997 PAUP*: Phylogenetic Analysis Using Parsimony MA. (*and other mpthods), version 4.0. Sinauer Associates, Sunderland, MAISEY, J.G., 1996 Discovering Fossil Fishes. Holt, New York. MA. MANGUM,C. P., 1991 Urea and chloride sensitivities of coelacanth TAMAI,Y., H. KOJIMAand K. TAKAYW-ABE,1994 Lipids and myelin hemoglobin. Environ. Biol. Fishes 32: 219-222. proteins in the brains of coelacanth (Latimeria chalumnae), - MEYER,A,, 1993 Evolution of mitochondrial DNA in fishes, pp. 1- fish (Lrpidosirenparadoxa and Protoptm aetiopicus), bichir 38 in Biochemistly and Molecular Biology ofFishes, edited by P. W. (Polyptms senegalus), and (Aripenser ruthenus) (Osteich- HOCHACHKAand. T. P. MOMMSEN.Elsevier Science, New York. thyes): phylogenetic implications. Can. J. Fish. Aquat. Sci. 51: MEER, A., 1995 Molecular evidence on the origin of tetrapods and 1265-1272. the relationships of the coelacanth. Trends Ecol. Evol. 10: 111 - TEMPLETON,A. R., 1983 Phylogenetic inference from restriction en- 116. donuclease cleavage site maps with particular reference to the MEYER,A,, and S. I. DOLVEN,1992 Molecules, fossils and the origin evolution of human and the apes. Evolution 37: 221 -244. of tetrapods. J. Mol. Evol. 35: 102-113. THOMPSON,J. D., D. G. HICXXNSand T. J. GIBSON,1994 CLUSTAL MEYER,A. A,, and A. C. WIISON,1990 Origin of tetrapods inferred W improving the sensitivity of progressive multiple sequence from their mitochondrial DNA affiliation to lungfish. J. Mol. alignmentthrough sequence weighting, position specific gap EvoI. 31: 359-364. penalties and weight matrix choice. Nucleic Acids Res. 22: 4673- MILLO'r, J., 1954 New facts about coelacanths. Nature 174 426- 4680. 427. THOMSON,K. S., 1966 Intercranial mobility in the coelacanth. Sci- MILL.OT,J., 1955 The coelacanth. Sci. Am. 193: 34-39. ence 153 999-1000. 1010 R. Zardoya and A. Meyer

TOWER,P., 1991 Purification of DNA, pp. 47-68 in EssentialMolecu- YANG,Z., and S. KUMAR,1996 Approximate methods for estimating lar Biology. A Practical Approach, edited by T. A. BROWN,Oxford the pattern of nucleotide substitution and thevariation of substi- University Press, Oxford. tution rates among sites. Mol. Biol. Evol. 13 650-659. TZENG,C. S., C. F. HUI, S. C. SHEN and P.C. HUANG,1992 The YOKOBORI,A. I., M. HASEGAWA,T. UEDA,N. OKADA,K. NISHIKAWA complete nucleotide sequence of the Crossostoma lacustre mito- et al., 1994 Relationship among coelacanths, lungfishes, and chondrialgenome: conservation and variations among verte- tetrapods: a phylogenetic analysis based on mitochondrial brates. Nucleic Acids Res. 20: 4853-4858. cytochrome oxidase I gene sequences.J. Mol. Evol. 38: 602- VOROBYEVA,E., and H. P. SCHULTZE,1991 Description and systemat- 609. ics of panderichthyid fishes with comments ontheir relationship ZARDOYA,R., and A. MEYER, 1996a Thecomplete nucleotide se- to tetrapods, pp. 68-109 in Origins OftheMajor Groups of Tetrapods: quence of the mitochondrial genome of the lungfish (Protoptms Controversies and Consensus, edited by H. P. SCHULTZE,and L. dolloi), supports its phylogenetic position as a close relative of TRUEB,Cornel1 Univ. Press, Ithaca, NY. land vertebrates. Genetics 142: 1249-1263. WALBERG,M. W. and D. A. CLAYTON,1981 Sequence and properties of the human KB cell and mouse L cell D-Loop regions of mito- ZARDOYA,R., and A. MEYER,1996b Evolutionary relationships of the chondrial DNA. Nucleic Acids Res. 9: 541 1-5421. coelacanth, lungfishes, and tetrapodsbased on the 28sribosomal WONG,T. W., and D.A. CLAYTON, 1985 In vitro replication of hu- RNA gene. Proc. Natl. Acad. Sci. USA 93: 5449-5454. manmitochondrial DNA accurateinitiation atthe origin of ZARDOYA, R., and A.MEYER, 1996c Phylogenetic performance of light-strand synthesis. Cell 42: 951-958. mitochondrial protein coding genes in resolving relationships WOODWARD,A. S., 1891 Catalogue of Fossil Fishes in the British Museum among vertebrates. Mol. Biol. Evol. 13: 933-942. (Natural Histmy). British Museum (Natural History), London. ZARDOYA,R., A. GARRIDO-PERTIERRAand J. M. BAUTISTA,1995 The XIA, X., 1996 Maximizing transcription efficiency causes codon us- complete nucleotide sequence of the mitochondrial DNA ge- age bias. Genetics 144: 1309-1320. nome of the Rainbow trout, Oncurhynchus mykiss. J. Mol. Evol. 41: Xu, X. and U. ARNASON,1996 A complete sequence of the mito- 942-951. chondrial genome of the Western lowland gorilla. Mol.Biol. Evol. 13: 691-698. Communicating editor: N. TAKAHATA