<<

Biologia 64/4: 811—818, 2009 Section Cellular and Molecular Biology DOI: 10.2478/s11756-009-0121-8

Bayesian inference of cetacean phylogeny based on mitochondrial genomes

Xiao-Guang Yang1,2

1Department of Physics and Astronomy, McMaster University, Hamilton, Ontario, Canada, L8S 4M1 2Current address: Department of Biological Engineering, University of Missouri, Columbia, MO 65211 USA, e-mail: [email protected]

Abstract: The phylogeny of (, , ) has long attracted the interests of biologists and has been investigated by many researchers based on different datasets. However, some phylogenetic relationships within Cetacea still remain controversial. In this study, Bayesian analyses were performed to infer the phylogeny of 25 representative species within Cetacea based on their mitochondrial genomes for the first time. The analyses recovered the clades resolved by the previous studies and strongly supported most of the current cetacean classifications, such as the monophyly of Odontoceti (toothed whales) and Mysticeti (baleen whales). The analyses provided a reliable and comprehensive phylogeny of Cetacea which can provide a foundation for further exploration of cetacean ecology, conservation and biology. The results also showed that: (i) the mitochondrial genomes were very informative for inferring phylogeny of Cetacea; and (ii) the Bayesian analyses outperformed other phylogenetic methods on inferring mitochondrial genome-based phylogeny of Cetacea. Key words: Cetacea; Odontoceti; Mysticeti; cetacean phylogeny; Bayesian analysis; mitochondrial genomes. Abbreviations: mt, mitochondrial; aa, amino acid; nt, nucleotide; NJ, neighbor-joining; MP, maximum parsimony; ML, maximum likelihood; BP, bootstrap percentage; PP, Bayesian posterior probabilities; MCMC, Markov chain Monte Carlo; GTR4, general time-reversible four-state model; GTR2, general time-reversible two-state model; PHASE, Phylogenetics And Sequence Evolution.

Introduction al. 2005; Nishida et al. 2007; Caballero et al. 2008). Some of the issues have been resolved: there is a general Cetaceans (whales, dolphins, porpoises) are large mam- agreement that Cetacea, the sister group of Hippop- mals that are widely distributed on Earth. They are im- tamidae, is within Artiodactyla (Hasegawa & Adachi portant species in the ecosystem. Many of them were 1996; Montgelard et al. 1997; Nikaido et al. 1999; Lum commercially harvested for long time, making the clas- et al. 2000; Thewissen et al. 2001); the monophyly and sification of cetaceans based on their morphology char- sister relationship of Odontoceti and Mysticeti were acteristics possible. Some of them, such as supported by morphological, nuclear and mitochondrial River , are endangered species. The cetacean data (Messenger & McGuire 1998; O’Leary & Geisler phylogeny is important for understanding the cetacean 1999; Nikaido et al. 2001; Rychel et al. 2004; Yan et ecology and protecting endangered species. The or- al. 2005; May-Collado & Agnarsson 2006; Nishida et der Cetacea includes two extant suborders: Odontoceti al. 2007). It has been widely accepted that (toothed whales) and Mysticeti (baleen whales). Odon- is the basal group of Mysticeti (Arnason et al. 1992; toceti is composed of four major groups: Milinkovitch et al. 1994; Arnason & Gullberg 1996; (sperm whales), Ziphiidae (beaked whales), Delphi- Rychel et al. 2004; Sasaki et al. 2005; May-Collado & noidea and polyphyletic river dolphins, while Mysticeti Agnarsson 2006; Nishida et al. 2007). However, some is- divides into four groups: (gray whales), sues remain controversial: the relationships among the (pygmy right whales), Balaenidae (right groups within Odontoceti and Mysticeti (Adegoke et al. whales) and Balaenopteridae ( and humpback 1993; Milinkovitch et al. 1993; Rosel et al. 1995; Arna- whales). The phylogeny of cetaceans has long attracted son & Gullberg 1996; Waddell et al. 2000; Hamilton et the interests of evolutionary biologists and has been al. 2001; Nikaido et al. 2001; Pichler et al. 2001), and investigated by many researchers based on different the relationships among several subgroups within Bal- datasets (e.g. morphology, nuclear and mitochondrial aenopteridae (Arnason et al. 1992; Adegoke et al. 1993; DNA, RNA and protein sequences) (Milinkovitch et al. Arnason & Gullberg 1994, 1996; Messenger & McGuire 1993; Rosel et al. 1995; Montgelard et al. 1997; Mes- 1998). senger & McGuire 1998; Rychel et al. 2004; Yan et Mitochondrial (mt) genomes have become popu-

c 2009 Institute of Molecular Biology, Slovak Academy of Sciences 812 X.-G. Yang

Table 1. Cetacean species list.

Suborder/family Scientific name English name Mitochondrion Acc. No.

Mysticeti Balaenidae mysticetus bowhead NC 005268 Eubalaena australis Southern NC 006930 Eubalaena japonica Northern right whale NC 006931 Balaenopteridae acutorostrata North Atlantic NC 005271 Balaenoptera bonaerensis NC 006926 Balaenoptera borealis NC 006929 Balaenoptera brydei Bryde’s whale NC 006928 Balaenoptera edeni pygmy Bryde’s whale NC 007938 Balaenoptera musculus NC 001601 Balaenoptera omurai Omura’s NC 007937 Balaenoptera physalus fin whale NC 001321 Megaptera novaeangliae NC 006927 Eschrichtiidae robustus NC 005270 Neobalaenidae Caperea marginata NC 005269 Odontoceti Delphinidae albirostris white-beaked dolphin NC 005278 geoffrensis Boutu ( dolphin) NC 005276 breviceps pygmy NC 005272 Lipotes vexillifer Yangtze NC 007629 Monodon monoceros NC 005279 Phocoenidae phocoena harbor NC 005280 Physeteridae catodon sperm whale NC 002503 Platanista minor Indus River dolphin NC 005275 Pontoporiidae Pontoporia blainvillei Franciscana dophin NC 005277 Ziphiidae bairdii Baird’s NC 005274 Hyperoodon ampullatus Northern NC 005273

lar in phylogenetic analysis since the mid 1990s for ity of the phylogeny was judged based on the recovery several reasons. The mt-genomes are relatively of the previously resolved and less controversial clades short and simple compared to the nuclear genomes, and (Gatesy et al. 1996; Gatesy 1997; Ursing & Arnason they are easy to amplify and sequence; the animal mt- 1998; Nikaido et al. 1999; Cassens et al. 2000; Nikaido DNA has relatively fast evolutionary rate (Brown et al. et al. 2001; Rychel et al. 2004; Sasaki et al. 2005). It 1979) and is free of recombination (Olivo et al. 1983). was also judged by the bootstrap percentage (BP) for In some previous studies, only 12 mt-genomes were in- NJ and MP methods and the Bayesian posterior prob- cluded in the phylogenetic analysis of Mysticeti (Sasaki abilities (PP) for Bayesian analyses (Suzuki et al. 2002; et al. 2005), or 16 mt-genomes in the Cetacea (Ar- Erixon et al. 2003; Simmons et al. 2004). nason et al. 2004). The methods used in these studies were all standard statistical methods, such as neighbor- Material and methods joining (NJ), maximum parsimony (MP) and maxi- mum likelihood (ML). In other previous studies, a large Dataset number of species were included and Bayesian analy- The mt-genomes of 25 cetacean species were downloaded ses were performed, but only mt-Cytb (gene encoding from GenBank (Benson et al. 2008) at the Genomes web- mt-cytochrome b) or several other genes were employed site: http://www.ncbi.nlm.nih.gov/genomes/static/euk o. (Rychel et al. 2004; May-Collado & Agnarsson 2006). html. These species with completed mt-genomes are good In this study, Bayesian analyses were performed to representatives of the extant families within Cetacea. The recover the phylogenetic relationship of 25 representa- mt-genomes of 7 non-cetacean species, including the hip- tive species within Cetacea based on their mt-genomes popotamus, were employed as outgroups. Names and acces- for the first time. Bayesian analysis has been widely sion numbers of cetacean and non-cetacean species are listed in Table 1 and Table 2, respectively. used by researchers for inferring phylogeny (Huelsen- beck et al. 2001; Holder & Lewis 2003). The method Phylogenetic analyses has clear computational advantages over standard sta- The phylogenetic analyses were performed at both nu- tistical methods and can be used to explore a com- cleotide (nt) and amino acid (aa) levels. The 12 protein- plex model space in reasonable time (Huelsenbeck et coding genes on the same strand of mt-DNA were used. The al. 2001; Holder & Lewis 2003). The clades recovered NADH6 gene was excluded because the gene is on the oppo- in the previous studies, such as the monophyly and sis- site strand of mt-DNA and has different base composition than the other genes. The alignments of all these sequences ter relationship of Odontoceti and Mysticeti, were in- were obtained using Clustal X (Thompson et al. 1997) and vestigated. The controversial relationships among these modified carefully by eye using GeneDoc (Nicholas et al. suborders were also investigated and discussed. NJ and 1997). All the positions with ambiguous alignment were ex- MP analyses were performed in order to compare the cluded. The number of remaining codons of the 12 concate- results with those from Bayesian analyses. The reliabil- nated genes was 3589. Bayesian cetacean phylogeny based on mito-genomes 813

Table 2. Non-cetacean species as outgroups.

Family Scientific name English name Mitochondrion Accession No.

Canidae Canis familiaris dog NC 002008 Equidae Equus caballus horse NC 001640 Equus asinus ass NC 001788 Tapiridae Tapirus terrestris Brazilian tapir NC 005130 Bovidae Bos taurus cow NC 006853 Ovis aries sheep NC 001941 Hippopotamus amphibius hippopotamus NC 000889

NJ and MP analyses were performed using PHYLIP dependently of the others and this assumption is not valid (Felsenstein 2005). The models of substitution for the amino for RNA genes (Higgs 1998; Savill et al. 2001; Jow et al. acid and nucleotide sequences were mtREV24 (Adachi & 2002). In this study, the secondary structure of mt-rRNA Hasegawa 1996) and HKY85 (Hasegawa et al. 1985), respec- was considered not only in modifying the alignment, but tively. The shape parameter (α) for the discrete Γ distribu- in the construction of the phylogenetic relationship. The tion (six categories), the proportions of invariant sites and RNA6A model was used for the compensatory substitu- substitution rate parameters of the models were optimized tions in the RNA helices in Bayesian analyses based on mt- based on the sequences. The bootstrapping was carried out rRNA sequences. This model considered pairs of sites rather with 500 replicates. than only single sites and was implemented in the PHASE The Markov chain Monte Carlo (MCMC) Bayesian in- package recently. GTR4 model was used for loop regions ference was performed using PHASE (Phylogenetics And in RNA molecules. The 12S and 16S mt-rRNA sequences Sequence Evolution; the documentation to the PHASE were aligned and the alignments were modified by taking package available at http://www.bioinf.man.ac.uk/ the secondary structure into account (Jow et al. 2002). The resources/phase/) (Jow et al. 2002). This package provided number of remaining sites of 12S+16S mt-rRNA was 2434. a number of substitution models for amino acid, nucleotide and RNA base pairs sequences. The model used for amino Results and discussion acid sequences was mtREV24. As to protein-encoding genes, each codon position was treated separately because the sub- stitution rate varies among different codon positions. The General results substitution rate of the third position was five-fold higher Figure 1 shows the phylogeny reconstructed by the than those of the first two positions, which was confirmed by Bayesian analysis at the nt level using the GTR4 and an internal program in PHASE. The models were most gen- GTR2 models for the first two and the third codon eral time-reversible four-state model (GTR4) (Yang 1994) positions, respectively. The BP and PP values for for the first two codon positions and most general time- the nodes in the tree obtained by the different meth- reversible two-state model (GTR2) (Phillips & Penny 2003; ods and at different levels are all listed in Table 3. Gibson et al. 2005) for the third position, respectively. In As shown in Figure 1, the Bayesian inference recov- GTR2, A and G were lumped into a single state R (purine) ered the deep clades resolved in the previous studies and, similarly, C and T were also lumped into Y (pyrimi- dine). When GTR2 was used, only transversion was consid- based on other different datasets (i.e., Mysticeti, Odon- ered. The GTR2 model helped to reduce the noise induced toceti, Cetancodonta (Cetacea+Hippopotamidae) and by the much higher substitution rate in the third codon (Tapiridae+Equidae)) (Messenger & McGuire 1998; position due to the weaker selective pressure. It was imple- Gatesy et al. 1999; Nikaido et al. 1999; Cassens et al. mented in PHASE recently (Gibson et al. 2005). The dis- 2000; Arnason et al. 2004; Rychel et al. 2004; Yan et crete Γ distribution for the site heterogeneity (Yang 1994) al. 2005; May-Collado & Agnarsson 2006; Nishida et was adopted and the six categories were used. The propor- al. 2007). These resolved clades were 1, 5, 6, 7 and tions of invariant sites were optimized during the analy- 17 (shown in Figure 1), which were highlighted by a ses. The MCMC runs were launched with random start- summing junction. The PP values were high (100%) ing trees. The burn-in period was 200,000 generations, and this was sufficient for the likelihood and the substitution for these clades compared to recent analyses using cy- model parameters to reach equilibrium. After the burn-in tochrome b (Arnason et al. 2004; May-Collado & Ag- period, 10,000 trees were sampled in 2,000,000 generations narsson 2006). The Bayesian analysis also recovered with sampling period every 200 generations. For each ex- most of the clades supported by the previous stud- periment in this study, four independent MCMC runs were ies (Gatesy 1997; Messenger & McGuire 1998; Gatesy performed to ensure that the Bayesian analyses were not et al. 1999; Cassens et al. 2000; Nikaido et al. 2001; trapped in local optima. The Bayesian posterior probabil- Rychel et al. 2004; Sasaki et al. 2005; Yan et al. 2005; ities (PP) for the nodes in the final consensus trees were May-Collado & Agnarsson 2006; Nishida et al. 2007) at averaged over four independent trees. subfamily, family and superfamily levels with relatively Furthermore, it is well known that compensatory sub- high PP values compared to NJ and MP analyses. The stitutions occur in the paired regions of RNA helices (Higgs 1998; Savill et al. 2001; Jow et al. 2002). This means that 25 cetacean species with completed mt-genomes repre- the substitutions occurring on the one side of a pair are sented all the extant families within Cetacea. Thus, the correlated with the substitutions on the other side (Higgs Bayesian analysis provided a reliable and comprehen- 1998; Savill et al. 2001; Jow et al. 2002). Most phylogenetic sive cetacean phylogeny which will help further under- programs assume that each site in a molecule evolves in- standing of cetacean biology. These results also showed 814 X.-G. Yang

Fig. 1. The cetacean phylogeny reconstructed by the Bayesian analysis at the nucleotide level using the 12 concatenated protein- encoding genes. The GTR4 and GTR2 models implemented in PHASE package were used for the first two condon positions and the third codon position, respectively. The horizontal length of each branch is proportional to the estimated number of nucleotide substitution. The clades solved by previous studies are indicated by summing junctions. The nodes with relatively low posterior probability are indicated by arrows. For bootstrap percentage and posterior probability support values, see Table 3. Nodes are labeled to identify values in Table 3. Major groups are labeled according to the recent literature. that the mt-genomes were very informative for inferring mation was lost when only amino acids were considered the phylogeny of Cetacea, while they were very short instead of nucleotides. (only 12 protein-coding genes used in this study) com- The Bayesian analysis based on mt-rRNA se- pared to nuclear genomes. Nishida et al. (2007) used quences did not perform as well as expected, even 1.7-kbp fragment of the non-recombining Y chromo- though it still recovered many clades supported by some to infer the cetacean phylogeny and recovered the the analyses based on nt and aa sequences (data not major clades as this study did. The informativeness is shown). The evolutionary mechanisms of 12S and 16S possibly due to the absence of recombination, small ef- mt-rRNA sequences were correlated to the secondary fective population size and low homoplasy (Nishida et structure of RNA helices and relatively higher evolu- al. 2007). Although it could not resolve the relationship tionary rate in loop regions. Even though the RNA6A among clades with short branches, it may be combined model, which considered paired sites of RNA helices, with mt-DNA to provide better phylogenies. was used in this study, obviously, the model was not The result of the Bayesian analysis at the aa level sufficiently able to describe phylogeny of mt-rRNA se- is shown in Figure 2 with the PP values for every node. quences. The loop regions which had higher substitu- The Bayesian analysis at aa level did not recover the tion rate might have caused noise. Further studies are monophyly of Odontoceti; the PP values for the nodes needed to understand underlying mechanisms and to indicated by stars were relatively low (88% and 77%, improve models for mt-rRNA phylogenies. respectively) and the branches on the nodes were very The NJ and MP analyses did not recover the mono- short.Butitrecoveredmostofcladesasthentlevel phyly of Odontoceti, neither did they recover many analysis did. This suggested that the aa level analysis other clades supported by the previous studies (Messen- still provided some useful information to help under- ger & McGuire 1998; Nikaido et al. 1999, 2001; Cassens stand the cetacean phylogeny, even though some infor- et al. 2000; Arnason et al. 2004; Rychel et al. 2004; Bayesian cetacean phylogeny based on mito-genomes 815

Table 3. Bootstrap percentage (BP) and Bayesian posterior probabilities (PP) (%).a

Node aanj ntnj aamp ntmp aabaye ntbaye ntRYbaye ssRNAbaye

1 100 100 100 99 100 100 100 100 2 <50 100 100 100 100 100 100 100 3 100 100 100 100 100 100 100 100 4 53 100 100 100 100 100 100 100 594968788100100 100 No 6 100 100 100 100 100 100 100 100 7 No No <50 <50 <50 100 100 No 8 98 100 100 100 100 100 100 100 9NoNo<50 No 77 100 100 No 10 66 92 <50 56 94 65 68 <50 11 97 100 100 100 100 100 100 100 12 85 100 100 85 100 100 100 99 13 52 100 100 100 100 100 100 100 14 62 98 <50 <50 94 100 100 87 15 No <50 54 <50 100 64 73 No 16 91 100 94 99 100 100 100 100 17 No 100 100 100 100 100 100 100 18 No 100 100 100 100 100 100 100 19 No 100 100 100 100 100 100 100 20 No 97 <50 55 100 100 100 No 21 No 100 75 100 100 100 100 80 22 No No 76 No 99 100 82 <50 23 No No <50 No <50 No 80 No 24 <50 100 94 100 100 100 100 100 25 <50 100 100 100 100 100 100 100 26 No 99 60 96 100 100 100 100 27 <50 88 <50 93 No 100 99 No 28 68 100 100 100 100 100 100 100 29 65 100 80 100 100 100 100 99 a Explanation of abbreviations: aanj: neighbor-joining method at amino acid (aa) level; ntnj: neighbor-joining method at nucleotide (nt) level; aamp: maximum parsimony method at aa level; ntmp: maximum parsimony method at nt level; aabaye: Bayesian analysis at aa level; ntbaye: Bayesian analysis at nt level with GTR4 model at three codon positions; ntRYbaye: Bayesian analysis at nt level with GTR4 model at first two codon positions and GTR2 model at the third codon position; ssRNAbaye: Bayesian analysis at mt-rRNA level considering the secondary structure; No: clade was not recovered; the support values for nodes 1, 5, 6, 7 and 17 are in italic; The PP values for the ntRYbaye are in bold.

Sasaki et al. 2005; May-Collado & Agnarsson 2006; did the NJ and MP methods with high BP values Nishida et al. 2007), while the Bayesian analyses at nt (>85%). and aa levels recovered most of the clades supported by the previous studies, as shown in Figure 1. In some Mysticeti and Odontoceti clades recovered by NJ and MP methods, the support The previous studies about the relationship of Mysticeti values were very low (<50%), as shown in Table 3. Even and Odontoceti agreed on the monophyly of Mysticeti considering that the PP values might overestimate sup- (Arnason et al. 2004; Rychel et al. 2004; Sasaki et al. ports for certain nodes (Suzuki et al. 2002; Erixon et 2005; Yan et al. 2005; May-Collado & Agnarsson 2006; al. 2003; Simmons et al. 2004), these results suggested Nishida et al. 2007). But the position of the Mysticeti that the Bayesian analyses at nt and aa levels outper- varied in different studies. The Bayesian analysis at aa formed the NJ and MP analyses on cetacean phylogeny level in this study put Mysticeti as sister to Odontoceti based on mt-genomes in this study. The discussions will except the families Physeteridae and Kogiidae with rel- be focused on the Bayesian results at nt and aa levels atively low PP values (88%). Some phylogenies based in the following sections. on mt data (Milinkovitch et al. 1993, 1994; Arnason & Gullberg 1994; Verma et al. 2004), such as mt-Cytb Cetancodonta (Cetacea + Hippopotamidae) genes, placed Mysticeti as sister to different clades in As shown in Figures 1 and 2, the hippopotamus is Odontoceti, disrupting the monophyly of Odontoceti, the closer relative of monophyletic cetacean species which contradicted the studies based on the nuclear than are the cow and the sheep. The relationship and morphological data (Messenger & McGuire 1998; has been resolved in the phylogenetic analysis of mt- Nikaido et al. 2001; Nishida et al. 2007). The Bayesian Cytb gene (Arnason & Gullberg 1996). It was sub- result at nt level in this study supported the monophyly sequently supported and confirmed by the studies of Mysticeti and Odontoceti and the sister relationship based on different datasets (Gatesy et al. 1996; Gatesy between these two groups with high PP values (100%). 1997; Ursing & Arnason 1998; Nikaido et al. 1999, These results were strongly supported by the previous 2001; Cassens et al. 2000). In this study, not only studies based on nuclear and morphological data (Mes- the Bayesian analyses at nt and aa levels supported senger & McGuire 1998; Nikaido et al. 2001; Nishida et the relationship with high PP values (100%), but al. 2007) and several new studies based on mt data (Ar- 816 X.-G. Yang

Fig. 2. The cetacean phylogeny reconstructed by the Bayesian analysis at the amino acid level using the 12 concatenated protein- encoding genes. The mtREV24 model implemented in PHASE package was used. The horizontal length of each branch is proportional to the estimated number of amino acid substitution. Numbers indicate posterior probability support values. The nodes with relatively low posterior probability are indicated by stars. nason et al. 2004; Sasaki et al. 2005; Yan et al. 2005). the PP value was not very high. In other words, the placement of Ziphiidae as a sister group to Platanis- Physeteroidea and Ziphiidae tidae was supported by the Bayesian analyses in this The Bayesian analyses at nt level supported the basal study at both nt and aa levels, but it may need more position of superfamily Physeteroidea, which was com- characters and molecular data which will provide sup- posed of the families Kogiidae and Physeteridae. This plementary information to support its placement within was consistent with the studies by Arnason et al. (2004), Odontoceti. Yan et al. (2005), May-Collado et al. (2006) and Nishida et al. (2007). and River dolphins Platanistidae includes Platanista gangetica minor The Bayesian analyses at nt and aa levels produced (Indus River dolphin) and Platanista gangetica ganget- the same phylogeny of Delphinoidea and river dolphins. ica ( River dolphin) (Rice 1998). In this study, The results strongly supported the monophyly of Del- Indus River dolphin was used as the representative of phinoidea and the relationship ((Phocoenidae + Mon- Platanistidae. Ziphiidae was placed as a sister group to odontidae) + Delphinidae), with the PP values 100%. Indus River dolphin at nt (PP = 68%) and aa (PP = Waddell et al. (2000), Nishida et al. (2007) and May- 94%) levels, which contradicted the studies by Arna- Collado et al. (2006) got the same phylogeny based on son et al. (2004) and May-Collado et al. (2006). Their nuclear genes and mt-Cytb genes, respectively. studies put Ziphiidae as the second basal branch in River dolphins were polyphyletic which was sup- Odontoceti, even though the support values were very ported not only by this study but by most morpho- low. Yan et al. (2005) put it with Platanistidae in the logical and molecular studies as well. Figures 1 and 2 Bayesian analysis based on nt sequence data set with show that the river dolphins have two separated lin- relatively high PP value (92%), but their result based eages: (Lipotidae + (Pontoporiidae + Iniidae)) and Pla- on aa sequence agreed with the works of Arnason et tanistidae. The first lineage was the sister group to Del- al. (2004) and May-Collado et al. (2006), even though phinoidea. The monophyly and placement of the first Bayesian cetacean phylogeny based on mito-genomes 817 lineage and the arrangement within it were strongly In the meantime, the short branches joining the four supported by molecular data (Rosel et al. 1995; Mes- lineages indicated that the divergence happened in a senger & McGuire 1998; Waddell et al. 2000; Yan et very short time window, which made the phylogenetic al. 2005; May-Collado & Agnarsson 2006). Platanisti- analyses even more difficult. This also explained why dae and Ziphiidae formed a clade which diverged after the PP values on the nodes joining those short braches Physeteroidea in this study as supported by Cassens et were relatively low. The determination of the controver- al. (2000). Another position suggested by other stud- sial relationship among these four lineages needs help ies (Messenger & McGuire 1998; Arnason et al. 2004; from independent information, such as nuclear genomes Yan et al. 2005; May-Collado & Agnarsson 2006) was and records. that Platanistidae diverged after the divergence of Phy- seteroidea and Ziphiidae (in this order), and no sister Acknowledgements relationship between the latter two. These ambiguous phylogenies and short branches may be partly caused The author thanks two anonymous referees for their critical by rapid split in a short time window (Nikaido et al. reading of the manuscript and constructive comments. The 2001; Arnason et al. 2004; Yan et al. 2005). However, author is also thankful to Dr. Vivek Gowri-Shankar and in all the phylogenetic trees, Platanistidae was always Dr. Paul G. Higgs for their technical support and valuable more basal than other river dolphins, having no sister discussions and suggestions. relationship with them, which suggested two or more shifts to riverine habitats (Arnason & Gullberg 1996; References Cassens et al. 2000; Hamilton et al. 2001; Nikaido et al. 2001; Arnason et al. 2004; Yan et al. 2005; May-Collado Adachi J. & Hasegawa M. 1996. Model of amino acid substitution & Agnarsson 2006). in proteins encoded by mitochondrial DNA. J. Mol. Evol. 42: 459–468. Adegoke J.A., Arnason U. & Widegren B. 1993. Sequence orga- Balaenidae and Neobalaenidae nization and evolution, in all extant whalebone whales, of a The most basal position of Balaenidae and the arrange- DNA satellite with terminal chromosome localization. Chro- ment within it were supported by the Bayesian analyses mosoma 102: 382–388. at both nt and aa levels with high PP values (100%). Arnason U., Gretarsdottir S. & Widegren B. 1992. Mysticete The clade and its position within Mysticeti have been (baleen whale) relationships based upon the sequence of the common cetacean DNA satellite. Mol. Biol. Evol. 9: 1018– established and confirmed by the previous studies (Ar- 1028. nason et al. 1992; Adegoke et al. 1993; Arnason & Gull- Arnason U. & Gullberg A. 1993. Comparison between the com- berg 1994, 1996; Nishida et al. 2007). plete mtDNA sequences of the blue and the fin whale, two 37: Some morphological analyses placed Neobalaeni- species that can hybridize in nature. J. Mol. Evol. 312– 322. dae as a sister group to Balaenidae. Some nuclear and Arnason U. & Gullberg A. 1994. Relationship of baleen whales es- mt data analyses showed that Neobalaenidae diverged tablished by cytochrome b gene sequence comparison. Nature immediately after the basal divergence of Balaenidae. 367: 726–728. The latter was strongly supported by the Bayesian anal- Arnason U. & Gullberg A. 1996. Cytochrome b nucleotide se- quences and the identification of five primary lineages of ex- yses in this study, which was shown in Figures 1 and 2 tant cetaceans. Mol. Biol. Evol. 13: 407–417. with high PP value (100%). Arnason U., Gullberg A. & Janke A. 2004. Mitogenomic analyses provide new insights into cetacean origin and evolution. Gene Eschrichtiidae and Balaenopteridae 333: 27–34. The Bayesian analyses at nt and aa levels strongly Arnason U., Gullberg A. & Widegren B. 1991. The complete nu- cleotide sequence of the mitochondrial DNA of the fin whale, supported (PP = 100%) the monophyly of Eschrichti- Balaenoptera physalus.J.Mol.Evol.33: 556–568. idae + Balaenopteridae and the sister relationship with Benson D.A., Karsch-Mizrachi I., Lipman D.J., Ostell J. & Neobalaenidae. The analyses also recovered the same Wheeler D.L. 2008. GenBank. Nucleic Acids Res. 36 (Data- base issue): four principal lineages with high PP values (100%) as D25–D30. Brown W.M., George M., Jr. & Wilson A.C. 1979. Rapid evo- Sasaki et al. (2005) did: (i) lineage I (Antarctic minke lution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. and North Atlantic minke whales); (ii) lineage II (fin USA. 76: 1967–1971. and humpback whales); (iii) lineage III (blue, Omura’s Caballero S., Jackson J., Mignucci-Giannoni A.A., Barrios- baleen, sei, pygmy Bryde’s and Bryde’s whales); and Garrido H., Beltran-Pedreros S., Montiel-Villalobos M.A., Robertson K.M. & Baker C.S. 2008. Molecular systematics lineage IV (gray whale). All the lineages were well- of South American dolphins : sister taxa determina- supported (PP = 100%) as shown in Figures 1 and tion and phylogenetic relationships, with insights into a multi- 2. However, the relationship among the four lineages locus phylogeny of the Delphinidae. Mol. Phylogenet. Evol. still remains ambiguous. There were a lot of debates 46: 252–268. Cassens I., Vicario S., Waddell V.G., Balchowsky H., Van Belle about the positions of these lineages (Arnason et al. D., Ding W., Fan C., Mohan R.S., Simoes-Lopes P.C., Bastida 1991; Arnason & Gullberg 1993, 1994, 1996; Messenger R., Meyer A., Stanhope M.J. & Milinkovitch M.C. 2000. In- & McGuire 1998; Sasaki et al. 2005). The nodes joining dependent adaptation to riverine habitats allowed survival of the lineage I, II and III had relatively low PP values ancient cetacean lineages. Proc. Natl. Acad. Sci. USA. 97: (80% and 82%), which are indicated by arrows (Fig. 1). 11343–11347. Erixon P., Svennblad B., Britton T. & Oxelman B. 2003. Reli- As shown in Figure 2, the support for the node joining ability of Bayesian posterior probabilities and bootstrap fre- lineage II and III was also relatively low (PP = 46%). quencies in phylogenetics. Syst. Biol. 52: 665–673. 818 X.-G. Yang

Felsenstein J. 2005. PHYLIP (version 3.6). http://evolution.gs. Nishida S., Goto M., Pastene L.A., Kanda N. & Koike H. 2007. washington.edu/phylip.html. Phylogenetic relationships among cetaceans revealed by Y- Gatesy J. 1997. More DNA support for a Cetacea/Hippopotami- chromosome sequences. Zoolog. Sci. 24: 723–732. dae clade: the blood-clotting protein gene gamma-fibrinogen. O’Leary M.A. & Geisler J.H. 1999. The position of Cetacea within Mol. Biol. Evol. 14: 537–543. mammalia: phylogenetic analysis of morphological data from Gatesy J., Hayashi C., Cronin M.A. & Arctander P. 1996. Evi- extinct and extant taxa. Syst. Biol. 48: 455–490. dence from milk casein genes that cetaceans are close relatives Olivo P.D., Van de Walle M.J., Laipis P.J. & Hauswirth W.W. of hippopotamid artiodactyls. Mol. Biol. Evol. 13: 954–963. 1983. Nucleotide sequence evidence for rapid genotypic shifts Gatesy J., Milinkovitch M., Waddell V. & Stanhope M. 1999. Sta- in the bovine mitochondrial DNA D-loop. Nature 306: 400– bility of cladistic relationships between Cetacea and higher- 402. level artiodactyl taxa. Syst. Biol. 48: 6–20. Phillips M.J. & Penny D. 2003. The root of the mammalian tree Gibson A., Gowri-Shankar V., Higgs P.G. & Rattray M. 2005. A inferred from whole mitochondrial genomes. Mol. Phylogenet. comprehensive analysis of mammalian mitochondrial genome Evol. 28: 171–185. base composition and improved phylogenetic methods. Mol. Pichler F.B., Robineau D., Goodall R.N., Meyer M.A., Olivar- Biol. Evol. 22: 251–264. ria C. & Baker C.S. 2001. Origin and radiation of Southern Hamilton H., Caballero S., Collins A.G. & Brownell R.L., Jr. Hemisphere coastal dolphins ( ). Mol. 2001. Evolution of river dolphins. Proc. Biol. Sci. 268: 549– Ecol. 10: 2215–2223. 556. Rice D.W. 1998. Marine of the World: Systematics and Hasegawa M. & Adachi J. 1996. Phylogenetic position of Distribution. The Society of Marine Mammalogy, Lawrence, cetaceans relative to artiodactyls: reanalysis of mitochondrial Kansas, 231 pp. and nuclear sequences. Mol. Biol. Evol. 13: 710–717. Rosel P.E., Haygood M.G. & Perrin W.F. 1995. Phylogenetic re- Hasegawa M., Kishino H. & Yano T. 1985. Dating of the human- lationships among the true porpoises (Cetacea:Phocoenidae). ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Phylogenet. Evol. 4: 463–474. Mol. Evol. 22: 160–174. Rychel A.L., Reeder T.W. & Berta A. 2004. Phylogeny of mys- Higgs P.G. 1998. Compensatory neutral mutations and the evo- ticete whales based on mitochondrial and nuclear data. Mol. lution of RNA. Genetica 102–103: 91–101. Phylogenet. Evol. 32: 892–901. Holder M. & Lewis P.O. 2003. Phylogeny estimation: traditional Sasaki T., Nikaido M., Hamilton H., Goto M., Kato H., Kanda and Bayesian approaches. Nat. Rev. Genet. 4: 275–284. N.,PasteneL.,CaoY.,FordyceR.,HasegawaM.&OkadaN. Huelsenbeck J.P., Ronquist F., Nielsen R. & Bollback J.P. 2001. 2005. Mitochondrial phylogenetics and evolution of mysticete Bayesian inference of phylogeny and its impact on evolution- whales. Syst. Biol. 54: 77–90. ary biology. Science 294: 2310–2314. Savill N.J., Hoyle D.C. & Higgs P.G. 2001. RNA sequence evo- Jow H., Hudelot C., Rattray M. & Higgs P.G. 2002. Bayesian lution with secondary structure constraints: comparison of phylogenetics using an RNA substitution model applied to substitution rate models using maximum-likelihood methods. early mammalian evolution. Mol. Biol. Evol. 19: 1591–1601. Genetics 157: 399–411. Lum J.K., Nikaido M., Shimamura M., Shimodaira H., Shed- Simmons M.P., Pickett K.M. & Miya M. 2004. How meaningful lock A.M., Okada N. & Hasegawa M. 2000. Consistency of are Bayesian support values? Mol. Biol. Evol. 21: 188–199. SINE insertion topology and flanking sequence tree: quan- Suzuki Y., Glazko G.V. & Nei M. 2002. Overcredibility of molec- tifying relationships among cetartiodactyls. Mol. Biol. Evol. ular phylogenies obtained by Bayesian phylogenetics. Proc. 17: 1417–1424. Natl. Acad. Sci. USA 99: 16138–16143. May-Collado L. & Agnarsson I. 2006. Cytochrome b and Bayesian Thewissen J.G., Williams E.M., Roe L.J. & Hussain S.T. 2001. inference of whale phylogeny. Mol. Phylogenet. Evol. 38: 344– Skeletons of terrestrial cetaceans and the relationship of 354. whales to artiodactyls. Nature 413: 277–281. Messenger S.L. & McGuire J.A. 1998. Morphology, molecules, Thompson J.D., Gibson T.J., Plewniak F., Jeanmougin F. & Hig- and the phylogenetics of cetaceans. Syst. Biol. 47: 90–124. gins D.G. 1997. The CLUSTAL X windows interface: flexible Milinkovitch M.C., Meyer A. & Powell J.R. 1994. Phylogeny of strategies for multiple sequence alignment aided by quality all major groups of cetaceans based on DNA sequences from analysis tools. Nucleic Acids Res. 25: 4876–4882. three mitochondrial genes. Mol. Biol. Evol. 11: 939–948. Ursing B.M. & Arnason U. 1998. Analyses of mitochondrial Milinkovitch M.C., Orti G. & Meyer A. 1993. Revised phylogeny genomes strongly support a hippopotamus-whale clade. Proc. of whales suggested by mitochondrial ribosomal DNA se- Biol. Sci. 265: 2251–2255. quences. Nature 361: 346–348. Verma S.K., Sinha R.K. & Singh L. 2004. Phylogenetic position Montgelard C., Catzeflis F.M. & Douzery E. 1997. Phylogenetic of Platanista gangetica: insights from the mitochondrial cy- relationships of artiodactyls and cetaceans as deduced from tochrome b and nuclear interphotoreceptor retinoid-binding the comparison of cytochrome b and 12S rRNA mitochondrial protein gene sequences. Mol. Phylogenet. Evol. 33: 280–288. sequences. Mol. Biol. Evol. 14: 550–559. Waddell V.G., Milinkovitch M.C., Berube M. & Stanhope M.J. Nicholas K.B., Nicholas H.B.J. & Deerfield D.W.I. 1997. Gene- 2000. Molecular phylogenetic examination of the delphinoidea Doc. http://www.psc.edu/biomed/genedoc trichotomy: congruent evidence from three nuclear loci indi- Nikaido M., Matsuno F., Hamilton H., Brownell R.L., Jr., Cao Y., cates that porpoises (Phocoenidae) share a more recent com- Ding W., Zuoyan Z., Shedlock A.M., Fordyce R.E., Hasegawa mon ancestry with white whales (Monodontidae) than they M. & Okada N. 2001. Retroposon analysis of major cetacean do with true dolphins (Delphinidae). Mol. Phylogenet. Evol. lineages: the monophyly of toothed whales and the 15: 314–318. of river dolphins. Proc. Natl. Acad. Sci. USA 98: 7384–7389. Yan J., Zhou K. & Yang G. 2005. Molecular phylogenetics of ‘river Nikaido M., Rooney A.P. & Okada N. 1999. Phylogenetic rela- dolphins’ and the mitochondrial genome. Mol. Phylo- tionships among cetartiodactyls based on insertions of short genet. Evol. 37: 743–750. and long interpersed elements: hippopotamuses are the clos- Yang Z. 1994. Maximum likelihood phylogenetic estimation from est extant relatives of whales. Proc. Natl. Acad. Sci. USA 96: DNA sequences with variable rates over sites: approximate 10261–10266. methods. J. Mol. Evol. 39: 306–314.

Received July 29, 2008 Accepted January 2, 2009