Resolving Deep Phylogenetic Relationships in Salamanders: Analyses of Mitochondrial and Nuclear Genomic Data
Total Page:16
File Type:pdf, Size:1020Kb
Syst. Biol. 54(5):758–777, 2005 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150500234641 Resolving Deep Phylogenetic Relationships in Salamanders: Analyses of Mitochondrial and Nuclear Genomic Data DAVID W. WEISROCK,1,2 LUKE J. HARMON,1 AND ALLAN LARSON1 1Department of Biology, Campus Box 1137, Washington University, St. Louis, Missouri 63130, USA 2Present Address: Department of Biology, University of Kentucky, Lexington, Kentucky 40506, USA; E-mail: [email protected] Abstract.— Phylogenetic relationships among salamander families illustrate analytical challenges inherent to inferring phy- logenies in which terminal branches are temporally very long relative to internal branches. We present new mitochondrial DNA sequences, approximately 2100 base pairs from the genes encoding ND1, ND2, COI, and the intervening tRNA genes for 34 species representing all 10 salamander families, to examine these relationships. Parsimony analysis of these mtDNA sequences supports monophyly of all families except Proteidae, but yields a tree largely unresolved with respect to in- terfamilial relationships and the phylogenetic positions of the proteid genera Necturus and Proteus.Incontrast, Bayesian and maximum-likelihood analyses of the mtDNA data produce a topology concordant with phylogenetic results from nuclear-encoded rRNA sequences, and they statistically reject monophyly of the internally fertilizing salamanders, subor- der Salamandroidea. Phylogenetic simulations based on our mitochondrial DNA sequences reveal that Bayesian analyses outperform parsimony in reconstructing short branches located deep in the phylogenetic history of a taxon. However, phylogenetic conflicts between our results and a recent analysis of nuclear RAG-1 gene sequences suggest that statistical rejection of a monophyletic Salamandroidea by Bayesian analyses of our mitochondrial genomic data is probably erroneous. Bayesian and likelihood-based analyses may overestimate phylogenetic precision when estimating short branches located deep in a phylogeny from data showing substitutional saturation; an analysis of nucleotide substitutions indicates that these methods may be overly sensitive to a relatively small number of sites that show substitutions judged uncommon by the favored evolutionary model. [Bayesian; DNA simulation; internal fertilization; mitochondrial DNA; parsimony; ribosomal RNA; salamander.] Phylogenetic relationships among taxonomic families are unlikely to be secondarily lost. External fertilization, of salamanders are difficult to resolve because branches as observed in suborder Cryptobranchoidea (families grouping multiple families are likely very short rela- Cryptobranchidae and Hynobiidae) and strongly in- tive to the subsequent evolutionary history of the group ferred for suborder Sirenoidea (family Sirenidae; Sever (Larson et al., 2003). Parsimony-based analyses of mor- et al., 1996), is considered ancestral for salamanders and phological characters and nuclear-encoded rRNA se- anuran amphibians. quences produce different phylogenetic topologies with To reexamine these relationships and the phylo- statistically significant conflict between these data sets genetic-information content of molecular characters, we (Larson and Dimmick, 1993; Larson, 1998). Analyses of present new mitochondrial DNA (mtDNA) sequences, an expanded set of morphological characters and se- approximately 2100 base pairs from the genes encod- quences from the nuclear RAG-1 gene likewise show ing ND1, ND2, COI, and the intervening tRNA genes. that resolution of deep phylogenetic relationships varies Our phylogenetic analyses apply parsimony, maximum- among data partitions and phylogenetic methods (Wiens likelihood, and Bayesian criteria (Huelsenbeck et al., et al., 2005). 2001; Swofford et al., 1996) to test alternative phyloge- Analyses of combined morphological and molec- netic hypotheses using both new mtDNA data and previ- ular data support monophyly of internally fertiliz- ously published morphological and nuclear rRNA data ing salamanders, suborder Salamandroidea (families (Larson, 1991; Larson and Dimmick, 1993). We use a va- Ambystomatidae, Amphiumidae, Dicamptodontidae, riety of statistical tests to evaluate alternative phyloge- Plethodontidae, Proteidae, Rhyacotritonidae, and Sala- netic hypotheses derived from both a priori information mandridae; Larson and Dimmick, 1993; Wiens et al., (e.g., monophyly of suborder Salamandroidea) as well 2005), although this grouping is rejected statistically as a posteriori hypotheses generated from our new phy- by two independent molecular data sets, the nuclear- logenetic results. We identify and describe the molec- encoded rRNA sequences and the mitochondrial ge- ular characters that make large contributions to sta- nomic sequences reported here. tistically significant phylogenetic results obtained from Nonmonophyly of Salamandroidea requires homo- maximum-likelihood and Bayesian methods. plastic evolution of internal fertilization, a complex syn- Because the evolutionary age of salamanders is at least drome of characters involving morphology of male and 150 My (Gao and Shubin, 2001) and fossil evidence dates female cloacal glands, life history and behavior (see Sever some extant families to the upper Cretaceous (Gao and and Brizzi, 1998) or secondary loss of this trait. Indepen- Shubin, 2003), branches grouping two or more families dent evolution of internal fertilization by multiple lin- are probably short relative to their phylogenetic depth. eages seems unlikely (Sever and Brizzi, 1998). Absence Previous attempts at using mtDNA sequence (12S and of evolutionary loss of this syndrome within families in- 16S rRNA) data to resolve salamander-family phylogeny dicates that it is indispensable to the species that have produced a tree with extremely long terminal branches it and therefore highly “burdened” using the terminol- and short, poorly supported interfamilial branches (Hay ogy of Riedl (1978) and Donoghue (1989); such characters et al., 1995), indicating that family-level cladogenesis 758 2005 WEISROCK ET AL.—SALAMANDER FAMILY PHYLOGENY 759 occurred over a relatively short time interval. In such lusitanica, Notophthalmus viridescens, and Salamandra sala- cases, misleading phylogenetic information generated mandra, were taken from GenBank (Table 1). by parallel substitutions among branches within tax- Taxon sampling in combined phylogenetic analyses onomic families may overwhelm the signal generated followed two approaches. Because not all data were by substitutions on a lineage ancestral to two or more available for all taxa across data sets, we assembled families. Phylogenetic signal is eroded further by mul- pruned data sets that minimized the amount of miss- tiple substitutions occurring at the sites containing the ing data. Wiens (2003) showed that accuracy of phylo- original phylogenetic signal. Phylogenetic reconstruc- genetic analyses of incomplete data may be high if each tion of such evolutionary histories can be misled by taxon sampled contains sufficient characters; therefore, long-branch attraction (Felsenstein, 1978; Huelsenbeck, we also assembled combined data sets using the full com- 1997; Swofford et al., 2001). In extreme cases, a bifur- plement of taxa from the mtDNA data set with rRNA and cating tree may be analytically indistinguishable from morphological data missing for some taxa. a polytomy (see discussion by Donoghue and Sander- son [1992], Fishbein et al. [2001], Jackman et al. [1999], DNA Extraction and Sequencing Misof et al. [2001], Poe and Chubb [2004], Slowinski DNA extraction, polymerase chain reaction, and se- [2001], and Walsh et al. [1999]). We explore the lim- quencing methods were as described by Weisrock et its of deep phylogenetic reconstruction in the evolu- al. (2001) except that some sequencing reactions were tionary history of salamanders using simulations based performed using a Big-Dye Terminator Ready-Reaction on the inferred evolutionary models of our mtDNA Kit (Perkin-Elmer) and run on an ABI (PE Applied sequences. Biosystems, Inc.) 373A automated DNA sequencer. We sequenced approximately 2100 contiguous bases of mito- MATERIALS AND METHODS chondrial DNA including the genes encoding ND1 (sub- Taxon Sampling unit one of NADH dehydrogenase), tRNAIle, tRNAGln, Met This study used the published morphological and tRNA , ND2 (subunit two of NADH dehydrogenase), nuclear rRNA sequence data matrices of Larson and tRNATrp , tRNAAla, tRNAAsn, tRNACys, tRNATyr , and COI Dimmick (1993). The morphological data used by Larson (subunit I of cytochrome c oxidase), plus the origin for Asn and Dimmick (1993) are compiled from other sources light-strand replication (OL) between the tRNA and representing head and trunk morphology (Duellman tRNACys genes. This block of sequence also contains a and Trueb, 1986) and cloacal characters related to fer- small number of bases of intergenic sequence separat- tilization (Sever 1991a, 1991b; 1994). The complete data ing some tRNA genes. Primers used for sequencing are matrix containing all taxa and morphological characters listed in Table 2. from Larson and Dimmick (1993) was used in combined analyses with molecular characters. Sequence Alignments Nuclear rRNA sequence data were taken from Larson Alignment of the nuclear rRNA sequence data was and Dimmick (1993) and Larson (1991) and included similar to the original alignments of Larson (1991) and the anuran outgroup Xenopus laevis, the caecilian out-