A Nuclear Gene for Higher Level Phylogenetics: Phosphoenolpyruvate Carboxykinase Tracks Mesozoic-Age Divergences Within Lepidoptera (Insecta)
Total Page:16
File Type:pdf, Size:1020Kb
A Nuclear Gene for Higher Level Phylogenetics: Phosphoenolpyruvate Carboxykinase Tracks Mesozoic-Age Divergences Within Lepidoptera (Insecta) Timothy P. Friedlander, * Jerome C. Regier, * Charles Mtter,_F and David L. Wagner$ *Center for Agricultural Biotechnology, University of Maryland; TDepartment of Entomology, University of Maryland; and $Department of Ecology and Evolutionary Biology, University of Connecticut The sequence of phosphoenolpyruvate carboxykinase (PEPCK) has been previously identified as a promising can- didate for reconstructing Mesozoic-age divergences (Friedlander, Regier, and Mitter 1992, 1994). To test this hy- pothesis more rigorously, 597 nucleotides of aligned PEPCK coding sequence (-30% of the coding region) were generated from 18 species representing Mesozoic-age lineages of moths (Insecta: Lepidoptera) and outgroup taxa. Relationships among basal Lepidoptera are well established by morphological analysis, providing a strong test for the utility of a gene which has not previously been used in systematics. Parsimony and other phylogenetic analyses were conducted on nucleotides by codon positions (ntl, nt2, nt3) separately and in combination, and on amino Downloaded from https://academic.oup.com/mbe/article/13/4/594/1055547 by guest on 02 October 2021 acids, for comparison to the test phylogeny. The highest concordance was achieved with ntl + nt2, for which one of two most-parsimonious trees was identical to the test phylogeny, and with all nucleotides when nt3 was down- weighted sevenfold or higher, for which a single most-parsimonious tree identical to the test phylogeny resulted. Substitutions in nt3 approached saturation in many, but not all, pairwise comparisons and their exclusion or severe downweighting greatly increased the degree of concordance with the test phylogeny. Neighbor-joining analysis confirms this finding. The utility of PEPCK for phylogenetics is demonstrated over a time span for which few other suitable genes are currently available. Introduction Concordance among independent data sets is a and phylum-level phylogenies, dating to the Paleozoic principal criterion for robustness of phylogenetic hy- and earlier (Creti et al. 1991; Rivera and Lake 1992). potheses (e.g., Penny and Hendy 1986; Miyamoto and By contrast, synonymous nucleotide changes within the Cracraft 1991; Hillis 1995; Miyamoto and Fitch 1995). same gene have recently proven useful for inferring spe- Therefore, molecular systematists require access to mul- cies- and genus-level relationships (Cho et al. 1995). tiple unlinked gene sequences. Currently, only organel- Our studies suggest that analysis of synonymous lar and nuclear ribosomal DNAs, for which “universal” changes in many protein-encoding genes should prove PCR primers are available, have been widely applied to useful for Tertiary-age divergences, with genes of highly systematic questions. Additional sequences are needed conserved protein sequence having the additional ad- to address relationships on which organellar or riboso- vantages of unambiguous sequence alignment and rela- ma1 sequences prove uninformative or misleading. tive ease of PCR primer definition. Our laboratory has been systematically searching Resolving Mesozoic-age systematic questions pre- for protein-encoding nuclear gene sequences that will be sents additional challenges because highly conserved phylogenetically useful at a variety of taxonomic levels protein sequences such as EF-la may be insufficiently in animals. Fourteen candidates were identified in an variable, while most synonymous changes in the same initial screening based on criteria of gene size, structure, gene may be multiply substituted and uninterpretable, copy number, and conservation (Friedlander, Regier, and particularly without extensive taxon sampling. What is Mitter 1992). The phylogenetic information content of needed are genes in which nonsynonymous characters five of these for which enough metazoan sequences were evolve more rapidly than in highly conserved sequences available was tested by the criterion of concordance, that such as EF-lo, but considerably more slowly than syn- is, their ability to recover groupings securely established onymous changes in such genes. Even when such genes by previous evidence (Friedlander, Regier, and Mitter are found, their application raises challenges absent 1994). These studies confirmed that all five genes carry from the study of recent divergences. Primer definition phylogenetic information and suggested further that they will be less straightforward because the sequence is had utility spanning an enormous temporal range (< 10 more variable. Furthermore, the issues of character MYA to >500 MYA). For example, amino acid se- weighting and data set partitioning will be relevant to quences of the highly conserved protein elongation fac- analyzing these anciently diverged sequences. For ex- tor- 101 (EF- la) have been used to reconstruct kingdom- ample, if synonymous changes are indeed saturated, then their downweighting, or even removal from the Key words: concordance study, Lepidoptera, Mesozoic, molecular data set, may be justified. systematics, nuclear gene, phosphoenolpyruvate carboxykinase, phy- Our earlier study identified two nuclear genes, not logenetics, sequence character partitions, PEPCK. previously exploited for systematics, which are likely to Address for correspondence and reprints: Timothy Friedlander, be informative about Mesozoic-age divergences: phos- Center for Agricultural Biotechnology, 2 113 Agriculture/Life Sciences Surge Building, University of Maryland, College Park, Maryland phoenolpyruvate carboxykinase (PEPCK; E.C. 4.1.1.32), 20742-3351. E-mail: [email protected]. the subject of this report, with approximately 1,941 bp Mol. Biol. Evol. 13(4):594+04. 1996 coding sequence, and dopa decarboxylase (Friedlander, 0 1996 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 Regier, and Mitter 1994; Fang et al., in prep.). PEPCK 594 A Nuclear Gene for Mesozoic-Age Phylogenetics 595 Heteroneura 6/4/53 Neolepidoptera \ Ditrysia (~140,500) Glossata \ 12/6/59 7 \ 14/8/59 l2 1 2/2/g “Monotrysia” Wl,300) 4 1s t (Tischeriidae) 16/7/54 7 4/l/22 t Exoporia (~~00) 17 11 4 (Hepialidae) Lepidoptera 12/8/55 t 6/2/47 “Dacnonypha” (2W 14 (mothsandbutterflies) 15 6 (Eriocraniidae) 11/g/51 t Heterobathmiina (l/lo) 16 8 Amphiesmenoptera \ \ 14/10/5518 10t Aglossata (~2) 14/11/56 t s/5/54 Zeugloptera Woo) t 18A 26 8 T I 11/6/57 Trichoptera (27,000) Downloaded from https://academic.oup.com/mbe/article/13/4/594/1055547 by guest on 02 October 2021 21 11 (caddisflies) Mecoptera (l/500) (scorpionflies) Mecopterida 16/10/57 Siphonaptera (l/2,000) 18 (fleas) 13/g/66 Diptera W 5Woc) t 16 Antliophora (flies) FIG. 1 .-“Test phylogeny” of the Lepidoptera and the other mecopteroid orders (Trichoptera, Mecoptera, Siphonaptera, Diptera) as sampled in this study, consisting of groupings strongly supported by morphology (see text). Numbers of species sampled in each lineage and the approximate numbers of extant species in that clade are listed in parentheses. For clades in which more than one species was sampled, average pairwise divergence values by codon position (ntl/nt2/nt3) are displayed above each branch, and average pairwise divergence values for amino acids are displayed_ - below each branch. Numbers of morphological synapomorphies supporting major lepidopteran clades (Kristensen 1984) are indicated with arrows. catalyzes the first step of gluconeogenesis, interconvert- debated aspects of lepidopteran phylogeny and to other ing oxaloacetate and phosphoenolpyruvate. Gene copy Mesozoic-age systematic questions. Availability of the number is low, possibly single copy in Lepidoptera and “known” phylogeny also allows objective judgment of Diptera, based on genomic Southern hybridizations with alternative approaches to phylogenetic analysis of this a Drosophila probe (Friedlander, Regier, and Mitter gene (Miyamoto and Fitch 1995), some of which are 1992). There are similarly low estimates of copy number explored here. in vertebrates (Yoo-Warren et al. 1983; Hod, Yoo-War- ren, and Hanson 1984), although a second, quite diver- .Materials and Methods gent, paralogous PEPCK sequence has been identified Specimens that is specifically targeted for export to the mitochon- dria (Weldon et al. 1990). PEPCK’s potential as a phy- The species names, number of individuals sampled, logenetic marker was supported by its recovery of ex- life history stage, and geographical source are listed in pected relationships among six published animal se- table 1, along with GenBank accession numbers for their quences-two nematodes, an insect, a bird, rat, and hu- PEPCK sequences. Field-collected, live moths were man. However, our limited sampling of taxa did not temporarily stored dry at the temperature of liquid ni- permit a precise identification of the time frame over trogen or in 100% ethanol at 0°C for up to 3 days, fol- which PEPCK would be useful. lowed by long-term storage at -80°C. Storage at -20°C To gauge the phylogenetic utility of PEPCK more in 100% ethanol at least up to 1 year also yielded sat- fully, we have applied this gene to a test case consisting isfactory templates for this study. Specimens from each of basal divergences within the insect order Lepidoptera, of the collections are vouchered in freezers at the Uni- which are Mesozoic in age (fig. 1; Kukalova-Peck 1991; versity of Maryland, and all are authoritatively identi- Ross and Jarzembowski 1993; Labandeira et al. 1994).