Mitochondrial Substitution Rates Are Extraordinarily Elevated and Variable in a Genus of Flowering Plants
Total Page:16
File Type:pdf, Size:1020Kb
Mitochondrial substitution rates are extraordinarily elevated and variable in a genus of flowering plants Yangrae Cho†‡, Jeffrey P. Mower†, Yin-Long Qiu§, and Jeffrey D. Palmer¶ Department of Biology, Indiana University, Bloomington, IN 47405-3700 Contributed by Jeffrey D. Palmer, November 8, 2004 Plant mitochondrial (mt) genomes have long been known to evolve We now report unprecedented levels and variation in mt RS, slowly in sequence. Here we show remarkable departure from this much of which is confined to a single genus of flowering plants pattern of conservative evolution in a genus of flowering plants. (Plantago, a large group of cosmopolitan weeds). Moreover, Substitution rates at synonymous sites vary substantially among unlike previous cases of (modest) rate variation in plant mito- lineages within Plantago. At the extreme, rates in Plantago exceed chondria, rate variation in Plantago is specific to the mt genome. those in exceptionally slow plant lineages by Ϸ4,000-fold. The Radical and reversible changes in the underlying mt mutation fastest Plantago lineages set a new benchmark for rapid evolution rate are probably responsible for this rate variation. in a DNA genome, exceeding even the fastest animal mt genome by an order of magnitude. All six mt genes examined show Materials and Methods similarly elevated divergence in Plantago, implying that substitu- Most Plantago DNAs were provided by the Royal Botanic tion rates are highly accelerated throughout the genome. In Gardens (Kew, U.K.). All other DNAs are as described in ref. 24. contrast, substitution rates show little or no elevation in Plantago PCR products were generated, purified, and sequenced as for each of four chloroplast and three nuclear genes examined. described in refs. 25 and 26. Taxon names and GenBank These results, combined with relatively modest elevations in rates accession numbers for all sequences used in this study are listed of nonsynonymous substitutions in Plantago mt genes, indicate in Tables 2–4, which are published as supporting information on EVOLUTION that major, reversible changes in the mt mutation rate probably the PNAS web site. PCR primer sequences and aligned data sets underlie the extensive variation in synonymous substitution rates. are available upon request. These rate changes could be caused by major changes in any Phylogenetic relationships within Plantaginaceae (see Fig. 2A) number of factors that control the mt mutation rate, from the were determined from a 4,730-nt data set consisting of portions of production and detoxification of oxygen free radicals in the mito- four chloroplast regions (ndhF, rbcL, and intergenic spacers atpB͞ chondrion to the efficacy of mt DNA replication and͞or repair. rbcL and trnL͞trnF). Relationships within Plantago subgenus Plan- tago (see Fig. 2B) were analyzed from a 9,845-nt data set containing genome evolution ͉ plant mitochondria ͉ Plantago ͉ rate two additional chloroplast regions (intergenic spacers psaA͞trnS variation ͉ synonymous substitution rates and trnC͞trnD). Maximum likelihood (ML) trees were constructed with PAUP* by using the general time-reversible model, a gamma n a pioneering study 25 years ago, Brown et al. (1) discovered that distribution with four rate categories, and an estimate of the Iprimate mitochondrial (mt) DNA evolves rapidly at the sequence proportion of invariant sites. The rate matrix, base frequencies, level compared with nuclear DNA. With rare exception (2), most shape of the gamma distribution, and proportion of invariant sites animal mt DNAs have been found to evolve rapidly in sequence were estimated before the ML analysis from a neighbor-joining tree (3–6). Rapid mt evolution may be the rule in other groups of constructed from the data. eukaryotes, although this conclusion must be tempered by the Divergence times outside Plantaginaceae were taken from ref. scanty data and distant comparisons available for most groups (7, 27. Those within the family were calculated by using a penalized 8). Plants are the most glaring exception to the general rule that mt likelihood approach (28) as implemented in the R8S program (29) DNA evolves rapidly in sequence. In 1987, Wolfe et al. (9) showed and a time constraint of 48 million years (27) for the Antirrhinum͞ that rates of synonymous substitution in angiosperm mt genes are Plantago split. The ML tree shown in Fig. 2A was used as the anomalously low, a few-fold lower than in chloroplast genes, Ϸ10- starting tree for the divergence time analysis shown in Fig. 2C. For to 20-fold lower than in nuclear genes of both angiosperms and Fig. 2D, the starting tree was constructed by first constraining the mammals, and Ϸ50- to 100-fold lower than in mammalian mt genes. taxa in the 4,730-nt data set to incorporate the alternative relation- A year later, Palmer and Herbon (10) extended the inference of low ships within subgenus Plantago from Fig. 2B and then estimating rates of sequence change to the entire plant mt genome (most of branch lengths for this topology in PAUP*. A smoothing factor of which is noncoding) and showed that rates of sequence and structural evolution are dramatically uncoupled in plant mt DNA. All subsequent studies have confirmed that nucleotide substitu- Abbreviations: mt, mitochondrial; ML, maximum likelihood; SSB, substitutions per site per tion rates are in general quite low in land plant mt genomes (11, 12). billion years; dN, substitutions per nonsynonymous site; dS, substitutions per synonymous site; R , synonymous substitution rate; SSU, small subunit; rDNA, rRNA-encoding DNA. At the same time, moderate variation in synonymous substitution S Data deposition: The nucleotide sequences reported in this paper have been deposited in rates (RS) (up to 7-fold) has been found in comparing several groups the GenBank database (accession nos. AJ389588–AJ389621 and AY818897–AY818951). of plants (13–16). In most cases, correlated rate changes are seen for †Y.C. and J.P.M. contributed equally to this work. chloroplast and͞or nuclear genes. Forces operating across the two ‡Present address: Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State organelle genomes or all three genomes, such as paternal trans- University, Blacksburg, VA 24061. mission of organelles or generation-time effects, respectively, have §Present address: Department of Ecology and Evolutionary Biology, University of Michigan, been invoked to explain these patterns, whereas a shift to hetero- Ann Arbor, MI 48109-1048. trophy has been invoked to explain correlated increases in rRNA ¶To whom correspondence should be addressed at: Department of Biology, Indiana Uni- substitution rates in parasitic plants (17). Phylogenetic studies versity, Jordan Hall 142, 1001 East Third Street, Bloomington, IN 47405-3700. E-mail: suggest rate variation in other groups of land plants (18–23), [email protected]. although these studies have lacked a quantitative focus. © 2004 by The National Academy of Sciences of the USA www.pnas.org͞cgi͞doi͞10.1073͞pnas.0408302101 PNAS ͉ December 21, 2004 ͉ vol. 101 ͉ no. 51 ͉ 17741–17746 Downloaded by guest on September 24, 2021 three was determined by using the R8S cross-validation procedure. substitution rates were calculated for a given tree branch by Different starting points of initial age estimates and reanalysis after dividing its synonymous branch length by the estimated diver- perturbation of the final age estimates had no effect on the results. gence time along that branch. Divergence times within Plantagi- Standard errors for each time were calculated by rerunning the naceae were calculated by using two different phylogenetic divergence time analyses on 500 bootstrapped data sets as described estimates for the group owing to the uncertain placement of P. in ref. 29. media, which shows the greatest mt divergence (Fig. 1). The first Branch lengths for the gene trees in Fig. 1 and Fig. 3, which is estimate was based on a 4,730-nt chloroplast data set. Apart from published as supporting information on the PNAS web site, were Plantago subgenus Plantago, and especially the placement of P. estimated by using PAML 3.13D (30) and topologically constrained media, relationships are well supported in this tree (Fig. 2A) and trees. Relationships within Plantaginaceae were constrained ac- in good agreement with previous molecular studies of the family cording to the analyses shown in Fig. 2 A and B. For all other taxa, (34, 35). To achieve more robust placement of P. media, the topologies were constrained according to ref. 31. Branch lengths, number of chloroplast characters was doubled (to 9,845 nt) for representing the number of substitutions per synonymous site (dS) subgenus Plantago and two outgroups. This increase in chloro- or number of substitutions per nonsynonymous site (dN), were plast characters gave a topology (Fig. 2B) in which P. media is determined for protein genes by using a simplified Goldman–Yang sister to P. rigida but with bootstrap support that is equally as low (GY94) codon model (32) with separate dN͞dS ratios () for each (53% versus 52%) as its placement in Fig. 2A as sister to the branch. Codon frequencies were computed by using the F3 ϫ 4 other three species of subgenus Plantago. The failure to resolve method (32). The transition͞transversion rate ratio () and dN͞dS relationships within subgenus Plantago by using nearly 10,000 nt, ratios were estimated during the analysis with initial values of 2 and together with the short internodes within the subgenus, implies 0.4, respectively. Standard errors for total branch lengths were a rapid radiation for these four species (see Fig. 4 for the reported by PAML, and these values were propagated to calculate single-gene chloroplast trees). standard errors for their corresponding dS and dN branch lengths. All estimates of Plantaginaceae divergence times and most Branch lengths representing total substitutions per site were esti- estimates of substitution rates are similar between the two mated for rRNA genes by using the general time-reversible nucle- analyses (Fig.