A Comparative Analysis of Whole Plastid Genomes from the : Expansion and Contraction of the Inverted Repeat, Mitochondrial to Plastid Transfer of DNA, and Identification of Highly Divergent Noncoding Regions Source: Systematic Botany, 40(1):336-351. Published By: The American Society of Taxonomists URL: http://www.bioone.org/doi/full/10.1600/036364415X686620

BioOne (www.bioone.org) is a nonprofit, online aggregation of core research in the biological, ecological, and environmental sciences. BioOne provides a sustainable online platform for over 170 journals and books published by nonprofit societies, associations, museums, institutions, and presses. Your use of this PDF, the BioOne Web site, and all posted and associated content indicates your acceptance of BioOne’s Terms of Use, available at www.bioone.org/page/terms_of_use. Usage of BioOne content is strictly limited to personal, educational, and non-commercial use. Commercial inquiries or rights and permissions requests should be directed to the individual publisher as copyright holder.

BioOne sees sustainable scholarly publishing as an inherently collaborative enterprise connecting authors, nonprofit publishers, academic institutions, research libraries, and research funders in the common goal of maximizing access to critical research. Systematic Botany (2015), 40(1): pp. 336–351 © Copyright 2015 by the American Society of Plant Taxonomists DOI 10.1600/036364415X686620 Date of publication February 12, 2015 A Comparative Analysis of Whole Plastid Genomes from the Apiales: Expansion and Contraction of the Inverted Repeat, Mitochondrial to Plastid Transfer of DNA, and Identification of Highly Divergent Noncoding Regions

Stephen R. Downie1,4 and Robert K. Jansen2,3 1Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, U. S. A. 2Department of Integrative Biology, The University of at Austin, Austin, Texas 78712, U. S. A. 3Department of Biological Sciences, Faculty of Science, King Abdulaziz University, P. O. Box 80141, Jeddah 21589, 4Author for correspondence ([email protected])

Communicating Editor: Mark P. Simmons

Abstract—Previous mapping studies have revealed that the frequency and large size of inverted repeat junction shifts in plastomes are unusual among angiosperms. To further examine plastome structural organization and inverted repeat evolution in the Apiales (Apiaceae + ), we have determined the complete plastid genome sequences of five taxa, namely Anthriscus cerefolium (154,719 base pairs), Crithmum maritimum (158,355 base pairs), verticillata (153,207 base pairs), Petroselinum crispum (152,890 base pairs), and filiformis subsp. greenmanii (154,737 base pairs), and compared the results obtained to previously published plastomes of Daucus carota subsp. sativus and Panax schin-seng. We also compared the five Apiaceae plastomes to identify highly variable noncoding loci for future molecular evolutionary and systematic studies at low taxonomic levels. With the exceptions of Crithmum and Petroselinum, which each demonstrate a ~1.5 kilobase shift of its LSC-IRB junction (JLB), all plastomes are typical of most other non-monocot angiosperm plastid DNAs in their structural organization, gene arrangement, and gene content. Crithmum and Petroselinum also incorporate novel DNA in the LSC region adjacent to the LSC-IRA junction (JLA). These insertions (of 1,463 and 345 base pairs, respectively) show no sequence similarity to any other region of their plastid genomes, and BLAST searches of the Petroselinum insert resulted in multiple hits to angiosperm mitochondrial genome sequences, indicative of a mitochondrial to plastid transfer of DNA. A comparison of pairwise sequence divergence values and numbers of variable and parsimony-informative alignment positions (among other sequence characteristics) across all introns and intergenic spacers >150 base pairs in the five Apiaceae plastomes revealed that the rpl32-trnL, trnE-trnT, ndhF-rpl32, 50rps16-trnQ,andtrnT-psbD intergenic spacers are among the most fast-evolving loci, with the trnD-trnY-trnE-trnT combined region presenting the greatest number of potentially informative characters overall. These regions are therefore likely to be the best choices for molecular evolutionary and systematic studies at low taxonomic levels. Repeat analysis revealed direct and inverted dispersed repeats of 30 base pairs or more that may be useful in population-level studies. These structural and sequence analyses contribute to a better understanding of plastid genome evolution in the Apiales and provide valuable new information on the phylogenetic utility of plastid noncoding loci, enabling further molecular evolutionary and phylogenetic studies on this economically, ecologically, and medicinally important group of flowering .

Keywords—Apiaceae, Araliaceae, intergenic spacer regions, intracellular transfer, plastid DNA.

The plastid genomes of the majority of photosynthetic closely related (Goulding et al. 1996; Plunkett and angiosperms are highly conserved in structural organization, Downie 2000). Large IR expansions (>1,000 bp) occur less gene arrangement, and gene content (Palmer 1991; Raubeson frequently and outnumber large contractions (Plunkett and and Jansen 2005; Wicke et al. 2011; Jansen and Ruhlman 2012; Downie 2000; Raubeson and Jansen 2005; Hansen et al. 2007). Ruhlman and Jansen 2014). Their hallmark is the presence of Because major changes in position of IR junctions can accom- two large duplicate regions in reverse orientation known as pany structural rearrangements elsewhere in the plastid the inverted repeat (IR), which separate the remainder of the genome (Palmer 1991; Boudreau and Turmel 1995; Aii et al. genome into large single copy (LSC) and small single copy 1997; Cosner et al. 1997; Perry et al. 2002; Chumley et al. 2006; (SSC) regions. Variation in size of this molecule is due most Haberle et al. 2008; Guisinger et al. 2011; Wicke et al. 2011), typically to the expansion or contraction of the IR into or out the availability of sequence data for these genomes can help of adjacent single-copy regions and/or changes in sequence elucidate the mechanisms responsible for these large-scale IR complexity due to insertions or deletions of novel sequences. junction shifts and other major genomic changes. Of the two equimolar structural isomers existing for plastid Previous ptDNA restriction site mapping studies have DNA (ptDNA; Palmer 1983), the structure most commonly shown that Apiaceae (Umbelliferae) exhibit unprecedented presented follows the convention used for Nicotiana tabacum variation in position of their LSC/IR boundary regions L. (tobacco) in which one copy of the IR is flanked by genes (Palmer 1985a; Plunkett and Downie 1999, 2000). While most ycf1 and trnH-GUG (and is designated as IRA) and the other umbellifer species surveyed possess a JLB indistinguishable copy is flanked by genes rps19 and ndhF (and is designated as from that of Nicotiana tabacum and the vast majority of other IRB; Shinozaki et al. 1986; Yukawa et al. 2005). The junctions , at least one expansion and seven different contrac- between the LSC region and each of these IR copies are des- tions of the IR relative to the N. tabacum JLB were detected in ignated as JLA (LSC/IRA) and JLB (LSC/IRB), and the junc- 55 species, each ranging in size from ~1–16 kilobase pairs tions flanking the SSC region are designated as JSA (SSC/IRA) (kb; Plunkett and Downie 2000). As examples, all examined and JSB (SSC/IRB; Shinozaki et al. 1986). In most non-monocot members of the “Aegopodium group,” representatives of angiosperm ptDNAs, JLB lies within the ribosomal protein tribes Careae and Pyramidoptereae (Downie et al. 2010), S10 operon in a more or less fixed position within or near have a ~1.1 kb expansion of JLB relative to that of N. tabacum. the rps19 gene and JLA lies just downstream of LSC gene Careae and Pyramidoptereae are monophyletic sister groups trnH-GUG. The IRs of angiosperm plastomes can fluctuate (Banasiak et al. 2013), indicating that IR junction shifts have greatly in size, although small contractions and expansions the potential to demarcate major clades within the family. of <100 base pairs (bp) are most frequent, and the positions Coriandrum sativum L. and Bifora radians M. Bieb., both of tribe of all four IR/single-copy junctions can vary even among Coriandreae, have the most contracted IRs, approximately 336 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 337

16.1 kb smaller than that of N. tabacum. Coriandrum also has a represent tribe Scandiceae subtribes Scandicinae and Daucinae, ~5.7 kb insertion of unknown composition in the vicinity of respectively, Crithmum is classified in tribe Pyramidoptereae, the 16S rRNA gene near the terminus of the IR (Plunkett Tiedemannia represents tribe Oenantheae (Feist et al. 2012), and and Downie 2000), a region subsequently identified through Petroselinum is placed in tribe Apieae (Downie et al. 2010). entire plastome sequencing as a duplication of genes trnH- Crithmum and Petroselinum are members of the “apioid GUG and psbA (Peery et al. 2006). Several of these junction superclade” (Downie et al. 2010), and Hydrocotyle and Panax shifts in Apiaceae are parsimony-informative and all are are classified in the closely related family Araliaceae (Nicolas restricted to the “apioid superclade” of Apiaceae subfamily and Plunkett 2009). To identify the most variable noncod- Apioideae, a large, morphologically heterogeneous group of ing loci, we compare the plastomes of the Apiaceae genera umbellifers comprising 14 tribes and other major clades of Anthriscus, Crithmum, Daucus, Petroselinum,andTiedemannia. dubious relationship (Plunkett and Downie 1999, 2000; The seven species considered herein represent disparate Downie et al. 2010). Plastid genome sequencing has also clades throughout the Apiales; they also represent plastomes revealed unique, noncoding DNA in the region bounded exhibiting major IR expansions and contractions (as assessed by JLA and trnH-GUG in several related species of Apiaceae previously through restriction site mapping studies), as well that does not show significant similarity to any other known as those possessing IR junctions in positions very similar to ptDNA sequence (Peery et al. 2007, 2011), and BLAST those of many other unrearranged angiosperm plastid genomes. searches of these sequences suggested a mitochondrial to This information on Apiales plastome structural organization, plastid transfer of DNA (Peery et al. 2011). Subsequent stud- IR junction shifts, and noncoding loci variability will help ies have confirmed other instances of mitochondrial DNA guide subsequent systematic and evolutionary studies of this transfer into the plastome of Apiaceae (Goremykin et al. economically, ecologically, and medicinally important group 2009; Iorizzo et al. 2012a,b). Thus, the Apiaceae provide a of flowering plants. model system in which to study the mechanisms leading to large-scale expansions and contractions of the IR, with con- Materials and Methods comitant novel insertions near JLA either the cause or conse- quence of these major IR junction shifts. Plastid Isolation, Amplification, and Sequencing—Fresh of The Apiaceae have been the subject of numerous phyloge- Anthriscus and Petroselinum were obtained at a local grocery store, and living plants of Crithmum (UCONN 198501242), Hydrocotyle (UCONN netic studies using ptDNA and while these studies have con- 198501441), and Tiedemannia (as greenmanii Mathias & Con- tributed greatly to a broad understanding of its evolutionary stance; UCONN 200202456) were obtained from the Ecology and Evolu- history, uncertainties remain with regard to many species- tionary Biology Greenhouse Collections at the University of Connecticut. level relationships, particularly those of the largest genera Leaves from multiple individual plants were combined to provide enough material for the plastid isolations. Vouchers are deposited at the within the “apioid superclade” (reviewed in Downie et al. University of Illinois herbarium (ILL). 2001, 2010). These uncertainties are the result of a paucity of Approximately 5–10 g of fresh, young tissue was used in each variable DNA sequence characters obtained from the rela- plastid genome isolation. Plastids were isolated using the sucrose step- tively few plastid loci examined. The earliest of these studies gradient centrifugation method (Palmer 1986) and the extraction proce- focused on gene (matK exons, rbcL) and intron (rpl16, rps16, dures outlined in Jansen et al. (2005). The plastid pellet was resuspended in wash buffer to a final volume of 2 ml. Whole plastid genome amplifi- rpoC1) sequences (Downie et al. 1996, 2000; Plunkett et al. cation was performed with rolling circle amplification (RCA) using a 1996a,b; Downie and Katz-Downie 1999), whereas subse- REPLI-g midi kit (Qiagen Inc., Valencia, California). The manufacturer’s quent studies examined primarily the five noncoding regions protocol for amplification of genomic DNA from cells was followed, with between genes trnK and psbI, and intergenic spacer regions the following modification to improve ptDNA amplification (M. Guisinger, unpubl.): 1.5 ml of alkaline lysis solution (10 ml of solution A in Jansen et al. trnH-psbA, trnD-trnT, and trnF-trnL-trnT (Calvin˜o and Downie 2005, activated with 1.0 ml1MDTT)wasaddedto1.0ml of isolated plastids 2007; Downie et al. 2008; Degtjareva et al. 2009; Nicolas and and 4.0 ml of PBS and incubated on ice for 10 min. Subsequently, 3.5 mlof Plunkett 2009; Sun and Downie 2010). Rates of nucleotide stop solution was added to this mixture. Each RCA reaction consisted of 0.5–1.0 ml of lysate, 23.0–23.5 ml of REPLI-g midi reaction buffer, and 1.0 ml change between these regions, as well as between other plastid of REPLI-g midi DNA polymerase. The reaction was incubated at 30 Cfor loci, can vary tremendously within a given taxonomic group 16 hr and terminated at 65 for 3 min. After confirmation that the RCA was (Shaw et al. 2005); therefore, it is useful to investigate the successful by running out 2 ml of the product on a minigel, another 2 mlof relative utility of previously unexplored or underutilized non- RCA product was then digested with 20 units of EcoRI in a 20-ml reaction, coding regions among Apiaceae plastomes for their potential run out on a 1% agarose gel with a low DNA mass ladder, and stained and visualized with ethidium bromide. The bright, discrete banding patterns in resolving phylogenetic relationships at low taxonomic relative to the nuclear DNA background ensured that the amplification levels within the family. The most variable noncoding regions product contained sufficient quality and quantity of ptDNA to proceed uncovered can also be considered for intraspecific phylogeo- with genome sequencing. graphic and population genetic studies. The amplified genomic DNA was sent to the W. M. Keck Center for Comparative and Functional Genomics at the University of Illinois at To characterize plastome structural organization and IR Urbana-Champaign, sheared by nebulization, subjected to 454 library evolution in the Apiales, we report the complete genome preparation, and then sequenced using the Roche/454 Genome sequences of Anthriscus cerefolium (L.) Hoffm. (), Sequencer (GS) FLX platform (454 Life Sciences Corp., Branford, Connect- Crithmum maritimum L. (sea samphire), Hydrocotyle verticillata icut), following the manufacturer’s protocol. For two genomes that were Thunb. (whorled marshpennywort), Petroselinum crispum sequenced at two later dates (Hydrocotyle, Tiedemannia), GS FLX Titanium series reagents and plates were used. (Mill.) Fuss (parsley), and Tiedemannia filiformis (Walter) Feist Genome Assembly, Finishing, and Annotation—The obtained nucleo- & S. R. Downie subsp. greenmanii (Mathias & Constance) Feist tide sequence reads were assembled de novo into contiguous sequences & S. R. Downie (giant water cowbane), and compare the (“contigs”) using Newbler 2.3, a DNA sequence assembly software pack- results obtained to previously published plastomes of Daucus age designed specifically for the 454 GS FLX sequencing platform, using default settings (90% minimum overlap identity). To identify contigs that carota L. subsp. sativus Schu¨bl. & G. Martens (domesticated were plastid sequences, all contigs were searched against NCBI’s Nucle- carrot; Ruhlman et al. 2006) and Panax schin-seng T. Nees otide Collection database using BLAST’s megablast option (Altschul et al. (Korean ginseng; Kim and Lee 2004). Anthriscus and Daucus 1990). Putative gene identifications in each plastid contig were then made 338 SYSTEMATIC BOTANY [Volume 40 using the annotation program DOGMA (Dual Organellar GenoMe Anno- amplified and sequenced together (i.e. trnD-trnY-trnE-trnT; trnS-psbZ- tator; Wyman et al. 2004). These contigs were then compared with previ- trnG-trnfM; psbB-psbT-psbN-psbH; rpl36-infA-rps8-rpl14). Sequences from ously published reference sequences from the complete plastid genomes the five Apiaceae genomes were aligned using ClustalX and, if necessary, of Daucus carota subsp. sativus (GenBank accession number DQ898156) manually adjusted using PAUP* 4.0b10 (Swofford 2002) to produce an and Panax schin-seng (AY582139) and assembled using Sequencher 4.7 alignment with the fewest number of changes (indels or nucleotide sub- (Gene Codes Corp., Ann Arbor, Michigan). The lack of genome rear- stitutions). The comparison of these 74 regions included numbers of rangements, low frequency of repeated sequences, and availability of constant, uninformative, parsimony-informative, and variable alignment gene mapping data for three of the five Apiales plastid genomes positions, range of sequence divergence in pairwise comparisons (uncor- (Plunkett and Downie 1999; Lee and Downie 2000) greatly facilitated rected “p” distance), and numbers of total and parsimony-informative assemblies. Gaps between contigs were filled in (or the contigs deter- indels. The number of nucleotide substitutions (uninformative + informa- mined to abut, because there was no missing sequence between adjoining tive alignment positions) and indels for each ptDNA region was tallied; contigs) by identifying sequences at both ends of adjacent contigs and this number divided by the total aligned length resulted in the proportion using these regions to pull out high-quality individual sequence reads of observed mutational events (or percent variability) for each region from the 454 FASTA-formatted output. A minimum of 20 overlapping compared (Shaw et al. 2005). sequence reads in each direction was aligned using ClustalX (Jeanmougin et al. 1998) and their consensus sequence used to link the contigs. This same method was used to identify IR-single copy junctions. Sequences Results between a dozen pairs of contigs were confirmed by designing PCR primer pairs (18–20 bp in size) flanking gaps and Sanger sequencing Plastome Assemblies—Newbler assemblies of the nucle- using an ABI Prism BigDye Terminator v3.1 ready reaction cycle sequenc- otide sequence reads resulted in 54 (Anthriscus)to333 ing kit (Applied Biosystems, Foster City, California) and an ABI 3730xl (Hydrocotyle) non-redundant contigs, a portion of each of DNA Analyzer. Because breaks between contigs generally occurred at or which blasted to published plastid genomes. Average read near IR-single copy junctions and in noncoding regions characterized by homopolymer runs, this Sanger sequencing verified IR boundaries and lengths varied from about 225 bp (Anthriscus, Crithmum, confirmed homopolymeric repeats in these regions. Genomic data for Petroselinum) to 336 and 381 bp (in Hydrocotyle and Tiedemannia, three taxa (Anthriscus, Petroselinum, Crithmum) were also assembled into respectively). Average sequencing depth ranged from about contigs using MIRA (Mimicking Intelligent Read Assembly), under 90 + (Anthriscus, Tiedemannia) to 350 + (Hydrocotyle). A compar- slightly less stringent alignment settings (Chevreux et al. 2000). DOGMA ison of assemblies of the five Apiales plastid genomes is pre- was implemented to assist in annotating all genes and to identify coding sequences, tRNAs, and rRNAs using the plastid/bacterial genetic code; sented in Table 1. start and stop codons were added manually, as were small exons. Exon- Plastid contigs assembled using MIRA were substantially intron boundaries were determined by detailed comparison with other longer than the Newbler assemblies and, in most cases, con- annotated genomes and individual gene sequences, especially that of firmed the consensus sequences that were assembled manu- N. tabacum (Shinozaki et al. 1986). The identification of tRNAs was con- ally from the individual 454 reads. As one example, a MIRA firmed using tRNAscan-SE (Lowe and Eddy 1997; Schattner et al. 2005). Following convention and for annotation purposes, numbering begins re-assembly of Crithmum under the “accurate setting” with the first base of the LSC region following JLA and proceeds counter- resulted in one 68-kb contig spanning many AT-rich clockwise around the circular genome. intergenic spacers, whereas the Newbler-generated assembly Plastome and Sequence Comparisons—Plastid genome sequences resulted in 18 smaller contigs covering the same ptDNA (minus one copy of the IR) of Anthriscus, Crithmum, Petroselinum,and Tiedemannia were compared to the reference Daucus using MultiPipMaker region. In general, Newbler generated more contigs by break- (Schwartz et al. 2003). The options “Search both strands” and “Show all ing at homopolymer runs, even though high-quality reads matches” were chosen. REPuter (Kurtz and Schleiermacher 1999) was existed at the ends of all contigs. In only a few instances did used to identify and locate forward (direct) and inverted (palindromic) the MIRA assemblies differ from those constructed using repeat sequences. Searches of plastomes (upon the removal of one copy of Newbler and these were all single nucleotide differences in the IR) were carried out with minimal repeat size ³30 bp and a Hamming distance of 3. Because REPuter overestimates the number of repetitive regions of poly-A/T uncertainties in intergenic spacers. elements in a given sequence by recognizing nested or overlapping Plastome Structural Characteristics—With the exceptions repeats within a given region containing multiple repeats (Chumley of Crithmum and Petroselinum, where JLB and JLA differ and et al. 2006; Timme et al. 2007), all redundant repeats were identified unique DNA has been incorporated into the LSC region adja- manually and excluded. Circular genomic maps were constructed using OGDraw (Organellar Genome Draw; Lohse et al. 2007). cent to the JLA boundary, all plastomes possess the ancestral The “Extract Sequences” option of DOGMA was used to extract angiosperm plastid genome organization. These genomes intergenic spacers and introns from the annotated sequences in order to ranged in size from 152,890 bp (Petroselinum) to 158,355 bp construct data matrices to identify highly divergent regions. The more (Crithmum), with size extremes caused by contraction or divergent Araliaceae representatives, Panax and Hydrocotyle, were expansion of the IR at J (Table 2). All plastomes except excluded from these analyses. A total of 70 noncoding ptDNA regions LB were compared, representing all intergenic spacers >150 bp in Daucus and Crithmum and Petroselinum share identical complements of introns from both LSC and SSC regions. Coding regions were excluded coding genes, each containing 30 unique tRNA genes, four because they tend to provide fewer variable characters than noncoding unique rRNA genes, and 79 unique protein-coding genes. regions; similarly, all regions contained in the IR were also excluded Allowing for duplication of genes in the IR, there are a total because many of them diverge at slower rates compared to sequences located in their adjacent single-copy regions (Wolfe et al. 1987; Perry and of 130 complete, predicted coding regions in these plastomes. Wolfe 2002; Kim and Lee 2004). Four regions of combined noncoding and Eighteen genes have introns, 16 of which contain a single coding loci were also compared, as these loci are small enough to be PCR- intron, while two (ycf3, clpP) have two introns. All maintain

Table 1. Comparison of assemblies of five Apiales plastid genomes.

Feature Anthriscus cerefolium Crithmum maritimum Hydrocotyle verticillata Petroselinum crispum Tiedemannia filiformis subsp. greenamnii Total no. of non-redundant contigs 54 142 333 162 219 No. of plastid contigs 13 50 63 14 8 Plastid contig size (bp) 241–68,210 120–19,171 110–26,905 242–75,583 3,130–72,953 Average plastid contig size (bp) 9,891 2,596 2,049 9,144 16,019 No. of reads assembled 53,255 129,820 131,886 65,608 29,867 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 339

Table 2. Comparison of major structural features of seven Apiales plastid genomes. Features of Panax schin-seng are provided by Kim and Lee (2004) and those of Daucus carota are provided by Ruhlman et al. (2006).

Panax Hydrocotyle Daucus carota Anthriscus Tiedemania filiformis Petroselinum Crithmum Feature schin-seng verticllata subsp. sativus cerefolium subsp. greenmanii crispum maritimum GenBank accession number AY582139 HM596070 DQ898156 GU456628 HM596071 HM596073 HM596072 Entire plastid genome size (bp) 156,318 153,207 155,911 154,719 154,737 152,890 158,355 Large single copy region size (bp) 86,106 84,352 84,242 84,774 84,585 86,116 85,230 Small single copy region size (bp) 18,070 18,739 17,567 17,551 17,140 17,508 17,139 Inverted repeat size (bp) 26,071 25,058 27,051 26,197 26,506 24,633 27,993 A + T content (%) 61.9 62.4 62.3 62.6 62.7 62.2 62.4 Number of repeats ³ 30 bp 11 (57) 9 (41) 14 (70) 12 (41) 20 (39) 18 (41) 29 (71) (maximum size in bp) Number of homopolymers 270 (244; 26) 283 (257; 26) 303 (281; 22) 320 (293; 27) 291 (262; 29) 298 (271; 27) 326 (300; 26) ³ 7bp(AorT;GorC) Homopolymer length (bp) 7–13 (7–13; 7–19 (7–19; 7–17 (7–17; 7–21 (7–21; 7–14 (7–14; 7–16 (7–15; 7–20 (7–20; (A or T; G or C) 7–11) 7–11) 7–11) 7–11) 7–11) 7–16) 7–12)

conserved intron boundaries, and with the exception of the Tiedemannia,JLB occurs within rps19 resulting in the duplica- sole group I intron found in trnL-UAA, all are group II tion of a portion of this gene (56 bp) in IRA, and there are 2 bp introns (Michel et al. 1989). The intron-containing gene rps12 of noncoding sequence between JLA and trnH-GUG. The is trans-spliced; its 50 end exon (50rps12) occurs in the LSC Hydrocotyle verticillata plastome contains the same complement region and its second and third exons (30rps12) reside some of uniquely occurring and duplicated genes as in Anthriscus 27 kb away in IRB separated by an intron (an additional copy and Tiedemannia.Here,JLB extends into rps19 resulting in the 0 of 3 rps12 with intron exists in IRA). Similar to other angio- duplication of a portion of this gene (62 bp) in IRA, and there sperm plastid genomes, Apiales plastomes are AT-rich. are 6 bp of noncoding sequence between JLA and trnH-GUG. A comparison of the major structural features of each of Relative to the previous three plastomes, the IR of Petroselinum these five plastid genomes, along with comparable data for crispum has contracted ~1.5 kb such that all of rps19 and most the previously published Daucus and Panax plastomes, is pro- of rpl2 are now single-copy. JLB is contained within rpl2 and a vided (Table 2, Fig. 1). In the Anthriscus cerefolium plastome small portion of this gene (37 bp) is duplicated at the end of JLB occurs within gene rps19, resulting in the duplication of a IRA.InPetroselinum, there are now only four intron-containing 0 portion of this gene (99 bp) in IRA at JLA. There are 2 bp of genes duplicated in the IR, not five. Between JLA and the 3 end 0 noncoding sequence between JLA and the 3 end of gene trnH- of trnH-GUG in the LSC region there are 345 bp of non- GUG in the LSC region. The Tiedemannia filiformis subsp. coding sequence. Unlike the previous four genomes, the IR of greenmanii plastid genome contains the same complement of Crithmum maritimum has expanded outwards ~1.5 kb such uniquely occurring and duplicated genes as in Anthriscus.In that all of rps19 and previously single-copy genes rpl22 and

Fig. 1. Comparison of IR-single copy border positions across seven Apiales plastid genomes (vertical lines), with sizes of each of the four major plastid genome components indicated (LSC, IRB, SSC, IRA). The various lengths of complete genes (rps19, rpl2, ycf1), pseudogenes denoted by asterisks (ycf1*, rps19*, rpl2*), and intergenic spacer regions (JSB-ndhF,JLA-trnH) adjacent to JLB,JSB,JSA,andJLA are also indicated. In Crithmum,JLB occurs between 0 genes rpl16 and rps3, whereas in Petroselinum,JLB occurs in the rpl2 5 exon. In all other Apiales examined, JLB occurs at different positions within rps19.In Hydrocotyle and Crithmum, the final 32 and 7 nucleotides, respectively, of IRB adjacent to JSB are shared by the genes ycf1*andndhF, in opposite transcriptional orientations. Crithmum and Petroselinum have incorporated large regions of novel DNA (of 1,463 and 345 bp, respectively) into JLA-trnH. 340 SYSTEMATIC BOTANY [Volume 40 rps3 are now contained within the IR. Nine protein-coding identity of ³90% among the five newly sequenced plastomes genes are contained entirely within the IR, not six as in the (Table 2). These repeated sequences were detected in coding other plastomes, and between JLA and trnH-GUG there are regions (primarily ycf2 and the three serine transfer-RNA 1,463 bp of noncoding sequence. The circular plastome gene [trnS] genes that recognize different codons, but also in psaA map of Crithmum can be viewed as online supplementary data and psaB), introns (ycf3, petB, petD, ndhB, and ndhA), and most (supplementary Fig. S1). commonly, within 21 intergenic spacer regions scattered Junctions JSB and JSA are in the same relative gene positions throughout the entire plastomes. The majority of these repeti- 0 in all seven Apiales plastomes (Fig. 1). JSB occurs within the 3 tive elements were direct repeats. Maximum repeat sizes end of ndhF (Hydrocotyle, Crithmum) or between genes ndhF ranged from 39 bp (Tiedemannia)to71bp(Crithmum). Several and ycf1*, the latter existing as a pseudogene in IRB (Panax, of these repeats occurred in the same locations and were Daucus, Anthriscus, Tiedemannia, Petroselinum). In Hydrocotyle shared by all five Apiaceae plastomes, either within genes, and Crithmum, the final 32 and 7 nucleotides, respectively, of introns, or intergenic spacers. IRB are shared by genes ycf1* and ndhF, albeit transcribed in Repeat analysis of the 345-bp region between JLA and trnH opposite directions (Fig. 1). Ycf1 varies in size from 5,382– of Petroselinum failed to detect any direct or inverted repeats 5,760 bp, and possesses numerous indels. In all seven ³30 bp. In contrast, a series of direct repeats characterize the plastomes, JSA occurs in ycf1, but at different positions within 1,463-bp Crithmum JLA-trnH region, with the four largest of this large and variably sized gene. As examples, in the these being perfect or nearly perfect repeats of 71, 67, 49, and Tiedemannia plastome, 2,069 bp of ycf1 are duplicated in the 39-bp. Many smaller repeats were also identified through IR, whereas in Hydrocotyle only 768 bp of ycf1 are duplicated. manual examination, with the largest of these being 27- and Gene rps19, at 279 bp in size in all plastomes, overlaps JLB in 21-bp in size, plus an 18-bp repeating unit occurring three Panax, Hydrocotyle, Daucus, Anthriscus, and Tiedemannia, and times in tandem. All of these repeats were restricted to posi- as a result various lengths (51–99 bp) of rps19 pseudogenes tions 4–943 of the Crithmum plastome. ³ are located adjacent to JLA in IRA (Fig. 1). The number of homopolymers 7 bp in each of the five An examination of DNA sequences flanking JSB and JSA newly sequenced plastid genomes ranged between 283 and across all seven species reveals that, other than Daucus and 326, with A or T polymers outnumbering G or C polymers by Anthriscus (both Apiaceae tribe Scandiceae) whose IR-SSC ratios of 9.0–11.5 to 1 (Table 2). Generally, the largest homo- junctions are in identical positions, there are no significant polymers were composed of A’s or T’s, the largest being sequence motifs or repetitive elements in common at or near 21-bp in size (Table 2). In each genome, the majority of homo- these junctions which may have instigated IR boundary polymers was 7 bp long (173–186), followed by 8 bp long changes. Alignment of DNA sequences flanking JLA across (42–65), then 9 bp long (28–40). As the length of the homo- Anthriscus, Daucus, Hydrocotyle, Tiedemannia, Panax,and polymer increased, fewer numbers were identified. Homo- Araliaceae species Tetrapanax papyrifer (Wang et al. 2008) and polymer sequences of >13 bp occurred infrequently in each Eleutherococcus senticosus (Yi et al. 2012), also revealed no com- genome (0–6). Among the largest homopolymers, the single monsequenceoneithersideofJLA. Differences among these 21-bp poly-A homopolymer (Anthriscus) occurred between seven species are due to length rather than point mutations, IR genes rpl2 and rps19*; the single 20-bp poly-A homopoly- and variable poly(A) tracts are apparent between genes rpl2 mer (Crithmum) occurred between LSC genes trnT-GGU and and rps19*, ranging in length from A8 in Eleutherococcus, Panax, psbD; the single 19-bp poly-T homopolymer (Hydrocotyle) and Tetrapanax,toA21 in Anthriscus. The sequences flanking occurred between LSC genes rps8 and rpl14; and the single JLA in Crithmum and Petroselinum plastomes also each appear 18-bp poly-A homopolymer (Hydrocotyle)occurredinSSC to be unique, although in these species novel, noncoding DNA gene ccsA. The vast majority of homopolymers, and all but 0 is apparent between JLA and 3 trnH-GUG. one of the largest ones (>13 bp), occurred in intergenic spacers; The positions and percent identities of gap-free segments several others occurred in the variable and large hypothetical of pairwise alignments of Anthriscus, Crithmum, Tiedemannia, coding frames ycf1 and ycf2. and Petroselinum with Daucus are shown as a MultiPip (Fig. 2). Novel DNA—Both Petroselinum and Crithmum contain The positions of the genes shown on the MultiPip are relative novel DNA sequences between JLA and LSC gene trnH-GUG to the Daucus plastome coordinates, the reference sequence. of 345 and 1,463 bp, respectively (Fig. 1). In contrast, the Most of the Daucus sequence aligns with that of the other four other three newly sequenced plastomes are similar to Daucus species of Apiaceae. Coding regions are represented as mostly and Panax in having only 2–6 bp of noncoding sequence in unbroken aligning segments of relatively high percent identity this region. These two large insertions show no sequence (95–98%); noncoding regions, on the other hand, show more similarity to any other portions of their respective plastid mismatches and indels, with percent identity values much genomes (by comparing these insert sequences to their trun- lower. Highly variable noncoding regions are scattered cated plastid genomes under varying degrees of stringency), throughout the single-copy regions and include intergenic nor do they match any ptDNA data currently available in spacers ndhF-rpl32 and rpl32-trnL in the SSC region and GenBank. A BLAST search querying the 345-bp Petroselinum 50trnK-30rps16,50rps16-trnQ,andtrnD-trnY-trnE-trnT in the sequence resulted in multiple hits to several angiosperm LSC region. A large segment of the Daucus plastome is not mitochondrial genome sequences. The best alignment score, present in any of the secondary sequences. This region occurs with 35% query coverage (positions 224–345, numbering 0 in the IR between genes 3 rps12 and trnV-GAC at Daucus coor- beginning the first bp in the LSC region following JLA and dinates ~99,300–100,800 and has been identified previously as proceeding counter-clockwise), showed 91% sequence simi- a putative mitochondrion to plastid transfer region (Goremy- larity to a mitochondrial DNA (mtDNA) intergenic spacer kin et al. 2009; Iorizzo et al. 2012a,b). between genes cytochrome b (cob) and ORF25 in several Repeat Analysis—Repeat analysis identified 9–29 direct Daucus carota subsp. sativus cultivars and breeding lines (Bach and inverted repeats of 30 bp or longer with a sequence et al. 2002). The same high match was obtained against the 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 341

Fig. 2. Multiple percent identity plots (MultiPip) comparing the Daucus plastid genome (as the reference sequence) to Anthriscus, Tiedemannia, Petroselinum,andCrithmum plastid genomes using MultipPipMaker (Schwartz et al. 2003). IRA is excluded. The top line is a gene map of Daucus (Ruhlman et al. 2006); introns are depicted by white boxes. Sequence similarity of aligned regions in Anthriscus, Tiedemannia, Petroselinum,andCrithmum is shown as horizontal bars in each plot indicating average percent identity between 50–100% (vertical axis). The horizontal axis represents the coordinates in the Daucus plastid genome. Parallel lines, as in psaA and psaB, indicate repeated sequence. 342 SYSTEMATIC BOTANY [Volume 40

Fig. 2. continued. complete Daucus carota subsp. sativus mitochondrial genome in Table 3. Aligned lengths for each locus ranged from 172 sequence, at positions 193,593–193,712 (Iorizzo et al. 2012b) (trnfM-rps14) to 1,545 bp (50rps16-trnQ) and the number of representing an intergenic spacer also bounded by genes cob parsimony-informative positions ranged from two (for sev- and atp4 (ORF25). A BLAST search querying the 1,463-bp eral loci) to 73 (trnE-trnT). The number of variable alignment novel sequence of Crithmum resulted in no significant align- positions ranged from seven (trnfM-rps14) to 337 (rpl32-trnL). ments; however, 39 bp of this sequence immediately adjacent There was no obvious relationship between the length of the to trnH-GUG showed 87% sequence similarity to a portion aligned sequence and the number of variable or parsimony- of the 345-bp Petroselinum insert and to the Daucus carota informative positions, a relationship consistent with other mitochondrial genome fragment. The Crithmum trnH-psbA plastome sequence comparisons (Shaw et al. 2007). Pairwise region also contains novel DNA and this region was unalign- sequence divergence values for each locus ranged from 0– able with the other Apiales species examined; a BLAST 3.1% for trnfM-rps14 to 8.5–26.2% for rpl32-trnL. The number search of this sequence resulted in a single hit to the same of indels in each region ranged from two (rpoC2-rpoC1)to spacer region in another accession of Crithmum maritimum 55 (ndhF-rpl32) and the number of parsimony-informative (Degtjareva et al. 2009). indels ranged from zero (for multiple loci) to 10 (50rps16- Identification of Divergent Regions—A comparison of trnQ). Percent variability ranged from 6.4 (trnfM-rps14)to sequence features across five Apiaceae plastid genomes and 38.7% (rpl32-trnL). Of the four combined regions exam- 70 noncoding ptDNA loci (>150 bp in Daucus) is presented ined, trnD-trnY-trnE-trnT was the most variable, with 30.5% 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 343

Table 3. Comparison of 74 noncoding loci from five plastid genomes of Apiaceae (Daucus, Anthriscus, Tiedemannia, Petroselinum, and Crithmum). For the individual intergenic spacers and introns, the loci are listed as they occur on the Crithmum ptDNA map (Fig. S1), starting at JLA, proceeding counter- clockwise, and excluding the two IR regions. These noncoding loci represent all intergenic spacers >150 bp and introns from both LSC and SSC regions. The trnH-psbA locus excludes Crithmum because of alignment ambiguities.

Parsimony- Parsimony- Aligned length Constant Uninformative informative Variable Sequence Total informative Percent Locus Length (bp) (bp) positions positions positions positions divergence (%) indels indels variability trnH-psbA 180–191 197 165 29 3 32 5.5–12.5 10 1 21.3 psbA-30trnK 204–259 265 226 32 7 39 5.4–12.6 8 0 17.7 30trnK-matK 221–283 284 268 13 3 16 1.8–4.4 3 0 6.7 matK-50trnK 716–736 760 709 44 7 51 1.4–3.7 16 2 8.8 50trnK-30rps16 508–751 817 641 139 37 176 9.1–16.2 22 1 24.2 rps16 intron 837–875 902 809 83 10 93 2.3–6.7 16 3 12.1 50rps16-trnQ 1,087–1,432 1,545 1,307 204 34 238 3.8–11.9 40 10 18.0 trnQ-psbK 343–362 363 341 14 8 22 0.9–4.0 6 0 7.7 psbK-psbI 316–412 434 361 59 14 73 3.6–12.8 15 3 20.3 trnS-50trnG 510–557 605 528 63 14 77 3.8–10.0 18 3 15.7 trnG intron 698–712 726 654 50 22 72 3.9–6.6 12 2 11.6 atpF intron 721–741 770 682 79 9 88 1.1–8.4 19 5 13.9 50atpF-atpH 275 – 412 420 361 49 10 59 3.0–9.6 8 1 16.0 atpH-atpI 742–1169 1,209 1,028 154 27 181 3.5–9.5 28 4 17.3 atpI-rps2 264–290 338 320 16 2 18 1.9–4.3 10 2 8.3 rps2-rpoC2 209–230 257 234 18 5 23 0.5–7.3 8 1 12.1 rpoC2-rpoC1 211–214 216 202 10 4 14 1.9–4.3 2 0 7.4 rpoC1 intron 742–751 775 716 47 12 59 2.2–4.2 15 2 9.5 rpoB-trnC 1,189–1,272 1,360 1,155 178 27 205 4.3–10.5 42 6 18.2 trnC-petN 475–690 755 662 82 11 93 3.7–8.8 25 4 15.6 petN-psbM 1,073–1,131 1,213 1,031 150 32 182 3.8–9.0 33 6 17.7 psbM-trnD 613 – 689 704 603 86 15 101 4.4–9.9 12 1 16.1 trnE-trnT 599–794 850 575 202 73 275 13.9–23.9 42 3 37.3 trnT-psbD 1,235–1,465 1,526 1,298 187 41 228 3.7–10.0 35 7 17.2 psbC-trnS 229–254 262 242 18 2 20 2.1–5.3 7 2 10.3 trnS-psbZ 340–394 400 357 33 10 43 3.7–6.6 12 2 13.8 psbZ-trnG 276–299 317 285 28 4 32 2.1–6.4 12 1 13.9 trnfM-rps14 159–166 172 165 5 2 7 0–3.1 4 2 6.4 psaA-ycf3 exon3 697–721 749 683 56 10 66 2.1–5.6 15 0 10.8 ycf3 intron2 735–784 794 730 55 9 64 1.3–5.3 11 1 9.4 ycf3 intron1 697–731 755 695 53 7 60 2.1–4.3 13 3 9.7 ycf3 exon1-trnS 693–877 903 759 122 22 144 3.1–11.2 28 1 19.0 trnS-rps4 186–320 328 278 45 5 50 4.9–8.9 9 2 18.0 rps4-trnT 353–382 408 347 43 18 61 4.1–10.5 16 3 18.9 trnT-50trnL 766–803 863 730 107 26 133 3.7–11.1 29 3 18.8 trnL intron 496–518 522 492 28 2 30 0.8–4.8 5 1 6.7 30trnL-trnF 310–375 377 334 39 4 43 2.9–7.7 12 2 14.6 trnF-ndhJ 385–399 427 365 48 14 62 3.1–9.9 12 2 17.3 ndhC-30trnV 830–1,102 1,152 942 181 29 210 2.3–16.7 43 7 22.0 trnV intron 558–568 570 527 33 10 43 1.4–4.1 8 1 8.9 trnV-trnM 173–183 183 167 12 4 16 2.4–5.2 3 0 10.4 atpB-rbcL 742–770 839 742 86 11 97 1.3–8.1 22 0 14.2 rbcL-accD 582– 605 631 573 45 13 58 2.9–5.9 15 3 11.6 accD-psaI 370– 479 525 443 73 9 82 4.2–14.1 26 3 20.6 psaI-ycf4 386– 422 449 397 43 9 52 3.6–7.1 18 1 15.6 ycf4-cemA 682–863 926 837 75 14 89 2.4–7.2 14 1 11.1 cemA-petA 229–267 275 253 20 2 22 1.2–6.3 7 0 10.5 petA-psbJ 733–967 1049 888 144 17 161 5.7–9.4 41 4 19.3 psbE-petL 916 –1,160 1,215 1,068 124 23 147 3.2–7.5 27 1 14.3 trnP-psaJ 378–384 386 338 39 9 48 3.2–8.2 4 1 13.5 psaJ-rpl33 437–462 486 403 75 8 83 3.6–10.3 18 4 20.8 rps18-rpl20 202–230 235 193 35 7 42 1.8–10.8 9 4 21.7 rpl20–50rps12 769–785 807 738 58 11 69 2.2–4.7 13 2 10.2 50rps12–30clpP 145–172 177 157 17 3 20 3.4–10.8 5 0 14.1 clpP intron2 619–660 670 610 53 7 60 2.9–4.8 27 3 13.0 clpP intron1 839–850 874 789 75 10 85 1.6–6.4 22 3 12.2 50clpP-psbB 432–495 534 490 37 7 44 2.2–6.6 14 2 10.9 psbB-psbT 172–206 213 194 13 6 19 1.0–6.5 9 0 13.1 petB intron 748–756 757 693 54 10 64 1.7–4.9 5 0 9.1 petD intron 747–773 786 721 54 11 65 2.1–4.8 13 1 9.9 rps8-rpl14 194–203 206 179 23 4 27 1.0–10.9 5 0 15.5 rpl16 intron 883–952 1,000 883 100 17 117 2.4–7.4 32 5 14.9 ndhF-rpl32 843 –1,086 1,202 945 213 44 257 7.0–15.0 55 5 26.0 rpl32-trnL 634–975 1,006 669 296 41 337 8.5–26.2 52 2 38.7 ccsA-ndhD 205–224 228 172 44 12 56 4.9–17.5 13 2 30.3 psaC-ndhE 260–286 294 249 33 12 45 3.3–10.4 5 0 17.0

(Continued) 344 SYSTEMATIC BOTANY [Volume 40

TABLE 3. (CONTINUED).

Parsimony- Parsimony- Aligned length Constant Uninformative informative Variable Sequence Total informative Percent Locus Length (bp) (bp) positions positions positions positions divergence (%) indels indels variability ndhE-ndhG 209–230 235 197 34 4 38 2.4–10.9 5 0 18.3 ndhG-ndhI 301–362 377 319 50 8 58 4.0–9.4 14 1 19.1 ndhA intron 1,055–1,104 1,142 1,005 111 26 137 2.8–7.6 30 1 14.6 rps15-ycf1 289– 402 427 358 58 11 69 3.5–11.4 17 0 20.1 Combined Regions trnD-trnY-trnE-trnT 945 –1,150 1,219 898 227 94 321 7.6–19.3 51 6 30.5 trnS-psbZ-trnG-trnfM 1,065–1,133 1,206 1,099 90 17 107 2.5–5.0 34 5 11.7 psbB-psbT-psbN-psbH 594– 626 638 611 19 8 27 1.3–2.7 12 1 6.1 rpl36-infA-rps8-rpl14 1,066–1,079 1,083 994 76 13 89 2.1–5.0 8 0 9.0 variability, 7.6–19.3% sequence divergence, 321 variable the highest percent variability of all loci examined (14.3– alignment positions, and 51 indels. 38.7%). They also had among the highest pairwise sequence A comparison of the 25 most divergent noncoding divergence values (19.3% for trnD-trnY-trnE-trnT, 23.9% for plastome regions, as represented by numbers of variable trnE-trnT, and 26.2% for rpl32-trnL). No intron was included and parsimony-informative alignment positions, is presented in these top 13 most divergent loci, but of those included in in Fig. 3. Among the most variable loci include rpl32-trnL, the analysis four (ndhA, rpl16, rps16, and atpF) had the trnD-trnY-trnE-trnT, trnE-trnT, ndhF-rpl32, 50rps16-trnQ, trnT- greatest numbers of variable alignment positions (88–137) psbD, ndhC-30trnV, rpoB-trnC, petN-psbM, atpH-atpI, 5-trnK- and among the highest sequence divergence and % variabil- 30rps16, petA-psbJ, and psbE-petL. These 13 regions had among ity values (Table 3).

Fig. 3. Histogram showing the numbers of variable (dark boxes) and parsimony-informative (white boxes) positions in 25 of the most variable noncoding loci (intergenic spacers and introns) identified among the 74 regions compared from five Apiaceae plastid genomes. 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 345

Discussion sperms, it may not actually contain the exact ancestral IR boundaries (Raubeson et al. 2007). In many instances (like Plastome Comparisons—The plastomes of Anthriscus, those of some Apiales), an rps19 pseudogene of varying Crithmum, Hydrocotyle, Petroselinum, and Tiedemannia are length remains in IR immediately adjacent to the LSC highly conserved in size, structure, and gene order and con- A region; in three other instances, rps19 is full-length (and tent compared to the previously published plastomes of therefore exists as two complete copies in the IR) and J Panax (Kim and Lee 2004) and Daucus (Ruhlman et al. 2006). LA occurs between rps19 and trnH-GUG. The majority of these They are also similar to the plastome of Eleutherococcus plastomes (94) have between 0 and 30 nucleotides of noncod- senticosus (Rupr. & Maxim.) Maxim. (Siberian ginseng), the 0 ing sequence between JLA and the 3 end of trnH-GUG. most recent member of Araliaceae to have its plastid genome Twelve of the remaining 14 plastomes have between 34 and sequenced (Yi et al. 2012). Each of these Apiales plastomes 71 bp of noncoding sequence in this region, one species of shares characteristics of gene content and organization with Gossypium L. has 121 bp of noncoding sequence, and Hevea the ancestral angiosperm plastid genome, as most frequently brasiliensis (Willd. ex A. Juss.) Mu¨ ll. Arg. has 198 bp of non- represented by N. tabacum (Raubeson et al. 2007). The coding sequence in this region. Therefore, the large insertions ptDNAs of Anthriscus and Daucus, both Apiaceae tribe of novel, noncoding sequence between JLA and trnH-GUG of Scandiceae, are colinear, although the plastid genome map 345 nucleotides (Petroselinum)and1,463nucleotides(Crithmum) presented in Ruhlman et al. (2006) for Daucus indicates an are highly unusual among angiosperms. Similar large inser- additional open reading frame (ORF80) in the IR. Addition- tions of novel DNA (of between 214 and 393 nucleotides) have ally, Daucus contains two copies of trnG-GCC in the LSC been reported within this same region in other Apiaceae spe- region, whereas in all other Apiales plastomes considered cies and their pattern of distribution suggests that these inser- herein, one of these is replaced by trnG-UCC (between trnS- tions may be restricted to the “apioid superclade” of Apiaceae GCU and trnR-UCU). The sizes of these Apiales plastomes subfamily Apioideae (Peery et al. 2007). In most monocot are within the known range for most angiosperms possessing plastomes, the gene trnH is located between rpl2 and rps19 an IR; however, the size of the Crithmum IR, at 27,993 bp, is at in the IR (Mardanov et al. 2008; Wang et al. 2008; Wang the higher end of its 20–30 kb average size (Plunkett and and Messing 2011), and between JLA and the first gene in the Downie 2000; Cai et al. 2006; Chumley et al. 2006; Ruhlman LSC region, psbA, there are 82–148 nucleotides of noncod- et al. 2006; Haberle et al. 2008; Wicke et al. 2011). ing sequence. These sequences are also unique within the The IR boundaries of Anthriscus, Daucus, Hydrocotyle, plastome (Hansen et al. 2007), but to the best of our knowledge Tiedemannia, Panax,andEleutherococcus are also in the same these regions do not show any similarity to angiosperm mito- relative positions as in many other unrearranged non-monocot chondrial DNA. angiosperm plastid genomes. In these Apiales taxa, JLB occurs Mitochondrial DNA Transfer into the Plastid Genome— within rps19, resulting in the duplication of a portion of this Angiosperm mitochondrial genomes readily accept DNA of gene at the end of IRA adjacent to JLA.JLA occurs down- plastid and nuclear origin (Knoop 2004; Kubo and Mikami stream from gene trnH-GUG in the adjacent LSC region. In 2007; Richardson and Palmer 2007; Goremykin et al. 2009; Petroselinum, however, the IR has contracted ~1.5 kb at the Iorizzo et al. 2012b), as well as sequences of horizontal origin IRB-LSC boundary, such that JLB is contained within the rpl2 from foreign genomes (Bergthorsson et al. 2003, 2004; Mower 0 5 exon, and in Crithmum, the IR has expanded ~1.5 kb at the et al. 2010). In contrast, intracellular or horizontal gene trans- IRB-LSC boundary to duplicate genes rpl22 and rps3.These fer to the plastid genome is extremely rare, with the few large IR junction shifts (>1,000 bp) are unusual, for they occur reported cases restricted to some red and green algal plastids less frequently in angiosperm plastomes than do IR con- (Sheveleva and Hallick 2004; Rice and Palmer 2006; Kleine tractions and expansions of <100 bp (Goulding et al. 1996). et al. 2009). Until recently, no convincing evidence of DNA Another major difference between the plastomes of Petroselinum transfer from either the nucleus or the mitochondrion into and Crithmum and those of most other angiosperms is the the angiosperm plastome has been documented (Palmer presence of a large insertion of novel, noncoding DNA in the 1985b, 1990; Rice and Palmer 2006; Richardson and Palmer 0 region between JLA and the 3 end of trnH-GUG, with portions 2007; Bock and Timmis 2008; Bock 2010); in fact, such a of these inserts showing high sequence similarity to mitochon- transfer has generally been considered highly unlikely, pos- drial DNA. sibly reflecting the lack of a DNA uptake system in plastids Complete sequence data for 199 angiosperm plastid (Richardson and Palmer 2007; Kleine et al. 2009; Smith 2011). genomes (representing 134 genera) are publicly available in Several recent papers, however, have identified DNA NCBI’s Organelle Genome Resources database (http://www sequences of putative mitochondrial origin in angiosperm .ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=2759 plastomes or invoked intracellular or horizontal transfer to &opt=plastid; accessed February 1, 2013). Excluding most explain the origin of novel sequences within plastid genomes. monocots and other angiosperm species whose plastid In the Pelargonium + hortorum L. H. Bailey (Geraniaceae) genomes are rearranged relative to the ancestral angiosperm plastome, for example, ORF56 and ORF42 in the trnA-UGC plastome gene order (as exemplified by N. tabacum) and intron showed high sequence similarities to the mitochondrial those species whose plastomes have lost one copy of the ACR-toxin sensitivity (ACRS) gene of Citrus jambhiri Lush. IR (Downie and Palmer 1992; Raubeson and Jansen 2005; (Rutaceae) and to the mitochondrial pvs-trnA gene of Phaseolus Guisinger et al. 2011; Jansen and Ruhlman 2012), the JLA L. (Fabaceae), respectively (Chumley et al. 2006). ORF56 is boundaries for the remaining 108 plastomes (73 genera) typ- also shared by both plastid and mitochondrial genomes in ically occur between rpl2 and trnH-GUG, with the latter pres- other taxa, suggesting transfer from the mitochondrion into ent in its entirety only in the LSC region (Wang et al. 2008). the plastid (Chumley et al. 2006). The plastid genome of Trifo- While the N. tabacum plastome is frequently inferred to rep- lium subterraneum L. (Fabaceae) contains 19.5 kb of unique resent the ancestral plastid genome organization for angio- DNA distributed among 160 fragments ranging in size from 346 SYSTEMATIC BOTANY [Volume 40

30–494 bp, some of which was hypothesized to represent searches of the 1,463-bp Crithmum insert resulted in no signif- instances of horizontal transfer from bacterial genomes (Cai icant hits, except for a small (39-bp) region that also matched a et al. 2008). Other legume plastid genomes contain much repet- portion of the Petroselinum insert and the D. carota mitochon- itive sequences and unique DNA too, and speculation on their drialgenomefragment.Theoriginofthelargenumberofrepet- origins included intracellular transfers from the mitochon- itive elements within the Crithmum insert remains unclear. We drion or nucleus, or horizontal transfers from other genomes, interpret these results cautiously, for we have not firmly dem- possibly pathogenic bacteria (Cai et al. 2008). Another example onstrated the mitochondrial provenance of this novel DNA, of transfer of mitochondrial DNA to the plastid genome has and suggest that the sequencing of whole mitochondrial been identified in Apocynaceae tribe Asclepiadeae (Ku et al. genomes from the Apiaceae, specifically from Petroselinum, 2013; Straub et al. 2013). In this case, it was suggested that may eventually shed light on its origin. homologous recombination is responsible for the transfer of Repeated Sequences—For the five newly sequenced the mitochondrial sequence into the plastome. Thus, it appears plastomes, repeat analysis identified 9–29 dispersed repeats the transfer of mitochondrial DNA to the plastome is more of 30 bp or longer with a sequence identity of 90% or greater. common than previously thought. In the previously published Daucus plastome, repeat analysis Goremykin et al. (2009) uncovered two small sequences (of identified 12 direct and two inverted repeats of ³30 bp, four 74 and 126 bp) of the Vitis vinifera L. (Vitaceae) mitochondrial of which were direct repeats (of 30–70 bp) occurring in ycf2 genome that were highly similar to regions of the Daucus (Ruhlman et al. 2006). In Panax, nine direct and two inverted carota plastome. BLAST analysis of the larger sequence repeats ³30 bp were identified, the largest of these being four revealed a high similarity to the coding region of the mito- tandem repeats of 57-bp in ycf1 (Kim and Lee 2004). In the chondrial cytochrome c oxidase subunit 1 gene (cox1), prompt- Eleutherococcus plastome, a total of 23 repeats ³30 bp were ing the authors to suggest that its presence in the Daucus located, the largest being 79 bp in size and also occurring in plastome might possibly represent a rare transfer of DNA ycf2 (Yi et al. 2012). Repeated sequences, other than the IR, from the mitochondrion into the plastid. These two sequences are considered to be uncommon in plastid genomes and are are contained within a large 1,439-bp fragment of the D. carota more prevalent in those plastomes that have undergone IR (positions 99,309–100,747 and 139,407–140,845 in Ruhlman major changes in genome organization (Palmer 1985a, 1991; et al. 2006) that is a part of the 30rps12-trnV-GAC intergenic Chumley et al. 2006; Haberle et al. 2008; Weng et al. 2014). spacer region; this fragment, however, has no similarity to any Considering all Apiales plastomes investigated to date, repeats other published plastid nucleotide region (Goremykin et al. occurred most often in five introns and numerous intergenic 2009). The MultiPip (Fig. 2) shows that this large fragment of spacer regions scattered throughout the genome; within coding the Daucus plastome is not present in any other Apiaceae regions, repeats were prevalent in genes ycf2, trnS (3 genes), plastome sequenced herein, including its closest ally Anthriscus psaA, and psaB. The repeated elements identified in Apiales (both Apiaceae tribe Scandiceae). Subsequently, Iorizzo et al. are similar in number, size, and location to those of other (2012a,b) confirmed the presence of this unique 1,439-bp frag- angiosperms whose plastomes are structurally unrearranged ment (identified as 1,452 bp in their study) in the Daucus carota (Ruhlman et al. 2006; Yi et al. 2012). The origin of these repeats plastome, while also discovering that it is present as three non- is not known, although replication slippage could be responsi- contiguous sequences in the D. carota mitochondrial genome. ble for generating direct repeats (Palmer 1991). Further, the authors documented consistent conservation of a Similarly, homopolymer sequences of >13 bp occurred large portion of this region across all mitochondrial genomes of infrequently in each plastid genome. In Panax, 18 homopoly- the diverse Apiaceae species they examined (representing seven mers ³10 bp were identified previously (the largest being additional species of Daucus and six other genera of Apiaceae, 13-bp in size), the majority being composed of multiple A’s the latter including five species from the “apioid superclade” or T’s (Kim and Lee 2004). In Eleutherococcus, 27 homopoly- and planum L. of subfamily Saniculoideae). In the mers ³10 bp were identified, of which 22 were composed of plastid genome, this fragment, or a large portion thereof, multiple A or T bases and the largest were also of 13 bp in was present only in Daucus and Cuminum L., both of tribe size (Yi et al. 2012). These repetitive sequences, of which Scandiceae subtribe Daucinae (Lee et al. 2001). Iorizzo et al. there are many in the Apiales, may be potentially useful for (2012a) concluded that their results provided strong evidence population-level genetic studies, as sequence length poly- of a mitochondrial to plastid transfer of DNA, and the pres- morphisms can be surveyed among multiple accessions of a ence of this putative mitochondrial to plastid transfer region in species revealing haplotype variation (Yi et al. 2012, and ref- Scandiceae subtribe Daucinae but not in subtribe Scandicinae erences cited therein). Homopolymer runs are difficult to (Anthriscus) suggests this transfer event occurred sometime resolve using the Roche/454 platform (and other sequencing during the early evolution of the Daucinae clade. The authors methods) and contribute to sequencing errors (Moore et al. also suggested that a retrotransposon-like event might have 2006). In our study, Sanger sequencing was used for indepen- been responsible for transferring this sequence into the plastid dent confirmation of only a few homopolymeric repeats at or genome (Iorizzo et al. 2012b). near IR-single copy junctions. However, with the low depth Crithmum and Petroselinum also incorporate novel DNA into of coverage at some contig ends and the fact that very few their plastomes. BLAST searches of the 345-bp Petroselinum homopolymer runs were confirmed by Sanger sequencing, unique insert resulted in 122 bp of this region having multiple we caution that sequencing errors are likely present, espe- hits to angiosperm mitochondrial DNAs from a variety of cially within the largest homopolymeric regions. angiosperms, with the best alignment score matching a por- All major shifts in JLB position documented in Apiaceae are tion of the cob-atp4(ORF25) intergenic spacer in the Daucus restricted to members of the “apioid superclade” (Plunkett carota mitochondrial genome. This high sequence similarity is and Downie 1999, 2000). These include both major IR expan- suggestive of a possible intracellular transfer of DNA into the sions and contractions, such as those presented herein for plastid from the Petroselinum mitochondrial genome. BLAST Crithmum and Petroselinum. The prevalence of IR junction 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 347 shifts in this one (albeit large) clade suggests that some com- tribe Apieae (the “Apium group”), as represented herein by mon mechanism is responsible for flux in position of JLB and Petroselinum,butalsobyAmmi L., Anethum L., Apium L., that this mechanism likely originated in the immediate com- Foeniculum Hill., and Ridolfia Moris. in Plunkett and Downie mon ancestor of the group (Plunkett and Downie 2000). The (2000), are all characterized by a ~1.5 kb JLB contraction. underlying mechanisms responsible for causing such large L. (Conium clade) and Pimpinella L. (tribe Pimpinelleae) IR junction shifts are not yet known, but may involve homol- also possess a similar-sized IR contraction as that found in ogous recombination between small, dispersed repeated tribe Apieae (Plunkett and Downie 2000), but whether this sequences, adjacency to tRNA genes, or double-stranded contraction serves as a synapomorphy uniting these three breakage followed by recombination between poly(A) tracts major clades of taxa remains to be seen. In the absence of (Palmer 1985a; Goulding et al. 1996; Plunkett and Downie sequence data flanking their IR junctions, ambiguity remains < 2000; Wang et al. 2008). Small IR expansions ( 100 bp), such in assessing the precise location of JLB. as those found in many Nicotiana species, are likely caused by To date, all major IR expansions and contractions uncov- gene conversion (Goulding et al. 1996; Wang et al. 2008). ered in the Apiales are restricted to the “apioid superclade,” Comparisons of DNA sequences flanking IR-single copy a large, distally branching group within Apiaceae subfamily regions in Apiales reveal that there are no dispersed repeats Apioideae whose higher-level relationships are unclear at or near these junctions, although this doesn’t preclude (Plunkett and Downie 2000). Resolution of relationships such repetitive elements from instigating IR boundary among its constituent 14 tribes and major clades (Downie changes in other apioid taxa. However, the absence of any et al. 2010), as well as the formal recognition of these major significant sequence motifs or repetitive elements at or near clades, must await supporting data, such as that provided by IR-single copy junctions in other angiosperms exhibiting IR the distribution of well-characterized, coincident IR junction boundary fluxes suggests that these types of sequences may locations. Such rearrangements are rare enough that they not be the typical cause of IR spreading (Palmer 1985b; have the potential to resolve with confidence a particular Goulding et al. 1996), although Wang et al. (2008) observed branching point in a phylogeny and define monophyletic that IR-LSC junctions are indeed found at either poly(A) groups (Downie and Palmer 1992). We are currently survey- tracts or in A-rich regions. In the “apioid superclade,” the ing for, characterizing, and circumscribing the distribution of presence of both large-scale expansions and contractions major IR junction shifts in the “apioid superclade” to more of the IR at JLB and large, novel insertions in the LSC region rigorously evaluate the extent of IR junction flux in the group adjacent to JLA may be coincidental, or the latter may be (R. Peery et al. unpubl. data). Even though these rearrange- mediating, in some way, these IR boundary changes, or ments alone are unlikely to provide a comprehensive frame- vice versa. work of relationships, when used in conjunction with other Expansion/Contraction of the IR as a Phylogenetic molecular and traditional approaches, they have the power Marker—Because of their infrequent occurrence, major struc- to illuminate phylogeny or to provide additional support for tural rearrangements of the plastid genome usually can pro- otherwise weakly-supported clades. Of course, there is also vide strong evidence of monophyly (Downie and Palmer 1992; the possibility that IR fluxes are completely random, as they Raubeson and Jansen 2005; but see Downie et al. 1991, for an appear to be among closely related Nicotiana species (Goulding exception). These rare, large-scale mutational changes provide et al. 1996), or successive expansion-contraction events have complementary markers that can be used alongside nucleotide occurred, as has been suggested in adzuki bean (Perry et al. substitutions in molecular systematic studies. Variation in 2002) and Pelargonium + hortorum (Chumley et al. 2006), overall size of the IR is common in angiosperms, but such a which would confound interpretation of phylogeny. character cannot readily be used in a phylogenetic analysis Noncoding Regions as a Source of Phylogenetic Information because length mutations can occur anywhere within the IR, in Apiaceae—At present, nrDNA ITS sequences comprise the making homologous size variants difficult to assess. Specific most comprehensive database for Apiaceae phylogenetic expansions and contractions of border positions of the IR, study (Downie et al. 2010). These data have often been sup- however, can be used to demarcate monophyletic groups, plemented with sequences from a variety of ptDNA introns and these markers have been used previously to delimit major and intergenic spacers, yet for those taxa where radiations clades in Ranunculaceae (Johansson and Jansen 1993; Hoot have been relatively recent it has been difficult to generate and Palmer 1994; Johansson 1998), Berberidaceae (Kim and sufficient phylogenetic signal because of the relatively slow Jansen 1994), Nicotiana (Goulding et al. 1996), Campanulales rate of evolution of the various plastid loci being compared. (Knox and Palmer 1999), Poaceae (Guisinger et al. 2010), and Within the “apioid superclade,” phylogenetic resolution is monocots (Wang et al. 2008). Minor changes in IR junction particularly poor within several groups of genera, such as positions (<100 bp) are more common (Goulding et al. 1996), the clade of perennial, endemic Apiaceae subfamily Apioideae but this increased frequency suggests that they would not of western North America (Sun et al. 2004; Sun and Downie make very reliable phylogenetic markers. 2010), Bunium (Degtjareva et al. 2009), Conium (Cordes 2009), In Apiaceae, the IR has both expanded and contracted Pimpinella (Magee et al. 2010), the Arracacia clade (Danderson significantly, with successive contractions identified within 2011), Heracleum (Yu et al. 2011), and Angelica (Liao et al. the “apioid superclade,” five of which are parsimony- 2013). Each of these groups contains unresolved terminal informative (Plunkett and Downie 2000). Members of Apiaceae clades of presumably closely related species, which would tribes Careae and Pyramidoptereae, both previously attribut- benefit considerably by the inclusion of additional phyloge- able to the “Aegopodium group,” share a large expansion of JLB netically informative characters. relative to the Nicotiana plastome and are monophyletic sister The development of genetic markers to infer evolutionary groups based on a nuclear ribosomal DNA internal tran- relationships among recently diverged taxa remains an impor- scribed spacer (nrDNA ITS) sequence phylogeny (Plunkett tant challenge in molecular systematic studies and, to this end, and Downie 2000; Banasiak et al. 2013). Members of Apiaceae considerable advances have been made in the identification 348 SYSTEMATIC BOTANY [Volume 40 and development of markers from the angiosperm plastid we are unaware of any studies of the family incorporating genome (Shaw et al. 2005, 2007). Despite these efforts, a plastid ndhA or atpF intron sequences. Group II introns of the plastid region identified as highly variable in one group may not genomes of land plants, which includes the aforementioned actually be phylogenetically informative in another. As an four introns, show a strong relationship between the func- example, the Crithmum trnH-psbA intergenic spacer region tional importance of its secondary structural features and the shows extensive sequence divergence when compared with likelihood of mutational change, with those domains and the other four Apiaceae species and could not be aligned subdomains essential for intron-associated functions most unambiguously with these sequences. This region is substan- conserved evolutionarily (Downie et al. 1996, 2000; Kelchner tially shorter in Crithmum ptDNA (127 bp) than that of the 2002). Intergenic spacers are less functionally constrained than other species (180–191 bp); it also contains novel DNA, many introns and therefore more likely to vary, thereby providing small, repeated sequences in direct orientation, and is AT-rich more variable and phylogenetically informative characters for (73%). This spacer region is highly variable in angiosperms phylogenetic analyses. The greater variability of intergenic (Aldrich et al. 1988; Shaw et al. 2007) and has been proposed spacer regions over introns in plastome comparisons has been for DNA barcoding (Kress et al. 2005), yet in our study this reported previously (Shaw et al. 2007). In Apiaceae, intron locus ranked 59th out of 74 in its number of variable alignment sequences do not provide enough informative characters to positions. Furthermore, in Heracleum and its allies (Apiaceae resolve relationships below generic levels (Downie et al. 1996, tribe Tordylieae), the locus is quite conserved, with one small 2000, 2008; Downie and Katz-Downie 1999). internal region characterized by an inversion and duplication In conclusion, a comparative analysis of seven complete (Logacheva et al. 2008). Repeats, large deletions, and small plastid genomes from the Apiales, five of which were inversions in trnH-psbA have been reported for other species sequenced during the course of this investigation, has pro- of Apiales, severely confounding the use of this locus in phy- vided a plethora of information on their structural organiza- logenetic analyses (Degtjareva et al. 2012). tion and sequence evolution. The features of five of these Practical information to emerge from our study is the iden- genomes are consistent structurally with typical plastomes of tification of divergent plastid regions for enhancing resolu- other non-monocot angiosperm species, while two, Crithmum tion of low-level phylogenetic relationships in Apiaceae. A and Petroselinum, each demonstrate a ~1.5 kb expansion or comparison of levels of DNA sequence divergence across five contraction of its IR. These two species also incorporate novel Apiaceae plastomes, representing species from four dispa- DNA adjacent to JLA, indicative of a mitochondrial to plastid rate lineages within the family (tribes Apieae, Oenantheae, transfer of DNA. A comparison of 74 loci, representing all Pyramidoptereae, and Scandiceae), revealed that intergenic noncoding regions >150 bp from throughout the single-copy spacer regions rpl32-trnL, trnE-trnT, ndhF-rpl32, 50rps16-trnQ, portions of five Apiaceae plastomes, demonstrated a wide and trnT-psbD are among the fastest-evolving loci, as mea- range of sequence divergence in different regions. Several sured by numbers of variable and parsimony-informative regions were identified that yielded greater numbers of vari- characters, and also by total indels, pairwise sequence diver- able and parsimony-informative characters than many of gence estimates, and percent variability. Aligned lengths of those regions heretofore commonly employed in Apiaceae each of these regions ranged from 850–1,545 positions, and molecular systematic studies, providing new information on these sizes are of sufficient length for ease of PCR-amplifica- the potential phylogenetic utility of plastid noncoding loci. tion and DNA sequencing. These five spacers appear to be Therefore, these highly variable loci should be more useful in good candidates for resolving recent divergences. Consider- future interspecific phylogenetic, intraspecific phylogeo- ation of two spacers simultaneously because of gene adja- graphic, and population-level genetic studies of these plants. cency and co-amplification as a single contiguous unit (i.e. The distribution of specific IR junction shifts, on the other ndhF-rpl32 and rpl32-trnL, and trnE-trnT and trnT-psbD) hand, has the potential to demarcate major clades within the offers even better regions for phylogenetic analysis. The family, particularly those within the “apioid superclade” trnD-trnY-trnE-trnT combined region is substantially longer whose higher-level relationships remain elusive. and presents the greatest number of parsimony-informative The frequency and large size of IR junction shifts in characters overall. Shaw et al. (2007) identified nine noncod- Apiaceae plastomes appear to be unusual in angiosperm ing regions as the best possible choices for low-level phylo- families whose plastid genomes are otherwise unrearranged, genetic studies of angiosperms. All nine of these regions are indicating that the group represents a model system in which within our top 13 most divergent loci. To these nine, we also to study the mechanisms leading to large-scale expansions include trnE-trnT, rpoB-trnC, petN(ycf6)-psbM, and the trnD- and contractions of the IR. Concomitant with these IR shifts trnY-trnE-trnT combined region; these regions, too, were also is the incorporation of novel DNA in the LSC region adjacent noted as highly variable or in the “Tier 1” category of Shaw to JLA, some of which represents transfer of DNA from the et al. (2005, 2007). Although we acknowledge that a plastid mitochondrial genome into the plastid. With other reports of region identified as highly variable in one group may not be a putative mitochondrial to plastid transfer region elsewhere as useful phylogenetically in another group, it is interesting in the plastomes of Daucus and Cuminum (Goremykin et al. that the Shaw et al. (2005, 2007) studies identified the same 2009; Iorizzo et al. 2012a) it will be well worthwhile to examine highly variable loci for several disparate angiosperm lineages additional Apiaceae plastomes for instances of intracellular as we did for the Apiaceae. The resolution of recent diver- transfer of DNA. To improve understanding of the mecha- gences in Apiaceae would benefit considerably by the inclu- nisms causing IR junction shifts and to further characterize sion of any or all of these highly variable loci. and circumscribe the putative mitochondrial DNA insert, the No intron was included in the top 13 most divergent non- sequences spanning the four IR-single-copy junctions are coding loci, although four (ndhA, rpl16, rps16 and atpF) made being compared in additional species of Apiaceae (R. Peery the top 25. Of these, the rpl16 and rps16 introns have been et al. unpubl. data). By combining these data and analyzing used extensively in Apiaceae phylogenetic studies, whereas them within the context of a ptDNA-derived phylogeny of the 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 349 group, it will be possible to determine when the first IR junc- Cordes,J.M.2009.A systematic study of poison hemlock (Conium, tion shift occurred and the extent of IR junction mobility dur- Apiaceae). M. S. thesis. Urbana, Illinois: University of Illinois at Urbana-Champaign. ing evolution of the group. Complete plastome sequences of Cosner, M. E., R. K. Jansen, J. D. Palmer, and S. R. Downie. 1997. The other members of the “apioid superclade” will further advance highly rearranged chloroplast genome of Trachelium caeruleum understanding of the evolutionary events in the plastome that (Campanulaceae): multiple inversions, inverted repeat expansion accompanied umbellifer diversification. and contraction, transposition, insertions/deletions, and several repeat families. Current Genetics 31: 419–429. Acknowledgments. The authors thank Clinton Morse of the Univer- Danderson, C. A. 2011. A phylogenetic study of the Arracacia clade sity of Connecticut Plant Growth Facilities for providing plant material, (Apiaceae). Ph. D. thesis. Urbana, Illinois: University of Illinois at Murray Henwood of the University of Sydney for identifying the Urbana-Champaign. Hydrocotyle species, J. Chris Blazier and Mary M. Guisinger for providing Degtjareva, G. V., E. V. Kljuykov, T. H. Samigullin, C. M. Valiejo-Roman, technical assistance, Rhiannon M. Peery for discussions, and two anony- and M. G. Pimenov. 2009. Molecular appraisal of Bunium and some mous reviewers for constructive comments. This work was initiated related arid and subarid geophilic Apiaceae-Apioideae taxa of the while SRD was on sabbatical leave in the laboratory of RKJ. This research ancient Mediterranean. Botanical Journal of the Linnean Society 160: was supported by a grant to RKJ from the National Science Foundation 149–170. (IOS-1027259). Degtjareva, G. V., M. D. Logacheva, T. H. Samigullin, E. I. Terentieva, and C. M. Valiejo-Roman. 2012. Organization of chloroplast psbA-trnH intergenic spacer in dicotyledonous angiosperms of the family Literature Cited Umbelliferae. Biochemistry (Moscow) 77: 1056–1064. Downie, S. R. and D. S. Katz-Downie. 1999. Phylogenetic analysis of Aii, J., Y. Kishima, T. Mikami, and T. Adachi. 1997. Expansion of the IR in chloroplast rps16 intron sequences reveals relationships within the the chloroplast genomes of buckwheat species is due to incorpora- woody southern African Apiaceae subfamily Apioideae. Canadian tion of an SSC sequence that could be mediated by an inversion. Journal of Botany 77: 1120–1135. Current Genetics 31: 276–279. Downie, S. R. and J. D. Palmer. 1992. Use of chloroplast DNA rearrange- Aldrich, J., B. W. Cherney, E. Merlin, and L. Christopherson. 1988. The ments in reconstructing plant phylogeny. Pp. 14–35 in Molecular role of insertions/deletions in the evolution of the intergenic region Systematics of Plants, eds. P. S. Soltis, D. E. Soltis, and J. J. Doyle. New between psbA and trnH in the chloroplast genome. Current Genetics York: Chapman and Hall. 14: 137–146. Downie, S. R., D. S. Katz-Downie, and K.-J. Cho. 1996. Phylogenetic anal- Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. ysis of Apiaceae subfamily Apioideae using nucleotide sequences Basic local alignment search tool. Journal of Molecular Biology 215: from the chloroplast rpoC1 intron. Molecular Phylogenetics and Evolu- 403–410. tion 6: 1–18. Bach, I. C., A. Olesen, and P. W. Simon. 2002. PCR-based markers to Downie, S. R., D. S. Katz-Downie, and M. F. Watson. 2000. A phylog- differentiate the mitochondrial genomes of petaloid and male fertile eny of the family Apiaceae based on chloroplast carrot (Daucus carota L.). Euphytica 127: 353–365. DNA rpl16 and rpoC1 intron sequences: towards a suprageneric Banasiak, Ł., M. Piwczynski, T. Ulinski, S. R. Downie, M. F. Watson, classification of subfamily Apioideae. American Journal of Botany B. Shakya, and K. Spalik. 2013. Dispersal patterns in space and time: 87: 273–292. a case study of Apiaceae subfamily Apioideae. Journal of Biogeogra- Downie, S. R., D. S. Katz-Downie, F.-J. Sun, and C.-S. Lee. 2008. Phylog- phy 40: 1324–1335. eny and biogeography of Apiaceae tribe Oenantheae inferred from 0 Bergthorsson, U., K. L. Adams, B. Thomason, and J. D. Palmer. 2003. nuclear rDNA ITS and cpDNA psbI-5 trnK(UUU) sequences, with empha- Widespread horizontal transfer of mitochondrial genes in flowering sisontheNorthAmericanendemicsclade.Botany 86: 1039–1064. plants. Nature 424: 197–201. Downie, S. R., G. M. Plunkett, M. F. Watson, K. Spalik, D. S. Katz-Downie, Bergthorsson,U.,A.O.Richardson,G.J.Young,L.R.Goertzen,and C. M. Valiejo-Roman, E. I. Terentieva, A. V. Troitsky, B.-Y. Lee, J. D. Palmer. 2004. Massive horizontal transfer of mitochondrial J. Lahham, and A. El-Oqlah. 2001. Tribes and clades within Apiaceae genes from diverse land plant donors to the basal angiosperm subfamily Apioideae: the contribution of molecular data. Edinburgh Amborella. Proceedings of the National Academy of Sciences USA 101: Journal of Botany 58: 301–330. 17747–17752. Downie, S. R., R. G. Olmstead, G. Zurawski, D. E. Soltis, P. S. Soltis, J. C. Bock, R. 2010. The give-and-take of DNA: horizontal gene transfer in Watson, and J. D. Palmer. 1991. Six independent losses of the chloro- plants. Trends in Plant Science 15: 11–22. plast DNA rpl2 intron in dicotyledons: molecular and phylogenetic Bock, R. and J. N. Timmis. 2008. Reconstructing evolution: gene transfer implications. Evolution 45: 1245–1259. from plastids to the nucleus. BioEssays 30: 556–566. Downie, S. R., K. Spalik, D. S. Katz-Downie, and J.-P. Reduron. 2010. Boudreau, E. and M. Turmel. 1995. Gene rearrangements in Chlamydomonas Major clades within Apiaceae subfamily Apioideae as inferred by chloroplast DNAs are accounted for by inversions and by the phylogenetic analysis of nrDNA ITS sequences. Plant Diversity and expansion/contraction of the inverted repeat. Plant Molecular Biology Evolution 128: 111–136. 27: 351–364. Feist, M. A. E., S. R. Downie, A. R. Magee, and M. Liu. 2012. Revised Cai, Z., M. Guisinger, H. G. Kim, E. Ruck, J. C. Blazier, V. McMurtry, generic delimitations for Oxypolis and Ptilimnium (Apiaceae) based J. V. Kuehl, J. Boore, and R. K. Jansen. 2008. Extensive reorganization on leaf morphology, comparative fruit anatomy, and phylogenetic of the plastid genome of Trifolium subterraneum (Fabaceae) is associ- analysis of nuclear rDNA ITS and cpDNA trnQ-trnK intergeneric ated with numerous repeated sequences and novel DNA insertions. spacer sequence data. Taxon 61: 402–418. Journal of Molecular Evolution 67: 696–704. Goulding, S. E., R. G. Olmstead, C. W. Morden, and K. H. Wolfe. 1996. Cai, Z., C. Penaflor, J. V. Kuehl, J. Leebens-Mack, J. E. Carlson, C. W. Ebb and flow of the chloroplast inverted repeat. Molecular & General dePamphilis, J. L. Boore, and R. K. Jansen. 2006. Complete plastid Genetics 252: 195–206. genome sequences of Drimys, Liriodendron,andPiper: implications Goremykin, V. V., F. Salamini, R. Velasco, and R. Viola. 2009. Mitochon- for the phylogenetic relationships of magnoliids. BMC Evolutionary drial DNA of Vitis vinifera and the issue of rampant horizontal gene Biology 6: 77. transfer. Molecular Biology and Evolution 26: 99–110. Calvin˜o, C. I. and S. R. Downie. 2007. Circumscription and phylogeny of Guisinger, M. M., T. W. Chumley, J. V. Kuehl, J. L. Boore, and R. K. Apiaceae subfamily Saniculoideae based on chloroplast DNA Jansen. 2010. Implications of the plastid genome sequence of Typha sequences. Molecular Phylogenetics and Evolution 44: 175–191. (Typhaceae, Poales) for understanding genome evolution in Chevreux, B., T. Pfisterer, and S. Suhai. 2000. Automatic assembly Poaceae. Journal of Molecular Evolution 70: 149–166. and editing of genomic sequences. In: Suhai S. (ed), Genomics and Guisinger, M. M., J. V. Kuehl, J. L. Boore, and R. K. Jansen. 2011. Extreme proteomics—functional and computational aspects. New York: Kluwer reconfiguration of plastid genomes in the angiosperm family Academic/Plenum Publishers. Geraniaceae: rearrangements, repeats, and codon usage. Molecular Chumley, T. W., J. D. Palmer, J. P. Mower, H. M. Fourcade, P. J. Calie, Biology and Evolution 28: 583–600. J. L. Boore, and R. K. Jansen. 2006. The complete chloroplast genome Haberle, R. C., H. M. Fourcade, J. L. Boore, and R. K. Jansen. 2008. sequence of Pelargonium + hortorum: organization and evolution of Extensive rearrangements in the chloroplast genome of Trachelium the largest and most highly rearranged chloroplast genome of land caeruleum are associated with repeats and tRNA genes. Journal of plants. Molecular Biology and Evolution 23: 2175–2190. Molecular Evolution 66: 350–361. 350 SYSTEMATIC BOTANY [Volume 40

Hansen, D. R., S. G. Dastidar, Z. Cai, C. Penaflor, J. V. Kuehl, J. L. Boore, emphasis on East Asian species, inferred from nrDNA, cpDNA, and and R. K. Jansen. 2007. Phylogenetic and evolutionary implications morphological evidence. Systematic Botany 38: 266–281. of complete genome sequences of four early-diverging angio- Logacheva, M. D., C. M. Valiejo-Roman, and M. G. Pimenov. 2008. ITS sperms: Buxus (Buxaceae), Chloranthus (Chloranthaceae), Dioscorea phylogeny of West Asian Heracleum species and related taxa of (Dioscoreaceae), and Illicium (Schisandraceae). Molecular Phyloge- Umbelliferae—Tordylieae W. D. J. Koch, with notes on evolution netics and Evolution 45: 547–563. of their psbA-trnH sequences. Plant Systematics and Evolution 270: Hoot, S. B. and J. D. Palmer. 1994. Structural rearrangements, including 139–157. parallel inversions, within the chloroplast genome of Anemone and Lohse, M., O. Drechsel, and R. Bock. 2007. OrganellarGenomeDRAW related genera. Journal of Molecular Evolution 38: 274–281. (OGDRAW): a tool for the easy generation of high-quality custom Iorizzo, M., D. Grzebelus, D. Senalik, M. Szklarczyk, D. Spooner, and graphical maps of plastid and mitochondrial genomes. Current P. Simon. 2012a. Against the traffic: The first evidence for mitochon- Genetics 52: 267–274. drial DNA transfer into the plastid genome. Mobile Genetic Elements Lowe, T. M. and S. R. Eddy. 1997. tRNAscan-SE: a program for improved 2: 261–266. detection of transfer RNA genes in genomic sequence. Nucleic Acids Iorizzo, M., D. Senalik, M. Szklarczyk, D. Grzebelus, D. Spooner, and Research 25: 955–964. P. Simon. 2012b. De novo assembly of the carrot mitochondrial Magee, A. R., B.-E. van Wyk, P. M. Tilney, and S. R. Downie. 2010. genome using next generation sequencing of whole genomic DNA Phylogenetic position of African and Malagasy Pimpinella species provides first evidence of DNA transfer into an angiosperm plastid and related genera (Apiaceae, Pimpinelleae). Plant Systematics and genome. BMC Plant Biology 12: 61. Evolution 288: 201–211. Jansen, R. K., L. A. Raubeson, J. L. Boore, C. W. dePamphilis, T. W. Mardanov, A. V., N. V. Ravin, B. B. Kuznetsov, T. H. Samigullin, A. S. Chumley, R. C. Haberle, S. K. Wyman, A. J. Alverson, R. Peery, Antonov, T. V. Kolganova, and K. G. Skyabin. 2008. Complete S. J. Herman, H. M. Fourcade, J. V. Kuehl, J. R. McNeal, J. Leebens- sequence of the duckweed (Lemna minor) chloroplast genome: struc- Mack, and L. Cui. 2005. Methods for obtaining and analyzing whole tural organization and phylogenetic relationships to other angio- chloroplast genome sequences. Methods in Enzymology 395: 348–384. sperms. Journal of Molecular Evolution 66: 555–564. Jansen, R. K. and T. A. Ruhlman. 2012. Plastid genomes of seed plants. Michel, F., K. Umesono, and H. Ozeki. 1989. Comparative and functional Pp. 103–126 in Genomics of chloroplasts and mitochondria, advances anatomy of group II catalytic introns—a review. Gene 82: 5–30. in photosynthesis and respiration, eds. R. Bock and V. Knoop. Moore, M. J., A. Dhingra, P. S. Soltis, R. Shaw, W. G. Farmerie, K. M. Dordrecht: Springer. Folta, and D. E. Soltis. 2006. Rapid and accurate pyrosequencing of Jeanmougin, F., J. D. Thompson, M. Gouy, D. G. Higgins, and T. J. Gibson. angiosperm plastid genomes. BMC Plant Biology 6: 17. 1998. Multiple sequence alignment with ClustalX. Trends in Biochemi- Mower, J. P., S. Stefanovic, W. Hao, J. S. Gummow, K. Jain, D. Ahmed, cal Sciences 23: 403–405. and J. D. Palmer. 2010. Horizontal acquisition of multiple mitochon- Johansson, J. T. 1998. Chloroplast DNA restriction site mapping and the drial genes from a parasitic plant followed by gene conversion with phylogeny of Ranunculus (Ranunculaceae). Plant Systematics and Evo- host mitochondrial genes. BMC Biology 8: 150. lution 213: 1–19. Nicolas, A. N. and G. M. Plunkett. 2009. The demise of subfamily Johansson, J. T. and R. K. Jansen. 1993. Chloroplast DNA variation and Hydrocotyloideae (Apiaceae) and the re-alignment of its genera phylogeny of the Ranunculaceae. Plant Systematics and Evolution 187: across the entire order Apiales. Molecular Phylogenetics and Evolution 29–49. 53: 134–151. Kelchner, S. A. 2002. Group II introns as phylogenetic tools: structure, Palmer, J. D. 1983. Chloroplast DNA exists in two orientations. Nature function, and evolutionary constraints. American Journal of Botany 301: 92–93. 89: 1651–1669. Palmer, J. D. 1985a. Comparative organization of chloroplast genomes. Kim, Y.-D. and R. K. Jansen. 1994. Characterization and phylogenetic Annual Review of Genetics 19: 325–354. distribution of a chloroplast DNA rearrangement in the Berberidaceae. Palmer, J. D. 1985b. Evolution of chloroplast and mitochondrial DNA in Plant Systematics and Evolution 193: 107–114. plants and algae. Pp. 131–240 in Monographs in evolutionary biology: Kim, K.-J. and H.-L. Lee. 2004. Complete chloroplast genome sequences Molecular evolutionary genetics, Chapter 3, ed. R. J. MacIntyre. from Korean ginseng (Panax schinseng Nees) and comparative analy- New York: Plenum Publishing Corporation. sis of sequence evolution among 17 vascular plants. DNA Research Palmer, J. D. 1986. Isolation and structural analysis of chloroplast DNA. 11: 247–261. Methods in Enzymology 118: 167–186. Kleine, T., U. G. Maier, and D. Leister. 2009. DNA transfer from organ- Palmer, J. D. 1990. Contrasting modes and tempos of genome evolution in elles to the nucleus: the idiosyncratic genetics of endosymbiosis. land plant organelles. Trends in Genetics 6: 115–120. Annual Review of Plant Biology 60: 115–138. Palmer, J. D. 1991. Plastid chromosomes: structure and evolution. Knoop, V. 2004. The mitochondrial DNA of land plants: peculiarities in Pp. 5–53 in Cell culture and somatic cell genetics of plants,Vol.7A,eds. phylogenetic perspective. Current Genetics 46: 123–139. L. Bogorad and I. K. Vasil. San Diego: Academic Press. Knox, E. B. and J. D. Palmer. 1999. The chloroplast genome arrangement Perry, A. S. and K. H. Wolfe. 2002. Nucleotide substitution rates in of Lobelia thuliniana (Lobeliaceae): expansion of the inverted repeat legume chloroplast DNA depend on the presence of the inverted in an ancestor of the Campanulales. Plant Systematics and Evolution repeat. Journal of Molecular Evolution 55: 501–508. 214: 49–64. Perry, A. S., S. Brennan, D. J. Murphy, T. A. Kavanagh, and K. H. Wolfe. Kress, W. J., K. J. Wurdack, E. A. Zimmer, L. A. Weigt, and D. H. Janzen. 2002. Evolutionary re-organization of a large operon in adzuki bean 2005. Use of DNA barcodes to identify flowering plants. Proceedings chloroplast DNA caused by inverted repeat movement. DNA of the National Academy of Sciences USA 102: 8369–8374. Research 9: 157–162. Ku, C., W.-C. Chung, L.-L. Chen, and C.-H. Kuo. 2013. The complete Peery, R., J. V. Kuehl, J. L. Boore, and L. Raubeson. 2006. Comparisons of plastid genome sequence of Madagascar periwinkle Catharanthus three Apiaceae chloroplast genomes—, dill and fennel. roseus (L.) G. Don: plastid genome evolution, molecular marker iden- Abstract, Botany 2006 Meeting. tification, and phylogenetic implications in . PLoS ONE Peery, R., S. R. Downie, J. V. Kuehl, J. L. Boore, and L. Raubeson. 2007. 8: e68518. Chloroplast genome evolution in Apiaceae. Abstract, Botany 2007 Kubo, T. and T. Mikami. 2007. Organization and variation of angiosperm Meeting. mitochondrial genome. Physiologia Plantarum 129: 6–13. Peery, R., S. R. Downie, R. K. Jansen, and L. A. Raubeson. 2011. Apiaceae Kurtz, S. and C. Schleiermacher. 1999. REPuter: fast computation of max- organellar genomes. XVIII International Botanical Congress Abstract imal repeats in complete genomes. Bioinformatics 15: 426–427. Book, p. 481. Lee, B.-Y. and S. R. Downie. 2000. Phylogenetic analysis of cpDNA Plunkett, G. M., D. E. Soltis, and P. S. Soltis. 1996a. Higher level relation- restriction sites and rps16 intron sequences reveals relationships ships of Apiales (Apiaceae and Araliaceae) based on phylogenetic among Apiaceae tribes Caucalideae, Scandiceae and related taxa. analysis of rbcL sequences. American Journal of Botany 83: 499–515. Plant Systematics and Evolution 221: 35–60. Plunkett, G. M., D. E. Soltis, and P. S. Soltis. 1996b. Evolutionary patterns Lee, B.-Y., G. A. Levin, and S. R. Downie. 2001. Relationships within the in Apiaceae: inferences based on matK sequence data. Systematic spiny-fruited umbellifers (Scandiceae subtribes Daucinae and Botany 21: 477–495. Torilidinae) as assessed by phylogenetic analysis of morphological Plunkett, G. M. and S. R. Downie. 1999. Major lineages within Apiaceae characters. Systematic Botany 26: 622–642. subfamily Apioideae: a comparison of chloroplast restriction Liao, C., S. R. Downie, Q. Li, Y. Yu, X. He, and B. Zhou. 2013. New site and DNA sequence data. American Journal of Botany 86: insights into the phylogeny of Angelica and its allies (Apiaceae) with 1014–1026. 2015] DOWNIE AND JANSEN: COMPARISON OF APIALES PLASTID GENOMES 351

Plunkett, G. M. and S. R. Downie. 2000. Expansion and contraction of the Straub, S. C. K., R. C. Cronn, C. Edwards, M. Fishbein, and A. Liston. chloroplast inverted repeat in Apiaceae subfamily Apioideae. Sys- 2013. Horizontal transfer of DNA from the mitochondrial to the tematic Botany 25: 648–667. plastid genome and its subsequent evolution in milkweeds Raubeson, L. A. and R. K. Jansen. 2005. Chloroplast genomes of plants. (Apocynaceae). Genome Biology and Evolution 5: 1872–1885. Pp. 45–68 in Plant diversity and evolution: Genotypic and phenotypic Sun, F.-J., S. R. Downie, and R. L. Hartman. 2004. An ITS-based phyloge- variation in higher plants, ed. R. J. Henry. London: CAB International. netic analysis of the perennial, endemic Apiaceae subfamily Raubeson, L. A., R. Peery, T. W. Chumley, C. Dziubek, H. M. Fourcade, Apioideae of western North America. Systematic Botany 29: 419–431. J. L. Boore, and R. K. Jansen. 2007. Comparative chloroplast geno- Sun, F.-J. and S. R. Downie. 2010. Phylogenetic relationships among the mics: analyses including new sequences from the angiosperms perennial, endemic Apiaceae subfamily Apioideae of western North Nuphar advena and Ranunculus macranthus. BMC Genomics 8: 174. America: Additional data from the cpDNA trnF-trnL-trnT region Rice, D. W. and J. D. Palmer. 2006. An exceptional horizontal gene trans- continue to support a highly polyphyletic Cymopterus. Plant Diversity fer in plastids: gene replacement by a distant bacterial paralog and and Evolution 128: 151–172. evidence that haptophyte and cryptophyte plastids are sisters. BMC Swofford, D. L. 2002. PAUP*: Phylogenetic analysis using parsimony Biology 4: 31. (* and other methods), v. 4.0 beta 10. Sunderland: Sinauer Associates. Richardson, A. O. and J. D. Palmer. 2007. Horizontal gene transfer in Timme, R. E., J. V. Kuehl, J. L. Boore, and R. K. Jansen. 2007. A compara- plants. Journal of Experimental Botany 58: 1–9. tive analysis of the Lactuca and Helianthus (Asteraceae) plastid Ruhlman, T. A. and R. K. Jansen. 2014. The plastid genomes of flowering genomes: identification of divergent regions and categorization of plants. Pp. 3–38 in Chloroplast Biotechnology: Methods and Protocols, shared repeats. American Journal of Botany 94: 302–312. ed. P. Maliga. New York: Springer. Wang, R. J., C. L. Cheng, C. C. Chang, C. L. Wu, T. M. Su, and S. M. Chaw. Ruhlman, T., S.-B. Lee, R. K. Jansen, J. B. Hostetler, L. J. Tallon, C. D. 2008. Dynamics and evolution of the inverted repeat-large single Town, and H. Daniell. 2006. Complete plastid genome sequence of copy junctions in the chloroplast genomes of monocots. BMC Evolu- Daucus carota: implications for biotechnology and phylogeny of tionary Biology 8: 36. angiosperms. BMC Genomics 7: 222. Wang, W. and J. Messing. 2011. High-throughput sequencing of three Schattner, P., A. N. Brooks, and T. M. Lowe. 2005. The tRNAscan-SE, Lemnoideae (duckweeds) chloroplast genomes from total DNA. snoscan and snoGPS web servers for the detection of tRNAs and PLoS ONE 6: e24670. snoRNAs. Nucleic Acids Research 33: W686–W689. Weng, M.-L., J. C. Blazier, M. Govindu, and R. K. Jansen. 2014. Recon- Schwartz, S., L. Elnitski, M. Li, M. Weirauch, C. Riemer, A. Smit, NISC struction of the ancestral plastid genome in Geraniaceae reveals a Comparative Sequencing Program, E. D. Green, R. C. Hardison, and correlation between genome rearrangements, repeats and nucleotide W. Miller. 2003. MultiPipMaker and supporting tools: alignments substitution rates. Molecular Biology and Evolution, doi: 10.1093/ and analysis of multiple genomic DNA sequences. Nucleic Acids molbev/mst257. Research 31: 3518–3524. Wicke, S., G. M. Schneeweiss, C. W. dePamphilis, K. F. Mu¨ ller, and D. Shaw, J., E. B. Lickey, J. T. Beck, S. B. Farmer, W. Liu, J. Miller, K. C. Quandt. 2011. The evolution of the plastid chromosome in land Siripun, C. T. Winder, E. E. Schilling, and R. L. Small. 2005. The plants: gene content, gene order, gene function. Plant Molecular Biol- tortoise and the hare II: relative utility of 21 noncoding chloroplast ogy 76: 273–297. DNA sequences for phylogenetic analysis. American Journal of Botany Wolfe, K. H., W.-H. Li, and P. M. Sharp. 1987. Rates of nucleotide substi- 92: 142–166. tution vary greatly among plant mitochondrial, chloroplast, and Shaw, J., E. B. Lickey, E. E. Schilling, and R. L. Small. 2007. Comparison of nuclear DNAs. Proceedings of the National Academy of Sciences USA whole chloroplast genome sequences to choose noncoding regions 84: 9054–9058. for phylogenetic studies in angiosperms: the tortoise and the hare III. Wyman,S.K.,R.K.Jansen,andJ.L.Boore.2004.Automaticannota- American Journal of Botany 94: 275–288. tion of organellar genomes with DOGMA. Bioinformatics 20: Sheveleva, E. V. and R. B. Hallick. 2004. Recent horizontal intron transfer 3252–3255. to a chloroplast genome. Nucleic Acids Research 32: 803–810. Yi, D.-K., H.-L. Lee, B.-Y. Sun, M.-Y. Chung, and K.-J. Kim. 2012. The Shinozaki, K., M. Ohme, M. Tanaka, T. Wakasugi, N. Hayshida, T. complete chloroplast DNA sequence of Eleutherococcus senticosus Matsubayashi, N. Zaita, J. Chunwongse, J. Obokata, K. Yamaguchi- (Araliaceae); Comparative evolutionary analyses with other three Shinozaki, C. Ohto, K. Torazawa, B. Y. Meng, M. Sugita, H. Deno, asterids. Molecules and Cells 33: 497–508. T. Kamogashira, K. Yamada, J. Kusuda, F. Takaiwa, A. Kato, Yu, Y., S. R. Downie, X. He, X. Deng, and L. Yan. 2011. Phylogeny and N. Tohdoh, H. Shimada, and M. Sugiura. 1986. The complete nucle- biogeography of Chinese Heracleum (Apiaceae tribe Tordylieae) with otide sequence of the tobacco chloroplast genome: its gene organiza- comments on their fruit morphology. Plant Systematics and Evolution tion and expression. The EMBO Journal 5: 2043–2049. 296: 179–203. Smith, D. R. 2011. Extending the limited transfer window hypothesis to Yukawa, M., T. Tsudzuki, and M. Sugiura. 2005. The 2005 version of the inter-organelle DNA migration. Genome Biology and Evolution 3: chloroplast DNA sequence from tobacco (Nicotiana tabacum). Plant 743–748. Molecular Biology Reporter 23: 359–365.