<<

RESEARCH ARTICLE https://doi.org/10.12972/kjhst.20170081 Characterization of Two Complete Chloroplast Genomes in the Tribe (): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Dong-Hyuk Lee1, Won-Bum Cho2, Byoung-Hee Choi3, and Jung-Hyun Lee2*

1Baekdudaegan Biodiversity Conservation Division, Baekdudaegan National Arboretum, Bonghwa-gun 36209, Korea 2Department of Biology Education, Chonnam National University, Gwangju 61186, Korea 3Department of Biological Sciences, Inha University, Incheon 22212, Korea

*Corresponding author: [email protected]

Abstract The tribe Gnaphalieae is one of the largest and most complex in the Asteraceae family.

Received: June 12, 2017 The taxonomic delimitations of the tribe Gnaphalieae have been inferred by researchers; Revised: July 19, 2017 however, the exact relationships between it and other tribes remain in dispute, likely because Accepted: July 24, 2017 of complicated evolutionary histories. To improve our understanding of this tribe, we investigated the complete chloroplast (cp) genomes of two Asian Gnaphalieae taxa, namely that OPEN AccESS within Anaphalis sinica and leiolepis, and used this information to infer their phylogenetic relationships and evolution. The cp genomes of A. sinica and L. leiolepis were HORTICULTURAL SCIENCE and TECHNOLOGY found to be 152,718 bp and 151,072 bp long, respectively, with gene contents and orders that 35(6):769-783, 2017 resembled those of other Asteraceae species. However, we determined that the trnT-GGU genes URL: http://www.kjhst.org within the two Gnaphalieae cp genomes were either pseudogenized or lost. Because these gene pISSN : 1226-8763 features have also been identified in other Gnaphalieae genera, but not among other Asteraceae eISSN : 2465-8588 species, they reinforce the taxonomic boundary of the tribe Gnaphalieae as an independent This is an Open-Access article distributed group. Moreover, based on the trnT-GGU gene features in members of the HAP clade, we under the terms of the Creative Commons Attribution NonCommercial License which speculate that the parental lineages of the genera Anaphalis and have an permits unrestricted non-commercial use, allopolyploid origin. Thus, mutations in the trnT-GGU gene may be used in future studies as distribution, and reproduction in any medium, provided the original work is properly cited. indicators of generic and/or tribal relationships. Using 52 protein-coding genes, we performed a phylogenetic analysis which further indicated that, alongside the tribes and Copyrightⓒ2017 Korean Society for Horticultural Science. , the tribe Gnaphalieae is a monophyletic taxon. We also identified 63 and 81 SSRs in L. leiolepis and A. sinica cp genomic sequences, respectively. The results described here will

This work was supported by grants from the be useful for inferring the evolutionary history of the tribe Gnaphalieae, and specific genomic Scientific Research fund (KNA1-2-13, 14-2) of information can be applied in conservation strategies for endangered edelweiss (Leontopodium the Korea National Arboretum. The authors thank their colleagues I.S. Choi and D.P. Jin R. Br. ex Cass.) and pearly everlasting (Anaphalis DC.) species. at the Systematics Laboratory of Inha University; and Prof. J.H. Kim, Dr. H.T. Kim, Additional key words: Anaphalis sinica, evolution, HAP clade, Leontopodium leiolepis, plastid and Dr. J.S. Kim in the Department of Life Science, Gachon University, for help with our experiments and other research activities. We also thank Dr. D.K. Yi of the Korea National Introduction Arboretum for assistance with the statistical analysis and for meaningful discussions. The tribe Gnaphalieae (Cass.) Lecoq & Juillet (i.e., everlasting or paper daisies) comprises 185 genera

Horticultural Science and Technology 769 Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

and 1,240 species and is part of the subfamily Asteroideae in Asteraceae (Ward et al., 2009). In particular, species within the genus Leontopodium are commonly known as ‘edelweiss’. These ornamental are grown in gardens and parks (Harrison et al., 2010) and serve as national symbols for various alpine countries, such as Switzerland and Austria. Wild edelweiss plants are protected in several European and Asian countries, and hence they are now being cultivated to conserve their genetic resources (Zheljazkov and Craker, 2016). Therefore, clear taxonomic delimitations are essential for effective breeding and cultivation of these species. The tribe Gnaphalieae, one of the largest and most complex in Asteraceae, was originally regarded as an element of the tribe (Merxmüller et al., 1977), but is now considered independent and a sister clade to Anthemideae and Astereae (Panero and Funk, 2008). This reclassification was robustly supported by morphological (Anderberg, 1989; Karis, 1993) and molecular data (Kim and Jansen, 1995; Bayer and Starr, 1998; Wagstaff and Breitwieser, 2002). Despite numerous analyses of the tribe Gnaphalieae, intergeneric relationships are unclear and widely disputed, which is attributed to various factors such as morphological similarity and a complex evolutionary history (e.g., recent rapid radiation and allopolyploidy). Indeed several elements, such as Mill, Pseudognaphalium Kirp., and Philyrophyllum O. Hoffmann & Prantl, have been transferred or split into other genera and/or new tribes (Bentley et al., 2015; Nie et al., 2015). Phylogenomics is an effective means of addressing complex phylogeny (Bock et al., 2014). Nuclear DNA (nrDNA) has been used extensively to study phylogeny in the tribe Gnaphalieae, whereas chloroplast DNA (cpDNA) analyses based on the characterization of a few loci of have been less useful due to their low resolution, especially in the crown radiation group that comprises most of the tribe Gnaphalieae species. However, recent advances in next-generation sequencing technologies have allowed researchers to obtain numerous complete chloroplast (cp) genome sequences. This rich source of sequence data describing cp genomic structure and gene contents has provided a novel means of studying phylogenetic relationships as well as determining the functional evolution of cp genomes in land plants. Such data has been especially applicable for groups such as Asteraceae that have complex evolutionary histories. For example, the earliest chloroplast-mapping studies revealed two large inversions (22.8 kb and 3.3 kb in the large single copy, or LSC, region) that demonstrated an ancient evolutionary split within Asteraceae (Kim et al., 2005). Several recent comparative analyses have also detailed some structural changes in Asteraceae (e.g., inversion of the small single copy, or SSC, region; Liu et al., 2013; Choi and Park, 2015). Furthermore, such changes in cp genomes indicate that they are related to abiotic and/or biotic factors, including life cycle (e.g., parasite vs. non- parasite, Wicke et al., 2013) and habitat (e.g., alpine vs. lowland, Hu et al., 2015). Studies of Asteraceae have served as an excellent opportunity, on a global scale, to understand adaptions that have occurred in the recent radiation of a plant group (Panero and Funk, 2008). Chloroplast and nuclear DNA analyses indicate that several clades exist within the tribe Gnaphalieae, namely the South African basal clade (e.g., Relhania L'Hér. and Cass.) and its derived lineages, including Australian clades (Ewartia Beauverd and Euchiton Cass.), the HAP clade (i.e., Helichrysum, Anaphalis DC., and Pseudognaphalium), the FLAG clade ( L., Leontopodium R. Br. ex Cass., Antennaria Gaertn., and Wedd., as well as several small genera), and Gnaphalium L. These clades have consistently been grouped as individual monophyletic lineages even within the crown radiation group, which exhibits very uneven rates of nucleotide substitution and a recent and rapid evolutionary radiation event. Some correlations have been found for the phylogeny and biogeographic patterns of these clades (e.g., taxa of the Australian clade or the distinct distribution range of Helichrysum species). This suggests that more comprehensive knowledge of such correlations is essential if we are to address complex intergeneric relationships. Asian taxa of the tribe Gnaphalieae, such as Anaphalis and Leontopodium, are regarded as indispensable elements within a

770 Horticultural Science and Technology Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

transitional lineage extending from the South African basal clade to its derived clades on other continents, even though the tribe has few species (Ward et al., 2009). Because of their position between other clades in a transitional lineage, more detailed information about Anaphalis and Leontopodium is needed for a greater understanding of the phylogeny and evolutionary history of the tribe Gnaphalieae (Ward et al., 2009; Nie et al., 2013). These two genera belong to a distinct lineage and the HAP and FLAG clades occupy nearly half of the crown radiation group. However, previous studies have primarily focused on those clades in more diverse regions, e.g., in Africa and Australia, or on specific genera, such as Helichrysum and Pseudognaphalium. However, more comprehensive phylogenic information is needed with regard to various regions, habitats, and taxonomic positions. In this study, we investigated the fully sequenced cp genomes of the two representative Anaphalis and Leontopodium species A. sinica Hance and L. leiolepis Nakai. We expect that a comparative analysis of cp genomes from these genera will provide new insight concerning the complex evolutionary histories that extant members of the tribe Gnaphalieae have experienced during their recent, rapid radiation from southern Africa to various new habitats. We believe that this work represents the first comparison of Asian taxa and the cp genomes of other Asteraceae and Gnaphalieae species. Our results will be useful for inferring the inter- and intra-evolutionary histories of the tribe Gnaphalieae, and the genomic information obtained here can be used for efficient propagation of edelweiss species.

Materials and Methods

Plant Materials, Genome Amplification, and Sequencing

Fresh, young leaf material of Anaphalis sinica and Leontopodium leiolepis were collected from natural populations at Mt. Geumsung and Mt. Seolak in South Korea, and voucher specimens were deposited in the herbarium of Inha University (IUI: Geum014 and 96004, respectively). Total genomic DNA was extracted using a DNeasy Plant Mini Kit (Qiagen, Seoul, Korea), and was pre-washed with STE buffer to remove any inhibitory chemicals (Takakura and Nishio, 2012). The concentration of each extract was measured with a NanoDrop Spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) before high-quality DNA was sequenced using the Illumina Miseq platform (BML, Daejeon, Korea).

Genome Assembly, Annotation, and Analysis

Prior to genome assembly, all raw sequence reads of the two species produced by Illumina were checked by FASTQC (Andrews, 2010) and trimmed by Trimmomatic 0.32 (Bolger et al., 2014). These trimmed raw reads were assembled by a de novo method using Geneious R7 (Drummond et al., 2011). In total, 10,582,958 and 20,246,780 paired-end reads with 282.1 and 283.6 average read lengths were generated for A. sinica and L. leiolepis, respectively. Each cp genome sequence was mapped with reference to that of frigida (JX293720), using Geneious R7. To avoid assembly errors from homopolymer runs, and to identify the gaps and borders of the LSC, SSC, and Inverted Repeat (IR) regions, we designed specific primers based on contig sequences and verified those sequences through PCR and Sanger sequencing methods. Each genome was annotated using the online program DOGMA (Wyman et al., 2004), and corrections were made manually for start and stop codons and for intron/exon boundaries. Both tRNA and rRNA genes were identified by BLASTN with the same database and by tRNAscan- SE (Schattner et al., 2005), with default settings. A circular map of each cp genome was produced by Organellar Genome DRAW

Horticultural Science and Technology 771 Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

(Lohse et al., 2007), followed by manual modifications. The repeating sequences of cpDNA were identified by the SSR Extractor (http:// www.aridolan.com/ssr/ssr.aspx). To compare their synonymous (Ks) and non-synonymous (Ka) nucleotide substitutions, we aligned the functional protein-coding genes of these two Asian taxa using Geneious R7 and analyzed them with DnaSP (Librado and Rozas, 2009). We also used specific primer pairs to verify the pseudogenization (or gene loss) of trnT-GGU with related plant materials, including representatives of the genus category, i.e., Gnaphalium uliginosum; the Australian clade: Euchiton japonicum; members of the FLAG clade: Gamochaeta pensylvanica, L. japonicum, and L. leontopodioides; and members of the HAP clade: A. sinica var. morii, A. margaritacea, Helichrysum cymosum, H. orientale, and Pseudognaphalium affine.

Phylogenetic Analysis

Our phylogenetic analysis covered 13 complete cp DNA sequences that included 11 Asteraceae species and two outgroup species of Apiaceae and Araliaceae that were downloaded from the NCBI database (http://www.ncbi.nlm.nih.gov/genomes). From those species, 52 protein-encoding genes were aligned using MUSCLE (Edgar, 2004), which is part of the Geneious program. We then performed a Maximum Likelihood (ML) analysis that utilized RAxML with the GTR + G model (Stamatakis, 2006; Stamatakis et al., 2008).

Results and Discussion

Characteristics of the Leontopodium leiolepis and Anaphalis sinica cp Genomes, and Comparison with Other Asteraceae cp Genomes

The average paired-end read lengths were 301 bp for both L. leiolepis and A. sinica. These were assembled into the reference cp genome sequence of Artemisia frigida, which belongs to the closely related tribe Anthemideae. For L. leiolepis, 662,243 reads were assembled (mean coverage of 1,041.5). After the low-assembly coverage of gap regions was confirmed, we obtained 151,072 bp for the complete cp genome sequence of L. leiolepis (GenBank Accession Number NC027835). Its quadripartite structure includes an LSC region of 83,308 bp, an SSC region of 18,052 bp, and IR regions of 24,856 bp each (Fig. 1). Approximately 59.4% of the L. leiolepis cp genomic sequence (average GC content of 37.3%) is comprised of protein-coding genes, with the remaining 40.6% being non-coding regions such as introns and intergenic spacers. Of the 132 total genes (Table 1), 85 are protein-coding, 37 are for transfer RNA, eight are for ribosomal RNA, and two are partial genes (ycf1 and rps19) located on the boundaries of the IR regions. The IR regions contain seven protein-coding genes, seven tRNAs, and four rRNAs that are duplicated. The LSC region contains 62 protein-coding genes and 22 tRNA genes, whereas the SSC region has 11 protein-coding genes and one tRNA. For A. sinica, 139,015 reads were assembled (mean coverage of 210). After the low-assembly coverage of gap regions was confirmed, we obtained 152,718 bp for the complete cp genome sequence (GenBank Accession Number KX148081). Its quadripartite structure includes an LSC region of 84,546 bp, an SSC region of 18,488 bp, and IR regions of 24,842 bp each (Fig. 2). Approximately 59.7% of the A. sinica cp genomic sequence (average GC content of 37.1%) is comprised of protein-coding genes, with the remaining 40.3% being non-coding regions such as introns and intergenic spacers. Among the 131 genes (Table 1), 85 are protein-coding, 36 are for transfer RNA, eight are for ribosomal RNA, and two are partial genes (ycf1 and rps19)

772 Horticultural Science and Technology Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

trnS-GGA

trnL-UAA trnF-GAA

trnG-GCC trnM-CAU psbZ

psbC psbD

rbcL trnT-GGU

ycf3 rps4 trnT-UGU AtpA ndhJ AtpF psal trnV-UAC AtpH AtpE psaA ycf4 psaB AtpB Atpl ndhK petA ndhC rps2

trnS-UGA

rps14 petL rpsC2 petG trnG-UCC LSC trnR-UCU rpl33 trnfM-CAU rps18 psaJ rpsC1 psbJ psbL psbF psbE rpoB trnW-CCA rpl20 trnP-UGG rps12 psbB clpP psbT petN psbH trnE-UUC trnC-GCA petB psbN trnY-GUA psbl trnD-GUCpsbM psbK petD trnS-GCU trnQ-UUG rpoA rps11 Leontopodium leiolepis trps16 rpl36 rps8 trnK-UUU rpl14 rpl6 rps3 chloroplast genome matK rpl22 psbA rps12 rps19 151,072 bp trnH-GUG rpl2 rps19 rpl23 rps2 tml-CAU rpl23 ycf2 trnl-CAU

IRB ycf2 IRA tmL-CAA ndhB trnV-GAC trnL-CAA rps7 trnR-ACG trnl-GAU rps12 rrn4.5 trnA-UGC ndhB SSC trn16 rrn5 rps7 rrrn23 rps12

trnL-UAG

ycf1

rpl32

ccsA trnN-GUU rrn16 trnV-GAC

trnl-GAU trnN-GUU photosystem I trnA-UGC rrn23

photosystem II ndhF ndhG cytochrome b/f complex ndhE psaC ndhD ycf1 Atp synthase rrn5

rrn4.5 ndhl rps15 ndhH NADH dehydrogenase ndhA RubisCO large submit RNA polymerase trnR-ACG ribosomal proteins (SSU) ribosomal proteins (LSU) clpP, matK other genes hypothetical chloroplast reading frames (ycf) transfer RNAs ribosomal RNAs

Fig. 1. Map of the Leontopodium leiolepis chloroplast genome.

located on the boundaries of the IR regions, which is similar to the genomic contents of L. leiolepis. The IR regions contain seven protein-coding genes, seven tRNAs, and four rRNAs that are duplicated. The LSC region contains 62 protein-coding genes and 21 tRNA genes, whereas the SSC region has 11 protein-coding genes and one tRNA. Among the protein-coding genes of these two cp genomes, 11 have single introns and three (clpP, rps12, and ycf3) have two introns each. In the case of rps12, the 5′-end exon is located in the LSC region, whereas two copies of the 3′-end exon and intron are located in the IR regions. Similar observations have been reported for other Asteraceae cp genomes (Nie et al., 2012; Liu et al., 2013). Our comparisons of nucleotide substitution rates among coding genes revealed that ndhJ had the highest synonymous nucleotide substitution rate (Ks: 0.2777) whereas rpl16 had the highest non-synonymous nucleotide substitution rate (Ka: 0.0238) (Table 2). In contrast, 19 and 32 genes showed no changes in Ks and Ka, respectively. The cp genomes characterized here did not differ significantly from those of other completed Asteraceae cp genomes. As noted for other angiosperm lineages, the size variation, approximately 2 kb, was primarily attributed to differences in lengths of the LSC region, which ranges from 81.9 kb in spathulifolius to 84.3 kb in argentatum (Table 3 and Fig. 3). Other characteristics were similar between these two cp genomes of the tribe Gnaphalieae and those from other members of Asteraceae, including the presence of two pseudogenes, ycf1 and rps19, at the IR boundaries. In fact, previously described cp

Horticultural Science and Technology 773 Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Table 1. Genes identified in the chloroplast genomes of Leontopodium leiolepis and Anaphalis sinica No. Group of gene Name of gene (L. leiolepis / A. sinica) RNA genes Ribosomal RNAs rrn4.5**, rrn5**, rrn16**, rrn23** 8 / 8 trnA-UGC**, trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-GCC, trnG- UCC, trnH-GUG, trnI-CAU**, trnI-GAU**, trnK-UUU, trnL-CAA**, trnL-UAA, trnL- Transfer RNAs UAG, trnM-CAU, trnN-GUU**, trnP-UGG, trnQ-UUG, trnR-ACG**, trnR-UCU, trnS- 37 / 36 GCU, trnS-GGA, trnS-UGA, trnT-GGU ѱ, trnT-UGU, trnV-GAC**, trnV-UAC, trnW-CCA, trnY-GUA Protein genes Photosystem I psaA, psaB, psaC, psaI, psaJ 5 / 5 Photosystem II psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ 15 / 15 Cytochrome petA, petB, petD, petG, petL, petN 6 / 6 Atp synthase AtpA, AtpB, AtpE, AtpF, AtpH, AtpI 6 / 6 Rubisco rbcL 1 / 1 NADH dehydrogenease ndhA, ndhB**, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK 12 / 12 Atp-dependent protease clpP 1 / 1 subunit P Chloroplast envelope cemA 1 / 1 membrane protein Large units rpl2**, rpl14, rpl16, rpl20, rpl22, rpl23**, rpl32, rpl33, rpl36 11 / 11 Ribosomal proteins Small units rps2, rps3, rps4, rps7**, rps8, rps11, rps12**, rps14, rps15, rps16, rps18, rps19** 15 / 15 RNA polymerase rpoA, rpoB, rpoC1, rpoC2 4 / 4 Initiation factor infA 1 / 1 Transcription / Miscellaneous proteins accD, ccsA, matK 3 / 3 Translation Hypothetical proteins & ycf1**, ycf2**, ycf3, ycf4 6 / 6 conserved reading frames Total 132 / 131 Notes: *, genes with introns; **, duplicated genes; ѱ, gene loss in A.sinica

genomes of Asteraceae have revealed nearly identical gene contents and structures (Timme et al., 2007; Kumar et al., 2009; Dempewolf et al., 2010; Doorduin et al., 2011; Nie et al., 2012; Zhang et al., 2014; Choi and Park, 2015). This is in contrast to the earliest cp-mapping studies, which revealed two large inversions, specifically 22.8 kb and 3.3 kb, in the LSC region (Jansen and Palmer, 1987a, b; Kim et al., 2005). Despite several structural changes, e.g., inversion of the SSC region, being described (Liu et al., 2013; Choi and Park, 2015), those events might simply represent an alternative state in individual plants rather than characteristics unique to a specific lineage. In comparison, the two previously reported large inversions described above demonstrate an ancient evolutionary split within Asteraceae (Jansen and Palmer, 1987a, b; Kim et al., 2005). Indeed, inversion events in the SSC regions may have easily been promoted by their loop structure located between the two IRs. We also found two alternative states for SSC regions that were assembled in exactly the same way except for their direction. Unfortunately, we could not explicitly verify the orientation of the SSC region because our two featured species had short reads (301 bp). However, our assumption is strengthened by a recent report of SSC inversions in Asteraceae (Walker et al., 2015). Therefore, we believe that the events previously described for the SSC region cannot be used to infer the functional evolution among cp genomes and phylogenetic relationships. Moreover, we propose that the cp genomes in the family Asteraceae are remarkably conserved when compared with those of other large plant groups, such Fabaceae or Orchidaceae.

774 Horticultural Science and Technology Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

trnS-GGA

trnL-UAA trnF-GAA

trnM-CAU trnG-GCC psbZ

psbC psbD

rbcL trnT-GGU

ycf3 rps4 trnT-UGU AtpA ndhJ AtpF psal trnV-UAC AtpH AtpE psaA ycf4 psaB AtpB Atpl ndhK petA ndhC rps2

trnS-UGA

rps14 petL rpsC2 petG trnG-UCC LSC trnR-UCU rpl33 trnfM-CAU rps18 psaJ rpsC1 psbJ psbL psbF psbE rpoB trnW-CCA rpl20 trnP-UGG rps12 psbB clpP psbT petN psbH trnE-UUC trnC-GCA petB psbN trnY-GUA psbl trnD-GUCpsbM psbK petD trnS-GCU trnQ-UUG rpoA rps11 Anaphalis sinica trps16 rpl36 rps8 trnK-UUU rpl14 rpl6 rps3 chloroplast genome matK rpl22 psbA rps12 rps19 152,718 bp trnH-GUG rpl2 rps19 rpl23 rps2 tml-CAU rpl23 ycf2 trnl-CAU

IRB ycf2 IRA tmL-CAA ndhB trnV-GAC trnL-CAA rps7 trnR-ACG rps12 rrn4.5 trnA-UGCtrnl-GAU SSC trn16 ndhB rrn5 rps7 rrrn23 rps12

trnL-UAG

ycf1

rpl32

ccsA trnN-GUU rrn16 trnV-GAC

trnl-GAU trnN-GUU photosystem I trnA-UGC rrn23

photosystem II ndhF ndhG cytochrome b/f complex ndhE psaC ndhD ycf1 Atp synthase rrn5

rrn4.5 ndhl rps15 ndhH NADH dehydrogenase ndhA RubisCO large submit RNA polymerase trnR-ACG ribosomal proteins (SSU) ribosomal proteins (LSU) clpP, matK other genes hypothetical chloroplast reading frames (ycf) transfer RNAs ribosomal RNAs

Fig. 2. Map of the Anaphalis sinica chloroplast genome.

Pseudogenization and Gene Loss of trnT-GGU in Gnaphalieae

The gene contents of the two Gnaphalieae cp genomes are similar to those of other Asteraceae species, with the exception of the gene trnT-GGU. Although trnT-GGU exhibits several base substitutions (1 to 2 bp), it is present and completely functional in all other tribe members in Asteraceae. In comparison, the trnT-GGU gene in L. leiolepis has undoubtedly been pseudogenized, with two indels (6 bp and 8 bp) that can cause a modification to the structure of that trnA. Moreover, we determined that complete trnT-GGU gene loss occurred in the cp genome of A. sinica (Fig. 4). In addition to large structural rearrangements, relatively small-scale genomic changes, such as gene loss and transformation to pseudogenes, can serve as useful phylogenetic markers when addressing the phylogeny and evolutionary history of land plants. Representative cases have been reported for parasitic plants, including Cuscuta of Convolvulaceae (McNeal et al., 2007), Rhizanthella of Orchidaceae (Delannoy et al., 2011), and genera within Orobanchaceae (Li et al., 2013). These parasites exhibit gene loss or pseudogenization of the majority of chloroplast genes involved in photosynthesis (e.g., atpB, ndhF) because their host plant–dependent life cycle no longer requires autotrophic growth. Although such drastic sequence variation is uncommon in plant cp genomes, such events that reflect phylogeny are increasingly being reported based on the enhanced use of comparative chloroplast sequence analyses (Lee et al., 2007; Schmickl et al., 2009; Do et al., 2014).

Horticultural Science and Technology 775 Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Table 2. Comparison of synonymous (Ks) and non-synonymous (Ka) nucleotide substitution rates in the chloroplast genomes of Leontopodium leiolepis and Anaphalis sinica Synonymous Non-symonymous Gene No. of mutations Ks No. of mutations Ka Small subunit of ribosome rps2 5 0.0311 0 0.0000 rps3 3 0.0203 0 0.0000 rps4 3 0.0203 0 0.0000 rps7 1 0.0085 0 0.0000 rps8 1 0.0099 0 0.0000 rps11 0 0.0000 1 0.0033 rps12 2 0.0209 0 0.0000 rps14 1 0.0141 0 0.0000 rps15 1 0.0155 1 0.0047 rps16 3 0.0131 15 0.0175 rps18 0 0.0000 0 0.0000 rps19 0 0.0000 1 0.0047 Large subunit of ribosome rpl2 0 0.0000 1 0.0009 rpl14 0 0.0000 2 0.0073 rpl16 5.5 0.0206 25.5 0.0238 rpl20 3 0.0331 3 0.0105 rpl22 1 0.0094 0 0.0000 rpl23 0.5 0.0073 1.5 0.0071 rpl32 0 0.0000 0 0.0000 rpl33 0 0.0000 1 0.0066 rpl36 1 0.0333 0 0.0000 RNA polymerase subunit rpoA 4 0.0181 1 0.0013 rpoB 20 0.0272 6 0.0025 rpoC1 16 0.0256 16 0.0074 rpoC2 19.5 0.0205 24.5 0.0078 ATP synthase gene atpA 7 0.0186 5 0.0044 atpB 7 0.0194 7 0.0063 atpE 2 0.0208 2 0.0066 atpF 5 0.0197 6 0.0060 atpH 0 0.0000 0 0.0000 atpI 5 0.0274 3 0.0054 NADH dehydrogenase ndhA 10 0.0221 30 0.0184 ndhB 2 0.0041 2 0.0021 ndhC 1 0.0116 0 0.0000 ndhD 7 0.0196 3 0.0026 ndhE 2 0.0275 0 0.0000 ndhF 20.5 0.0413 17.5 0.0103 ndhG 1 0.0076 3 0.0076 ndhH 11 0.0421 2 0.0022 ndhI 3 0.0262 0 0.0000 ndhJ 3 0.2777 2 0.0055 ndhK 3 0.0182 2 0.0039

776 Horticultural Science and Technology Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Table 2. Continued Cytochrome b/f complex petA 9 0.0405 2 0.0028 petB 8 0.0253 13 0.0123 petD 1 0.0040 11 0.0120 petG 0 0.0000 0 0.0000 petL 0 0.0000 1 0.0151 petN 1 0.0432 0 0.0000 Photosystem I psaA 11 0.0209 0 0.0000 psaB 13 0.0262 2 0.0012 psaC 2 0.0346 0 0.0000 psaI 0 0.0000 0 0.0000 psaJ 0 0.0000 0 0.0000 Photosystem II psbA 7 0.0281 1 0.0012 psbB 9 0.0251 2 0.0017 psbC 11 0.0322 0 0.0000 psbD 8 0.0324 0 0.0000 psbE 0 0.0000 0 0.0000 psbF 0 0.0000 0 0.0000 psbH 1 0.0173 0 0.0000 psbI 0 0.0000 0 0.0000 psbJ 1 0.0277 0 0.0000 psbK 0 0.0000 0 0.0000 psbL 0 0.0000 1 0.0113 psbM 1 0.0375 0 0.0000 psbN 0 0.0000 0 0.0000 psbT 1 0.0394 1 0.0133 psbZ 0 0.0000 0 0.0000 Large chain of rubisco rbcL 4 0.0115 5 0.0045 Other genes infA 3 0.0544 0 0.0000 clpP 8.17 0.0189 29.83 0.0196 ccsA 3 0.0138 2 0.0028 accD 6 0.0192 5 0.0042 cemA 3 0.0201 5 0.0093 matK 11 0.0327 14 0.0121 Unknown functions ycf1 33 0.0323 65 0.0162 ycf2 4 0.0027 10 0.0019 ycf3 10 0.0239 10 0.0067

Accordingly, we identified similar features in other genera of the tribe Gnaphalieae. For example, the trnT-GGU genes in the FLAG, Gnaphalium, and Australian clades have been pseudogenized and, with the exception of Pseudognaphalium affine, those in members of the HAP clade have been completely lost. However, we were unable to ascertain whether trnT-GGU pseudogenization or gene loss demonstrates an autapomorphy in the entire tribe due to our limited taxon sampling size. Nevertheless, the near consistent trnT-GGU mutations clearly contrast with the gene contents for other tribes in Asteraceae.

Horticultural Science and Technology 777 Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Table 3. Comparison of sequence characteristics among nine chloroplast genomes of Asteraceae species

Anaphalis Leontopodium Aster Artemisia Lactuca Guizotia Parthenium Specise sinica leiolepis spathulifolius frigida sativa annuus vulgaris abyssinica argentatum Length (bp) 152,718 151,072 149,510 151,076 152,765 151,104 150,689 151,762 152,803 LSC (bp) 84,546 83,308 81,998 82,740 84,105 83,530 82,855 83,636 84,335 SSC (bp) 18,488 18,052 17,973 18,394 18,599 18,308 18,277 18,277 19,390 IRs (bp) 24,842 24,856 24,751 24,971 25,034 24,633 24,777 24,950 24,424 AT contents (%) 62.9 62.7 62.3 62.5 62.5 62.4 62.7 62.4 62.4 GC contents (%) 37.1 37.3 37.7 37.5 37.5 37.6 37.3 37.6 37.6 Total number of genes 131 132 132 134 128 138 132 132 106 Protein coding genes 85 85 87 87 84 85 87 85 56 tRNAs 27 28 29 30 20 27 29 27 27 rRNAs 4 4 4 4 4 4 4 4 4

LSC IRb SSC IRa LSC 43 bp

84.546 bp rps19 24.842 bp rcf1 18.488 bp ndhF Ψ rps1 24.842 bp Ψ rps19 trnH A. sinica 19 bp

84.546 bp rps19 24.856 bp rcf1 18.052 bp ndhF Ψ rps1 24.856 bp Ψ rps19 trnH L. leiolepis 5 bp

83.308 bp rps19 24.751 bp rcf1 17.973 bp ndhF Ψ rps1 24.751 bp Ψ rps19 trnH A. spathulifolius 75 bp

81.998 bp rps19 24.971 bp rcf1 18.394 bp ndhF Ψ rps1 24.971 bp Ψ rps19 trnH A. frigida 4 bp

82.740 bp rps19 25.034 bp rcf1 18.599 bp ndhF Ψ rps1 25.034 bp Ψ rps19 trnH L. sativa 70 bp

83.530 bp rps19 24.633 bp rcf1 18.308 bp ndhF Ψ rps1 24.633 bp Ψ rps19 trnH H. annuus

Fig. 3. Comparison of Inverted Repeats (IRs) and chloroplast genome structure among Asteraceae species.

Therefore, the trnT-GGU gene sequence may be a useful phylogenetic marker for investigating the complicated phylogeny and evolutionary history of the tribe Gnaphalieae. In particular, pseudogenized trnT-GGU in P. affine and complete trnT-GGU gene loss in Helichrysum and Anaphalis within the HAP clade indicate their evolutionary history. In general, members in the HAP clade are thought to have originated via allopolyploidy. This conclusion is robustly supported by various cytological, morphological, and molecular features (Smissen et al., 2011; Galbany-Casals et al., 2014). Phylogenetic analyses of Helichrysum and related genera have provided further proof of such origins, even though the potential parental species are not certain. Nevertheless, our current results give additional information about their lineage. If one considers the maternal inheritance and highly conserved structures of cpDNA, then the presence of the pseudogenized trnT-GGU in P. affine, unlike that in other members of the HAP clade, suggests that neither Helichrysum nor Anaphalis, which both exhibit complete trnT-GGU gene loss, form the maternal lineage for the genus Pseudognaphalium. Instead, one of the genera within the FLAG clade, which exhibits a pseudogenized version of trnT-GGU, would be a more appropriate maternal lineage candidate. Likewise, we can hypothesize that the genus Anaphalis (x = 14), which features complete trnT-GGU gene loss, originated by allopolyploidy between Helichrysum (x = 7) as a maternal lineage and Pseudognaphalium (x = 7) as a paternal lineage. Consequently, our cpDNA data, including gene contents and structure, serve as useful phylogenetic markers for studies of the phylogeny and evolutionary history of the tribe Gnaphalieae. Examination of further members of the tribe Gnaphalieae will facilitate a more comprehensive understanding of its dynamic evolutionary history.

778 Horticultural Science and Technology Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

trnT-GGU

10 20 30 40 50 60 70 80 90 100 110 120 130 140 148 Indel 2 A. spathulifolius C C A A G C C C C C T T C C T T T T A A C T C A G T GA T A G A G T A A T C C C A T G T A A A G G C G T A A G C C T C GG T T C A G A T C C G A TA GGGG AC T T T G T A T T T T T T T C A A. montana C C A A G C C T T T T C C T T T T A A C T C A G T GG T A G A G T A A T C C C A T G T A A A G G C G T A A G C C T C GG T T C A A A T C C G A TA AGGG GC T T T G T A T A T A T A T A T A T T T T T T T T C A A. montana C C A A G C C T T T T C C T T T T A A C T C A G T GG T A G A G T A A T C C C A T G T A A A G G T G T A A G C C T C GG T T C A A A T C C G A TA AGGG GC T T T G T A T A T A T A T A T A T T T T T T T T C A C. indicum C C A A G C C T T T T C C T T T T A A C T C A G T GG T A G A G T A A T C C C A C G T A A A G G C G T A A G C C T C GG T T C A A A T C C G A TA AGGG GC T T T G T A T A T A T A T A T A A A T T T T T T T T T T C A H. annaus C C A A G C C C C T T T T T T T A A C T C A G T GG T A G A G T A A T C C C A T G T A A A C G C G T A A G C C T C GG T T C A A A C C C G A CA AGGG GC T T T G T T T GG T T T T T T T C A G. abyssinica C C A A G C C C C C T T T T T T T T A C T C A G T GG T A G A G T T A T C C C A T G T A A T C G C G T A A G C C T C GG T T C A A A C C C G A CA AGGG GC T T T G T T T G T T T G T T T T T T C A A. spathulifolius C C A A G C C C C T T T T T T A C T C A G T GG T A G A G T T A T C C C A T G T A A T C G C G T A A G C C T C GG T T C A A A C C C G A CA AGGG GC T T T T T T T GG T T T T T T T T T G A P. argentatum C C A A G C C C C T T T T T T T A A C T C A G T GG T A G A G T A A T C C C A T G T A A A C G C G T A A G C C T C GG T T C A A A C C C G A CA AGGG GC T T T G T T T G T T T G T T T T T T C A A. adenophora L. sativa C C A A G C C T T T T T C C T T T T A A C T C A G T GG T A G A G T A A T G A A A T G T A A A G G G G T A A G C C T C GG T T C A A A T C C G A TA AGGG GC T T T G T A T T T T T T T C A P. clematidea C C A A G C C T T T T T T T T T A A C T T T T A A C T C A G T GG T A G A A T A A T C C C A T G T A A A C G C G T A A G C C T C GG T T C A A A C C C G A CA AGGG GC T T T G T T T G T T T G T T T T T C A

Other tribe J. vulgaris C C A A G C C C C C T T A A C T C A T A A C T C A G C GG T A G A G C A C T C C T A C G T A A A G G C G T A A G C C T C GG T T C A A A T C C G A CA AGGG AC T T T G T A T T T T T T T C A C. diffusa C C A A G C C T T T T A A C T C A G T GG T A G A G T A A T G C C A T G T A A A G G C G T A A G C C T C GG T T C A A A T C C G A TA AGGG GC T T T G T A T T T T T T T T T C A Indel 1 L. leiolepis C C A A G C C C T T T C C T T T T A A C T C A G C A GTA G A G T G T A A A G C G T A A G C C C T C GG T T C A A A T C C G A T A C C T A T A A G A G A C TC T T T T T T T T C A L. leontopodioides C C A A G C C C T T T C C T T T T A A C T C A G C A GTA G A G T G T A A A G C G T A A G C C C T C GG T T C A A A T C C G A T A C C T A T A A G A G A C TC T T T T T T T T C A L. japonicum C C A A G C C C T T T C C T T T T A A C T C A G C A GTA G A G T G T A A A G C G T A A G C C C T C GG T T C A A A T C C G A T A C C T A T A A G A G A C TC T T T T T T T T C A E. japonicus C C A A A C C C T T T C C G T T T A A C T C A A C A GTA G A C T G T A A A C C A T A A G C T C T C A G T T C A T A T C T G A T A C C T A T A A G A G A C TC T T T T T T T T A G. pensylvanica C C A A G C C C T T T C C T T T T A A C T C A G C A GTA G A G T G T A A A G C G T A A G C C C T C GG T T C A A A T C C G A T A C C T A T A A G A G A C TC T T T T T T T T C A

FLAG, Gnaphalium G. uliginosum C C A A G C C C T T T C C T T T T A A C T C A G C A GTA G A G T G T A A A G C G T A A G C C C T C GG T T C A A A T C C G A T A C C T A T A A G A G A C TC T T T T T T T T C A

P. affine C C G A G C C C T T T C C T T T T A A C T C A T C A G T A GAGA G T C A T G C G T A A G C C C T CGG T T C AA A T C C G A T A C C T A T A A G A C A C T C T T T T T T T C T H. orientale C T T A T C T T C T T C T T A T T C T T A T T T A T T T A T A T A T A T T T A T A T A T A T T T A A A A T T T A T C A T C T T C T T A T C T T A T A T A T A T T T T T C T T T T A T T T A T C T T C T T C T T A T A T A T A T A T T T T T C T C T T T C T T T A T T A T T T T A T T T T T T A T H. orientale C A A T A A G T A A T T T T A T T A T T C T A T A A G T A A T A A A A A A T T T T A T T A T T T A T T T A T T T T A T C T T C T T T T A T T A A T T A T T T C T A T T T A T T T A T C T T C T T C T T A T A T A T A T T T T A T A T T T A A T T T A A A. sinica C A T A T A A A T A A T A A A A A A T T T A T T A T T T A T T T C T A T T T A T T T A T C T T C T T C T T A T A T A T A T A T A T T A A T A T T T T A T A T T T T A A HAP

Gnaphalieae A. sinica var. morii C A T A T A A G T A A T A A A A A A T T T A T T A`T T T A T T T C T A T T T A T T T A T C T T C T T C T T A T A T A T A T A T A T T A A T A T T T T T A T A T T T T A A A. margaritacea C A T A T A A G T A A T A A A A A A T T T A T T A T T T A T T T C T A T T T A T A T A T C T T C T T C A T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A T T A A T A T T T T A T

Fig. 4. Comparison of trnT-GGU sequences among Asteraceae species.

Phylogenetic Analysis of Asteraceae cp Genomes

We used 52 protein-coding genes to generate a phylogenetic reconstruction of Asteraceae. This analysis included the cp genomes of all examined species alongside two Apiales species, namely Anthriscus cerefolium and Kalopanax septemlobus, as outgroup taxa. Phylogenetic analysis produced three clades that corresponded to the classification system of Carduoideae, Cichorioideae, and Asteroideae. Within the Asteraceae, Centaurea diffusa of Carduoideae was located at the lowest level whereas Lactuca sativa of Cichorioideae formed a sister group with species in Asteroideae (Fig. 5). The subfamily Asteroideae formed two groups: Senecionodae/Helianthodae and Asterodae. Following the incorporation of further protein-coding genes into this reconstruction (data not shown), we were able to group Jacobaea vulgaris of Senecinodae with the Asterodae. However, such a phylogenic relationship may be the result of biased topology due to a lack of gene annotations for some taxa. Within Asterodae, we found that the monophyletic group of A. sinica and L. leiolepis was a sister group to other species within Astereae (Aster spathulifolius) and Anthemideae ( indicum, Artemisia frigida, and A. montana). The topologies defined here are very consistent with previously reported phylogenetic relationships based on chloroplast genes (Panero and Funk, 2008). Moreover, our finding of a phylogenetic relationship for two Asian genera in Gnaphalieae further supports our conclusion

Anthriscus cerefolium

Kalopanax septemlobus Centraurea diffusa Cynareae Carduoideae

Lactuca sativa Cichloideae Cichorioideae Jacobaea vulgaris Senecionodae 100 adenophora

92 Guizotia abyssinica 100 Helianthodae Helianthus annuus 87 98 100 Parthenium argentatum Asteroideae

Aster spathulifolius Astereae Anaphalis sinica 80 Gnaphalieae Leontopodium leiolepis Asterodae 74 Artemisia montana Anthemideae 0.3 100 99 Artemisia frigida

Fig. 5. Phylogenetic analysis using 52 chloroplast protein genes and Maximum Likelihood method.

Horticultural Science and Technology 779 Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

that the tribe Gnaphalieae, albeit independent, is closely related to Anthemideae and Astereae rather than to the tribe Inuleae in Helianthodae, which was previously regarded as a sister group.

Repeat Sequence Analysis

Due to the high-level polymorphism that exists even among plants of the same species, microsatellites and simple sequence repeats (SSRs) are widely used as valuable genetic markers for assessing population genetic diversity, especially in studies of conservation genetics for endangered and rare species (Kramer and Havens, 2009). L. leiolepis and its relatives, commonly known as “edelweiss”, are primarily found in the upper regions of high mountains and are considered as endangered species. Thus, a number of edelweiss species undoubtedly require suitable management strategies in many countries. Here, we found 63 and 81 SSRs, including 38 and 44 mono-, 7 and 12 di-, 9 and 9 tri-, and 9 and 16 tetranucleotides in the cp genomes of L. leiolepis and A. sinica, respectively (Table 4). The mononucleotide repeats exclusively contained the nucleotide A (n = 16 and 22, respectively) or T (n = 22 and 22, respectively), similar to that in other angiosperm species. This A/T-rich tendency was consistent throughout almost all SSR nucleotide categories. These SSR data provide a powerful tool for evaluating the population genetics of these plant species, and will help to make informed decisions about suitable conservation management strategies. In conclusion, this is the first study to analyze the complete chloroplast genomes of two species in the tribe Gnaphalieae within Asteraceae, namely those of A. sinica and L. leiolepis. Their cp genome characteristics, such as copy number and gene content/order, closely resemble that of previously reported cp genomes of other Asteraceae species. However, the observed pseudogenization or gene loss of trnT-GGU in these studied species reinforces the taxonomic boundary of the tribe Gnaphalieae as an independent group, and can be used as an informative molecular marker for further research into generic and/or tribal relationships. Finally, the SSRs identified here provide valuable species–specific genetic information that may assist in conservation efforts.

Table 4. Distribution of simple sequence repeats (SSRs) in the chloroplast genomes of Anaphalis sinica and Leontopodium leiolepis

Base Length (bp) No. of SSRs Coordinate base pairs (A.sinica / L. leiolepis) (A.sinica / L. leiolepis) A 10 9 / 9 rpoB, clpP, ycf1, IGS / rpoB, rpoC1, IGS, ycf1 11 8 / 3 clpP, ycf3, IGS / IGS 12 3 / 1 trnI-GAU / IGS / clpP 13 2 / 1 IGS / IGS 15 – / 1 – / IGS 16 – / 1 – / ycf1 T 10 9 / 9 rps16, ycf3, trnK-UUU, IGS / rpoC1, rpoC2, IGS 11 7 / 5 ycf3, IGS / IGS 12 1 / 3 trnI-GAU / IGS, clpP 13 2 / 3 clpP, IGS / IGS 14 1 / – IGS / – 15 1 / – IGS / – 16 – / 1 – / IGS 19 1 / – IGS / – 22 – / 1 – / IGS AT 10 4 / 3 rpoC2, IGS / rpoC2, IGS 16 1 / – petD / –

780 Horticultural Science and Technology Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Table 4. Continued

Base Length (bp) No. of SSRs Coordinate base pairs (A.sinica / L. leiolepis) (A.sinica / L. leiolepis) TA 10 4 / 3 rpoC1, petD, IGS / rpoC1, IGS 12 2 / 1 IGS / IGS 14 1 / – clpP / – AAT 12 2 / – petB, IGS / – AAG 12 – / 1 – / IGS ATA 12 1 / – IGS / – 15 1 / – IGS / – ATT 12 2 / 1 IGS / IGS TAA 12 – / 2 – / IGS 15 1 / 1 clpP / IGS TAT 12 – / 1 – / psbC TTA 12 2 / 2 IGS / IGS TTC 12 – / 1 – / IGS AAAT 12 2 / – IGS / – AATA 12 1 / 1 IGS / IGS ATAG 12 1 / 1 IGS / IGS ATCA 12 1 / – IGS / – ATTT 12 2 / 1 trnK-UUU, IGS / IGS GATT 12 1 / 1 clpP / IGS TAAT 12 1 / – IGS / – TATT 12 2 / 2 ndhD, IGS / clpP, IGS TCAA 12 1 / – IGS / – TTAA 12 1 / – IGS / – TTTA 12 1 / 1 IGS / IGS TTTC 12 2 / 2 rpl16, IGS / rpl16, ndhD

Literature Cited

Anderberg, AA (1989) Phylogeny and reclassification of the tribe Inuleae (Asteraceae). Can J Bot 67:2277-2296. doi:10.1139/b89-292 Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Retrieved from http://www.bioinformatics. bbsrc.ac.uk/projects/fastqc Bayer RJ, Starr JR (1998) Tribal phylogeny of the Asteraceae based on two non-coding chloroplast sequences, the trnL intron and trnL/trnF intergenic spacer. Ann Mol Bot Gard 242–256. doi:10.2307/2992008 Bentley J, Klaassen ES, Bergh NG (2015) Philyrophyllum (Asteraceae) transferred from Gnaphalieae to Athroismeae based on phylogenetic analysis of nuclear and plastid DNA sequence data. Taxon 64:975-986. doi:10.12705/645.7 Bock DG, Kane NC, Ebert DP, Rieseberg LH (2014) Genome skimming reveals the origin of the Jerusalem artichoke tuber crop species: neither from Jerusalem nor an artichoke. New Phytol 201:1021-1030. doi:10.1111/nph.12560 Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114-2120. doi:10.1093/bioinformatics/btu170 Choi KS, Park SJ (2015) The complete chloroplast genome sequence of Aster spathulifolius (Asteraceae); genomic features and relationship with Asteraceae. Gene 572:214-221. doi:10.1016/j.gene.2015.07.020 Delannoy E, Fujii S, des Francs-Small CC, Brundrett M, Small I (2011) Rampant gene loss in the underground orchid Rhizanthella gardneri highlights evolutionary constraints on plastid genomes. Mol Biol Evol 28:2077-2086. doi:10.1093/molbev/msr028 Dempewolf H, Kane NC, Ostevik KL, Geleta M, Barker MS, Lai Z, Stewart ML, Bekele E, Engels JMM, et al (2010) Establishing genomic tools and resources for Guizotia abyssinica (L.f.) Cass. – the development of a library of expressed sequence tags, microsatellite loci, and the sequencing of its chloroplast genome. Mol Ecol Resour 10:1048-1058. doi:10.1111/j.1755-0998.2010.02859.x Do HDK, Kim JS, Kim JH (2014) A trnI_CAU triplication event in the complete chloroplast genome of Paris verticillata M. Bieb. (Melanthiaceae, Liliales). Genome Biol Evol 6:1699-1706. doi:10.1093/gbe/evu138

Horticultural Science and Technology 781 Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Doorduin L, Gravendeel B, Lammers Y, Ariyurek Y, Chin-A-Woeng T, Vrieling K (2011) The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris: SNPs, microsatellites and barcoding markers for population and phylogenetic studies. DNA Res 18:93-105. doi:10.1093/dnares/dsr002 Drummond AJ, Ashton B, Buxton S, Cheung M, Cooper A, Duran C, Field M, Heled J, Kearse M, et al (2011) GENEIOUS Pro v 7. Retrieved from http://www.geneious.com Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792-1797. doi:10.1093/nar/gkh340 Galbany-Casals M, Unwin M, Garcia-Jacas N, Smissen RD, Susanna A, Bayer RJ (2014) Phylogenetic relationships in Helichrysum (Compositae: Gnaphalieae) and related genera: incongruence between nuclear and plastid phylogenies, biogeographic and morphological patterns, and implications for generic delimitation. Taxon 63:608-624. doi:10.12705/633.8 Harrison PA, Vandewalle M, Sykes MT, Berry PM, Bugter R, de Bello F, Feld CK, Grandin U, Harrington R, et al (2010) Identifying and prioritising services in European terrestrial and freshwater ecosystems. Biodivers Conserv 19:2791–2821. doi:10.1007/s10531- 010-9789-x Hu S, Sablok G, Wang B, Qu D, Barbaro E, Viola R, Li M, Varotto C (2015) Plastome organization and evolution of chloroplast genes in Cardamine species adapted to contrasting habitats. BMC Genomics 16:306. doi:10.1186/s12864-015-1498-0 Jansen RK, Palmer JD (1987a) A chloroplast DNA inversion marks an ancient evolutionary split in the sunflower family (Asteraceae). Proc Natl Acad Sci USA 84:5818-5822. doi:10.1073/pnas.84.16.5818 Jansen RK, Palmer JD (1987b) Chloroplast DNA from lettuce and Barnadesia (Asteraceae): structure, gene localization, and characterization of a large inversion. Curr Genet 11:553-564. doi:10.1007/BF00384619 Karis PO (1993) Morphological phylogenetics of the Asteraceae-Asteroideae, with notes on character evolution. Plant Syst Evol 186:69-93. doi:10.1007/BF00937714 Kim KJ, Jansen RK (1995) ndhF sequence evolution and the major clades in the sunflower family. Proc Natl Acad Sci USA 92:10379- 10383. doi:10.1073/pnas.92.22.10379 Kim KJ, Choi KS, Jansen RK (2005) Two chloroplast DNA inversions originated simultaneously during the early evolution of the sunflower family (Asteraceae). Mol Biol Evol 22:1783-1792. doi:10.1093/molbev/msi174 Kramer AT, Havens K (2009) Plant conservation genetics in a changing world. Trends Plant Sci 14:599-607. doi:10.1016/ j.tplants.2009.08.005 Kumar S, Hahn FM, McMahan CM, Cornish K, Whalen MC (2009) Comparative analysis of the complete sequence of the plastid genome of Parthenium argentatum and identification of DNA barcodes to differentiate Parthenium species and lines. BMC Plant Biol 9:131. doi:10.1186/1471-2229-9-131 Lee HL, Jansen RK, Chumley TW, Kim KJ (2007) Gene relocations within chloroplast genomes of Jasminum and Menodora (Oleaceae) are due to multiple, overlapping inversions. Mol Biol Evol 24:1161-1180. doi:10.1093/molbev/msm036 Li X, Zhang TC, Qiao Q, Ren Z, Zhao J, Yonezawa T, Hasegawa M, Crabbe MJC, Li J, et al (2013) Complete chloroplast genome sequence of holoparasite Cistanche deserticola (Orobanchaceae) reveals gene loss and horizontal gene transfer from its host Haloxylon ammodendron (Chenopodiaceae). PLoS ONE 8:e58747. doi:10.1371/journal.pone.0058747 Librado P, Rozas J (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25:1451-1452. doi:10.1093/bioinformatics/btp187 Liu Y, Huo N, Dong L, Wang Y, Zhang S, Young HA, Feng X, Gu YQ (2013) Complete chloroplast genome sequences of Mongolia medicine Artemisia frigida and phylogenetic relationships with other plants. PLoS ONE 8:e57533. doi:10.1371/ journal.pone.0057533 Lohse M, Drechsel O, Bock R (2007) Organellar genome DRAW (OGDRAW): a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr Genet 52:267-274. doi:10.1007/s00294-007-0161-y McNeal JR, Kuehl JV, Boore JL, de Pamphilis CW (2007) Complete plastid genome sequences suggest strong selection for retention of photosynthetic genes in the parasitic plant genus Cuscuta. BMC Plant Biol 7:57. doi:10.1186/1471-2229-7-57 Merxmüller H, Leins P, Roessler H (1977) Inuleae – systematic review. In VH Heywood, JB Harborne, BL Turner, eds, The Biology and Chemistry of the Compositae, Vol I. Academic Press, London, UK, pp 577-602 Nie X, Lv S, Zhang Y, Du X, Wang L, Biradar SS, Tan X, Wan F, Weining S (2012) Complete chloroplast genome sequence of a major invasive species, crofton weed (Ageratina adenophora). PLoS ONE 7:e36869. doi:10.1371/journal.pone.0036869 Nie ZL, Funk VA, Sun H, Deng T, Meng Y, Wen J (2013) Molecular phylogeny of Anaphalis (Asteraceae, Gnaphalieae) with biogeographic implications in the Northern Hemisphere. J Plant Res 126:17-32. doi:10.1007/s10265-012-0506-6 Nie ZL, Funk VA, Meng Y, Deng T, Sun H, Wen J (2015) Recent assembly of the global herbaceous flora: evidence from the paper daisies (Asteraceae: Gnaphalieae). New Phytol 209:1795-1806. doi:10.1111/nph.13740 Panero JL, Funk VA (2008) The value of sampling anomalous taxa in phylogenetic studies: major clades of the Asteraceae revealed. Mol Phylogenet Evol 47:757-782. doi:10.1016/j.ympev.2008.02.011 Schattner P, Brooks AN, Lowe TM (2005) The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res 33:686-689. doi:10.1093/nar/gki366 Schmickl R, Kiefer C, Dobe C, Koch MA (2009) Evolution of trnF (GAA) pseudogenes in cruciferous plants. Plant Syst Evol 282:229- 240. doi:10.1007/s00606-008-0030-2 Smissen RD, Galbany-Casals M, Breitwieser I (2011) Ancient allopolyploidy in the everlasting daisies (Asteraceae: Gnaphalieae): complex relationships among extant clades. Taxon 60:649-662

782 Horticultural Science and Technology Characterization of Two Complete Chloroplast Genomes in the Tribe Gnaphalieae (Asteraceae): Gene Loss or Pseudogenization of trnT-GGU and Implications for Phylogenetic Relationships

Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690. doi:10.1093/bioinformatics/btl446 Stamatakis A, Hoover P, Rougemont J (2008) A rapid bootstrap algorithm for the RAxML web servers. Syst Biol 57:758-771. doi:10.1080/10635150802429642 Takakura KI, Nishio T (2012) Safer DNA extraction from plant tissues using sucrose buffer and glass fiber filter. J Plant Res 125:805- 807. doi:10.1007/s10265-012-0502-x Timme RE, Kuehl JV, Boore JL, Jansen RK (2007) A comparative analysis of the Lactuca and Helianthus (Asteraceae) plastid genomes: identification of divergent regions and categorization of shared repeats. Am J Bot 94:302-312. doi:10.3732/ajb.94.5.27 Wagstaff SJ, Breitwieser I (2002) Phylogenetic relationships of New Zealand Asteraceae inferred from ITS sequences. Pl Syst Evol 231:203-224. doi:10.1007/s006060200020 Walker JF, Zanis MJ, Emery NC (2015) Correction to “Comparative analysis of complete chloroplast genome sequence and inversion variation in burkei (, Asteraceae)”. Am J Bot 102:1008. doi:10.3732/ajb.1500990 Ward JM, Bayer RJ, Breitwieser I, Smissen RD, Galbany-Casals M, Unwin M (2009) Gnaphalieae – Systematic and phylogenetic review In VA Funk, A Susanna, TF Stuessy, RJ Bayer, eds, Systematics, Evolution, and Biogeography of Compositae. IAPT, Vienna, Austria, pp 537-585 Wicke S, Müller KF, de Pamphilis CW, Quandt D, Wickett NJ, Zhang Y, Renner SS, Schneeweiss GM (2013) Mechanisms of functional and physical genome reduction in photosynthetic and nonphotosynthetic parasitic plants of the broomrape family. Plant Cell 25:3711-3725. doi:10.1105/tpc.113.113373 Wyman SK, Jansen RK, Boore JL (2004) Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20:3252-3255. doi:10.1093/bioinformatics/bth352 Zhang Y, Li L, Yan TL, Liu Q (2014) Complete chloroplast genome sequences of ( catarium Veldkamp), an important invasive species. Gene 549:58-69. doi:10.1016/j.gene.2014.07.041 Zheljazkov VD, Craker LE (2016) Overview of medicinal and aromatic crops. In VD Jeliazkov, CL Crantrell, eds, Medicinal and Aromatic Crops: Production, Phytochemistry, and Utilization. Oxford University Press, Oxford, UK, pp 1-12. doi:10.1021/bk-2016- 1218.ch001

Horticultural Science and Technology 783