molecules

Article The Complete Chloroplast Genome of a Key Ancestor of Modern , Rosa chinensis var. spontanea, and a Comparison with Congeneric Species

Hong-Ying Jian 1,*,† ID , Yong-Hong Zhang 2,†, Hui-Jun Yan 1, Xian-Qin Qiu 1, Qi-Gang Wang 1, Shu-Bin Li 1 and Shu-Dong Zhang 3,*

1 National Engineering Research Center for Ornamental Horticulture/Flower Research Institute, Yunnan Academy of Agricultural Sciences, Kunming 650205, China; [email protected] (H.-J.Y.); [email protected] (X.-Q.Q.); [email protected] (Q.-G.W.); [email protected] (S.-B.L.) 2 School of Life Sciences, Yunnan Normal University, Kunming 650500, China; [email protected] 3 School of Biological Sciences and Technology, Liupanshui Normal University, Liupanshui 553004, China * Correspondence: ynwildfl[email protected] (H.-Y.J.); [email protected] (S.-D.Z.); Tel.: +86-871-6589-2602 (H.-Y.J.) † These authors contributed equally to this work.

Received: 19 January 2018; Accepted: 7 February 2018; Published: 12 February 2018

Abstract: Rosa chinensis var. spontanea, an endemic and endangered of China, is one of the key ancestors of modern roses and a source for famous traditional Chinese medicines against female diseases, such as irregular menses and dysmenorrhea. In this study, the complete chloroplast (cp) genome of R. chinensis var. spontanea was sequenced, analyzed, and compared to congeneric species. The cp genome of R. chinensis var. spontanea is a typical quadripartite circular molecule of 156,590 bp in length, including one large single copy (LSC) region of 85,910 bp and one small single copy (SSC) region of 18,762 bp, separated by two inverted repeat (IR) regions of 25,959 bp. The GC content of the whole genome is 37.2%, while that of LSC, SSC, and IR is 42.8%, 35.2% and 31.2%, respectively. The genome encodes 129 genes, including 84 protein-coding genes (PCGs), 37 transfer RNA (tRNA) genes, and eight ribosomal RNA (rRNA) genes. Seventeen genes in the IR regions were found to be duplicated. Thirty-three forward and five inverted repeats were detected in the cp genome of R. chinensis var. spontanea. The genome is rich in SSRs. In total, 85 SSRs were detected. A genome comparison revealed that IR contraction might be the reason for the relatively smaller cp genome size of R. chinensis var. spontanea compared to other congeneric species. Sequence analysis revealed that the LSC and SSC regions were more divergent than the IR regions within the Rosa and that a higher divergence occurred in non-coding regions than in coding regions. A phylogenetic analysis showed that the sampled species of the genus Rosa formed a monophyletic clade and that R. chinensis var. spontanea shared a more recent ancestor with R. lichiangensis of the section Synstylae than with R. odorata var. gigantea of the section Chinenses. This information will be useful for the conservation genetics of R. chinensis var. spontanea and for the phylogenetic study of the genus Rosa, and it might also facilitate the genetics and breeding of modern roses.

Keywords: Rosa chinensis var. spontanea; chloroplast genome; repeats; SSRs; genome comparison; phylogeny

1. Introduction Molecular data have suggested that Rosa chinensis Jacq. var. spontanea (Rehder et. Wilson) Yü et Ku is the maternal parent of R. chinensis and the possible paternal parent of R. odorata (Andrews) Sweet [1], which gave characters of recurrent flowering, tea scent, and multiple floral colors to modern roses [2,3]. As one of the key ancestors of modern roses, R. chinensis var. spontanea is not only a precious germplasm

Molecules 2018, 23, 389; doi:10.3390/molecules23020389 www.mdpi.com/journal/molecules Molecules 2018, 23, 389 2 of 13 resource for improving modern roses but also valuable plant material for the genetic research of recurrent flowering and the study of the biosynthesis of flower scent components. Furthermore, with the effect of “promoting blood circulation for removing blood stasis” and “subdhing swelling and detoxicating”, R. chinensis var. spontanea is also a source for famous traditional Chinese medicines that treat female diseases such as irregular menses and dysmenorrhea [4]. Rosa chinensis var. spontanea originates from China and is endemic to the Hubei, Sichuan, Chongqing, and Guizhou provinces [5,6]. It has been overharvested by local people and pharmaceutical companies because of its medicinal usefulness and has become rare in its wild habitats. It was uncertain whether it still existed as a wild-living species because investigators failed to collect samples of this species in the field [1]. It has been listed as an endangered (EN) species in a recent biodiversity report [7]. Fortunately, during systematic and integrative field investigations focusing on this species, we recently found several populations in the wild. It is important to mention that little information is available about R. chinensis var. spontanea except the fact that it is a diploid plant [8] and that it emits 1,3,5-trimethoxybenzene, together with methyleugenol and isomethyleugenol, as minor floral scent compounds [9], resulting from O-methytransferas genes [10]. The chloroplasts (cps) play important functional roles in the photosynthesis, biosynthesis, and metabolism of starch and fatty acids throughout the plant’s life cycle [11]. Typically, the cp genomes of angiosperm are circulars, with a characteristic quadripartite structure that is comprised of two inverted repeats (IRs) and two single copy regions: a large single copy region (LSC) and a small single copy region (SSC). The genetic composition of cp genome in angiosperm is more or less conserved, containing 110 to 120 genes including protein-coding genes (PCGs), transfer RNA (tRNA) genes and ribosomal RNA (rRNA) genes. In spite of the generally high conservation of gene order and gene content, cp genomes in angiosperm have undergone size changes, structure rearrangement, contraction and expansion of IRs, and even pseudogenization due to adaptations, even within genera, to the host ’ environments [12]. Here we report the sequence and structural analyses of the complete cp genome of R. chinensis var. spontanea, including analyses of the repeats and SSRs. Furthermore, we carried out comparative sequence analysis studies of cp sequences in the genus Rosa. This information will be useful for the conservation genetics of R. chinensis var. spontanea, as well as for the phylogenetic study of the genus Rosa. It might also benefit the genetics and breeding of modern roses.

2. Results and Discussion

2.1. Characteristics of Chloroplast Genome of R. chinensis var. spontanea The complete cp genome of R. chinensis var. spontanea represents a typical quadripartite circular molecule that is 156,590 bp in length. It is composed by a LSC region of 85,910 bp and a SSC region of 18,762 bp, separated by two IR regions of 25,959 bp (Table1 and Figure1). The GC content of the total cp DNA sequence is 37.2%, similar to that of R. odorata (Andr.) Sweet var. gigantea (Crép) Rehd. et Wils.(KF753637) [13], R. praelucens Byhouwer (MG450565) [14] and R. roxburghii Tratt. (KX768420). The GC content of the IR regions is 42.8%, while the LSC and SSC regions exhibit lower GC content (35.2% and 31.2%, respectively) (Table1). The complete cp genome includes 57.8% coding sequences (50.2% PCGs, 1.8% tRNAs, and 5.8% rRNAs) and 42.2% non-coding sequences (11.8% introns and 30.4% intergenic spacers). Among PCGs, the AT content of the first, second, and third positions is 54.7%, 62.5%, and 69.7%, respectively (Table1). This kind of bias towards a higher AT content at the third position of the codons is used to discriminate cp DNA from nuclear and mitochondrial DNA [15] and has been widely reported in other plant cp genomes [16–18]. The cp genome of R. chinensis var. spontanea contains 129 genes, including 84 PCGs, 37 tRNAs, and eight rRNAs (Table S1). Six PCGs (ndhB, rpl2, rpl23, rps7, rps12 and ycf2), four rRNAs (rrn16, rrn23, rrn 4.5 and rrn5) and seven tRNAs (trnA-UGC, trnI-CAU, trnI-GAU, trnL-CAA, trnN-GUU, trnR-ACG, and trnV-GAC) within the IR regions are completely duplicated. The LSC region contains 62 PCGs Molecules 2018, 23, 389 3 of 13

andMolecules 22 tRNAs. 2018, 23, x The FOR SSCPEERregion REVIEW contains one tRNA and 12 PCGs. Additionally, 14 genes, namely3 of 14 trnK-UUU, rps16, trnG-GCC, rpoC1, trnL-UAA, trnV-UAC, petB, rpl16, rp12, ndhB, trnI-GAU, trnA-UGC, ndhAtrnA,-UGC and ,petD ndhA,, contain and petD one, contain intron, one whereas intron, the whereasycf3, rps12 the ycf3and, rps12clpP genesand clpP contain genestwo contain introns. two Despiteintrons.that, Despite there that, are 17–20there groupare 17–20 II introns group within II introns tRNA within and protein-coding tRNA and protein-coding genes in land plantgenes cpin genomesland plant [19 cp], sogenomes far only [19], the intron so far of onlytrnL thehas intron been characterized of trnL has been as a groupcharacterized I intron inaschloroplasts a group I intron [20]. Thus,in chloroplasts all theseintrons [20]. Thus, of R. all chinensis these intronsvar. spontanea of R. chinensis, except var. the spontaneatrnL-UAA, exceptintron, the might trnL-UAA be group intron, II introns.might be The grouprps12 II geneintrons. is trans-spliced The rps12 gene in theis trans-spliced cp genome ofin R.the chinensis cp genomevar. ofspontanea R. chinensis. C-terminal var. spontanea exon. 2C-terminal and 3 of rps12 exonare 2 and located 3 of inrps12 the are IR regions.located in Exon the IR 1 is regions. 28,259 bpExon downstream 1 is 28,259 ofbpthe downstream nearest copy of the of exonsnearest 2 andcopy 3 of while exons 72,017 2 and bp 3 awaywhile from72,017 the bp distal away copy from of the exons distal 2 andcopy 3 of (Table exons S1). 2 and The 3trnK-UUU (Table S1). geneThe hadtrnK-UUU the largest gene intron had withthe largest a 2498 bpintron length, with in whicha 2498 thebpmatK length,gene in waswhich located. the matK The matK genegene was encodeslocated. MatK,The matK the maturasegene encodes which MatK, is derived the maturase from reverse which transcriptase is derived andfrom has reverse been provedtranscriptase to be anand essential has been splice proved factor to forbe an both essential the group splice I and factor group for IIboth introns the group [20,21 ].I and group II introns [20,21].

Figure 1. Chloroplast genome map of Rosa chinensis var. spontanea. Genes inside the circle are transcribed Figure 1. Chloroplast genome map of Rosa chinensis var. spontanea. Genes inside the circle are clockwise, and those outside are counterclockwise. Genes of different functions are color-coded. The darker transcribed clockwise, and those outside are counterclockwise. Genes of different functions are gray in the inner circle shows the GC content, while the lighter gray shows the AT content. color-coded. The darker gray in the inner circle shows the GC content, while the lighter gray shows the AT content. Based on the sequences of PCGs and tRNAs, the frequency of codon usage of the cp genome of R. chinensisBased var.on thespontanea sequenceswas of estimated PCGs and (Table tRNAs,2). In the total, frequency 27,525 codons of codon were usage found of the in all cp the genome coding of sequences.R. chinensis Among var. spontanea these, leucine was estimated is the most (Table frequent 2). In amino total, acid, 27,525 representing codons were 10.4% found (2,871) in all of thethe totalcoding codons, sequences. while Among cysteine these, is the leucine least frequent is the most one withfrequent 1.2% amino (320) of acid, the representing codons. A- and 10.4% U-ending (2,871) codonsof the total are common. codons, while Except cysteine for trnL-CAA is the, leasttrnS-GGA frequentand one a stop with codon 1.2% (UAG), (320) of all the types codons. of preferred A- and synonymousU-ending codons codons are (RSCU common. > 1) Except ended for with trnL-CAA A or U. , trnS-GGA and a stop codon (UAG), all types of preferred synonymous codons (RSCU > 1) ended with A or U.

Molecules 2018, 23, 389 4 of 13

Table 1. Base composition in the chloroplast genome of Rosa chinensis var. spontanea.

Region A T (U) G C Length LSC 31.7 33.1 17.2 18.0 85,910 SSC 34.4 34.3 15.1 16.3 18,762 IRB 28.7 28.5 22.2 20.6 25,959 IRA 28.7 28.5 22.2 20.6 25,959 Total 31.0 31.8 18.6 18.7 156,590 PCGs 30.6 31.4 20.3 17.7 79,773 1st position 30.7 24 26.9 18.7 26,591 2nd position 29.5 33 17.7 20.2 26,591 3rd position 31.7 38 16.4 14.1 26,591 PCGs: protein-coding genes.

Table 2. Condon-anticodon recognition patterns and codon usage of the Rosa chinensis var. spontanea chloroplast genome.

Amino Acid Codon Count RSCU tRNA Amino Acid Codon Count RSCU tRNA Phe UUU 1015 1.3 Ser UCU 580 1.62 Phe UUC 545 0.7 trnF-GAA Ser UCC 370 1.03 trnS-GGA Leu UUA 897 1.87 Ser UCA 406 1.13 trnS-UGA Leu UUG 580 1.21 trnL-CAA Ser UCG 222 0.62 Leu CUU 595 1.24 Pro CCU 424 1.45 Leu CUC 217 0.45 Pro CCC 241 0.82 Leu CUA 380 0.79 Pro CCA 320 1.09 trnP-UGG Leu CUG 202 0.42 Pro CCG 187 0.64 Ile AUU 1136 1.48 Thr ACU 542 1.55 Ile AUC 451 0.59 trnI-CAU Thr ACC 269 0.77 trnT-GGU Ile AUA 716 0.93 Thr ACA 418 1.19 trnT-UGU Met AUG 635 1 trnM-CAU Thr ACG 171 0.49 Val GUU 550 1.44 Ala GCU 645 1.76 Val GUC 193 0.5 trnV-GAC Ala GCC 244 0.67 Val GUA 567 1.48 Ala GCA 391 1.07 Val GUG 223 0.58 Ala GCG 183 0.5 Tyr UAU 798 1.6 Cys UGU 237 1.48 Tyr UAC 198 0.4 trnY-GUA Cys UGC 83 0.52 trnC-GCA stop UAA 59 1.38 stop UGA 21 0.49 stop UAG 48 1.13 Trp UGG 484 1 trnW-CCA His CAU 476 1.49 Arg CGU 362 1.28 trnR-ACG His CAC 161 0.51 trnH-GUG Arg CGC 120 0.43 Gln CAA 734 1.51 trnQ-UUG Arg CGA 385 1.36 Gln CAG 236 0.49 Arg CGG 144 0.51 Asn AAU 1003 1.52 Ser AGU 420 1.17 Asn AAC 317 0.48 Ser AGC 156 0.43 trnS-GCU Lys AAA 1082 1.48 Arg AGA 488 1.73 trnR-UCU Lys AAG 385 0.52 Arg AGG 194 0.69 Asp GAU 890 1.62 Gly GGU 612 1.3 Asp GAC 207 0.38 trnD-GUC Gly GGC 209 0.44 trnG-GCC Glu GAA 1052 1.46 trnE-UUC Gly GGA 694 1.48 Glu GAG 390 0.54 Gly GGG 365 0.78 RSCU: Relative Synonymous Codon Usage.

2.2. Repeat and SSR Analysis For the repeat structure analysis, 33 forward and five inverted repeats with a minimal repeat size of 20 bp were detected in the cp genome of R. chinensis var. spontanea (Table3). Most of these repeats are between 20 and 30 bp. The longest forward repeat is 41 bp in length, located in the intergenic region between the genes psbE and petL. Most of the repeats were found in the LSC region. Among them, repeat No. 5 is related to trnS-GCU and trnS-UGA (Table3). Repeat No. 7 is related to trnG-GCU and trnG-UCC. Repeat No. 13 is associated with psa genes. Six forward repeats were located in IR regions, including two repeats associated with ycf2 genes and one repeat related to the ndhB gene. In addition, there were several repeat pairs with either repeated sequence located in a distinct region, e.g., each of the two sequences of repeat No. 16, 25, and 26 are located in the gene introns of LSC and SSC, respectively. Molecules 2018, 23, 389 5 of 13

Table 3. Repeat sequences in the Rosa chinensis var. spontanea chloroplast genome.

Size Mismatch ID Repeat Start 1 Type Repeat Start 2 E-Value Gene Region (bp) (bp) 1 4426 F 29 45,071 −2 8.74 × 10−5 IGS; ycf3(intron) LSC 2 4427 F 30 4428 −3 6.56 × 10−4 IGS LSC 3 4428 F 28 45,072 −3 8.47 × 10−3 IGS LSC 4 4432 F 26 45,072 −2 4.48 × 10−3 IGS LSC 5 8329 F 29 36,077 −2 8.74 × 10−5 trnS-GCU; trnS-UGA LSC 6 8873 F 20 8895 0 6.27 × 10−3 IGS LSC 7 9804 F 27 37,135 −1 3.10 × 10−5 trnG-GCU; trnG-UCC LSC LSC; 8 13,510 F 20 89,606 0 6.27 × 10−3 IGS; ycf2 IRa 9 14,236 F 20 29,560 0 6.27 × 10−3 IGS LSC 10 27,619 F 24 27,643 0 2.45 × 10−5 IGS LSC 11 29,555 F 24 29,556 −1 1.76 × 10−3 IGS LSC 12 33,157 F 20 33,177 0 6.27 × 10−3 IGS LSC 13 39,390 F 30 41,614 −3 6.56 × 10−4 psaB; psaA LSC LSC; 14 42,625 F 25 147,248 −1 4.59 × 10−4 IGS IRb LSC; 15 44,406 F 39 100,262 0 2.28 × 10−14 ycf3(intron); IGS IRa LSC; 16 44,406 F 38 122,332 0 9.13 × 10−14 ycf3(intron); ndhA(intron) SSC LSC; 17 45,075 F 24 142,008 −1 1.76 × 10−3 ycf3(intron); IGS IRb 18 47,622 F 25 47,645 0 6.13 × 10−6 IGS LSC 19 58,656 F 34 58,687 0 2.34 × 10−11 IGS LSC 20 66,712 F 41 66,752 0 1.43 × 10−15 IGS LSC 21 66,939 F 20 66,958 0 6.27 × 10−3 IGS LSC 22 68,033 F 21 68,052 0 1.57 × 10−3 IGS LSC 23 71,232 F 20 84,928 0 6.27 × 10−3 IGS LSC 24 80,953 F 27 80,966 −2 1.21 × 10−3 IGS LSC 25 83,166 F 29 122,320 −3 2.36 × 10−3 rpl16(intron); ndhA(intron) LSC;SSC 26 83,172 F 28 122,326 −3 8.47 × 10−3 rpl16(intron); ndhA(intron) LSC;SSC 27 90,610 F 29 90,631 −2 8.74 × 10−5 ycf2 IRa IRa; 28 97,630 F 31 144,839 −3 1.81 × 10−4 ndhB(intron) IRb IRa; 29 100,260 F 40 122,330 0 5.70 × 10−15 IGS; ndhA(intron) SSC 30 101,012 F 23 101,033 0 9.80 × 10−5 IGS IRa 31 141,437 F 30 141,458 −2 2.34 × 10−5 IGS IRb 32 141,444 F 23 141,465 0 9.80 × 10−5 IGS IRb 33 151,840 F 29 151,861 −2 8.74 × 10−5 ycf2 IRb 34 6406 I 20 71,231 0 6.27 × 10−3 IGS LSC 35 6408 I 24 71,228 −1 1.76 × 10−3 IGS LSC 36 8622 I 26 45,073 −2 4.48 × 10−3 IGS LSC 37 8625 I 23 45,077 −1 6.76 × 10−3 IGS LSC 38 71,232 I 20 84,930 0 6.27 × 10−3 IGS LSC F: Forward; I: Inverted; IGS: intergenic space.

As chloroplast-specific SSRs are uniparentally inherited and exhibit a high level of intraspecific polymorphism, they are widely used in population genetics, species identification, evolutionary processes research of wild plants [22,23], and as markers for linkage map construction and the breeding of crop plants [24,25]. In total, 85 SSRs were identified in the cp genome of R. chinensis var. spontanea, most of which were detected in the LSC region (Table4). Among them, 55 (64.7%) are mononucleotide SSRs, ten (11.8%) are dinucleotide SSRs, seven (8.2%) are trinucleotide SSRs, 10 (11.8%) are tetranucleotide SSRs, one (1.2%) is a pentanucleotide SSR, and two (2.4%) are hexanucleotide SSRs. Only 22 SSRs are located in genes and the others are in the intergenic regions. Fifty two (94.5%) of the mononucleotide SSRs belong to the A/T type, which is consistent with the hypothesis that cp SSRs are generally composed of short polyadenine (poly A) or polythymine (poly T) repeats and rarely contain tandem guanine (G) or cytosine (C) repeats. These cp SSR markers can be used in the conservation genetics of R. chinensis var. spontanea, as well as and in both the linkage map construction and molecular-marker-assisted selection of modern roses. Molecules 2018, 23, 389 6 of 13

Table 4. Simple sequence repeats (SSRs) in the Rosa chinensis var. spontanea chloroplast genome.

ID Repeat Motif Length (bp) Start End Region Gene ID Repeat Motif Length (bp) Start End Region Gene 1 (A)11 11 279 289 LSC 44 (TTTA)4 12 50,468 50,479 LSC 2 (T)11 11 4108 4118 LSC 45 (TA)5 10 52,742 52,751 LSC 3 (A)19 19 4428 4446 LSC 46 (T)10 10 55,811 55,820 LSC atpB 4 (A)10 10 4449 4458 LSC 47 (AAAT)3 12 55,911 55,922 LSC 5 (A)10 10 4887 4896 LSC 48 (TAAT)3 12 58,366 58,377 LSC 6 (T)10 10 5023 5032 LSC 49 (T)14 14 60,810 60,823 LSC 7 (TATAT)3 15 6102 6116 LSC rps16 50 (TC)5 10 62,280 62,289 LSC cemA 8 (T)17 17 6407 6423 LSC 51 (T)11 11 64,513 64,523 LSC 9 (AATA)3 12 6525 6536 LSC 52 (T)10 10 69,689 69,698 LSC 10 (AG)5 10 6755 6764 LSC 53 (A)16 16 69,739 69,754 LSC 11 (A)11 11 6945 6955 LSC 54 (T)18 18 71,235 71,252 LSC 12 (TAA)4 12 8257 8268 LSC 55 (T)15 15 71,933 71,947 LSC clpP 13 (A)10 10 8639 8648 LSC 56 (A)10 10 72,733 72,742 LSC clpP 14 (AT)6 12 10,093 10,104 LSC 57 (AT)6 12 73,632 73,643 LSC 15 (TAT)4 12 10,343 10,354 LSC 58 (A)12 12 79,231 79,242 LSC 16 (T)11 11 12,157 12,167 LSC 59 (A)14 14 79,393 79,406 LSC 17 (T)10 10 12,915 12,924 LSC 60 (T)10 10 79,429 79,438 LSC rpoA 18 (A)10 10 13,184 13,193 LSC 61 (ATGT)3 12 79,529 79,540 LSC rpoA 19 (C)10 10 14,237 14,246 LSC 62 (T)11 11 81,586 81,596 LSC 20 (T)10 10 14,247 14,256 LSC 63 (A)10 10 82,641 82,650 LSC 21 (T)11 11 18,361 18,371 LSC rpoC2 64 (A)12 12 83,422 83,433 LSC rpl16 22 (TA)5 10 19,730 19,739 LSC rpoC2 65 (A)11 11 83,498 83,508 LSC rpl16 23 (T)10 10 26,080 26,089 LSC rpoB 66 (T)18 18 84,931 84,948 LSC 24 (T)12 12 28,925 28,936 LSC 67 (TAT)4 12 86,619 86,630 IRa rpl2 25 (C)15 15 29,556 29,570 LSC 68 (TAGAAG)3 18 93,987 94,004 IRa ycf2 26 (T)10 10 29,571 29,580 LSC 69 (T)11 11 101,618 101,628 IRa 27 (AAT)4 12 30,504 30,515 LSC 70 (AGGT)3 12 107,843 107,854 IRa rrn23 28 (T)14 14 30,519 30,532 LSC 71 (TATT)3 12 110,028 110,039 IRa 29 (A)10 10 30,666 30,675 LSC 72 (TGT)4 12 111,869 111,880 SSC 30 (TA)5 10 36,313 36,322 LSC 73 (T)10 10 115,507 115,516 SSC 31 (T)11 11 36,473 36,483 LSC 74 (TAA)4 12 115,558 115,569 SSC 32 (AT)5 12 37,070 37,079 LSC 75 (A)13 13 115,612 115,624 SSC 33 (C)13 13 37,303 37,315 LSC 76 (T)10 10 120,845 120,854 SSC 34 (A)11 11 37,316 37,326 LSC 77 (AT)7 14 121,678 121,691 SSC 35 (AT)5 10 43,682 43,691 LSC ycf3 78 (A)16 16 122,551 122,566 SSC ndhA 36 (A)15 15 45,073 45,087 LSC ycf3 79 (T)15 15 122,804 122,818 SSC ndhA 37 (A)10 10 45,392 45,401 LSC 80 (T)10 10 129,830 129,839 SSC ycf1 38 (T)10 10 45,931 45,940 LSC 81 (ATAA)3 12 132,463 132,474 IRb 39 (A)11 11 47,296 47,306 LSC 82 (CTAC)3 12 134,645 134,656 IRb rrn23 40 (TAAT)3 12 48,112 48,123 LSC 83 (A)11 11 140,873 140,883 IRb 41 (T)14 14 48,306 48,319 LSC 84 (CTTCTA)3 18 148,497 148,514 IRb ycf2 42 (A)12 12 48,420 48,431 LSC 85 (ATA)4 12 155,871 155,882 IRb 43 (TA)5 10 48,500 48,509 LSC Molecules 2018, 23, x FOR PEER REVIEW 8 of 14 Molecules 2018, 23, 389 7 of 13

2.3. Comparative Analysis of the Chloroplast Genomes of the Genus Rosa 2.3. Comparative Analysis of the Chloroplast Genomes of the Genus Rosa The complete cp genome sequence of R. chinensis var. spontanea was compared to that of R. odorataThe complete var. gigantea cp genome [13], sequence R. roxburghii of R. chinensis (KX768420)var. spontanea and wasR. praelucens compared to(MG450565) that of R. odorata [14]. var.Rosa giganteachinensis[ var.13], spontaneaR. roxburghii has the(KX768420) smallest cp and genomeR. praelucens with the(MG450565) smallest IR [region14]. Rosa (25,959 chinensis bp), whilevar. spontaneaR. praelucenshas thehas smallest the largest cp genomecp genome with with the smallest the largest IR region LSC, at (25,959 86,313 bp), bp while(TableR. S2). praelucens No significanthas the largestdifferences cp genome were found with the in the largest sequence LSC, at lengths 86,313 of bp SSC (Table among S2). Nothe significantfour species. differences The main were reason found for inthe the length sequence differences lengths in of cp SSC genomes among of the different four species. species The main is the reason size forvariation the length of the differences LSC and inIR cpregions genomes (Table of differentS2). rose species is the size variation of the LSC and IR regions (Table S2). SequenceSequence comparisonscomparisons revealed that thethe LSCLSC andand thethe SSCSSC regionsregions werewere moremore divergentdivergent thanthan thethe IRIR regions,regions, andand thatthat higherhigher divergencedivergence couldcould bebe foundfound inin non-codingnon-coding regionsregions thanthan inin codingcoding regionsregions (Figure(Figure2 ).2). Significant Significant variations variations could could be be found found in in coding coding regions regions of of some some genes genes including including rps19rps19and andycf1 ycf1.. TheThe highesthighest divergencedivergence inin non-codingnon-coding regionsregions waswas foundfound inin the the intergenic intergenic regionsregions ofof thethetrnK-rps16 trnK-rps16, rps16-trnQ, rps16-trnQ, trnS-trnG, trnS-trnG, trnR-atpA, trnR-atpA, atpF-atpH, atpF-atpH, rps2-rpoC2, rps2-rpoC2, rpoB-trnC, ,rpoB-trnCtrnC-petN, , trnT-psbDtrnC-petN,, psbZ-trnGtrnT-psbD,,rps4-trnT psbZ-trnG, psbE-petL, rps4-trnT, trnP-psaJ, psbE-petL, ndhF-rpl32, trnP-psaJ,, andndhF-rpl32ccsA-ndhD., andThe ccsA-ndhD. introns ofTherpl2, introns rps16, of ndhA, rpl2, trnV,rps16, clpP ndhA,, and trnV,ndhA clpPwere, and relatively ndhA were highly relatively divergent, highly too. divergent, These resultstoo. These might results indicate might that indicate these regionsthat these evolve regions rapidly evolve in therapidly genus inRosa the ,genus as well Rosa as, in as other well Rosaceaeas in other plants [26,27 plants]. [26,27].

Figure 2. Complete chloroplast genome comparison of four rose species using the chloroplast genome Figure 2. Complete chloroplast genome comparison of four rose species using the chloroplast genome ofofR. R. chinensischinensis var.var.spontanea spontanea asas aareference. reference. TheThe greygrey arrowsarrowsand andthick thickblack blacklines linesabove above the the alignment alignment indicateindicate the the gene gene orientation. orientation. TheThey y-axis-axis represents represents the the identity identity from from 50% 50% to to 100%. 100%.

2.4. IR Contraction in the Chloroplast Genome of R. chinensis var. spontanea 2.4. IR Contraction in the Chloroplast Genome of R. chinensis var. spontanea Although IRs are the most conserved regions of the cp genomes, contraction and expansion at Although IRs are the most conserved regions of the cp genomes, contraction and expansion at the the borders of IR regions are common evolutionary events, and are hypothesized to be the main borders of IR regions are common evolutionary events, and are hypothesized to be the main reason for reason for size differences between cp genomes [28]. Detailed comparisons of the IR-SSC and IR-LSC

Molecules 2018, 23, x FOR PEER REVIEW 9 of 14 boundaries among the cp genomes of the above four rose species were presented in Figure 3. IR regions are relatively highly conserved in the genus Rosa, but compared to other congeneric species, some position changes occurred in the IR/LSC regions of R. chinensis var. Spontanea. The rpl2 gene in the cp genome shifted by 31 bp from IRb to LSC at the LSC/IRb border, and that gene also shifted by 31 bp from IRa to LSC at the IRa/LSC border, indicating the IR contraction in the cp genome of this species. This contraction is mainly caused by the fragment deletions in the intergenic regions of theMolecules rps12-trnV2018, 23,, 389rrn4.5-rrn5, and trnR-trnN genes, and leads to the relatively smaller size of its8 of IR 13 regions and consequently a smaller size of the cp genome (Figure 3, Table S2). Generally, the IRa/LSC border is located between the rpl2 and trnH genes in the rose family withsize differences rpl2 in IRa betweenand trnH cp in genomes LSC [27], [28 like]. Detailed in R. roxburghii comparisons and R. of odorata the IR-SSC var. gigantea and IR-LSC. The boundaries trnH gene ofamong R. praelucens the cpgenomes extends only of the one above bp from four LSC rose to species IRa, but were its LSC presented region in was Figure much3. larger IR regions than that are ofrelatively other species highly (Table conserved S2). One in the 505 genus bp insertRosa,ion but in compared the intergenic to other region congeneric between species, the genes some psbM position and trnDchanges was occurreddetected according in the IR/LSC to the regions result of of theR. MAFFT chinensis alignment.var. Spontanea. This largeThe insertionrpl2 gene leads in theto the cp largestgenome LSC shifted region by of 31 R. bp praelucens from IRb and to LSCthus atthe the largest LSC/IRb cp genome border, among and that these gene four also rose shifted species. by The31 bp extraction from IRa and to LSC contraction at the IRa/LSC of the IR border, region indicating at the IR-SSC the IR boundaries contraction among in the these cp genome species of were this notspecies. significant. This contraction Accordingly, is mainlythe extension caused and by thecontra fragmentction of deletions IR regions in theat the intergenic IR/LSC borders, regions ofalong the withrps12-trnV the large, rrn4.5-rrn5 insertion/deletion, and trnR-trnN in the LSCgenes, region, and leadsmight to be the the relatively main reason smaller for sizethe ofcp itsgenome IR regions size variationand consequently in the genus a smaller Rosa. size of the cp genome (Figure3, Table S2).

Figure 3. Comparison of the LSC, SSC and IR regions in chloroplast genomes of four species. Ψ: Figure 3. Comparison of the LSC, SSC and IR regions in chloroplast genomes of four species. pseudogenes, →: distance from the edge. Ψ: pseudogenes, →: distance from the edge.

2.5. Phylogenetic Analysis Generally, the IRa/LSC border is located between the rpl2 and trnH genes in the rose family with rpl2 inThere IRa have and trnH beenin many LSC attempts [27], like to in reconstructR. roxburghii theand phylogenyR. odorata ofvar. thegigantea genus Rosa. The. MosttrnH ofgene them of R.suggested praelucens thatextends the extant only oneclassifica bp fromtion LSC system to IRa, was but artificial its LSC [29,30] region and was that much interspecies larger than relationships that of other withinspecies the (Table genus S2). remained One 505 bpambiguous. insertion inThe the specific intergenic relationships region between within the the genes sectionspsbM Chinensesand trnD wasand Synstylaedetected accordingwere still toobscure the result due of to the limited MAFFT sampling, alignment. low This genetic large insertionvariation leadsof molecular to the largest markers, LSC andregion complex of R. praelucens evolutionaryand thushistories the largest [31]. The cp genomeavailability among of the these complete four rose cp species.genomes The will extraction provide additionaland contraction informative of the IR data region for atthe the reconstruction IR-SSC boundaries of a amongrobust phylogenetic these species weremodel not for significant. the rose Accordingly,species. The thephylogenetic extension andtree contraction(Figure 4) ofbased IR regions on the at LSC, the IR/LSC SSC and borders, one-IR along regions with in the the large cp genomesinsertion/deletion of 22 species in the from LSC Rosaceae region, might showed be the that main species reason from for theRosaceae cp genome were sizemonophyletic variation in and the thatgenus theRosa intra-family. relationships were almost in compliance with that found by Zhang et al. [32]. Species from the genus Rosa formed a monophyletic clade with 100% support. The representative of 2.5. Phylogenetic Analysis There have been many attempts to reconstruct the phylogeny of the genus Rosa. Most of them suggested that the extant classification system was artificial [29,30] and that interspecies relationships within the genus remained ambiguous. The specific relationships within the sections Chinenses and Synstylae were still obscure due to limited sampling, low genetic variation of molecular markers, and complex evolutionary histories [31]. The availability of the complete cp genomes will provide Molecules 2018, 23, 389 9 of 13 additional informative data for the reconstruction of a robust phylogenetic model for the rose species. The phylogenetic tree (Figure4) based on the LSC, SSC and one-IR regions in the cp genomes of 22 species from Rosaceae showed that species from Rosaceae were monophyletic and that the intra-familyMolecules 2018, relationships23, x FOR PEER wereREVIEW almost in compliance with that found by Zhang et al. [32]. Species10 from of 14 the genus Rosa formed a monophyletic clade with 100% support. The representative of the subgenus Hulthemiathe subgenus, R. persicaHulthemiaMichx., R. [persica33,34], Michx.was a sister [33,34] to, thewas clade a sister composed to the clade by the composed other five by rose the species, other supportingfive rose species, the subgenus supporting position the subgenus of Hulthemia position. In the of subgenus HulthemiaRosa. In, theR. chinensis subgenusvar. Rosaspontanea, R. chinensisfrom sectionvar. spontaneaChinenses fromwas section sister toChinensesR. lichiangensis was sisterfrom to section R. lichiangensisSynstylae ,from and thensection clustered Synstylae with, and another then speciesclustered from with section anotherChinenses species, R.from odorata sectionvar. Chinensesgigantean,, R. confirming odorata var. that giganteanR. sections, confirmingChinenses andthat SynstylaeR. sections, defined Chinenses in the and traditional Synstylae, taxonomic defined in system, the traditional shared a moretaxonomic recent system, ancestor shared and could a more be mergedrecent ancestor as onesection and could in the be genusmergedRosa as one[30]. section in the genus Rosa [30].

Figure 4.4.Phylogeny Phylogeny of of 22 22 species species within within Rosaceae Rosaceae based ba onsed the on ML the analysis ML analysis of the chloroplast of the chloroplast genome’s LSC,genome’s SSC, andLSC, one-IR SSC, regionsand one-IR with regionsBerchemiella with wilsonii Berchemiella() wilsonii (Rhamnaceae) as the outgroup. as The the positionoutgroup. of R.The chinensis positionvar. of R.spontanea chinensisis var. shown spontanea in block is letters.shown in block letters.

3. Materials and Methods

3.1. DNA Sequencing and Chloroplast Genome Assembly ◦ 0 ◦ 0 Dry leaves of R. chinensis var. spontanea collected from Yichang ofof HubeiHubei (111(111°1010′ E, 3030°4747′ N, 400 m) were used to extract the total genomic DNA.DNA. A shotgunshotgun library was prepared andand sequencedsequenced using the IlluminaIllumina HiseqHiseq 20002000 (Illumina,(Illumina, CA,CA, USA)USA) atat NovogeneNovogene (Beijing,(Beijing, China).China). ApproximatelyApproximately 3.68 Gb rawraw datadata ofof 150150 bpbp paired-endpaired-end readsreads werewere generated.generated. The rawraw readsreads werewere filteredfiltered toto obtainobtain high-quality clean clean reads reads by by using using NGS NGS QC QCToolkit Toolkit v2.3.3 v2.3.3 with withdefault default parameters parameters [35]. The [35 cp]. genome The cp was de novo assembled using the GetOrganelle pipeline (https://github.com/Kinggerm/GetOrganelle).

3.2. Gene Annotation and Sequence Analysis The genome was automatically annotated by using the CpGAVAS pipeline [36]. The annotation was adjusted and confirmed by Geneious 8.1 [37]. Sequence data was deposited into GenBank under the accession number MG523859. The circular cp map of R. chinensis var. spontanea was generated

Molecules 2018, 23, 389 10 of 13 genome was de novo assembled using the GetOrganelle pipeline (https://github.com/Kinggerm/ GetOrganelle).

3.2. Gene Annotation and Sequence Analysis The genome was automatically annotated by using the CpGAVAS pipeline [36]. The annotation was adjusted and confirmed by Geneious 8.1 [37]. Sequence data was deposited into GenBank under the accession number MG523859. The circular cp map of R. chinensis var. spontanea was generated by OGDRAW [38]. Codon usage analysis, calculation of relative synonymous codon usage values (RSCU), and measurement of AT content were carried out by using MEGA 6.06 [39].

3.3. Genome Comparison MUMer [40] was used to perform pairwise sequence alignments of cp genomes. The mVISTA [41] program was applied to compare the complete cp genome of R. chinensis var. spontanea to the other published cp genomes of its congeneric species, i.e., R. odorata var. gigantea, R. roxburghii and R. praelucens, using the shuffle-LAGAN mode [42] and using the annotation of R. chinensis var. spontanea as reference.

3.4. Repeats and Simple Sequence Repeats (SSRs) REPuter [43] was used to find forward and inverted tandem repeats ≥ 20 bp with a minimum alignment score and maximum period size of 100 and 500, respectively. The minimum identity of repeats was limited to 85% (Hamming distance of 3). IMEx [44] was used to identify SSRs with the minimum repeat number set to 10, 5, 4, 3, 3 and 3 for mono-, di-, tri-, tetra-, penta- and hexanucleotides, respectively.

3.5. Phylogenetic Analysis To identify the phylogenetic position of R. chinensis var. spontanea in Rosa, 21 published cp genomes of Rosaceae were used to construct a phylogeny tree, using Berchemiella wilsonii (C. K. Schneid.) Nakai (Rhamnaceae) as the outgroup. The LSC, SSC, and one-IR regions of the total 23 cp genomes were all aligned using MAFFT 7.308 [45]. The maximum likelihood (ML) tree was reconstructed by RAxML 8.2.11 [46] with the nucleotide substitution model of GTR + Gamma; node support was conducted by means of a bootstrap analysis with 1000 replicates.

4. Conclusions In this study, we report and analyze the first complete cp genome of R. chinensis var. spontanea, one of the key ancestors of modern roses and a source for famous traditional Chinese medicines against female diseases. Compared to the cp genomes of other rose species, the cp genome of R. chinensis var. spontanea is the smallest, most likely due to the contraction of IR regions by 31 bps on each IR/LSC border. The cp genome of R. chinensis var. spontanea is rich in SSRs, which are valuable sources for developing new molecular markers. Our phylogenetic analysis showed that sampled species of the genus Rosa formed a monophyletic clade. Rosa chinensis var. spontanea shared a more recent ancestor with R. lichiangensis of the section Synstylae than with R. odorata var. gigantea of the section Chinenses. This supported the hypothesis that, in the traditional taxonomic system, Rosa sections Chinenses and Synstylae were closely related and could be merged to a single section within the genus Rosa. This information will be useful for the conservation genetics of R. chinensis var. spontanea and the phylogenetic study of genus Rosa, and might also facilitate the genetics and breeding of modern roses.

Supplementary Materials: Supplementary materials are available online. Acknowledgments: This study was supported by the National Natural Scientific Foundation of China (Grant 31760087), the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant XDPB0201), and the Academic and Technical Talent Training Project of Yunnan Province, China (Grant 2013HB092). Molecules 2018, 23, 389 11 of 13

Author Contributions: Hong-Ying Jian and Shu-Dong Zhang conceived and designed the research framework; Hong-Ying Jian wrote the paper; Shu-Dong Zhang and Yong-Hong Zhang assembled and annotated the genome; Hong-Ying Jian, Yong-Hong Zhang, and Hui-Jun Yan analyzed the data; Hong-Ying Jian, Hui-Jun Yan, and Xian-Qin Qiu performed the experiments. Hong-Ying Jian, Yong-Hong Zhang, Qi-Gang Wang, Shu-Bin Li, and Shu-Dong Zhang collected the samples and made revisions to the final manuscript. All authors have read and approved the final manuscript. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Meng, J.; Fougère-Danezan, M.; Zhang, L.B.; Li, D.Z.; Yi, T.S. Untangling the hybrid origin of the Chinese tea roses: Evidence from DNA sequences of single-copy nuclear and chloroplast genes. Plant Syst. Evol. 2011, 297, 157–170. [CrossRef] 2. Wylie, A. The history of . J. R. Hortic. Soc. 1954, 79, 555–571. 3. Rix, M. Rosa chinensis f. spontanea. Curtis’s Bot. Mag. 2005, 22, 214–219. [CrossRef] 4. Ye, J.Q. Modern Practical Herb; China Press of Traditional Chinese Medicine: Beijing, China, 2015; pp. 129–130, ISBN 9787513223744. 5. Ku, T.C. Rosa. In Flora Reipublicae Popularis Sinicae; Editorial Board of the Flora Republicae Popularis Sinicae, Ed.; Science Press: Beijing, China, 1985; Volume 37, pp. 360–455. ISBN 7030040171. 6. Ku, T.C.; Robertson, K.R. Rosa (Rosaceae). In Flora of China; Wu, Z.Y., Raven, P.H., Hong, D.Y., Eds.; Science Press: Beijing, China; Missouri Botanical Garden Press: St. Louis, MO, USA, 2003; Volume 9, pp. 339–381. ISBN 9787030130440. 7. Qin, H.N.; Yang, Y.; Dong, S.Y.; He, Q.; Jia, Y.; Zhao, L.N.; Yu, S.X.; Liu, H.Y.; Liu, B.; Yan, Y.H.; et al. Threatened species list of China’s higher plants. Biodivers. Sci. 2017, 25, 696–744. [CrossRef] 8. Akasaka, M.; Ueda, Y.; Koba, T. Karyotype analyses of five wild rose species belonging to septet A by fluorescence in situ hybridization. Chromosome Sci. 2002, 6, 17–26. 9. Yomogida, K. Scent of modern roses. Kouryo 1992, 175, 65–89. 10. Wu, S.; Watanabe, N.; Mita, S.; Ueda, Y.; Shibuya, M.; Ebizuka, Y. Two O-methytransferases isolated from flower petals of Rosa chinensis var. spontanea involved in scent biosynthesis. J. Biosci. Bioeng. 2003, 96, 119–128. [CrossRef] 11. Wicke, S.; Schneeweiss, G.M.; de Pamphilis, C.W.; Muller, K.F.; Quandt, D. The evolution of the plastid chromosome in land plants: Gene content, gene order, gene function. Plant Mol. Biol. 2011, 76, 273–297. [CrossRef][PubMed] 12. Daniell, H.; Lin, C.S.; Yu, M.; Chang, W.J. Chloroplast genomes: Diversity, evolution, and applications in genetic engineering. Genome Biol. 2016, 17, 134. [CrossRef][PubMed] 13. Yang, J.B.; Li, D.Z.; Li, H.T. Highly effective sequencing whole chloroplast genomes of angiosperms by nine novel universal primer pairs. Mol. Ecol. Resour. 2014, 14, 1024–1031. [CrossRef][PubMed] 14. Jian, H.Y.; Zhang, S.D.; Zhang, T.; Qiu, X.Q.; Yan, H.J.; Wang, Q.G.; Tang, K.X. Characterization of the complete chloroplast genome of a critically endangered decaploid rose species, Rosa praelucens (Rosaceae). Conserv. Genet. Resour. 2017.[CrossRef] 15. Clegg, M.T.; Gaut, B.S.; Learn, G.H.; Morton, B.R. Rates and patterns of chloroplast DNA evolution. Proc. Natl. Acad. Sci. USA 1994, 91, 6795–6801. [CrossRef][PubMed] 16. Shen, X.F.; Wu, M.L.; Liao, B.S.; Liu, Z.X.; Bai, R.; Xiao, S.M.; Li, X.W.; Zhang, B.L.; Xu, J.; Chen, S.L. Complete chloroplast genome sequence and phylogenetic analysis of the medicinal plant Artemisia annua. Molecules 2017, 22, 1330. [CrossRef][PubMed] 17. Xiang, B.; Li, X.; Qian, J.; Wang, L.; Ma, L.; Tian, X.; Wang, Y. The complete chloroplast genome sequence of the medicinal plant Swertia mussotii. Using the PacBio RS II platform. Molecules 2016, 21, 1029. [CrossRef] [PubMed] 18. He, L.; Qian, J.; Sun, Z.Y.; Xu, X.L.; Chen, S.L. Complete chloroplast genome of medicinal plant Lonicera japonica: Genome rearrangement, intron gain and loss, and implications for phylogenetic studies. Molecules 2017, 22, 249. [CrossRef][PubMed] 19. Daniell, H.; Wurdack, K.J.; Kanagaraj, A.; Lee, S.B.; Saski, C.; Jansen, R.K. The complete nucleotide sequence of the cassava (Manihot esculenta) chloroplast genome and the evolution of atpF in Malpighiales: RNA editing and multiple losses of a group II Intron. Theor. Appl. Genet. 2008, 116, 723–737. [CrossRef][PubMed] Molecules 2018, 23, 389 12 of 13

20. Liu, C.H.; Zhu, H.T.; Xing, Y.; Tan, J.J.; Chen, X.H.; Zhang, J.J.; Peng, H.F.; Xie, Q.J.; Zhang, Z.M. Albino leaf 2 is involved in the splicing of chloroplast group I and II Introns in rice. J. Exp. Bot. 2016.[CrossRef][PubMed] 21. Vogel, J.; Hübschmann, T.; Börner, T.; Hess, W.R. Splicing and intron-internal RNA editing of trnK-matK transcript in barley plastids: Support for MatK as an essential splice factor 1. J. Mol. Biol. 1997, 270, 179–187. [CrossRef][PubMed] 22. Provan, J. Novel chloroplast microsatellites reveal cytoplasmic variation in Arabidopsis thaliana. Mol. Ecol. 2000, 9, 2183–2185. [CrossRef][PubMed] 23. Flannery, M.L.; Mitchell, F.J.; Coyne, S.; Kavanagh, T.A.; Burke, J.I.; Salamin, N.; Dowding, P.; Hodkinson, T.R. Plastid genome characterisation in Brassica and Brassicaceae using a new set of nine SSRs. Theor. Appl. Genet. 2006, 113, 1221–1231. [CrossRef][PubMed] 24. Powell, W.; Morgante, M.; McDevitt, R.; Vendramin, G.G.; Rafalski, J.A. Polymorphic simple sequence repeat regions in chloroplast genomes: Applications to the population genetics of pines. Proc. Natl. Acad. Sci. USA 1995, 92, 7759–7763. [CrossRef][PubMed] 25. Xue, J.; Wang, S.; Zhou, S.L. Polymorphic chloroplast microsatellite loci in Nelumbo (Nelumbonaceae). Am. J. Bot. 2012, 99, 240–244. [CrossRef][PubMed] 26. Shen, L.; Guan, Q.; Amin, A.; Wei, Z.; Li, M.; Li, X.; Lin, Z.; Tian, J. Complete plastid genome of Eriobotrya japonica (Thunb.) Lindl and comparative analysis in Rosaceae. SpringerPlus 2016, 5, 2036. [CrossRef] [PubMed] 27. Cheng, H.; Li, J.F.; Zhang, H.; Cai, B.H.; Gao, Z.H.; Qiao, Y.S.; Mi, L. The complete chloroplast genome sequence of strawberry (Fragaria × ananassa Duch.) and comparison with related species of Rosaceae. PeerJ 2017, 5, e3919. [CrossRef][PubMed] 28. Raubeson, L.A.; Peery, R.; Chumley, T.W.; Dziubek, C.; Fourcade, H.M.; Boorem, J.L.; Jansen, R.K. Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus. BMC Genom. 2007, 8, 174–201. [CrossRef][PubMed] 29. Bruneau, A.; Starr, J.R.; Joly, S. Phylogenetic relationships in the genus Rosa: New evidence from chloroplast DNA sequences and an appraisal of current knowledge. Syst. Bot. 2007, 32, 366–378. [CrossRef] 30. Fougère-Danezan, M.; Joly, S.; Bruneau, A.; Gao, X.F.; Zhang, L.B. Phylogeny and biogeography of wild roses with specific attention to polyploids. Ann. Bot. 2015, 115, 275–291. [CrossRef][PubMed] 31. Zhu, Z.M.; Gao, X.F.; Fougère-Danezan, M. Phylogeny of Rosa sections Chinenses and Synstylae (Rosaceae) based on chloroplast and nuclear markers. Mol. Phylogenet. Evol. 2016, 87, 50–64. [CrossRef][PubMed] 32. Zhang, S.D.; Jin, J.J.; Chen, S.Y.; Chase, M.W.; Sotis, D.E.; Li, H.T.; Yang, J.B.; Li, D.Z.; Yi, T.S. Diversification of Rosaceae since the Late Cretaceous based on plastid phylogenomics. New Phytol. 2017, 214, 1355–1367. [CrossRef][PubMed] 33. Rehder, A. Manual of Cultivated Trees and Hardy in North America Exclusive of the Subtropical and Warmed Temperate Regions; Macmillan: New York, NY, USA, 1940. 34. Wissemann, V. Conventional (wild roses). In Encyclopedia of Rose Science; Roberts, A.V., Debener, T., Gudin, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2003; Volume 1, pp. 111–117. ISBN 0-12-227620-5. 35. Patel, R.K.; Jain, M. NGS QC toolkit: A toolkit for quality control of next generation sequencing data. PLoS ONE 2017, 7, e30619. [CrossRef][PubMed] 36. Liu, C.; Shi, L.; Zhu, Y.; Chen, H.; Zhang, J.; Lin, X.; Guan, X. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genom. 2012, 13, 715. [CrossRef][PubMed] 37. Kearse, M.; Moir, R.; Wilson, A.; Stones-Havas, S.; Cheung, M.; Sturrock, S. Geneious basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 2012, 28, 1647–1649. [CrossRef][PubMed] 38. Lohse, M.; Drechsel, O.; Kahlau, S.; Bock, R. Organellar Genome-DRAW—A suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 2013, 41, 575–581. [CrossRef][PubMed] 39. Tamura, K.; Stecher, G.; Peterson, D.; Filipski, A.; Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. Biol. Evol. 2013, 30, 2725–2733. [CrossRef][PubMed] 40. Kurtz, S.; Phillippy, A.; Delcher, A.L.; Smoot, M.; Shumway, M.; Antonescu, C.; Salzberg, S.L. Versatile and open software for comparing large genomes. Genome Biol. 2004, 5, R12. [CrossRef][PubMed] Molecules 2018, 23, 389 13 of 13

41. Mayor, C.; Brudno, M.; Schwartz, J.R.; Poliakov, A.; Rubin, E.M.; Frazer, K.A.; Pachter, L.S.; Dubchak, I. VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 2000, 16, 1046–1047. [CrossRef][PubMed] 42. Frazer, K.A.; Pachter, L.; Poliakov, A.; Rubin, E.M.; Dubchak, I. VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 2004, 32, 273–279. [CrossRef][PubMed] 43. Mudunuri, S.B.; Nagarajaram, H.A. IMEx: Imperfect Microsatellite Extractor. Bioinformatics 2007, 23, 1181–1187. [CrossRef][PubMed] 44. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [CrossRef] [PubMed] 45. Katoh, K.; Standley, D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [CrossRef][PubMed] 46. Stamatakis, A.; Hoover, P.; Rougemont, J. A rapid bootstrap algorithm for the RAxML web servers. Syst. Biol. 2008, 57, 758–771. [CrossRef][PubMed]

Sample Availability: Sequence data of Rosa chinensis var. spontanea has been deposited into GenBank and are available from the authors.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).