Genome
Complete Plastome Sequences of Picea asperata Mast., P. crassifolia Kom. and Comparative Analyses with P. abies (L.) Karst. and P. morrisonicola Hayata
Journal: Genome
Manuscript ID gen-2018-0195.R1
Manuscript Type: Article
Date Submitted by the 19-Mar-2019 Author:
Complete List of Authors: Ouyang, Fangqun; State Key Laboratory of Forest Genetics and Tree Breeding, Hu, Jiwen; Research Institute of Forestry, Chinese Academy of Forestry Wang, Junchen;Draft Northwest Agriculture & Forestry University Ling, Juanjuan; Research Institute of Forestry, Chinese Academy of Forestry Wang, Zhi; Research Institute of Forestry, Chinese Academy of Forestry Wang, Nan ; Research Institute of Forestry, Chinese Academy of Forestry Ma, Jianwei; Research Institute of Forestry of Xiaolong Mountain Zhang, Hanguo; State Key Laboratory of Tree Genetics and Breeding Mao, Jianfeng; Beijing Forestry University Wang, Junhui ; Chinese Academy of Forestry
Keyword: Picea crassifolia, Picea asperata, plastome, ycf1, highly variable regions
Is the invited manuscript for consideration in a Special Not applicable (regular submission) Issue? :
https://mc06.manuscriptcentral.com/genome-pubs Page 1 of 41 Genome
1 Complete Plastome Sequences of Picea asperata Mast., P. crassifolia Kom. and
2 Comparative Analyses with P. abies (L.) Karst. and P. morrisonicola Hayata
3 Fangqun OuYang1, Jiwen Hu1, Junchen Wang12, Juanjuan Ling1, Zhi Wang1, Nan
4 Wang1, Jianwei Ma3, Hanguo Zhang4, Jian-Feng Mao5, Junhui Wang1*
5 1 State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree
6 Breeding and Cultivation of State Forestry Administration, Research Institute of
7 Forestry, Chinese Academy of Forestry, Beijing, PR China.
8 2 Northwest Agriculture & Forestry University, Xi’an, P. R. China.
9 3 Research Institute of Forestry of Xiaolong Mountain, Gansu Provincial Key 10 Laboratory of Secondary Forest Cultivation,Draft Gansu, P. R. China. 11 4 State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University,
12 Harbin, People’s Republic of China
13 5 National Engineering Laboratory for Forest Tree Breeding, Key Laboratory for
14 Genetics and Breeding of Forest Trees and Ornamental Plant of Ministry of
15 Education, College of Biological Science and Technology, Beijing Forestry
16 University, Beijing, 100083, PR China.
17 *Corresponding author: Junhui Wang, Dongxiaofu 1#, Xiangshan East Road, Haidian
18 District, Beijing, PR China, [email protected]
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 2 of 41
1 Abstract
2 Picea asperata and P. crassifolia have sympatric ranges and are closely related, but
3 the differences between these species at the plastome level are unknown. To better
4 understand the patterns of variation among Picea plastomes, the complete plastomes
5 of P. asperata and P. crassifolia were sequenced. Then, the plastomes were compared
6 with the complete plastomes of P. abies and P. morrisonicola, which are closely and
7 distantly related to the focal species, respectively. We also used these sequences to
8 construct phylogenetic trees to determine the relationships among and between the
9 four species as well as additional taxa from Pinaceae and other gymnosperms. 10 Analysis of our sequencing dataDraft allowed us to identify 438 single nucleotide 11 polymorphism (SNPs) point mutation events, 95 indel events, four inversion events,
12 and seven highly variable regions, including six gene spacer regions (psbJ-petA,
13 trnT-psaM, trnS-trnD, trnL-rps4, psaC-ccsA, and rps7-trnL) and one gene (ycf1). The
14 highly variable regions are appropriate targets for future use in the phylogenetic
15 reconstructions of closely related, sympatric Picea species as well as Pinaceae in
16 general.
17
18 Keywords Picea crassifolia, Picea asperata, plastome, ycf1, highly variable regions
https://mc06.manuscriptcentral.com/genome-pubs Page 3 of 41 Genome
19 Introduction
20 Spruces (Picea, Pinaceae) are important constituents of forests throughout the
21 Northern Hemisphere. The genus Picea comprises 34 species worldwide, and its
22 distribution and center of differentiation is in Asia (Farjon 2001). Most species occur
23 in China, including seven species endemic to the country (Farjon 2001; Fu et al.
24 1999). Picea species are generally morphologically similar and have incomplete
25 lineage sorting and interspecies introgression, which has resulted in a complicated
26 phylogeny and has caused researchers to experience problems with species
27 identification (Bouillé et al. 2011; Lockwood et al. 2013; Ran et al. 2006; Sullivan et 28 al. 2017). The significant topologicalDraft incongruence among chloroplast DNA loci is 29 suggestive of recombination, but incongruence between Picea mitochondrial DNA
30 and chloroplast DNA is suggestive of introgression (Bouillé et al. 2011; Sullivan et
31 al. 2017). Growing evidence suggests that plastomes can be widely applied for
32 species taxonomy and identification at both intra- and interspecific levels (Barrett et
33 al. 2016; Huang et al. 2014; Wu et al. 2013). Unlike most angiosperms, conifers
34 plastid inheritance were predominately paternal (Neale & Sederoff 1989). Unbroken
35 uniparental inheritance is one of key assumptions for plant evolution studies (Wolfe &
36 Randle 2004). Compared to nuclear genomes, plastomes have lower mutation rates,
37 smaller genome sizes, and smaller effective population sizes, which could provide a
38 larger basis for comparative studies in Picea species (Birol et al. 2013; Nystedt et al.
39 2013; Sullivan et al. 2017; Wolfe et al. 1987).
40 Picea asperata Masters and P. crassifolia Komarov are endemic to China, with a
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 4 of 41
41 natural range in the eastern Qinghai-Tibetan Plateau (QTP). The two species are not
42 only afforesting trees in the northwest region of China, but also has great significance
43 in economics and environment. These species exhibit parapatric areas (Whittle and
44 Johnston 2002), but P. crassifolia extends to a higher northern latitude (Bi et al.
45 2016). Eckenwalder (2009) recognizes P. crassifolia as a variety of P. schrenkiana
46 Fischet Mey which is not supported by molecular evidence in which P. crassifolia
47 was a member of the P. asperate complex (Lockwood et al. 2013). Picea asperata
48 and P. crassifolia are closely related and no derived chloroplast or mitochondrial
49 mutations are species-specific (Du et al. 2009). Nonetheless, the two species have 50 obvious phenotypic differences. Draft Compared to P. crassifolia, the leaf apex of P. 51 asperata is acute and the leaf length/width ratio is greater (Bi et al. 2016). In addition,
52 unlike P. crassifolia, the winter buds of P. asperata are resinous (Fu et al. 1999).
53 Population bottlenecks during the late Pleistocene period may have reduced
54 population size, restricted interspecific gene flow, and further promoted the
55 divergence between P. asperata and P. crassifolia by accelerating the fixation of
56 different adaptive alleles in the diverging lineages (Bi et al. 2016; Räsänen and
57 Hendry 2010). A number of spruce species have the ability to produce viable artificial
58 hybrids, including with parapatric and allopatric taxa (OECD 1999). Picea abies (L.)
59 Karst, native to northern and central Europe, can be easily hybridized with P.
60 asperata (Zhao et al. 2015) and P. crassifolia, as we previously verified in field trials
61 (Table S1). Crossability is generally high between species with close genetic
62 relationships (Eckenwalder and Press 2009); P. crassifolia, P. asperata, and P. abies
https://mc06.manuscriptcentral.com/genome-pubs Page 5 of 41 Genome
63 are genetically closely related, belonging to the same clade in the phylogenetic trees
64 estimated by chloroplast data (Lockwood et al. 2013; Ran et al. 2006; Sullivan et al.
65 2017). Phylogenetic comparisons of Picea species showed that the chloroplast DNA
66 and nuclear DNA trees were similar, suggesting that P. asperata and P. crassifolia are
67 closely related species. In contrast, significant topological differences were found in
68 the mitochondrial DNA tree, where P. abies was found to be closely related to P.
69 asperata and P. crassifolia (Ran et al. 2015). Previous work based on mitochondrial
70 DNA variation showed that most of the variation between P. asperata and P.
71 crassifolia is caused by variation within species (80%), with negligible variation 72 between species (Du et al. 2009). DraftHowever, some differentiation of chloroplast DNA 73 sequences has been found between the two species. For example, one of nine
74 chlorotypes is restricted to P. asperata (Du et al. 2009). Sullivan et al. (2017)
75 analyzed the entire plastome of 65 accessions of Picea to test for deviations from
76 canonical plastome evolution, but studies have not compared the difference of
77 plastome sequences between the closely related P. asperata and P. crassifolia.
78 In this study, the complete plastomes of P. asperata and P. crassifolia were
79 sequenced and compared to the plastome sequences of P. abies (Birol et al. 2013) and
80 P. morrisonicola Hayata (Zou et al. 2013). Furthermore, using these plastome
81 sequences, we investigated the broader-scale phylogenetic relationships of Picea and
82 taxa from Pinaceae and other gymnosperms to verify the usefulness of plastome
83 sequences for studying phylogenetic relationships. This study aimed to (1) evaluate
84 the characteristics of the P. asperata and P. crassifolia plastomes, (2) identify the
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 6 of 41
85 variation in the plastomes of P. asperata, P. crassifolia, P. abies, and P.
86 morrisonicola, (3) better understand the patterns of variation of Picea plastomes, (4)
87 determine the relationships among the four spruces as well as among these four
88 species and other Pinaceae and gymnosperm taxa.
89
90 Materials and Methods
91 Plant materials, DNA extraction, and DNA sequencing
92 On October 10, 2015, P. asperata and P. crassifolia needles were collected from
93 30-year-old plants growing at the improved tree seedling base at the Research
94 Institute of Forestry of Xiaolong Mountain,Draft Tianshui, Gansu Province, China (34°07′ 95 N, 105°24′ E). Total DNA was extracted from 1 g of fresh needles following the
96 cetyltrimethylammonium bromide (CTAB) method (Li et al. 2013). DNA fragments
97 were purified through agarose gel electrophoresis, and a 500-base pair (bp)-long
98 library was produced by NEBNext (New England Biolabs, lpswich, Massachusetts,
99 USA) for sequencing analysis using a HiSeq 4000 platform (PE150).
100 Plastome assembly and annotation
101 Clean reads were generated by removing adapter sequences as well as reads with too
102 many (>10%) unknown base calls (N), low complexity, and low-quality bases (>50%
103 of the bases with a quality score <5). High quality paired-end reads were assembled
104 using SPAdes 3.6.1 (Bankevich et al. 2012) with the parameter kmer = 95. Contigs
105 from the plastome were then filtered using BLASTN (Altschul et al. 1997) and
106 aligned to the P. abies reference plastome (http://congenie.org/; Birol et al. 2013)
https://mc06.manuscriptcentral.com/genome-pubs Page 7 of 41 Genome
107 with Sequencher 4.10 (Gene Codes Corporation, Ann Arbor, Michigan, USA). To
108 further verify the contigs, Geneious 8.1 (Kearse et al. 2012) was used to map all reads
109 to the assembled plastome sequence. The consensus sequences were produced using
110 mapped reads in Geneious.
111 The plastomes were annotated using the Dual Organellar GenoMe Annotator
112 (DOGMA) (Wyman et al. 2004). All annotations were manually checked to ensure
113 accurate identification of genes, especially for the genes unannotated by DOGMA,
114 including rps16, petB, and petD. All coding genes and the locations of RNA genes
115 were identified with BLASTX and BLASTN, respectively. The plastome maps of P. 116 asperata and P. crassifolia wereDraft generated using Organellar Genome DRAW 117 (http://ogdraw.mpimp-golm.mpg.de/index.shtml) (Lohse et al. 2013) and edited using
118 Adobe Illustrator CS5 (Adobe, San Jose, USA).
119 Comparative analysis of the complete plastomes of the four spruce species
120 The plastome sequences of P. abies (GenBank accession HF937082) and P.
121 morrisonicola (GenBank accession AB480556) were extracted from the National
122 Center for Biotechnology Information (NCBI) database. The plastome sequences of P.
123 asperata, P. crassifolia, P. abies, and P. morrisonicola were aligned using MAFFT
124 (Katoh and Standley 2013) and adjusted using Se-al software (Rambaut 2002).
125 Microstructural mutations and sequence polymorphisms among the four spruces were
126 analyzed using DnaSp 5.0 (Librado and Rozas 2009) employing the sliding window
127 method to screen highly variable regions. Plastomes were then analyzed using a 600
128 bp window length and a 25 bp step size. Microstructural mutations were analyzed and
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 8 of 41
129 classified into simple sequence repeat (SSR) and non-SSR indels and inversions. We
130 tested for structural rearrangements in de novo scaffolds of the complete plastomes of
131 four spruce species using the software package MUMmer v. 3.0 (Kurtz et al. 2004).
132 In order to better understand the patterns of variation of Picea plastomes, we also
133 compared sequence identity of the two new accessions (Picea asperata and P.
134 crassifolia) and the other three Picea plastomes extracted from the NCBI database, P.
135 sitchensis (GenBank accession EU998739), P. glauca (GenBank accession
136 KT634228), and P. jezoensis (GenBank accession KT337318) with P. abies as a
137 reference using mVISTA (Dubchak, I. 2007). 138 Phylogenetic analysis Draft 139 The 68 common conserved genes from the plastid genomes of 20 species from the
140 Pinaceae and 12 outgroup species (Table S3), from the Araucariaceae, Podocarpaceae,
141 Taxaceae, and Cupressaceae, were aligned using MAFFT and adjusted using Se-al
142 software (Rambaut 2002). We used these 32 species because at the time when we
143 sequenced Picea asperata and P. crassifolia the chloroplast sequences of these 32
144 species were already available. Two phylogenetic-inference methods, allowing for
145 different mutation rates for different genes and codon positions, were employed to
146 infer trees from these 68 concatenated regions. We fit general time reversible
147 (GTR+G) models to different genes and codon positions using the BIC criterion
148 implemented in PartitionFinder (Lanfear et al. 2012). Clade support (as a percentage)
149 was evaluated with 1,000 bootstrap replicates for both the maximum parsimony (MP)
150 and maximum likelihood (ML) methods. MP analysis was implemented in PAUP1.0
https://mc06.manuscriptcentral.com/genome-pubs Page 9 of 41 Genome
151 b10 (Simmons 2004) and ML analysis in RaxML 7.04 (Stamatakis 2006).
152
153 Results
154 High-throughput sequencing analysis, general features of the P. asperata and P.
155 crassifolia plastomes, and comparisons with P. abies and P. morrisonicola
156 Using an Illumina Hiseq 4000 (PE 150) system, the plastomes from Picea asperata
157 and P. crassifolia were sequenced, resulting in the generation of 14,912,912 and
158 8,506,474 total paired-end raw reads, respectively. Among these reads, 115,828 and
159 55,312 plastome reads were extracted with 140 X and 67 X coverage identified by 160 BLASTN against the plastome sequenceDraft of P. abies for the P. asperata (deposited in 161 GenBank: KY204451) and P. crassifolia plastomes (deposited in GenBank:
162 KY204450), respectively. The percentage of plastome reads of P. asperata was
163 0.78%, and the percentage of plastome reads of P. crassifolia was 0.65% of total raw
164 reads.
165 The details of the P. asperata plastome assembly were as follows: the total
166 assembled genome size was 124,145 bp, which was divided over eight contigs with an
167 N50 of 20,773 bp and a maximum contig size of 39,232 bp. The details of the P.
168 crassifolia plastome assembly were as follows: the total assembled genome size was
169 124,126 bp, which was divided over 10 contigs with an N50 of 35,630 bp and a
170 maximum contig size of 39,556 bp.
171 The complete plastomes of P. asperata and P. crassifolia (Fig. 1) were similar to
172 those of P. abies (124,084 bp) and P. morrisonicola (124,168 bp) (Table 1). The GC
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 10 of 41
173 content of the protein-coding regions of the P. asperata and P. crassifolia plastomes
174 were the same (38.71%), and this content was similar to those of P. abies (38.72%)
175 and P. morrisonicola (38.79%) (Table 1). The IR regions of the P. asperata and P.
176 crassifolia plastomes were highly reduced and exclusively contained repeated short
177 gene sequences of trnH-GUG (74 bp), trnI-CAU (73 bp), and ycf12 (101 bp) (Fig 1).
178 The plastomes of the four spruces displayed an equivalent number of encoded
179 genes, comprised 108 different functional genes (Fig. 1, Table S2) including 72
180 protein-coding genes, 32 tRNA genes, and four rRNA genes (16 S, 23 S, 5 S, and 4.5
181 S). Among the 108 genes, trnH-GUG, trnI-CAU, trnT-GGU, rps12, psbI and ycf12 182 were duplicated, existing as invertedDraft repeat sequences. Eight protein-coding genes 183 and seven tRNA genes each presented one intron, whereas ycf3 contained two. In
184 addition, rps12 was identified as a trans-splicing gene. The 108 genes were classified
185 into three classes. The first class contained genes related to transcription and
186 translation, mainly encoding RNA polymerase subunits, rRNAs, ribosomal protein
187 products, and tRNAs. The second class contained genes related to photosynthesis,
188 particularly the Rubisco large subunit gene and components of the photosynthetic
189 electron transport chain. The third class contained genes related to the biosynthesis of
190 amino acids and fatty acids, protein translocation as well as some genes of unknown
191 function, including ycf2 and ycf12.
192 The length of the four spruce plastomes was 124,517 bp and included 438 single
193 nucleotide polymorphisms (SNPs). The mean nucleotide diversity (π) was 0.0182.
194 Pairwise comparison of SNPs and π among the spruce plastomes revealed minimal
https://mc06.manuscriptcentral.com/genome-pubs Page 11 of 41 Genome
195 differences between P. asperata and P. crassifolia. Only seven SNPs were observed
196 and were located in psbI-trnE-UUC (3 SNPs), rpoC2 (1 SNP in coding sequence),
197 trnP-GGG-rpl32 (1 SNP), trnL-CAA-ycf2 (1 SNP), and trnV-UAC-trnH-GUG (1
198 SNP). Picea abies presented moderate differences from P. asperata and P. crassifolia
199 (69 and 71 SNPs, respectively). However, larger differences existed between P.
200 morrisonicola and the other three spruces, in which 396–408 SNPs were observed. It
201 is worth noting that we did not find structural rearrangements of P. asperata and P.
202 crassifolia compared with P. abies and P. morrisonicola after sequence alignment.
203 Sullivan et al. (2017) also found no evidence of rearrangements within scaffolds. 204 Divergence hotspots for the plastomesDraft of P. asperata, P. crassifolia, P. abies, and 205 P. morrisonicola
206 The number of SNPs ranged from 0 to 16, and π ranged from 0 to 0.01361 for the
207 plastome sequences. Seven highly variable regions were identified employing π =
208 0.006 as the dividing value; these included six gene spacer regions (psbJ-petA,
209 trnT-psaM, trnS-trnD, trnL-rps4, psaC-ccsA, and rps7-trnL) and one gene (ycf1). In
210 addition, rps7-trnL and ycf1 presented the greatest variation (Fig. 2).
211 Numbers and patterns of SNP mutations
212 The patterns of the 438 SNP mutations are presented in Fig. 3. The probability of each
213 base variation was different. The numbers of C to T and G to A mutations were high
214 (145), whereas the numbers of C to G and G to C mutations were low (21). In the
215 plastomes of the four spruce species, 236 transitions and 202 transversions were
216 detected, and the transition-to-transversion ratio was 1:1.17.
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 12 of 41
217 Among all of the identified base mutation events, 144, 35, and 258 occurred in
218 coding regions, introns, and intergenic regions, respectively. The 144 mutations
219 occurring in coding regions were located in 43 genes (Table S4), including 20
220 photosynthetic apparatus genes, seven ribosomal protein genes, four
221 transcription-related genes, three chlorophyll biosynthesis genes, and nine other
222 genes. The mutations were 30-fold more likely to occur in the ycf1 and ycf2 genes.
223 Notably, ycf1 and ycf2 together contain 48% of all coding sequence polymorphisms.
224 Only one base pair in rpoC2 showed a mutation between P. asperata and P.
225 crassifolia. 226 Numbers and forms of microstructuralDraft mutations 227 Various types of SSR loci are present in the genome, and different SSR loci display
228 different numbers of repetitions. Therefore, many indels may exist in the genome. In
229 this study, these indels are referred to as SSR indels, whereas other indels are referred
230 to as non-SSR indels.
231 Forty-eight SSR indels were detected among the four spruce plastomes, and
232 seven occurred in intronic regions. The largest SSR indel was nine-bp-long. More
233 than half of the SSR indels (26, 54.17%) were single-base indels. The SSRs causing
234 indels were mainly single-base repeats (mainly A and T), whereas two-base repeat
235 indels only occurred three times. The indel located at the 3'-end of rps12 was adjacent
236 to two SSR repeats (repeating bases A and G) and was regarded as a single indel
237 event according to the principles of sequence alignment (Table S5).
https://mc06.manuscriptcentral.com/genome-pubs Page 13 of 41 Genome
238 Forty-seven non-SSR indels were detected in the four spruce plastomes. Six were
239 located in gene-coding regions, nine in intronic regions, and 32 in intergenic spacer
240 regions. The non-SSR indels occurred eight times in the gene encoding ycf1. The
241 sizes of the non-SSR indels ranged from 1 to 60 bp, and they were larger than the SSR
242 indels. The largest indel (60 bp) was located in ycf1 (Table S6).
243 Four inversion events were detected in the four spruce plastomes; these were
244 located in the psbA-trnK, trnH-trnT, trnI-trnF, and ycf3 introns. The sizes of the four
245 inversion fragments were 4, 3, 4, and 2 bp, respectively, for psbA-trnK, trnH-trnT,
246 trnI-trnF, and ycf3 introns, and the lengths of the repetitive sequences at both ends 247 were 13, 9, 19, and 2 bp, respectivelyDraft (Table 2). 248 The direction of the four inversion events was analyzed further. Using P.
249 asperata as a reference (G), the inversions located in psbA-trnK and ycf3 occurred
250 only in P. morrisonicola; however, the inversions located in trnH-trnT and trnI-trnF
251 occurred in both P. abies and P. morrisonicola (Table 2).
252 Comparison of plastome sequences with three published Picea plastomes
253 We also compared the two new accessions of Picea asperata and P. crassifolia with
254 the other three known Picea species (P. sitchensis, P. glauca, and P. jezoensis)
255 plastomes extracted from the NCBI. We found that spruce plastomes were highly
256 similar in pair-wise comparisons (Fig. S1). The structures of these plastomes were
257 generally conserved, and neither translocations nor inversions were detected among
258 the sequences. As expected, coding regions were more highly conserved than
259 noncoding regions. More concretely, most highly polymorphic regions were located
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 14 of 41
260 in intergenic regions. In addition to the six intergenic regions mentioned above, we
261 also found others (such as accD-psaL, ycf12-psbB, ycf2-trnL, and psbE-petL). These
262 regions may be undergoing more rapid nucleotide substitution at the species level,
263 which would indicate that molecular markers are important for phylogenetic analyses
264 and plant identification in Picea.
265 Reconstruction of phylogenetic relationships based on plastome genes
266 We performed phylogenetic analysis on 32 species whose plastomes were publicly
267 available. All phylogenetic analyses were performed using maximum likelihood (ML)
268 and parsimony (MP) methods on 68 shared plastome genes, protein-coding genes, and 269 conserved genes that were almost identicalDraft in five major clades: i) Pinaceae (including 270 Pinoideae, Piceoideae, Laricodeae, and Abietoideae), ii) Araucariaceae, iii)
271 Podocarpaceae, iv) Taxaceae, and v) Cupressaceae (Fig. 4 ,Fig. S2 and S3).
272 Relationships within most clades were strongly supported (>90%) with the notable
273 exception of Pinoideae and Cathaya argyrophylla, in which the nodes had less than
274 60% support. Pinaceae was sister to the other conifers, among which Taxaceae was
275 sister to Cupressaceae, and Araucariaceae was sister to Podocarpaceae. Piceoideae
276 was sister to Pinaceae and these two were both sisters of Laricoideae. Larix decidua
277 was found to have a closer relationship with Pseudotsuga sinensis (Laricoideae) than
278 to the Pinoideae and Piceoideae. With one exception, Cathaya argyrophylla
279 (Laricoideae) was moderately supported as a sister group to Pinoideae. Abies koreana,
280 Keteleeria devidiana, and Cedrus deodara in Abietoideae were found to be more
281 closely related to each other than to other Pinaceae species.
https://mc06.manuscriptcentral.com/genome-pubs Page 15 of 41 Genome
282 In the Picadeae, P. asperata and P. crassifolia were found to be more closely
283 related to each other than to the seven other species included in the subfamily. Picea
284 abies was sister to P. asperata and P. crassifolia, with 100% support. The genetic
285 relationships of P. glauca and P. jezoensis could be collapsed into a polytomy
286 because the support was less than 80%, whereas P. morrisonicola and P. sitchensis
287 had 100% support as a clade that was sister to the P. glauca–P. jezoensis polytomy
288 and the P. abies, P. asperata, and P. crassifolia clade.
289
290 Discussion 291 In this study, the complete plastomesDraft of Picea asperata and P. crassifolia were 292 sequenced, and a comparative analysis with the plastomes of two other spruces (P.
293 abies and P. morrisonicola) was performed to assess genome-wide mutational
294 dynamics within the genus Picea. The sizes of the four spruce plastomes were similar,
295 in the range of 124,084 bp to 124,168 bp, which is similar to that of P. jezoensis
296 (124,146 bp) (Yang et al. 2015) and slightly longer than that of P. glauca (123,266
297 bp) (Jackman et al. 2016). The four spruces evaluated in the present study displayed
298 the same number of encoded genes (108 genes), and six genes (i.e. rps12, trnT-GGU,
299 trnH-GUG, trnI-CAU, psbI, and ycf12) were duplicated. The gene content of the
300 Picea plastome was conserved and comprised 74 protein-coding genes, 36 tRNA
301 genes, and four rRNAs (Jackman et al. 2016; Sullivan et al. 2017). The ndh genes (11
302 in total) were lost in all four spruces, and only non-functional plastid ndh gene
303 fragments were present. Several Pinaceae species were also found to have lost the ndh
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 16 of 41
304 genes in their plastid genome (Wakasugi et al. 1994). The fate of ndh genes in Picea
305 involved a complex and dynamic scenario that cannot be considered to be a single
306 evolutionary loss (Lin et al. 2010). Ranade et al. (2016) demonstrated that ndh genes
307 were transferred to the nucleus during chloroplast evolution. In addition to the loss of
308 ndh genes, Pinaceae shared other synapomorphic plastome features, such as the
309 common loss of rps16 genes and expansion of IRs to the 3’ region of the psbA gene.
310 The rps16 gene was also absent in the P. asperata and P. crassifolia plastomes (Fig 1),
311 which is similar to a previous finding that the rps16 gene was absent from the Pinus
312 thunbergii plastome (Tsudzuki et al. 1992). 313 Phylogenetic inference amongDraft closely related species within subsections presents 314 a challenge. A growing body of evidence points to the presence of highly variable
315 markers in the plastome (Dong et al. 2012). These markers have been widely used to
316 study plant phylogenetics on lower taxonomic levels. In the present study,
317 comparisons of the four spruce plastomes suggested that ycf1, which encodes an
318 essential component of the plastid protein import apparatus, has the most variable
319 regions (Kikuchi et al. 2013). In addition, the largest indel (60 bp) is located within
320 the ycf1 sequence. ycf1, with two noncontiguous sections (i.e. ycf1a and ycf1b), is the
321 most promising plastid DNA barcode in land plants (Dong et al. 2015). Using the
322 distance method, ycf1b exhibited the highest discrimination success compared with
323 markers such as matK, rbcLb, and trnH-psbA among Pinus, Calycanthaceae, Iris,
324 Armeniaca, Paeonia, and Quercus species (Dong et al. 2015). However, ycf1 has a
325 higher ratio of non-synonymous to synonymous substitutions, which strongly
https://mc06.manuscriptcentral.com/genome-pubs Page 17 of 41 Genome
326 supports the hypothesis that this gene has been positively selected in Pinus (Parks et
327 al. 2009) and Picea (Sullivan et al. 2017). After excluding repetitive sequences and
328 poorly aligned regions, we found that the ycf1 gene had 21 positively selected codons,
329 indicating that this gene may be under very high selection pressure in Picea (Sullivan
330 et al. 2017). We also found six gene spacer regions (i.e. psbJ-petA, trnT-psaM,
331 trnS-trnD, trnL-rps4, psaC-ccsA, and rps7-trnL) that showed high variability based on
332 comparisons of four spruce plastomes. Among these spacer regions, only psbJ-petA
333 has been used in phylogenetic reconstructions—i.e. low-resolution reconstructions of
334 Osmanthus Lour. (Oleaceae) (Guo et al. 2011) and other angiosperm species 335 (Jaramillo et al. 2008). Draft 336 Nonetheless, there are other issues to consider when using plastome for
337 phylogenetic and/or population genetic studies. For instance, it is important to note
338 that plastome is effectively a single gene estimate. Incomplete lineage sorting can
339 cause the plastome to markedly vary from the true species tree (Doyle 1992), and
340 introgression can cause further discordance between plastome and species history
341 (Maddison 1997). There is also growing awareness of the importance of considering
342 the causes of phylogenetic discord to account for disagreement between phylogenetic
343 analyses given that plastomes are capable of recombination (Sullivan et al. 2017, Zhu,
344 2018). Thus, reconstructing evolutionary relationships among closely-related species
345 should involve the use of multiple individuals and high-resolution loci (and even
346 whole plastomes) (Knowles and Carstens 2007; Sullivan et al. 2017; Syring et al.
347 2007).
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 18 of 41
348 In the present study, the topologies of ML and MP phylogenetic trees inferred
349 from 68 shared plastome genes, protein-coding genes, and codon positions were
350 almost identical (Fig. 4, Fig. S2 and S3). Pinaceae was sister to the other conifers,
351 among which Taxaceae was sister to Cupressaceae, and Araucariaceae was sister to
352 Podocarpaceae, which is in accordance with previous phylogenetic analyses
353 uncovered two subclades within cupressophytes that inferred from 80 plastid
354 protein-coding genes (Wu and Chaw, 2016). One is the Northern Hemisphere species,
355 consisting of Cupressaceae and Taxaceae, and the other is the Southern Hemisphere
356 species, containing Araucariaceae and Podocarpaceae. Our results also revealed an 357 ambiguous placement for CathayaDraft argyrophylla, a relative of Pinus and Picea; our 358 results suggested that C. argyrophylla was more closely related to Picea than to Pinus
359 (Lin et al. 2010; Wang et al. 2000). In addition, we found that Cedrus deodara
360 formed a clade with Keteleeria davidiana and Abies koreana, which agrees with
361 previous phylogenetic analyses that used comparative chloroplast genomics to
362 categorize Pinaceae genera and subfamilies (Lin et al. 2010).
363 With respect to the phylogenetic distribution of Picea species, our results
364 suggested that seven spruce species are sister to the Piceoideae (Fig. 4, Fig. S2 and
365 S3). The topologies of our phylogenies of these seven spruce species are similar to
366 phylogenetic analyses produced by Sullivan et al. (2017). Picea abies, P. asperata,
367 and P. crassifolia were found to be more closely related to each other than to other
368 Picea species. Similar results were also found by previous phylogenetic analyses that
369 used plastid, mitochondrial, and nuclear markers (Lockwood et al. 2013; Ran et al.
https://mc06.manuscriptcentral.com/genome-pubs Page 19 of 41 Genome
370 2006; Sullivan et al. 2017). However, several nodes with P. abies and predominately
371 northeast Asian taxa, including P. asperata and P. crassifolia, had less than 50%
372 support in Sullivan et al. (2017). The uncertainty of this topology is likely due to a
373 rapid and recent radiation given the very low interspecific genetic divergence and
374 monomorphic mitotypes (Ran et al. 2006; Sullivan et al. 2017). The plastomes of P.
375 abies, P. asperata, and P. crassifolia presented minimal differences; i.e. P. asperata
376 and P. crassifolia appeared to have only seven SNPs in the present study. A previous
377 study of chloroplast and mitochondrial DNA variations between P. asperata and P.
378 crassifolia also showed that none of the derived mutations are species specific (Du et 379 al. 2009). A recent divergence (approximatelyDraft 127,000 years ago) between these two 380 species and a lack of fixed variation indicates that they are at the initial stage of
381 speciation (Bi et al. 2016). At this stage, there has been insufficient time to
382 accumulate genetic differentiation (Nielsen and Wakeley 2001). Moreover,
383 incomplete lineage sorting and gene flow results in extensive genetic sharing between
384 the two lineages (Bi et al. 2016).
385 Picea jezoensis is a widespread species found in cold-temperate and boreal
386 forests in eastern Siberia, northeast China, Korea, and Japan. Here, we found poor
387 support for the relationships among P. jezoensis, P. glauca, and P. abies on the
388 whole-plastome level. Similarly, previous phylogenetic studies based on individual
389 chloroplast DNA regions showed that these species are distinct (Lockwood et al.
390 2013; Ran et al. 2006). Sullivan et al. (2017) inferred that introgression between a P.
391 jezoensis-like species and the most recent common ancestor of P. abies may have
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 20 of 41
392 been present in northeastern Russia during the Quaternary glaciation, prior to the
393 clade’s diversification and colonization of Eurasia. In addition, P. glauca and P.
394 sitchensis, which are distributed in western North America (Sutton et al. 1991), were
395 clustered into distantly related clades rather than in a monophyletic group (Fig 4).
396 This finding is consistent with phylogenetic analyses conducted using the trnC-trnD
397 and trnT-trnF regions (Ran et al. 2006) as well as whole plastome alignments
398 (Sullivan et al. 2017). P. sitchensis and P. glauca, two distantly-related species, have
399 been found to readily hybridize (Hamilton & Aitken 2013). However, a previous
400 study showed that P. glauca, P. engelmannii, and P. sitchensis were sister to each 401 other with strong support (LockwoodDraft et al. 2013). The phylogeographical types 402 found at different sampling locations and the different plastome sequences within
403 them may also have caused discordance among Picea phylogenies (Aizawa et al.
404 2007; Lockwood et al. 2013; Ran et al. 2006).
405 Other than P. sitchensis, P. morrisonicola was found to be the most distantly
406 related to the other six Picea species (Fig. 4). Picea morrisonicola is a vulnerable
407 spruce species endemic to the island of Taiwan (Bodare et al. 2013); chloroplast,
408 mitochondrial and nuclear marker data suggest that this species is rather distantly
409 related to P. abies, P. crassifolia, and P. asperata (Ran et al. 2015). The population
410 of P. morrisonicola is small and geographically isolated (Bodare et al. 2013);
411 therefore, species-specific mutations in the chloroplast DNA would likely have
412 accumulated more rapidly in this species (Bouillé et al. 2011; Ran et al. 2006;
413 Sullivan et al. 2017). Thus, P. morrisonicola exhibits greater differences compared
https://mc06.manuscriptcentral.com/genome-pubs Page 21 of 41 Genome
414 with the other three spruce species examined in this study. Several studies have
415 investigated the phylogenetic relationships among spruces (Bouillé et al. 2011; Ran
416 et al. 2006; Sullivan et al. 2017); however, no concordant topologies have been
417 generated using plastid data (Bouillé et al. 2011; Ran et al. 2006; Sullivan et al.
418 2017). This can be attributed to interspecific plastome recombination and ancient
419 reticulate evolution in Picea (Bouillé et al. 2011; Sullivan et al. 2017).
420
421 Conclusions
422 In this study, by comparing the plastomes of P. asperata, P. crassifolia, P. abies, and 423 P. morrisonicola, we identified 438Draft SNPs, 95 indel events, four inversion events, 424 seven highly variable regions, six gene spacer regions (psbJ-petA, trnT-psaM,
425 trnS-trnD, trnL-rps4, psaC-ccsA, and rps7-trnL) and one gene (ycf1). These identified
426 regions and mutations may be used as molecular markers for further phylogenetic
427 analyses of Picea and other Pinaceae species. Furthermore, phylogenetic analysis
428 inferred from shared plastome genes may be more precise than analyses using a single
429 gene from the plastome. However, because of incomplete lineage sorting and
430 interspecies introgression leading to inconsistent tree topologies, we suggest that
431 additional analyses should be conducted to confirm the relationships among Picea
432 species, especially based on larger genomic data.
433
434 Acknowledgements
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 22 of 41
435 This work was financially supported by the National Natural Science Foundation of
436 China (NSFC 31271265 and 31600541) and the China Postdoctor Science Foundation
437 (CPSF 2016M591053). We acknowledge TopEdit LLC for linguistic editing and
438 proofreading during the preparation of this manuscript. And we also would like to
439 thank the editor-in-chief, associate editor and the expert reviewer, for their valuable
440 comments and suggestions.
441
442 References
443 Aizawa, M., Yoshimaru, H., Saito, H., Katsuki, T., Kawahara, T., Kitamura, K., Shi, 444 F., and Kaji, M. 2007. PhylogeographyDraft of a northeast Asian spruce, Picea 445 jezoensis, inferred from genetic variation observed in organelle DNA markers.
446 Molecular Ecology 16(16): 3393-3405.
447 doi:10.1111/j.1365-294X.2007.03391.x.
448 Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., , and
449 Lipman D.J. 1997. Gapped BLAST and PSI-BLAST: a new generation of
450 protein database search programs. Nucleic Acids Research 25(17): 3389-3402.
451 doi: 10.1093/nar/25.17.3389.
452 Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S.,
453 Lesin, V.M., Nikolenko, S.I., Pham, S., Prjibelski, A.D., Pyshkin A.V.,
454 Sirotkin A.V., Vyahhi N., Tesler G., Alekseyev M.A. and Pevzner P.A. 2012.
455 SPAdes: A New Genome Assembly Algorithm and Its Applications to
456 Single-Cell Sequencing. Journal of Computational Biology 19: 455-477.
https://mc06.manuscriptcentral.com/genome-pubs Page 23 of 41 Genome
457 doi:10.1089/cmb.2012.0021.
458 Barrett, C.F., Baker, W.J., Comer, J.R., Conran, J.G., Lahmeyer, S.C., Leebens‐Mack,
459 J.H. and Li, J., et al. 2016. Plastid genomes reveal support for deep
460 phylogenetic relationships and extensive rate variation among palms and other
461 commelinid monocots. New Phytologist 209(2): 855-870.
462 doi:10.1111/nph.13617.
463 Bi, H., Yue, W., Wang, X., Zou, J., Li, L., Liu, J., and Sun, Y. 2016. Late Pleistocene
464 climate change promoted divergence between Picea asperata and
465 P. crassifolia on the Qinghai–Tibet Plateau through recent bottlenecks. 466 Ecology & Evolution 6(13):Draft 4435-4444. doi: 10.1002/ece3.2230. 467 Birol, I., Raymond, A., Jackman, S.D., Pleasance, S., Coope, R., Taylor, G.A. and
468 Yuen,S.M.M., et al. 2013. Assembling the 20 Gb white spruce (Picea glauca)
469 genome from whole-genome shotgun sequencing data. Bioinformatics 29(12):
470 1492-1497. doi: 10.1093/bioinformatics/btt178.
471 Bodare, S., Stocks, M., Yang, J.C., and Lascoux, M. 2013. Origin and demographic
472 history of the endemic Taiwan spruce (Picea morrisonicola). Ecology &
473 Evolution 3(10): 3320-3333. doi: 10.1002/ece3.698.
474 Bouillé, M., Senneville, S., and Bousquet, J. 2011. Discordant mtDNA and cpDNA
475 phylogenies indicate geographic speciation and reticulation as driving factors
476 for the diversification of the genus Picea. Tree Genetics & Genomes 7(3):
477 469-484. doi:10.1007/s11295-010-0349-z.
478 Dong, W., Liu, J., Yu, J., Wang, L., and Zhou, S. 2012. Highly variable chloroplast
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 24 of 41
479 markers for evaluating plant phylogeny at low taxonomic levels and for DNA
480 barcoding. PLoS One 7 : e35071. doi:10.1371/journal.pone.0035071.
481 Dong, W., Xu, C., Li, C., Sun, J., Zuo, Y., Shi, S. and Cheng, T., et al., 2015. ycf1, the
482 most promising plastid DNA barcode of land plants. Scientific Reports 5:
483 8348. doi:10.1038/srep08348.
484 Doyle, J.J. 1992. Gene trees and species trees: molecular systematics as one-character
485 taxonomy. Systematic Botany 17(1): 144-163.
486 Du, F.K., Petit, R.J., and Liu, J.Q. 2009. More introgression with less gene flow:
487 chloroplast vs. mitochondrial DNA in the Picea asperata complex in China, 488 and comparison with otherDraft Conifers. Molecular Ecology 18(7): 1396-1407. 489 doi: 10.1111/j.1365-294X.2009.04107.x.
490 Dubchak, I. 2007. Comparative Analysis and Visualization of Genomic Sequences
491 Using VISTA Browser and Associated Computational Tools. Methods in
492 Molecular Biology 395:3-16.
493 Eckenwalder, J.E., and Press, T. 2009. Conifers of the world. Timber Press.
494 Farjon, A. 2001. World checklist and bibliography of conifers. Royal Botanic
495 Gardens.
496 Fu, L., Li, N., and Mill, R. 1999. Pinaceae. In: Flora of China. Science Press, Beijing
497 and Missouri Botanical Garden Press, St. Louis. 11-52.
498 Guo, S.-Q., Xiong, M., Ji, C.-F., Zhang, Z.-R., Li, D.-Z., and Zhang, Z.-Y. 2011.
499 Molecular phylogenetic reconstruction of Osmanthus Lour. (Oleaceae) and
500 related genera based on three chloroplast intergenic spacers. Plant Systematics
https://mc06.manuscriptcentral.com/genome-pubs Page 25 of 41 Genome
501 and Evolution 294(1-2): 57-64. doi:10.1007/s00606-011-0445-z.
502 Hamilton, J.A., Aitken, S.N. 2013. Genetic and morphological structure of a spruce
503 hybrid (Picea sitchensis x P. glauca) zone along a climatic gradient. American
504 Journal of Botany. 100:1651–1662.
505 Huang, D.I., Hefer, C.A., Kolosova, N., Douglas, C.J., and Cronk, Q.C.B. 2014.
506 Whole plastome sequencing reveals deep plastid divergence and cytonuclear
507 discordance between closely related balsam poplars, Populus balsamifera and
508 P. trichocarpa (Salicaceae). New Phytologist 204(3): 693-703. doi:
509 10.1111/nph.12956. 510 Jackman, S.D., Warren, R.L., Gibb,Draft E.A., Vandervalk, B.P., Mohamadi, H., Chu, J. 511 and Raymond, A., et al. 2016. Organellar Genomes of White Spruce (Picea
512 glauca): Assembly and Annotation. Genome Biology & Evolution 8(1):
513 29-41. doi:10.1093/gbe/evv244.
514 Katoh, K., and Standley, D.M. 2013. MAFFT multiple sequence alignment software
515 version 7: improvements in performance and usability. Molecular Biology &
516 Evolution 30: 772-780. doi:10.1093/molbev/mst010.
517 Kearse, M., Moir, R., Wilson, A., Stoneshavas, S., Cheung, M., Sturrock, S. and
518 Buxton, S., et al. 2012. Geneious Basic: An integrated and extendable desktop
519 software platform for the organization and analysis of sequence data.
520 Bioinformatics 28(12): 1647-1649. doi:10.1093/bioinformatics/bts199.
521 Knowles, L.L., and Carstens, B.C. 2007. Estimating a geographically explicit model
522 of population divergence. Evolution 61(3): 477-493.
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 26 of 41
523 doi:10.1111/j.1558-5646.2007.00043.x.
524 Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., and
525 Salzberg, S.L. 2004. Versatile and open software for comparing large
526 genomes. Genome Biology 5(2): R12. doi:10.1186/gb-2004-5-2-r12.
527 Lanfear, R., Calcott, B., Ho, S.Y.W., and Guindon, S. 2012. PartitionFinder:
528 combined selection of partitioning schemes and substitution models for
529 phylogenetic analyses. Molecular Biology & Evolution 29(6): 1695-1701.
530 doi:10.1093/molbev/mss020.
531 Li, J., Wang, S., Yu, J., Wang, L., and Zhou, S. 2013. A modified CTAB protocol for 532 plant DNA extraction. ChineseDraft Bulletin of Botany 48(1): 72-78. doi: 533 10.3724/SP.J.1259.2013.00072.
534 Librado, P., and Rozas, J. 2009. DnaSP v5: a software for comprehensive analysis of
535 DNA polymorphism data. Bioinformatics 25(11): 1451-1452. doi:
536 10.1093/bioinformatics/btp187.
537 Lin, C.P., Huang, J.P., Wu, C.S., Hsu, C.Y., and Chaw, S.M. 2010. Comparative
538 Chloroplast Genomics Reveals the Evolution of Pinaceae Genera and
539 Subfamilies. Genome Biology & Evolution 2(1): 504-517.
540 doi:10.1093/gbe/evq036.
541 Lockwood, J.D., Aleksić, J.M., Zou, J., Wang, J., Liu, J., and Renner, S.S. 2013. A
542 new phylogeny for the genus Picea from plastid, mitochondrial, and nuclear
543 sequences. Molecular Phylogenetics & Evolution 69(3): 717-727. doi:
544 10.1016/j.ympev.2013.07.004.
https://mc06.manuscriptcentral.com/genome-pubs Page 27 of 41 Genome
545 Lohse, M., Drechsel, O., Kahlau, S., and Bock, R. 2013. OrganellarGenomeDRAW--a
546 suite of tools for generating physical maps of plastid and mitochondrial
547 genomes and visualizing expression data sets. Nucleic Acids Research
548 41(Web Server issue): 575-581. doi: 10.1093/nar/gkt289.
549 Maddison, W.P. 1997. Gene Trees in Species Trees. Systematic Biology 46(3):
550 523-536. doi: 10.1093/sysbio/46.3.523.
551 Neale DB, Sederoff RR. 1989. Paternal inheritance of chloroplast DNA and maternal
552 inheritance of mitochondrial DNA in loblolly pine. Theoretical and Applied
553 Genetics 77:212–216. 554 Nielsen, R., and Wakeley, J. 2001.Draft Distinguishing Migration From Isolation: A 555 Markov Chain Monte Carlo Approach. Genetics 158(2): 885-896.
556 Nystedt, B., Street, N.R., Wetterbom, A., Zuccolo, A., Lin, Y.-C., Scofield, D.G. and
557 Vezzi, F. et al., 2013. The Norway spruce genome sequence and conifer
558 genome evolution. Nature. 497:579-584. doi:10.1038/nature12211.
559 OECD. 1999. Consensus document on the biology of Picea glauca (Moench) Voss
560 (White Spruce). Series on Harmonization of Regulatory Oversight in
561 Biotechnology No. 13, eds. Joint Meet. Chemicals Committee and Working
562 Party on Chemicals (Environ. Direct., OECD Environ. Health and Safety
563 Publ.), Paris, France.
564 Parks, M., Cronn, R., and Liston, A. 2009. Increasing phylogenetic resolution at low
565 taxonomic levels using massively parallel sequencing of chloroplast genomes.
566 BMC Biology 7(84): 1-17. doi:10.1186/1741-7007-7-84.
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 28 of 41
567 Rambaut, A. 2002. Se-Al: Sequence Alignment Editor v2.0 a11. University of Oxford
568 UK.
569 Ran, J.H., Wei, X.X., and Wang, X.Q. 2006. Molecular phylogeny and biogeography
570 of Picea (Pinaceae): implications for phylogeographical studies using
571 cytoplasmic haplotypes. Molecular Phylogenetics & Evolution 41(2):
572 405-419. doi:10.1016/j.ympev.2006.05.039.
573 Ran, J.H., Shen, T.T., Liu, W.J., Wang, P.P., and Wang, X.Q. 2015. Mitochondrial
574 introgression and complex biogeographic history of the genus Picea.
575 Molecular Phylogenetics & Evolution 93: 63-76. 576 doi:10.1016/j.ympev.2015.07.020.Draft 577 Ranade, S.S., García-Gil, M.R., and Rosselló, J.A. 2016. Non-functional plastid ndh
578 gene fragments are present in the nuclear genome of Norway spruce (Picea
579 abies L. Karsch): insights from in silico analysis of nuclear and organellar
580 genomes. Molecular Genetics & Genomics 291(2): 935-941.
581 doi:10.1007/s00438-015-1159-7.
582 Räsänen, K., and Hendry, A.P. 2010. Disentangling interactions between adaptive
583 divergence and gene flow when ecology drives diversification. Ecology
584 Letters 11(6): 624-636. doi: 10.1111/j.1461-0248.2008.01176.x.
585 Kikuchi S., Bédard J., Hirano M., Hirabayashi Y., Oishi M., Imai M. and Takase M.
586 2013. Uncovering the protein translocon at the chloroplast inner envelope
587 membrane. Science (New York, N.Y.) 339(6119): 571-574.
588 doi:10.1126/science.1229262.
https://mc06.manuscriptcentral.com/genome-pubs Page 29 of 41 Genome
589 Simmons, M.P. 2004. Independence of alignment and tree search. Molecular
590 Phylogenetics & Evolution 31(3): 874-879. doi:10.1016/j.ympev.2003.10.008.
591 Stamatakis, A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic
592 analyses with thousands of taxa and mixed models. Bioinformatics 22(21):
593 2688-2690. doi: 10.1093/bioinformatics/btl446.
594 Sullivan, A.R., Schiffthaler, B., Thompson, S.L., Street, N.R., and Wang, X.R. 2017.
595 Interspecific Plastome Recombination Reflects Ancient Reticulate Evolution
596 in Picea (Pinaceae). Molecular Biology & Evolution 34(7): 1689-1701.
597 Sutton, B.C., Flanagan, D.J., Gawley, J.R., Newton, C.H., Lester, D.T., and 598 El-Kassaby, Y.A. 1991. InheritanceDraft of chloroplast and mitochondrial DNA in 599 Picea and composition of hybrids from introgression zones. Theoretical &
600 Applied Genetics 82(2): 242-248. doi:10.1007/bf00226220.
601 Syring, J., Farrell, K., Businsky, R., Cronn, R., and Liston, A. 2007. Widespread
602 genealogical nonmonophyly in species of Pinus subgenus Strobus. Systematic
603 Botany 56(2): 1-19. doi:10.1080/10635150701258787.
604 Tsudzuki, J., Nakashima, K., Tsudzuki, T., Hiratsuka, J., Shibata, M., Wakasugi, T.,
605 and Sugiura, M. 1992. Chloroplast DNA of black pine retains a residual
606 inverted repeat lacking rRNA genes: nucleotide sequences of trnQ, trnK,
607 psbA, trnI and trnH and the absence of rps16. Molecular & General Genetics
608 232(2): 206-214.
609 Wakasugi, T., Tsudzuki, J., Ito, S., Nakashima, K., Tsudzuki, T., and Sugiura, M.
610 1994. Loss of all ndh genes as determined by sequencing the entire chloroplast
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 30 of 41
611 genome of the black pine Pinus thunbergii. Proceedings of the National
612 Academy of Sciences of the United States of America 91(21): 9794-9798.
613 Wang, X.Q., Tank, D.C., and Sang, T. 2000. Phylogeny and divergence times in
614 Pinaceae: evidence from three genomes. Molecular Biology & Evolution 17:
615 773-781. doi:10.1212/01.wnl.0000210464.94122.e1
616 Whittle, C.A., and Johnston, M.O. 2002. Male-Driven Evolution of Mitochondrial and
617 Chloroplastidial DNA Sequences in Plants. Molecular Biology & Evolution
618 19(6): 938-949. doi: 10.1093/oxfordjournals.molbev.a004151.
619 Wolfe A.D, Randle C.P. 2004. Recombination, heteroplasmy, haplotype 620 polymorphism, and paralogyDraft in plastid genes: Implications for plant molecular 621 systematics. Systematic Botany 29:1011–1020.
622 Wolfe, K.H., Li, W.H., and Sharp, P.M. 1987. Rates of nucleotide substitution vary
623 greatly among plant mitochondrial, chloroplast, and nuclear DNAs.
624 Proceedings of the National Academy of Sciences of the United States of
625 America 84(24): 9054-9058.
626 Wu, C.S., Chaw SW., and Huang, Y.Y. 2013. Chloroplast phylogenomics indicates
627 that Ginkgo biloba is sister to Cycads. Genome biology and evolution 5(1):
628 243-254. doi:10.1093/gbe/evt001.
629 Wu, C.S., Chaw, S.M. 2016. Large-Scale Comparative Analysis Reveals the
630 Mechanisms Driving Plastomic Compaction, Reduction, and Inversions in
631 Conifers II (Cupressophytes). Genome biology and evolution 8:3740-3750
632 doi:10.1093/gbe/evw278.
https://mc06.manuscriptcentral.com/genome-pubs Page 31 of 41 Genome
633 Wyman, S.K., Jansen, R.K., and Boore, J.L. 2004. Automatic annotation of organellar
634 genomes with DOGMA. Bioinformatics 20(17): 3252-3255. doi:
635 10.1093/bioinformatics/bth352.
636 Yang, J.C., Joo, M., So, S., Yi, D.K., Shin, C.H., Lee, Y.M., and Choi, K. 2015. The
637 complete plastid genome sequence of Picea jezoensis (Pinaceae: Piceoideae).
638 Mitochondrial DNA Part A DNA Mapping Sequencing & Analysis 27: 3761.
639 doi: 10.3109/19401736.2015.1079894.
640 Zhao, W., Jiang, M., Ma, J., Xu, N., and Wang, J. 2015. Interspecific hybridization of
641 Picea and genetic testing of growth traits in F_1 seedlings. Forest Science and 642 Technology (In Chinese) 9:Draft 40-43. 643 Zhu, A., Fan, W., Adams, R.P., Mower, J.P. 2018. Phylogenomic evidence for ancient
644 recombination between plastid genomes of the
645 Cupressus-Juniperus-Xanthocyparis complex (Cupressaceae). BMC
646 Evolutionary Biology 18:137 doi:10.1186/s12862-018-1258-2.
647 Zou, J., Sun, Y., Li, L., Wang, G., Wei, Y., Lu, Z., Wang, Q., and Liu, J. 2013.
648 Population genetic evidence for speciation pattern and gene flow between
649 Picea wilsonii, P. morrisonicola and P. neoveitchii. Annals of Botany 112(9):
650 1829-1844. doi:10.1093/aob/mct241.
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 32 of 41
651 Figure legends
652 Figure 1. Gene map of the P. asperata and P. crassifolia plastomes. Genes are
653 indicated by boxes on the inside (clockwise transcription) and outside
654 (counterclockwise transcription) as the outermost circle. Genes belonging to different
655 functional groups are color-coded. The dashed area in the inner circle indicates the
656 GC content of the plastome.
657
658 Figure 2. Sequence identity plot comparing the complete plastomes of P. asperata, P.
659 crassifolia, P. abies, and P. morrisonicola with P. asperata as a reference using 660 mVISTA. Gray arrows and thick blackDraft lines above the alignment indicate genes with 661 their orientation and the position of the IRs, respectively. A cutoff of 70% identity
662 was used for the plots, and the Y-scale represents the percent identity, and ranges
663 from 50 to 100%.
664
665 Figure 3. Patterns of nucleotide substitutions among the P. asperata, P. crassifolia, P.
666 abies and P. morrisonicola plastomes.
667
668 Figure 4. Phylogenetic tree inferred via maximum likelihood and parsimony using 68
669 shared protein-coding genes among 32 plastid genomes (20 from the Pinaceae and 12
670 outgroups from the Araucariaceae, Podocarpaceae, Taxaceae, and Cupressaceae).
671 Supported values estimated from 1,000 bootstrap replicates through maximum
672 likelihood (ML) and parsimony (MP) are presented along the branches (MP/ML).
https://mc06.manuscriptcentral.com/genome-pubs Page 33 of 41 Genome
673
674 Supplementary materials
675 Table S1. Interspecific hybridization among Picea crassifolia and P. abies, P.
676 wilsonii, P. asperata, P. koraiensis, and P. glauca.
677
678 Table S2. Genes identified in the Picea asperata, P. crassifolia, P. abies, and P.
679 morrisonicola plastomes.
680
681 Table S3. List of 32 species whose cp sequences are available in GeneBank. 682 Draft 683 Table S4. Base mutation events in gene coding regions of the plastomes of Picea
684 asperata, P. abies, and P. morrisonicola.
685
686 Table S5. Simple sequence repeat (SSR) indels in the plastomes of Picea asperata, P.
687 crassifolia, P. abies, and P. morrisonicola.
688
689 Table S6. Simple indels in the plastomes of Picea asperata, P. crassifolia, P. abies,
690 and P. morrisonicola.
691
692 Figure S1. Sequence identity plot comparing the complete plastomes of P. asperata,
693 P. crassifolia, P. abies, P. morrisonicola, P. sitchensis, P. glauca, and P. jezoensis
694 with P. abies as a reference using mVISTA. Gray arrows and thick black lines above
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 34 of 41
695 the alignment indicate genes with their orientation and the position of the IRs,
696 respectively. Genome regions are color-coded as exons and conserved non-coding
697 sequences (CNS). A cutoff of 50% identity was used for the plots, and the Y-scale
698 represents the percent identity, and ranges from 50 to 100%.
699
700 Figure S2 Phylogenetic tree inferred via maximum likelihood and parsimony using
701 68 shared genes among 32 plastid genomes (20 from the Pinaceae and 12 outgroups
702 from the Araucariaceae, Podocarpaceae, Taxaceae and Cupressaceae). Supported
703 values estimated from 1,000 bootstrap replicates using maximum likelihood (ML). 704 Draft 705 Figure S3 Phylogenetic tree inferred via maximum likelihood and parsimony using
706 68 shared codon positions among 32 plastid genomes (20 from the Pinaceae and 12
707 outgroups from the Araucariaceae, Podocarpaceae, Taxaceae and Cupressaceae).
708 Supported values estimated from 1,000 bootstrap replicates using maximum
709 likelihood (ML).
710
711
https://mc06.manuscriptcentral.com/genome-pubs Page 35 of 41 Genome
1 Table 1. Summary of the four complete Picea plastomes: Picea asperata, P. crassifolia, P. abies, and P. morrisonicola
P. P. P. P. Category asperata crassifolia morrisonicola abies
Accession number KY204451 KY204450 AB480556 HF937082
124,084 Size (bp) 124,145 124,126 124,168 % GC content 38.71 Draft38.71 38.79 38.72 Number of protein-coding genes 72 72 72 72
Number of tRNA genes 32 32 32 32
Number of rRNA genes 4 4 4 4
Number of genes with introns 14 14 14 14
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 36 of 41
3
Draft
https://mc06.manuscriptcentral.com/genome-pubs Page 37 of 41 Genome
1 Table 2. Inversion events and inversion directions in the plastomes of Picea
2 asperata, P. crassifolia, P. abies, and P. morrisonicola
Size of Repeat
inversion sequence P. P. P. P. Gene fragment length asperata crassifolia abies morrisonicola
(bp) (bp)
psbA-trnK 4 13 G G G A
trnH-trnT 3 9 G A A A
trnI-trnF 4 19 G G A A ycf3 intron 2 2 DraftG G G A
3 A and G indicate different directions. The direction of the plastome of P. asperata (G)
4 was used as a reference.
https://mc06.manuscriptcentral.com/genome-pubs Genome Page 38 of 41
trnS-GCU
trnE-UUC trnD-GUC trnY-GUA
Masp
psbM
rpoB trnH-GUG
trnV-UAC atpE atpB rpoC1
psbI rpoC2
psbJ
trnT-GGU petN rbcL rps2 trnC-GCA psbL trnM-CAU psbF psbE atpI accD psaI trnR-CCG ycf4 atpH trnW-CCA trnP-UGG atpF cemA petA
atpA
rpl20 rps12 petL clpP trnR-UCU petG trnG-GCC psaJ ycf12 psbB rpl33 ycf12 rps18 psbI psbT psbK psbH trnS-GCU petB psbN trnQ-UUG chlB petD Picea asperata rpoA trnK-UUU rps11 Draft matK rpl36 infA 124,145 bp psbA rps8 trnI-CAU rpl14 trnH-GUG rpl16 rps3 Picea crassifolia rpl22 rps19 124,126 bp rpl2 ycf2 rpl23 trnI-CAU
trnF-GAA trnL-UAA
trnL-CAA
trnV-GAC trnT-UGUrps4 rps7 rps12 rpl32 trnS-GGA trnL-UAG
ycf3 ccsA
trnG-UCC psbZ
trnP-GGG psaA
psbC
trnN-GUU psbD psaB trnT-GGU psaC
rps15
rps14 trnS-UGA ycf1 photosystem I
trnfM-CAU photosystem II chlN chlL cytochrome b/f complex rrn16
ATP synthase rrn23 trnI-GAU
trnA-UGC RubisCO large subunit trnR-ACG
5.4nrr
RNA polymerase 5nrr ribosomal proteins (SSU) ribosomal proteins (LSU) clpP, matK other genes hypothetical chloroplast reading frames (ycf) transfer RNAs https://mc06.manuscriptcentral.com/genome-pubs ribosomal RNAs introns psbF atpE petG trnK matK trnK chlB trnQ psbIycf12 rpl20 rpl33 trnP psbE psbJ petA cemA ycf4 psaI accD rbcL atpB trnM trnH psbK trnS clpP rps18 psaJtrnW petL psbL trnR trnV trnT 100% P. crassifolia 50% 100% P. morrisonicola 50% 100% P. abies 50% 0k 3k 6k 9k 12k 15k 18k 21k 24k 27k 30k
psaM psbI trnY psbM trnC rpoB rpoC1 rpoC2 rps2 atpI atpH atpF atpA trnR ycf12 psbB psbN petB petD rpoA rpl36rps8 trnS trnE trnD petN trnG psbT psbH rps11 infA 100%
50% 100%
50% 100%
50% 33k 36k 39k 42k 45k 48k 51k 54k 57k 60k rpl22 psbD trnG rps8 rpl16 rps3 rpl2 rpl23 trnF trnT trnS ycf3 psaA psaB rps14 trnS psbC trnT rrn16 trnI trnA rrn23 Genome rpl14 rps19 trnI trnL rps4 trnfMpsbZ Draft 100% 50% 100%
50% 100% https://mc06.manuscriptcentral.com/genome-pubs 50% 63k 66k 69k 72k 75k 78k 81k 84k 87k 90k 93k
rrn4.5 trnR chlL chlN ycf1 rps15 psaC ccsA trnP rpl32 trnV rps12 trnL ycf2 trnH rrn5 trnN trnL rps7 trnI 100%
50% gene 100%
exon 50% intron 100% CNS 50% 93k 96k 99k 102k 105k 108k 111k 114k 117k 120k 123k Page 39 of 41 Genome Page 40 of 41
Draft
81x49mm (300 x 300 DPI)
https://mc06.manuscriptcentral.com/genome-pubs Page 41 of 41 Genome
Draft
https://mc06.manuscriptcentral.com/genome-pubs