Genome

Molecular identification of () using a candidate DNA barcode: the chloroplast psbA–trnH intergenic region

Journal: Genome

Manuscript ID gen-2017-0115.R2

Manuscript Type: Article

Date Submitted by the Author: 07-Sep-2017

Complete List of Authors: Feng, Shangguo; Hangzhou Normal University, College of Life and EnvironmentalDraft Sciences Jiao, Kaili; Hangzhou Normal University Zhu, Yujia; Hangzhou Normal University Wang, Hongfen; Shandong Xiajin First Middle School Jiang, Mengying; Hangzhou Normal University Wang, Huizhong; Hangzhou Normal University, College of Life and Environmental Sciences

Is the invited manuscript for consideration in a Special This submission is not invited Issue? :

Keyword: Physalis, DNA barcoding, psbA–trnH region, molecular identification

https://mc06.manuscriptcentral.com/genome-pubs Page 1 of 20 Genome

1 Molecular identification of Physalis species (Solanaceae) using a candidate DNA

2 barcode: the chloroplast psbA –trnH intergenic region

3 Shangguo Feng 1, 2 , Kaili Jiao 2, Yujia Zhu 2, Hongfen Wang 3, Mengying Jiang 2,

4 Huizhong Wang 2, *

5 1 College of Bioscience & Biotechnology, Hunan Agricultural University, Changsha

6 410128, China

7 2 Zhejiang Provincial Key Laboratory for Genetic Improvement and Quality Control 8 of Medicinal , College of Life and Environmental Sciences, Hangzhou Normal 9 University, Hangzhou 310036, China

10 3 Shandong Xiajin First Middle School, Xiajin 253200, China

11

12

13 *Corresponding author Draft

14 Huizhong wang: [email protected] .

15 The email addresses of the other coauthors: [email protected] (Shangguo

16 Feng), [email protected] (Kaili Jiao), [email protected] (Yujia Zhu),

17 [email protected] (Hongfen Wang), [email protected] (Mengying Jiang).

18

19

20

21

22

23

24

25

26

27

28

29 1

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 2 of 20

30 Abstract :

31 Physalis L., an important genus of the family Solanaceae, includes many

32 commercially important edible and medicinal species. Traditionally, species

33 identification is based on morphological traits; however, the highly similar

34 morphological traits among Physalis species make this approach difficult. In this

35 study, we evaluated the feasibility of using a popular DNA barcode, the chloroplast

36 psbA –trnH intergenic region, in the identification of Physalis species. Thirtysix

37 psbA –trnH regions of Physalis species and of the closely related Nicandra

38 physalodes were analyzed. The success rates of PCR amplification and sequencing of

39 the psbA –trnH region were 100%. MEGA V6.0 was utilized to align the psbA –trnH

40 sequences and to compute genetic distances. The results show an apparent barcoding

41 gap between intra and interspecific variations. Results of both BLAST1 and 42 nearestdistance methods prove thatDraft the psbA –trnH regions can be used to identify all 43 species examined in the present study. In addition, phylogenetic analysis using

44 psbA –trnH data revealed a distinct boundary between species. It also confirmed the

45 relationship between Physalis species and closely related species, as established by

46 previous studies. In conclusion, the psbA –trnH intergenic region can be used as an

47 efficient DNA barcode for the identification of Physalis species.

48 Keywords : Physalis ; DNA barcoding; psbA –trnH region; molecular identification

49

50

51

52

53

54

55

56

57

58

59 2

https://mc06.manuscriptcentral.com/genome-pubs Page 3 of 20 Genome

60 Introduction

61 Physalis is one of the important genera of the plant family Solanaceae,

62 comprising ranged from 75 to 120 species (Maggie 2005; Martinez 1998; Chinese

63 Academy of Sciences 1978; Wei et al. 2012). Most Physalis species are distributed in

64 tropical and temperate regions of the Americas, and a few species are native to

65 Eurasia and Southeast Asia (Feng et al. 2016; Maggie 2005; Martinez 1998; Chinese

66 Academy of Sciences 1978; Wei et al. 2012). In China, there are five Physalis species

67 (Physalis alkekengi , P. angulate , P. peruviana , P. pubescens , and P. minima ) and two

68 varieties ( P. alkekengi var. franchetii , P. angulate var. villosa ), which are mainly

69 distributed in the eastern, central, southern and southwestern regions (Chinese

70 Academy of Sciences 1978). Because of their high content of vitamins, minerals, and

71 antioxidants, many Physalis species possess potential medicinal properties such as 72 antibacterial, antiinflammatory,Draft and anticancer activity (Hong et al. 2015; Ji et al. 73 2012; Wei et al. 2012). For example, P. alkekengi var. franchetii , a standard medical

74 plant recorded in the Pharmacopoeia of the People’s Republic of China, is used in

75 traditional Chinese medicine for the treatment and prevention of sore throat, tumors,

76 cough, leishmaniasis, eczema, hepatitis, and urinary problems (Chinese

77 Pharmmacopoeia Editorial Committee 2015; Yang et al. 2016). Moreover, some

78 Physalis species such as P. philadelphica , P. peruviana , and P. pubescens are

79 extensively cultivated for their edible fruits in many regions of the world (SangNgern

80 et al. 2016; ZamoraTavares et al. 2015; Zhang et al. 2016).

81 Because of their highly similar morphological characteristics, misidentification

82 of Physalis species is common (Fig. 1). For example, P. minima can be confused with

83 P. angulate or P. pubescens in applications in traditional Chinese medicine (Feng et al.

84 2016; Chinese Academy of Sciences 1978). In addition, some species of another

85 genus such as Nicandra physalodes are often mistaken for Physalis species (Feng et al.

86 2016) (Fig. 1). The amounts of biologically active compounds and their application

87 value differ among Physalis species and their related species. Thus, incorrect

88 identification of Physalis species can lead to their inappropriate usage and failures in

89 their genetic resource conservation. The traditional morphological authentication 3

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 4 of 20

90 approach is often affected by heritable variation and growth environment (Maggie

91 2005; VargasPonce et al. 2011). Hence, a rapid and reliable identification method for

92 Physalis is essential.

93 DNA barcoding, a technique based on sequence diversity within short and

94 standardized nuclear or chloroplast DNA (cpDNA) region, is often used in species

95 authentication (Chen et al. 2010; Feng et al. 2016; Feng et al. 2015; Hebert et al.

96 2004). cpDNA sequences ( atpF–atpH, psbA –trnH , matK, rbcL) have been widely

97 used as a tool in studies on plant phylogenetics and species identification. In particular,

98 the psbA –trnH intergenic region has been found to be ubiquitous in plants; it has thus

99 emerged as one of the popular DNA barcodes (Cbol Plant Working Group 2009; Gao

100 et al. 2013; Kress and Erickson 2007; Kress et al. 2005; Yao et al. 2009). However,

101 few studies have reported the use of the psbA –trnH intergenic region in the barcoding 102 of Physalis species. In the presentDraft study, we examined the feasibility of using the 103 psbA –trnH intergenic region as a DNA barcode in the identification of eight Physalis

104 species including four collected from China and four obtained from GenBank (Clark

105 et al. 2016).

106 Materials and Methods

107 Plant Materials

108 In this study, 32 samples belonging to eight Physalis species and four samples of

109 N. physalodes (often mistaken for Physalis species because of its similar

110 morphological characteristics) were collected (Table 1, Fig. 1). Among the specimens,

111 31 were collected from the main distribution regions from China for sequencing

112 (Table 1). Other psbA –trnH sequences of Physalis species were obtained from

113 GenBank (Clark et al. 2016). All collected samples were verified on the basis of the

114 specimens stored in the Chinese Virtual Herbarium (http://www.cvh.ac.cn/ ).

115 DNA Extraction, Amplification, and Sequencing

116 Fresh young of the collected samples were used for genomic DNA

117 isolation, as previously reported (Feng et al. 2016; Feng et al. 2013). The pair of

118 universal primers used to amplify chloroplast psbA –trnH sequences consisted of

119 psbA F (5 ʹ GTTATGCATGAACGTAATGCTC3ʹ) and trnH R 4

https://mc06.manuscriptcentral.com/genome-pubs Page 5 of 20 Genome

120 (5ʹCGCGCATGGTGGATTCACAAATC3ʹ) (Sang et al. 1997). PCR amplification

121 was conducted using 25 L volumes containing 1× PCR buffer with MgCl 2, 0.4 mM

122 dNTPs, 0.4 M of each primer (synthesized by Sangon Biotech Co., Ltd., Shanghai

123 China), and 1.0 U Taq DNA polymerase (TaKaRa Bio., Kyoto, Japan). PCR

124 amplification was conducted in a Mastercycler Nexus gradient thermocycler

125 (Eppendorf AG, Hamburg, Germany) with the following parameter settings: holding

126 at 94ºC for 5 min followed by 32 cycles at 94ºC for 50 s, at 55ºC for 50 s, and at 72ºC

127 for 1.5 min, and a final extension at 72ºC for 10 min. The amplification products were

128 sequenced in both directions by Sunny Biotechnology Co., Ltd., Shanghai, China.

129 Data Analysis

130 The program Clustal W (Thompson et al. 2002) was used to align the chloroplast

131 psbA –trnH regions obtained from all samples, but this was assisted by manual editing. 132 MEGA 6.0 was used to calculate Draftthe genetic distances with the Kimura twoparameter 133 (K2P) model (Tamura et al. 2013). Interspecific divergences were evaluated by

134 average interspecific distance, minimum interspecific distance, and average theta

135 prime calculated using the K2P model, while the intraspecific variation was

136 determined from the metrics average intraspecific distance, coalescent depth, and

137 theta based on the K2P model (Chen et al. 2010; Meyer and Paulay 2005). The

138 distributions of intraspecific variability were compared against those of the

139 interspecific variability using DNA barcoding gaps (Chen et al. 2010; Meyer and

140 Paulay 2005). Wilcoxon twosample tests were carried out as previous study (Chen et

141 al. 2010). BLAST1 and nearestdistance methods were performed to assess the

142 discriminatory efficacy of psbA –trnH sequences for the collected species (Feng et al.

143 2015; Slabbinck et al. 2008). For the BLAST1 method, the reference database for

144 all psbA –trnH regions were searched using the BLAST program

145 (http://blast.ncbi.nlm.nih.gov/Blast.cgi ). For the nearestdistance method,

146 identification was based on all pairwise genetic distances calculated between each

147 query and each of the reference sequences, as well as among the reference sequences.

148 Phylogenetic analysis was conducted according to the neighborjoining (NJ) and

149 maximum likelihood (ML) methods using MEGA 6.0 (Tamura et al. 2013), with the 5

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 6 of 20

150 following parameters: 1000 bootstrap replicates, TamuraNei model, and complete

151 elimination of all positions containing gaps and missing data.

152 Results

153 PCR Efficiency, Sequencing, and Characteristics of the psbA –trnH regions

154 The success rates of PCR amplification and sequencing of the psbA –trnH regions

155 from the sampled specimens were 100%. The lengths of the psbA –trnH regions were

156 in the range 492 to 553 bp, having an average of 515 bp (Table 1). The GC content

157 ranged from 26.95% to 29.47%, with an average of 28.28%. Therefore, the sequence

158 length and GC content of the psbA –trnH regions varied. The GenBank accession

159 number of each psbA –trnH region is given in Table 1.

160 Genetic Divergence within and between Species

161 Genetic divergences were calculated using MEGA 6.0. The six metrics used to 162 evaluate the interspecific versusDraft intraspecific variation are shown in Table 2. The 163 genetic distances of the three intraspecific metrics (average intraspecific distance,

164 coalescent depth, and theta) are far less than the genetic distances of the three

165 interspecific metrics (average interspecific distance, minimum interspecific distance,

166 and theta prime) (Table 2). Significant differences between the interspecific and

167 intraspecific divergences were also demonstrated by Wilcoxon twosample tests

168 (Table 3).

169 Assessment of the Barcoding Gap

170 The distributions of genetic distance of all tested samples calculated using the

171 K2P model of intra versus interspecific variation on a scale of 0.004 distance units

172 were analyzed (Fig. 2). We observed a barcoding gap between intraspecific and

173 interspecific variation (Fig. 2). The intraspecific genetic distance was in the range of

174 0.000–0.008 and the proportion of intraspecific genetic distance that equal zero

175 reached 89.89%. The interspecific genetic distance ranged from 0.002 to 0.922, and

176 the proportion of interspecific genetic distance of ≥0.011 was 82.33% (Fig. 2).

177 Species Discriminability of psbA –trnH

178 The results of both BLAST1 and nearestdistance methods indicate that the

179 identification rates at the species level using psbA –trnH are 100.0% accurate (Table 4) 6

https://mc06.manuscriptcentral.com/genome-pubs Page 7 of 20 Genome

180 and that psbA –trnH therefore possesses high discriminability for the eight Physalis

181 species and N. physalodes collected in the study.

182 Both the NJ and ML phylogenetic trees constructed based on psbA –trnH regions

183 suggest that all of the tested samples can be grouped into three main clusters (Fig. 3).

184 Cluster I is the most complex and contains ungrouped samples of P. angulata and four

185 strongly supported species clades containing (I1) P. pubescens , (I2) P. gracilis + P.

186 pruinosa , (I3) P. minima , and (I4) P. heterophylla + P. peruviana with strong support

187 (BS = 99 for NJ method, and 96 for ML methods). Cluster II, which has strong

188 support (BS = 100 for both NJ and ML methods), contains six samples of P. alkekengi

189 var. franchetii . All samples of N. physalodes from genus Nicandra are distant from

190 any Physalis species, and constitute cluster III with strong support (BS = 100 for both

191 NJ and ML methods). Both NJ and ML trees also show that more than one sample 192 from the same species can be groupedDraft into one branch (Fig. 3, A and B). 193 Discussion

194 The commercial value of many Physalis species has received more attention

195 because of their edible fruits, medicinal value, and ornamental value (Feng et al. 2016;

196 ValdiviaMares et al. 2016; Zhang and Tong 2016). Therefore, their accurate

197 identification is highly important. However, their identification based on

198 morphological characteristics, which are similar, is extremely difficult. In fact, the

199 psbA –trnH region has been used as a DNA barcode to authenticate various plants with

200 similar morphological traits (Chen et al. 2010; Gao et al. 2013; Ma et al. 2010; Yang

201 et al. 2011; Yao et al. 2009). In the present work, we used for the first time the

202 chloroplast psbA –trnH region for barcoding Physalis species.

203 Many researchers have reported that the intraspecific variation of psbA –trnH

204 region is very low while the interspecific divergence is very large in a wide variety of

205 plants (Gao et al. 2013; Ma et al. 2010; Yang et al. 2011; Yao et al. 2009). Similar

206 results were found in our study. We found that the genetic distance between most

207 samples from the same species in this study is 0.000 ( P. angulate , P. pubescens and P.

208 alkekengi var. franchetii , and P. peruviana ), which means that most of the Physalis

209 species have a unique psbA –trnH sequence. This feature is thus useful for identifying 7

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 8 of 20

210 different Physalis species and related species.

211 According to results of the BLAST1 and nearestdistance methods, psbA –trnH

212 leads to high species discriminability (100.00% identification success rates for both

213 methods). Interestingly, P. pubescens and P. pruinosa have highly similar

214 morphological traits. Hence, they cannot be easily distinguished from each other by

215 traditional morphological approach and by use of nuclear ribosomal ITS2 sequences

216 (Feng et al. 2016), but they can be accurately identified on the basis of their

217 psbA –trnH regions. The results strongly suggest that the psbA –trnH region can be

218 used as a complementary barcode (Chen et al. 2010).

219 The NJ and ML phylogenetic trees constructed with the psbA –trnH regions

220 indicate that different samples from the same species can be grouped together.

221 Previous studies (Feng et al. 2016; Maggie 2005) reported that P. alkekengi var. 222 franchetii , which is distant fromDraft any other Physalis species, constitute a separate 223 cluster (II) with strong support (Fig. 3, BS = 100 for both NJ and ML methods). This

224 result supports the inclusion of P. alkekengi var. franchetii in the Chinese

225 Pharmacopoeia as the source of Herba Physalis (Chinese Pharmmacopoeia Editorial

226 Committee 2015). In addition, N. physalodes , a species often mistaken for Physalis

227 species because of its similar morphological traits, is separated from any Physalis

228 species by the psbA –trnH regions (Fig. 3, BS = 100 for both NJ and ML methods).

229 Therefore, the psbA –trnH region is useful for species identification, as well as

230 contributes to the phylogenetic analysis of Physalis and its closely related species.

231 Conclusions

232 Our study demonstrates that the chloroplast psbA –trnH intergenic region

233 possesses high species discriminability and that it could be an ideal universal DNA

234 barcode for Physalis species tested in this study. However, more Physalis species

235 should be collected in the future to verify whether psbA –trnH region could be used to

236 identify all species of Physalis . In addition, NJ and ML tree analyses provided solid

237 evidence that the psbA –trnH region has potential use in the phylogenetic analysis of

238 Physalis and plants closely related to it.

8

https://mc06.manuscriptcentral.com/genome-pubs Page 9 of 20 Genome

239 Acknowledgments

240 This study was supported in part by the National Natural Science Foundation of China

241 (31470407), the Zhejiang Provincial Public Welfare Technology Applied Research

242 Foundation of China (2014C32090), the Hangzhou Scientific and Technological

243 Program (20150932H04), and the Hangzhou Scientific and Technological Program

244 (20150932H03).

245 References 246 Cbol Plant Working Group. 2009. A DNA barcode for land plants. Proc Natl Acad Sci U S A 106 (31): 247 1279412797. doi: 10.1073/pnas.0905845106. 248 Chen, S., Yao, H., Han, J., Liu, C., Song, J., Shi, L., Zhu, Y., Ma, X., Gao, T., Pang, X., Luo, K., Li, Y., 249 Li, X., Jia, X., Lin, Y., and Leon, C. 2010. Validation of the ITS2 region as a novel DNA barcode 250 for identifying medicinal plant species. PLoS One 5(1): e8613. doi: 251 10.1371/journal.pone.0008613. 252 Chinese Academy of Sciences. 1978. Flora of China. Science press, Beijing, China 67 : 50. 253 Chinese Pharmmacopoeia Editorial Committee. 2015. Pharmacopoeia of the People's Republic of 254 China. Chemical Industry Press, Beijing,Draft China I Part : 360361. 255 Clark, K., KarschMizrachi, I., Lipman, D.J., Ostell, J., and Sayers, E.W. 2016. GenBank. Nucleic 256 Acids Res 44 (D1): D6772. doi: 10.1093/nar/gkv1276. 257 Feng, S., Jiang, M., Shi, Y., Jiao, K., Shen, C., Lu, J., Ying, Q., and Wang, H. 2016. Application of the 258 Ribosomal DNA ITS2 Region of Physalis (Solanaceae): DNA Barcoding and Phylogenetic Study. 259 Front Plant Sci 7: 1047. doi: 10.3389/fpls.2016.01047. 260 Feng, S., Jiang, Y., Wang, S., Jiang, M., Chen, Z., Ying, Q., and Wang, H. 2015. Molecular 261 Identification of Dendrobium Species (Orchidaceae) Based on the DNA Barcode ITS2 Region and 262 Its Application for Phylogenetic Study. Int J Mol Sci 16 (9): 2197521988. doi: 263 10.3390/ijms160921975. 264 Feng, S., Zhao, H., Lu, J., Liu, J., Shen, B., and Wang, H. 2013. Preliminary genetic linkage maps of 265 Chinese herb Dendrobium nobile and D. moniliforme . J Genet 92 (2): 205212. doi: 266 10.1007/s120410130246y. 267 Gao, T., Ma, X., and Zhu, X. 2013. Use of the psbAtrnH region to authenticate medicinal species of 268 Fabaceae. Biol Pharm Bull 36 (12): 19751979. doi: 10.1248/bpb.b1300611. 269 Hebert, P.D.N., Penton, E.H., Burns, J.M., Janzen, D.H., and Hallwachs, W. 2004. Ten species in one: 270 DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. P 271 Natl Acad Sci USA 101 (41): 1481214817. doi: 10.1073/pnas.0406166101. 272 Hong, J.M., Kwon, O.K., Shin, I.S., Song, H.H., Shin, N.R., Jeon, C.M., Oh, S.R., Han, S.B., and Ahn, 273 K.S. 2015. Antiinflammatory activities of Physalis alkekengi var. franchetii extract through the 274 inhibition of MMP9 and AP1 activation. Immunobiology 220 (1): 19. doi: 275 10.1016/j.imbio.2014.10.004. 276 Ji, L., Yuan, Y.L., Luo, L.P., Chen, Z., Ma, X.Q., Ma, Z.J., and Cheng, L. 2012. Physalins with 277 antiinflammatory activity are present in Physalis alkekengi var. franchetii and can function as 278 Michael reaction acceptors. Steroids 77 (5): 441447. doi: 10.1016/j.steroids.2011.11.016.

9

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 10 of 20

279 Kress, W.J., and Erickson, D.L. 2007. A twolocus global DNA barcode for land plants: the coding rbcL 280 gene complements the noncoding trnHpsbA spacer region. PLoS One 2(6): e508. doi: 281 10.1371/journal.pone.0000508. 282 Kress, W.J., Wurdack, K.J., Zimmer, E.A., Weigt, L.A., and Janzen, D.H. 2005. Use of DNA barcodes 283 to identify flowering plants. P Natl Acad Sci USA 102 (23): 83698374. doi: 284 10.1073/pnas.0503123102. 285 Ma, X.Y., Xie, C.X., Liu, C., Song, J.Y., Yao, H., Luo, K., Zhu, Y.J., Gao, T., Pang, X.H., Qian, J., and 286 Chen, S.L. 2010. Species identification of medicinal pteridophytes by a DNA barcode marker, the 287 chloroplast psbAtrnH intergenic region. Biol Pharm Bull 33 (11): 19191924. doi: 288 JST.JSTAGE/bpb/33.1919 [pii]. 289 Maggie, W.P.S., M. 2005. Untangling Physalis (Solanaceae) from the Physaloids: A TwoGene 290 Phylogeny of the Physalinae. Systematic Botany 30 (1): 216230. doi: 291 http://dx.doi.org/10.1600/0363644053661841. 292 Martinez, M. 1998. Revision of Physalis section Epeteiorhiza (Solanaceae). Ann Ins Biol Bot 69 : 293 71117. doi: 10.1002/ece3.102. 294 Meyer, C.P., and Paulay, G. 2005. DNA barcoding: error rates based on comprehensive sampling. PLoS 295 biology 3(12): e422. doi: 10.1371/journal.pbio.0030422. 296 SangNgern, M., Youn, U.J., Park, E.J., Kondratyuk, T.P., Simmons, C.J., Wall, M.M., Ruf, M., Lorch, 297 S.E., Leong, E., Pezzuto, J.M., and Chang, L.C. 2016. Withanolides derived from Physalis 298 peruviana (Poha) with potential Draft antiinflammatory activity. Bioorg Med Chem Lett 26 (12): 299 27552759. doi: 10.1016/j.bmcl.2016.04.077. 300 Sang, T., Crawford, D., and Stuessy, T. 1997. Chloroplast DNA phylogeny, reticulate evolution, and 301 biogeography of Paeonia (Paeoniaceae). Am J Bot 84 (8): 1120. DOI: 10.2307/2446155. 302 Slabbinck, B., Dawyndt, P., Martens, M., De Vos, P., and De Baets, B. 2008. TaxonGap: a visualization 303 tool for intraand interspecies variation among individual biomarkers. Bioinformatics 24 (6): 304 866867. doi: 10.1093/bioinformatics/btn031. 305 Tamura, K., Stecher, G., Peterson, D., Filipski, A., and Kumar, S. 2013. MEGA6: Molecular 306 Evolutionary Genetics Analysis version 6.0. Mol Biol Evol 30 (12): 27252729. doi: 307 10.1093/molbev/mst197. 308 Thompson, J.D., Gibson, T., and Higgins, D.G. 2002. Multiple sequence alignment using ClustalW and 309 ClustalX. Current protocols in bioinformatics: 2.3. 12.3. 22. doi: 10.1002/0471250953.bi0203s00. 310 ValdiviaMares, L.E., Zaragoza, F.A.R., Gonzalez, J.J.S., and VargasPonce, O. 2016. Phenology, 311 agronomic and nutritional potential of three wild husk tomato species ( Physalis , Solanaceae) from 312 Mexico. Sci HorticAmsterdam 200 : 8394. doi: 10.1016/j.scienta.2016.01.005. 313 VargasPonce, O., PerezAlvarez, L.F., ZamoraTavares, P., and Rodriguez, A. 2011. Assessing Genetic 314 Diversity in Mexican Husk Tomato Species. Plant Mol Biol Rep 29 (3): 733738. doi: 315 10.1007/s1110501002581. 316 Wei, J.L., Hu, X.R., Yang, J.J., and Yang, W.C. 2012. Identification of SingleCopy Orthologous Genes 317 between Physalis and Solanum lycopersicum and Analysis of Genetic Diversity in Physalis Using 318 Molecular Markers. Plos One 7(11). doi: 10.1371/journal.pone.0050164. 319 Yang, Y., Zhai, Y., Liu, T., Zhang, F., and Ji, Y. 2011. Detection of Valeriana jatamansi as an adulterant 320 of medicinal Paris by length variation of chloroplast psbAtrnH region. Planta Med 77 (1): 8791. 321 doi: 10.1055/s00301250072. 322 Yang, Y.K., Xie, S.D., Xu, W.X., Nian, Y., Liu, X.L., Peng, X.R., Ding, Z.T., and Qiu, M.H. 2016. Six

10

https://mc06.manuscriptcentral.com/genome-pubs Page 11 of 20 Genome

323 new physalins from Physalis alkekengi var. franchetii and their cytotoxicity and antibacterial 324 activity. Fitoterapia 112: 144152. doi: 10.1016/j.fitote.2016.05.010. 325 Yao, H., Song, J.Y., Ma, X.Y., Liu, C., Li, Y., Xu, H.X., Han, J.P., Duan, L.S., and Chen, S.L. 2009. 326 Identification of Dendrobium species by a candidate DNA barcode sequence: the chloroplast 327 psbAtrnH intergenic region. Planta Med 75 (6): 667669. doi: 10.1055/s00291185385. 328 ZamoraTavares, P., VargasPonce, O., SanchezMartinez, J., and CabreraToledo, D. 2015. Diversity 329 and genetic structure of the husk tomato ( Physalis philadelphica Lam.) in Western Mexico. Genet 330 Resour Crop Ev 62 (1): 141153. doi: 10.1007/s1072201401639. 331 Zhang, C.R., Khan, W., Bakht, J., and Nair, M.G. 2016. New antiinflammatory sucrose esters in the 332 natural sticky coating of tomatillo ( Physalis philadelphica ), an important culinary fruit. Food 333 Chem 196 : 726732. doi: 10.1016/j.foodchem.2015.10.007. 334 Zhang, W.N., and Tong, W.Y. 2016. Chemical Constituents and Biological Activities of Plants from the 335 Genus Physalis . Chemistry & biodiversity 13 (1): 4865. doi: 10.1002/cbdv.201400435. 336

337

338

339

340 Draft

341

342

343

344

345

11

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 12 of 20

346

347 Table 1 Voucher information, GenBank accession numbers, sequence lengths and GC content of the psbA-trnH sequences for all samples

348 examined in the study Longitude (E) Latitude (N) GenBank Sequence GC content Species Name Locality information Voucher No. Accession No. lengths (bp) (%) Physalis angulata L. Xiaoshan, Hangzhou, Zhejiang, China 120°15′ 30°11′ PHZ0001 KY263828 504 29.17 P. angulata Lin’an, Hangzhou, Zhejaing, China 119°43′ 30°14′ PHZ0002 KY263829 504 29.17 P. angulata Pujiang, Jinhua, Zhejiang, China 121°30′ 31°04′ PHZ0003 KY263830 504 29.17 P. angulata Yueqing, Wenzhou, Zhejiang, China 120°58′ 28°06′ PHZ0004 KY263831 504 29.17 P. angulata Luotian, Huanggang, Hubei, China Draft 115°23′ 30°47′ PHZ0005 KY263832 504 29.17 P. angulata Xiajin, Dezhou, Shandong, China 116°00′ 36°57′ PHZ0006 KY263833 504 29.17 P. angulata Baohua, Honghe, Yunnan, China 102°20′ 23°17′ PHZ0007 KY263834 504 29.17 P. alkekengi var. franchetii Nong’an, Changchun, Jilin, China 125°10′ 44°25′ PHZ4001 KY263848 501 26.95 (Mast.) Makino P. alkekengi var. franchetii Faku, Shenyang, Liaoning, China 123°24′ 42°30′ PHZ4002 KY263850 501 26.95 P. alkekengi var. franchetii Donggang, Dandong, Liaoning, China 124°08′ 39°51′ PHZ4003 KY263851 501 26.95 P. alkekengi var. franchetii Donggang, Dandong, Liaoning, China 124°08′ 39°51′ PHZ4004 KY263852 501 26.95 P. alkekengi var. franchetii Zoucheng, Jinan, Shandong, China 116°59′ 35°24′ PHZ4005 KY263853 501 26.95 P. alkekengi var. franchetii Zoucheng, Jinan, Shandong, China 116°59′ 35°24′ PHZ4006 KY263854 501 26.95 P. gracilis Miers GenBank – – – HG963529 509 28.09 P. heterophylla Nees GenBank – – – HQ596787 492 29.47 P. minima L. Tangshan, Hebei, China 118°10′ 39°37′ PHZ3001 KY263844 506 28.66 P. minima Pingdingshan, Henan, China 113°11′ 33°46′ PHZ3002 KY263845 506 28.66 P. minima Heze, Shandong, China 115°28′ 35°14′ PHZ3003 KY263846 506 28.66

12

https://mc06.manuscriptcentral.com/genome-pubs Page 13 of 20 Genome

P. minima Lishui, Zhejiang, China 119°55′ 28°28′ PHZ3004 KY263847 506 28.85 P. minima Lou’An, Anhui, China 116°31′ 31°44′ PHZ3005 KY263848 506 28.66 P. peruviana L. GenBank – – – HQ216179 509 28.88 P. peruviana GenBank – – – HQ216178 509 28.88 P. pruinosa (Waterf.) M. GenBank – – – HG963524 509 28.49 Martinez P. pubescens L. Faku, Shenyang, Liaoning, China 123°24′ 42°30′ PHZ2001 KY263835 525 28.38 P. pubescens Guta, Jinzhou, Liaoning, China 121°07′ 41°06′ PHZ2002 KY263835 525 28.38 P. pubescens Changhai, Dalian, Liaoning, China 122°35′ 39°16′ PHZ2003 KY263837 525 28.38 P. pubescens Chaoyang, Zhaodong, Heilongjiang, China 126°15′ 45°52′ PHZ2004 KY263838 525 28.38 P. pubescens Baiquan, Qiqiha’er, Heilongjiang, ChinaDraft 126°05′ 47°35′ PHZ2005 KY263839 525 28.38 P. pubescens Aihui, Heihe, Heilongjiang, China 127°29′ 50°14′ PHZ2006 KY263840 525 28.38 P. pubescens Nong’an, Changchun, Jilin, China 125°10′ 44°25′ PHZ2007 KY263841 525 28.38 P. pubescens Nong’an, Changchun, Jilin, China 125°10′ 44°25′ PHZ2008 KY263842 525 28.38 P. pubescens Tonghua, Changchun, Jilin, China 125°45′ 41°40′ PHZ2009 KY263843 525 28.38 Nicandra physalodes (L.) Yiwu, Jinhua, Zhejiang, China 120°04′ 29°18′ NHZ0001 KY263855 552 27.54 Gaertn. N. physalodes Jiujiang, Jiangxi, China 115°59′ 29°42′ NHZ0002 KY263856 552 27.72 N. physalodes Changsha, HuNan, China 112°56′ 28°13′ NHZ0003 KY263857 553 27.12 N. physalodes Xiaoshan, Hangzhou, Zhejiang, China 120°15′ 30°11′ NHZ0004 KY263858 553 27.12

13

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 14 of 20

349 Table 2 Analyses of interspecific divergence and intraspecific variation of the

350 chloroplast psbA –trnH regions Measurement K2P value All interspecific distance 0.280 ± 0.027 Theta prime 0.246 ± 0.024 Minimum interspecific distance 0.243 ± 0.024 All intraspecific distance 0.000 ± 0.000 Theta 0.001 ± 0.001 Coalescent depth 0.002 ± 0.001 351 352

353

354

355

356 357 Draft 358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

14

https://mc06.manuscriptcentral.com/genome-pubs Page 15 of 20 Genome

374

375 Table 3 Results of Wilcoxon twosample tests for the distribution of intra vs.

376 interspecific divergences No. of interspecific distances No. of intraspecific distances Wilcoxon W P value

533 89 4357 4.06e50

377

378

379

380

381

382

383 384 Draft 385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

15

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 16 of 20

401

402 Table 4 Comparison of efficiencies of authentication of chloroplast psbA –trnH region

403 using different methods Methods of No. of No. of Correct Incorrect Ambiguous identification samples species identification identification identification

BLAST1 36 9 100% 0 0 Nearestdistance 36 9 100% 0 0 404 405

406

407

408

409 410 Draft 411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

16

https://mc06.manuscriptcentral.com/genome-pubs Page 17 of 20 Genome

427 Figure captions

428 Fig. 1 Plant morphology of P. angulate, P. alkekengi var. franchetii , P. minima , P.

429 pubescens , and Nicandra physalodes .

430 Fig. 2 Relative distributions of interspecific divergence and intraspecific variation of

431 the psbA –trnH region based on K2P genetic distance.

432 Fig. 3 Neighborjoining (NJ) and maximum likelihood (ML) trees based on

433 psbA –trnH sequences. Numbers above branches indicate bootstrap support (BS)

434 values (BS≥50). (A): NJ tree and (B): ML tree.

435

436 437

Draft

17

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 18 of 20

Draft Fig. 1 Plant morphology of P. angulate, P. alkekengi var. franchetii, P. minima, P. pubescens, and Nicandra physalodes.

169x107mm (300 x 300 DPI)

https://mc06.manuscriptcentral.com/genome-pubs Page 19 of 20 Genome

Draft

Fig. 2 Relative distributions of inter-specific divergence and intra-specific variation of the psbA–trnH region based on K2P genetic distance.

150x107mm (300 x 300 DPI)

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 20 of 20

Draft Fig. 3 Neighborjoining (NJ) and maximum likelihood (ML) trees based on psbA–trnH sequences. Numbers above branches indicate bootstrap support (BS) values (BS≥50). (A): NJ tree and (B): ML tree.

166x103mm (300 x 300 DPI)

https://mc06.manuscriptcentral.com/genome-pubs