<<

www.nature.com/scientificreports

OPEN Forensic characterization of 15 autosomal STRs in four populations from , , and genetic Received: 15 December 2017 Accepted: 5 March 2018 relationships with neighboring Published: xx xx xxxx populations Xiaoni Zhan1, Atif Adnan 1, Yuzhang Zhou1, Amjad Khan1, Kadirya Kasim1 & Dennis McNevin2

The Xinjiang Uyghur Autonomous Region of China (XUARC) harbors 47 ethnic groups including the Manchu (MCH: 0.11%), (MGL: 0.81%), Kyrgyz (KGZ: 0.86%) and Uzbek (UZK: 0.066%). To establish DNA databases for these populations, allele frequency distributions for 15 autosomal short tandem repeat (STR) loci were determined using the AmpFlSTR Identifler PCR amplifcation kit. There was no evidence of departures from Hardy–Weinberg equilibrium (HWE) in any of the four populations and minimal departure from linkage equilibrium (LE) for a very small number of pairwise combinations of loci. The probabilities of identity for the diferent populations ranged from 1 in 1.51 × 1017 (MCH) to 1 in 9.94 × 1018 (MGL), the combined powers of discrimination ranged from 0.99999999999999999824 (UZK) to 0.9999999999999999848 (MCH) and the combined probabilities of paternal exclusion ranged from 0.9999979323 (UZK) to 0.9999994839 (MCH). Genetic distances, a phylogenetic tree and principal component analysis (PCA) revealed that the MCH, KGZ and UZK are genetically closer to the Han population of and the Mongol population of Mongolia while the MGL are closer to Han, Japanese, Korean, Malaysian, Han and living in China.

Xinjiang is a multi-ethnic region and has played an important role in connecting eastern Eurasia and western Eurasia. It was crossed by the famous Silk Road, which linked trade between East Asia, Central Asia, and Europe1. Many ethnic groups, including the Manchu (MCH), Mongols (MGL), Kirgiz (KGZ) and Uzbek (UZK) have lived there for hundreds of years2. Te Manchu founded two Chinese Dynasties on the country’s inner plains: the Jin Dynasty, founded by the Nvzhen people, and the , founded by Huang Taijin in 1635. Te history of the Manchu can be traced back 6000–7000 years ago (6–7 kya). Although the can be found in all over China3, they represent only 0.11% of the Xinjiang population4. Te Mongols came from the area around the east bank of the ancient Wangjian River (present-day Eerguna River) in . “Mengwu” is the earliest Chinese name for “Mongolia”. It frst appeared in the Tang dynasty (618–907). “Mongol” was initially the name for one of the Mongolian tribes. At the beginning of the 13th century, the Mongolian tribe headed by Genghis Khan unifed the other tribes in the region and gradually formed a new ethnic community. Terefore, “Mongolia” became the name for a nationality instead of a tribe5. As well as Mongolia, Mongols currently live mainly in the Inner Mongolia Autonomous Region and some prefectures of Xinjiang Uygur Autonomous Region like Bayingolin (South East) and Bortala (North West). Tey represent 0.81% of the Xinjiang population4. Te Kyrgyz (or Kirgiz) live mainly in the southwest of Xinjiang, especially in the Kezhilesu Kyrgyz autono- mous state. Tey have a long history and have been known in China by many names. In the Han dynasty, they were called “Gekun” or “Jiankun”. Later they were called “Qigu” in the Jin dynasty; “Jiankun”, “Jikasi” or “Qiliqisi”

1Department of Forensic Genetics, School of Forensic Medicine, China Medical University, , 110122, P.R. China. 2National Centre for Forensic Studies, Faculty of Science & Technology, University of Canberra, Canberra, . Xiaoni Zhan and Atif Adnan contributed equally to this work. Correspondence and requests for materials should be addressed to A.A. (email: [email protected])

SCieNTiFiC REporTS | (2018)8:4673 | DOI:10.1038/s41598-018-22975-6 1 www.nature.com/scientificreports/

in the Tang and Song dynasty; and “Jirjisi” or “Qirjisi” in the Yuan and Ming periods. All these names were based on “Kyrgyz”, which has had diferent Chinese translation at diferent times. Te etymology of “Kyrgyz” is thought to be “40 tribes” or “40 girls”2. While the Kyrgyz are primarily located in Kyrgyzstan, they represent only 0.86% of the Xinjiang population4. Te name “Uzbek” frst originated with Uzbek Khan, a local ruler in the Mongol Empire in the 14th century. Te are an ancient Iranian people that intermingled with nomadic Mongol and Turkic tribes that invaded Central Asia between the 11th and 15th centuries. Te Uzbeks that live in China live mostly in Xinjiang near the border with and the former Soviet Central Asian republics. Uzbeks have been trading in western China for centuries. In the 16th century, they began to settle in cities in Xinjiang. Most Uzbeks in China still live in the cities and are engaged in trading or business1. Tey represent 0.066% of the Xinjiang population4. Short tandem repeat (STR) loci, also referred to as microsatellites or simple sequence repeats (SSRs), are DNA sequences that contain a repeat motif of 2–6 bp and are characterized by a high level of relatively stable polymorphisms, a dense, uniform chromosomal distribution as well as short sequence lengths, which facilitates detection and analysis by PCR and sequencing6,7. All these features render STRs as powerful genetic markers for inter-population studies8 and for the reconstruction of recent human evolutionary history9. In view of their high level of variability, autosomal STRs have been the most common genetic markers used in forensic applications, including personal identifcation and paternity testing10. Most forensic laboratories use commercially available kits for multiple STR genotyping11. Tere have been previous studies of STR genotypes in the Uighur12 and Kazak13 populations of Xinjiang but the Manchu, Mongol, Kyrgyz and Uzbek populations remain uncharacterised. In the present study, the 15auto- somal STRs in the AmpFLSTR Identifler kit (Applied Biosystems, Foster City, CA, USA) were examined in the MCH, MGL, KGZ and UZK minorities of the Xinjiang Uyghur Autonomous Region (XUAR). Results and Discussion Forensic parameters. Te distribution of allele frequencies and forensic statistical parameters in the four Xinjiang ethnic minorities are available as Supplementary Data (Supplementary Tables 1, 2, 3 and 4). Totals of 152, 165, 153 and 168 unique alleles were found in the Manchu, Mongol, Kyrgyz and Uzbek populations, respectively. Te combined powers of discrimination (CPDs) for the 15 STR loci were 0.999 999 999 999 999 984 833, 0.999 999 999 999 999 990 057, 0.999 999 999 999 999 996 333 and 0.999 999 999 999 999 998 244, respectively. Te combined powers of exclusion (CPE) for the 15 STR loci were 0.999 999 416, 0.999 999 483, 0.999 997 932 and 0.999 998 973, respectively. Te probabilities of identity for the diferent populations were 1/1.51 × 1017, 1/1.75 × 1018, 1/3.66 × 1018 and to 1/9.94 × 1018, respectively. D2S1338 had the highest heterozy- gosities and powers of discrimination (PDs) in all four populations. FGA was the most polymorphic locus in the Mongol (20 unique alleles) and Uzbek (19 unique alleles) populations, respectively. D18S51 was most poly- morphic in the Manchu population (18 unique alleles) while D18S51, D21S11 and FGA all had 15 unique alleles in the Kyrgyz population. Informativeness can be quantitatively measured by the polymorphism information content. Teoretically, PIC values can range from 0 to 1. At a PIC of 0, the marker has only one allele. At a PIC of 1, the marker would have an infnite number of alleles. A PIC value of greater than 0.7 is considered to be highly informative. Clearly, markers with greater numbers of alleles tend to have higher PIC values and thus are more informative14. Te Manchu and Mongol populations have four loci with PIC < 0.7 while the Uzbek and Kyrgyz populations have only two loci with PIC < 0.7. Terefore, most loci exhibited a high informativeness, showing the potential of the Identifler panel for diferentiation of individuals and for paternity testing for the four ethnic minority populations in the Xinjiang Uyghur Autonomous Region of China.

Hardy-Weinberg equilibrium (HWE). All of the loci were in Hardy-Weinberg Equilibrium (HWE) in the Kyrgyz population (p > 0.05), while one STR locus was out of HWE for Manchu (D7S820), two loci for Mongol (CSF1PO, D19S433) and four loci for Uzbek (D18S51, D2S1338, D7S820 and FGA). However, when a sequential Bonferroni correction15 was applied to mitigate against the so-called “multiple comparison problem” (where for a signifcant p-value of 0.5, 5% of tests are likely to be signifcant by chance), no loci in any of the four populations were found to be out of HWE (Supplementary Table 5).

Linkage equilibrium (LE). Linkage disequilibrium (LD) can be caused by association between adjacent alleles co-inherited from single, ancestral chromosomes but may also be a result of selection, random genetic drif, the rate of mutation or recombination, nonrandom mating, founder efects, sampling efects, recent admix- ture, and population substructure16. Exact tests for linkage equilibrium (LE) showed that the p-values of 50 pair- wise combinations of STR loci (11 in Mongolia and Manchu, 13 in Kyrgyz and 15 in Uzbek) were below 0.05 and thus displaying LD (Supplementary Tables 6, 7, 8 and 9). Afer a sequential Bonferroni correction15, only fve pairs were out of LE (Supplementary Table 10). Tese were TH01/D8S1179 and D18S51/D13S317 in the Manchu population, vWA/D21S11 and D2S1338/D19S433 in the Uzbek population and FGA/D13S317 in the Kyrgyz pop- ulation. All pairwise combinations of loci were in LE in the Mongol population. Terefore, of the 105 pairwise LE tests in each population, a maximum of two were out of LE in any population. Application of the “product rule” for calculation of random match probabilities across multiple loci is fully justifed in the Mongol population and is unlikely to produce signifcant errors in the other three populations.

Cluster analysis with STRUCTURE. STRUCTURE analysis of the four populations from Xinjiang pro- vided no evidence of population structure for any repetition at any value of K. Tat is, each repetition yielded ancestry proportions for each individual that were approximately equally distributed between each ancestral cluster and were no diferent between the four populations. STRs for forensic identity testing, such as those included in the Identifler panel, are selected for high heterozygosity and minimal allele frequency diferences

SCieNTiFiC REporTS | (2018)8:4673 | DOI:10.1038/s41598-018-22975-6 2 www.nature.com/scientificreports/

Figure 1. Neighbour-joining tree of the Manchu, Mongol, Kyrgyz and Uzbek populations from Xinjiang in relation to other regional populations.

between populations and so they generally make poor ancestry informative markers (AIMs) which require large allele frequency diferences between populations. Further, pairwise FST between the four populations were generally < 0.03 except for Mongols at D5S818, D13S317, D16S539, D18S51, D19S433, FGA, TPOX and vWA. While we may have expected Mongols to exhibit some diferentiation from the other three populations, it is not surprising that Manchus, Kyrgyz and Uzbeks are not diferentiated by the STRs in the Identifler panel using STRUCTURE.

Comparison with other populations. An AMOVA was utilized for comparison between the four popula- tions in this study and previously published population studies employing the same 15 STR loci. Genetic distances (FST) and associated p-values for each locus are presented in Supplementary Tables 11, 12, 13 and 14. Te largest genetic distances in the Manchu, Mongol, Kyrgyz and Uzbeck populations were observed at vWA, D19S433, FGA and TPOX, respectively, while the lowest distances were observed at D8S1179 in the Manchu population and at CSF1PO in the Mongol, Kyrgyz and Uzbek populations. Genetic distances between populations based on Nei’s formula17 are documented in Supplementary Table 15. Tese were used to construct a neighbor-joining tree of the four populations from Xinjiang and the other populations (Fig. 1). Te Manchu and Kyrgyz are most closely related and they share a most recent common ancestor with the Uzbeks and a second most recent common ances- tor with Mongols (from Mongolia) and ethnic Han from Liaoning province. Mongols from Xinjiang were most closely related to Russians in China, Hui from , Manchu from Liaoning and Salar from Qinghai. PCA was applied to normalized allele frequencies at the 15 STR loci in the Manchu, Mongol, Kyrgyz and Uzbek populations of Xinjiang (Fig. 2A), in other populations from Xinjiang ( and : Fig. 2B) in other populations from China (Fig. 2C) and in other populations from neighboring countries (Fig. 2D). In

SCieNTiFiC REporTS | (2018)8:4673 | DOI:10.1038/s41598-018-22975-6 3 www.nature.com/scientificreports/

Figure 2. (A) Principal component analysis (PCA) based on the 15 autosomal STR loci of the four populations from Xinjiang in this study. (B) Principal component analysis (PCA) based on the 15 autosomal STR loci of the four populations from Xinjiang in this study and two other Xinjiang populations from previous studies (Uyghur and Kazakhs). (C) Principal component analysis (PCA) based on the 15 autosomal STR loci of the four populations from Xinjiang in this study and other Chinese populations from previous studies. (D) Principal component analysis (PCA) based on the 15 autosomal STR loci of the four populations from Xinjiang in this study and other populations from neighboring countries.

Fig. 2A, the Manchu and Kyrgyz are clustered in the lower right quadrant closer to each other than to Uzbeks. Mongols appear in the upper right quadrant, away from Manchu, Kyrgyz and Uzbeks. Tese proximities are consistent with the phylogenetic relationships observed in the neighbor-joining tree (Fig. 1). In Fig. 2B, the Manchu, Kyrgyz, Uzbeks and Kazakhs are clustered in the lower right quadrant while the Uyghurs and Mongols are clustered in the upper right. In Fig. 2C, the Manchu, Kyrgyz, Uzbek, Kazakh and Mongols of East Mongolia (China) are clustered in the lower right quadrant while the Miao, Dong, Bouyei, Mongols from Xinjiang, Hui from Qinghai, Dongxiang from Qinghai, Salar from Qinghai, Russians in China, Han and Manchu of Liaoning are clustered in the upper right. Finally, in Fig. 2D, the Manchu, Kyrgyz, Uzbek and Kazakhs cluster with the Mongols from Mongolia, away from other populations. At all resolutions, PCA supports the genetic proximity of Manchu, Kyrgyz, Uzbek and Kazakhs in Xinjiang while Mongols in Xinjiang display greater genetic distance from these populations as well as from other Mongols in Mongolia and China. Tis interpretation is also consistent with Fig. 1.

Concluding remarks. In this study, forensic characterization of 15 autosomal STR loci in the Manchu, Mongol, Kyrgyz and Uzbek minority populations of Xinjiang was performed. Te AmpFlSTR Identifler panel was found to be appropriate for forensic identity testing and paternity testing in these populations with a high power of discrimination, no signifcant departures from HWE at any loci and minimal departure from LE for a very small number of pairwise combinations of loci. Population genetic analyses indicated that the Manchu, Kyrgyz and Uzbek were closely related while the Mongols of Xinjiang had a closer genetic relationship with Russians in China, Hui from Qinghai, Manchu from Liaoning and Salar from Qinghai. Surprisingly, Mongols from Mongolia and China were more closely related to Manchu, Kyrgyz and Uzbek than to Mongols in Xinjiang, perhaps suggesting an ancient divergence when Mongols originally migrated to present day Xinjiang. Materials and Methods Samples and DNA extraction. Blood samples were collected from a total of 1842 unrelated healthy indi- viduals from the XUAR (1157 males, 685 females), including 306 Manchu (208 males, 98 females), 507 Mongols (male: 275, female: 232), 550 Kyrgyz (329 males, 221 females) and 479 Uzbek (345 males, 134 females). All par- ticipants gave their informed consent either orally and with thumb print (in case they could not write) or in writing afer the study aims and procedures were carefully explained to them in their own language. Te study was approved by the ethical review board of the China Medical University, Shenyang Liaoning Province, People’s

SCieNTiFiC REporTS | (2018)8:4673 | DOI:10.1038/s41598-018-22975-6 4 www.nature.com/scientificreports/

Republic of China and in accordance with the standards of the Declaration of Helsinki. All blood samples were stored at −20 °C before DNA extraction. Genomic DNA was extracted from blood stains using the TIANamp Blood Spots DNA Kit (TIANGEN BIOTECH CO., LTD) according to the manufacturer’s instructions and the concentration of DNA was quantifed by absorption at 260 nm using an ultraviolet spectrophotometer (UV-2800AH, UNICO).

PCR amplifcation. PCR co-amplifcation of ffeen autosomal STR loci (D18S51, D21S11, TH01, D3S1358, FGA, TPOX, D8S1179, vWA, CSF1PO, D16S539, D7S820, D13S317, D2S1338, D19S433, and D5S818) were per- formed in a fuorescence-based multiplex reaction using the AmpFLSTR Identifler kit (Applied Biosystems, Foster City, CA, USA). From 1 to 2 ng of the target DNA was amplifed according to the manufacturer’s recom- mended protocol. Termal cycling was conducted under the following conditions: 95 °C for 11 min; 28 cycles of 94 °C for 60 s, 59 °C for 60 s, 72 °C for 60 s; and a fnal extension of 60 °C for 45 min. All loci were amplifed in a GeneAmp PCR System 9700 thermal cycler (Applied Biosystems, Foster City, CA).

Genotyping. Amplifed products were analyzed with reference to ABI GeneScan 500 LIZ internal size stand- ard (Life Technologies) and AmpFlSTR Identifler Allelic Ladder using an ABI 3130xl genetic analyzer (Applied Biosystems, Foster City, CA) according to the AmpFLSTR Identifler standard protocol. Analysis of data obtained from the genetic analyzer was performed using GeneMapper sofware v3.5.

Quality control. Negative (autoclaved deionized H2O) and positive (AmpFlSTR Control DNA 9947 A) con- trols were employed for DNA extraction, DNA quantitation, PCR amplifcation and capillary electrophoresis. All negative controls displayed an absence of amplifed product while positive controls were consistent with known genotypes.

Statistical analysis. Allelic frequencies and important forensic parameters, such as match probability (MP), power of discrimination (PD), power of exclusion (PE) and polymorphism information content (PIC) were calculated using PowerStats V1.218. Observed heterozygosity (Ho), expected heterozygosity (He), pairwise FST and exact tests for Hardy–Weinberg equilibrium (HWE) and linkage equilibrium (LE) between pairwise combinations of loci were performed using Arlequin v3.5 based on a likelihood ratio test for unknown gametic phase19. Empirical distributions were obtained from 10,000 permutations. Principal components analysis was performed with MVSP 3.1 (http://www.kovcomp.com) based on allelic frequencies of the 15 autosomal STR loci. Nei’s standard genetic distances between currently studied and previously published populations (Russian20, Saraki Pakistan21, Korean22, Punjabi23, Indian24, Morocco25, Eastern Turkey26, Hong Kong27, Japanese28, Interior Sindh (unpublished), Hungarian29, South Iran30, Azerbaijan31, Turkish Cypriot32, Afghanistan33, Bangladesh34, Malaysia35, Kadazan Malaysia36, Sindh Pakistan37, Iraq38, Pashtuns Afghanistan39, Tajik Afghanistan39, Uzbek Afghanistan39, Turkmen Afghanistan39, Mongols of Mongolia40, Hazara Afghanistan39, Kuala Lumpur Malaysia41, Miao42, East Mongolia of China43, Dong44, Bouyei45, Han Liaoning46, Manchu Liaoning47, Hui Qinghai48, Uyghur China49, Russian in China50, Dongxiang Qinghai51, Salar Qinghai51, Kazakh China13) were generated using the Phylip 3.69 package52 and visualized with Mega7 sofware53.

Cluster analysis using STRUCTURE. STRUCTURE (version 2.2)54 was used to determine if there was any population structure within and between the Manchu, Mongol, Kyrgyz and Uzbek populations from Xinjiang. Raw genotypes are included in spreadsheet format in Supplementary File 1. Te Admixture model with corre- lated allele frequencies was employed without prior population information (USEPOPINFO = 0). Te number of inferred clusters (K) was varied from 2 to 10 with 10 repetitions of each K value and a total of 10,000 burnins and 10,000 Markov chain Monte Carlo (MCMC) simulations for each repetition. References 1. Central Asia and China: Te Oxford History of Islam. In Te Oxford history of Islam (ed. Esposito, J. L.) 433 (Oxford University Press, 1999). 2. Millward, J. A. Eurasian crossroads: a history of Xinjiang. (Columbia University Press, 2007). 3. Sun, L. Writing an empire: an analysis of the manchu origin myth and the dynamics of manchu identity. J. Chin. Hist. 1, (93–109 (2017). 4. China et al. 2000 nian ren kou pu cha Zhong guo min zu ren kou zi liao = Tabulation on nationalities of 2000 population census of China. (Min zu chu ban she, 2003). 5. Weatherford, J. M. Genghis Khan and the making of the modern world (Tree Rivers Press, 2012). 6. Hammond, H. A., Jin, L., Zhong, Y., Caskey, C. T. & Chakraborty, R. Evaluation of 13 short tandem repeat loci for use in personal identifcation applications. Am. J. Hum. Genet. 55, 175–189 (1994). 7. Sánchez-Diz, P. et al. Population data on 15 autosomal STRs in a sample from Colombia. Forensic Sci. Int. Genet. 3, e81–82 (2009). 8. Bowcock, A. M. et al. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368, 455–457 (1994). 9. Rowold, D. J. & Herrera, R. J. Inferring recent human phylogenies using forensic STR technology. Forensic Sci. Int. 133, 260–265 (2003). 10. Meng, H.-T. et al. Genetic diversities of 20 novel autosomal STRs in Chinese Xibe ethnic group and its genetic relationships with neighboring populations. Gene 557, 222–228 (2015). 11. Butler, J. M. Short tandem repeat typing technologies used in human identity testing. BioTechniques 43, ii–v (2007). 12. Yuan, L. et al. Genetics analysis of 38 STR loci in Uygur population from Southern Xinjiang of China. Int. J. Legal Med. 130, 687–688 (2016). 13. Zhang, H. et al. Population genetic analysis of the GlobalFiler STR loci in 748 individuals from the Kazakh population of Xinjiang in . Int. J. Legal Med. 130, 1187–1189 (2016). 14. Hildebrand, C. E. & Torney, D. Informativeness of Polymorphic DNA Marker. Los Alamos Sci. 20, 100–102. 15. Benjamini, Y. & Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society 57, 289–300 (1995). 16. Chakravarti, A. Population genetics—making sense out of sequence. Nat. Genet. 21, 56–60 (1999).

SCieNTiFiC REporTS | (2018)8:4673 | DOI:10.1038/s41598-018-22975-6 5 www.nature.com/scientificreports/

17. Takezaki, N. & Nei, M. Genetic distances and reconstruction of phylogenetic trees from microsatellite DNA. Genetics 144, 389–399 (1996). 18. Tereba, A. Powerstats version 1.2, Tools for Analysis of Population Statistics. Promega corporation website, http://www.promega. com/geneticidtools/powerstats. Profles DNA 14–16 (1999). 19. Excofer, L., Laval, G. & Schneider, S. Arlequin (version 3.0): an integrated sofware package for population genetics data analysis. Evol. Bioinforma. Online 1, 47–50 (2005). 20. Stepanov, V. A. et al. Genetic variability of 15 autosomal STR loci in Russian populations. Leg. Med. 12, 256–258 (2010). 21. Shafque, M. et al. Genetic diversity of 15 autosomal STR loci in the population of Southern Punjab Pakistan. Forensic Sci. Int. Genet. 19, e1–e2 (2015). 22. Yoo, S. Y. et al. A large population genetic study of 15 autosomal short tandem repeat loci for establishment of Korean DNA Profle Database. Mol. Cells 32, 15–19 (2011). 23. Shan, M. A. et al. Genetic distribution of 15 autosomal STR markers in the Punjabi population of Pakistan. Int. J. Legal Med. 130, 1487–1488 (2016). 24. Shrivastava, P., Jain, T. & Trivedi, V. B. Genetic polymorphism study at 15 autosomal locus in central Indian population. SpringerPlus 4 (2015). 25. Bentayebi, K., Abada, F., Ihzmad, H. & Amzazi, S. Genetic ancestry of a Moroccan population as inferred from autosomal STRs. Meta Gene 2, 427–438 (2014). 26. Tokdemir, M., Tunçez, F. T. & Vicdanli, N. H. Population Genetic data for 15 Autosomal STR markers in Eastern Turkey. Gene 586, 36–40 (2016). 27. Law, M. et al. STR data for the PowerPlex 16 loci for the Chinese population in Hong Kong. Forensic Sci. Int. 129, 64–67 (2002). 28. Tie, J., Wang, X. & Oxida, S. Genetic Polymorphisms of 15 STR Loci in a Japanese Population. J. Forensic Sci. 51, 188–189 (2006). 29. Demeter, S. J., Kelemen, B., Székely, G. & Popescu, O. Genetic Variation at 15 Polymorphic, Autosomal, Short Tandem Repeat Loci of Two Hungarian Populations in Transylvania, Romania. Croat. Med. J. 51, 515–523 (2010). 30. Hedjazi, A., Nikbakht, A., Hosseini, M., Hoseinzadeh, A. & Hosseini, S. M. V. Allele frequencies for 15 autosomal STR loci in Fars province population, southwest of Iran. Leg. Med. 15, 226–228 (2013). 31. Nasibov, E. et al. Allele frequencies of 15 STR loci in Azerbaijan population. Forensic Sci. Int. Genet. 7, e99–e100 (2013). 32. Gurkan, C., Demirdov, D. K., Yamaci, R. F. & Sevay, H. Population genetic data for 15 autosomal STR markers in Turkish Cypriots from Cyprus. Forensic Sci. Int. Genet. 14, e1–e3 (2015). 33. Berti, A. et al. Autosomal STR frequencies in Afghanistan population. J. Forensic Sci. 50, 1494–1496 (2005). 34. Hossain, T. et al. Population genetic data on 15 autosomal STR loci in Bangladeshi population. Forensic Sci. Int. Genet. 13, e4–e5 (2014). 35. Nakamura, Y., Samejima, M., Minaguchi, K. & Nambiar, P. Population Genetics of Identifler System in Malaysia. Bull. Tokyo Dent. Coll. 57, 233–239 (2016). 36. Kee, B. P., Lian, L. H., Lee, P. C., Lai, T. X. & Chua, K. H. Genetic data for 15 STR loci in a Kadazan-Dusun population from East Malaysia. Genet. Mol. Res. 10, 739–743 (2011). 37. Perveen, R., Shahid, A. A., Shafque, M., Shahzad, M. & Husnain, T. Genetic variations of 15 autosomal and 17 Y-STR markers in Sindhi population of Pakistan. Int. J. Legal Med. https://doi.org/10.1007/s00414-017-1544-3 (2017). 38. Barni, F. et al. Allele frequencies of 15 autosomal STR loci in the Iraq population with comparisons to other populations from the middle-eastern region. Forensic Sci. Int. 167, 87–92 (2007). 39. Di Cristofaro, J., Buhler, S., Temori, S. A. & Chiaroni, J. Genetic data of 15 STR loci in fve populations fromAfghanistan. Forensic Sci. Int. Genet. 6, e44–e45 (2012). 40. Choi, E.-J. et al. Forensic and population genetic analyses of the GlobalFiler STR loci in the Mongolian population. Genes Genomics 39, 423–431 (2017). 41. Maruyama, S., Minaguchi, K., Takezaki, N. & Nambiar, P. Population data on 15 STR loci using AmpF/STR Identifler kit in a Malay population living in and around Kuala Lumpur, Malaysia. Leg. Med. 10, 160–162 (2008). 42. Zhang, L., Zhao, Y., Guo, F., Liu, Y. & Wang, B. Population data for 15 autosomal STR loci in the Miao ethnic minority from Province. . Forensic Sci. Int. Genet. 16, e3–e4 (2015). 43. Du, Q., Wang, J. & Huang, Y. A genetic study of 15 STR loci in Chinese East Mongolian population. Fa Yi Xue Za Zhi 20, 164–166 (2004). 44. Zhang, L. Population data for 15 autosomal STR loci in the Dong ethnic minority from Guizhou Province, Southwest China. Forensic Sci. Int. Genet. 16, 237–238 (2015). 45. Zhang, L. Population data for 15 autosomal STR loci in the Bouyei ethnic minority from Guizhou Province, Southwest China. Forensic Sci. Int. Genet. 17, 108–109 (2015). 46. Yao, J. & Wang, B. Genetic Variation of 25 Y-Chromosomal and 15 Autosomal STR Loci in the Population of Liaoning Province, . PLOS ONE 11, e0160415 (2016). 47. Xing, J. et al. Genetic polymorphism of 15 STR loci in a Manchu population in Northeast China. Forensic Sci. Int. Genet. 5, e93–e95 (2011). 48. Deng, Y. et al. Genetic polymorphism analysis of 15 STR loci in Chinese Hui ethnic group residing in Qinghai province of China. Mol. Biol. Rep. 38, 2315–2322 (2011). 49. Chen, J. et al. Population genetic data of 15 autosomal STR loci in Uygur ethnic group of China. Forensic Sci. Int. Genet. 6, e178–e179 (2012). 50. Zhu, B. et al. Population genetic analysis of 15 autosomal STR loci in the Russian population of northeastern Inner-Mongolia, China. Mol. Biol. Rep. 37, 3889–3895 (2010). 51. Deng, Y. et al. Genetic polymorphisms of 15 STR loci of Chinese Dongxiang and Salar ethnic minority living in Qinghai Province of China. Leg. Med. 9, 38–42 (2007). 52. Felsenstein, J. PHYLIP (Phylogeny Inference Package) Version 3.69. (Department of Genome Sciences, University of Washington, 2009). 53. Kumar, S., Stecher, G. & Tamura, K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol. Biol. Evol. 33, 1870–1874 (2016). 54. Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

Acknowledgements We are very grateful to the volunteers in our study. Tis project is supported by the National Natural Science Foundation of P. R. China (NSFC, No. 81471826), Ministry of Finance, P. R. China. Te funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

SCieNTiFiC REporTS | (2018)8:4673 | DOI:10.1038/s41598-018-22975-6 6 www.nature.com/scientificreports/

Author Contributions A.A. wrote the manuscript, X.Z., A.A., K.K., Y.Z. and A.K. conducted the experiment, A.A., D.M., X.Z., Y.Z. and K.K., analyzed the results and modifed the manuscript. X.Z., K.K., Y.Z. and A.K. had collected the samples. All authors reviewed the manuscript. Additional Information Supplementary information accompanies this paper at https://doi.org/10.1038/s41598-018-22975-6. Competing Interests: Te authors declare no competing interests. Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional afliations. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre- ative Commons license, and indicate if changes were made. Te images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not per- mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

© Te Author(s) 2018

SCieNTiFiC REporTS | (2018)8:4673 | DOI:10.1038/s41598-018-22975-6 7