ZOOLOGICAL RESEARCH

Exploring European ancestry among the Kalash population: a mitogenomic perspective

DEAR EDITOR, customs and beliefs have so far reinforced their claim to be Greek descents following the invasion of Alexander III of With a population of around 4 000 individuals, the Kalash Macedon to the region (Cacopardo, 2011). In the past several people have been living in the Hindu-Kush mountain valleys of decades, various genetic studies have been carried out to present-day northern for centuries. Due to their investigate the genetic structure and history of the Kalash mysterious origin and fairer European complexion, the genetic people, in particular their genetic connection with western history of this ethnic group has been investigated previously Eurasians. For example, several studies have indicated that using different markers. To date, however, the maternal this ethnic group originated from either the Middle East or genetic architecture has not been systematically dissected Europe, followed by a population bottleneck (Qamar et al., based on high-resolution complete mitochondrial genomes 2002; Rosenberg et al., 2002). It is also widely concerned (mitogenomes), making their maternal genetic history, whether the Kalash were genetically isolated for more than 10 especially their genetic connection with Europeans from a kya (Ayub et al., 2015) or received genetic admixture from matrilineal perspective, unclear. To unravel this issue, we western during 990 and 210 BCE (Hellenthal et al., analyzed mitogenome data of 34 Kalash samples together 2014). Moreover, the possible genetic connection between with 6 075 individuals from across Eurasia. Our results and the Kalash remains controversial (Cacopardo, indicated exclusive western Eurasian origin of the Kalash 2011; Firasat et al., 2007; Mansoor et al., 2004; Qamar et al., people, represented by eight haplogroups. Among these 2002). haplogroups, J2b1a7a and R0a5a (accounting for ~50% of the Many previous genetic studies have been based on nuclear Kalash gene pool) displayed in situ differentiations in the genome or data, while the maternal genetic Kalash and could be traced to the Mediterranean region. Age structure of the Kalash had only been dissected based on estimations suggested these haplogroups arose in the Kalash mitochondrial DNA (mtDNA) restricted fragment length population ~2.26 and 3.01 thousand years ago (kya), a time polymorphism (RFLP) and control region variations (Quintana- frame consistent with the invasion of Alexander III of Macedon Murci et al., 2004), thus greatly limiting our understanding of to the region. One possible explanation for the maternal the maternal genetic landscape of this ethnic group. genetic contribution from Europeans to the Kalash people Therefore, whether there is a substantial maternal genetic would be the involvement of women in foreign campaigns of contribution from Europeans to the Kalash, and when this ancient Greek warfare, followed by a founder effect. Our study genetic contact was established, remain unclear. thus sheds important light on the genetic origin of the Kalash To provide more insight into the genetic history of the community of Pakistan. Kalash from a matrilineal perspective, we collected and The Kalash or Kalasha people are an ancient Indo- analyzed available complete mitochondrial genome European speaking indigenous group with unique culture and (mitogenome) data of 34 Kalash individuals (25 from the traditions, living restrictively in the Hindu-Kush mountain range CEPH Human Genome Diversity Project (HGDP) panel (Cann of present-day northern Pakistan. The enigmatic origin of the et al., 2002) and nine from this work), as well as 6 075 Kalash and interestingly their distinct European complexion, individuals sampled from Europe and Asia (Figure 1A; e.g., lighter skin tone and blue eyes, in addition to certain Received: 14 February 2020; Accepted: 15 June 2020; Online: 24 Open Access June 2020 This is an open-access article distributed under the terms of the Foundation items: This work was supported by the Strategic Priority Creative Commons Attribution Non-Commercial License (http:// Research Program (XDA20040102), Second Tibetan Plateau Scientific creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted Expedition and Research (STEP) (2019QZKK0607), National Natural non-commercial use, distribution, and reproduction in any medium, Science Foundation of China (31620103907), Chinese Academy of provided the original work is properly cited. Sciences (QYZDB-SSW-SMC020), and Yunnan Applied Basic Copyright ©2020 Editorial Office of Zoological Research, Kunming Research Project (2017FB044) Institute of Zoology, Chinese Academy of Sciences DOI: 10.24272/j.issn.2095-8137.2020.052

552 Science Press Zoological Research 41(5): 552−556, 2020 Supplementary Table S1). As showed in our results, a total of derived from the root of J2b1a independently, with 16274 eight mtDNA haplogroups were identified in the Kalash, serving as a parallel mutation on both branches. We therefore including R0a, U4a1, J2b1a, U2e1h, H2a1a, U4b1a4, T2a1a, turned our attention to the ancestral node, J2b1a. and U2e2a1, all of which exclusively arise from the Eurasian Coincidently, the majority (74%) of J2b1a samples, as well as macro haplogroup R, an observation in agreement with its ancient root type J2b1, were found in Europe, especially in previous study (Quintana-Murci et al., 2004). Comparison of Sardinia (Figure 1C; Supplementary Table S3). This evidence the maternal composition between Kalash and other Eurasian therefore implies an origination of J2b1a in Europe (probably populations (Supplementary Table S1) showed that most of around the Mediterranean region), in agreement with previous the identified haplogroups in the Kalash were substantially study (Pala et al., 2012). Additional support comes from the shared with neighboring Dardic group (Kho), as well as being observation of haplogroup J2b1a in bones of ancient ubiquitous in other western Eurasians (Figure 1B), indicating a Europeans (Figure 1C, Supplementary Table S3). Further age western origination of this ethnic group. This is consistent with estimations using mitogenome rate (Soares et al., 2009) previous studies that were based on both uniparental markers revealed that the major haplogroup J2b1a can be traced back and whole-genome data (Hellenthal et al., 2014; Qamar et al., to 10.59±1.28 kya, a timeframe within the Neolithization and 2002; Quintana-Murci et al., 2004). Phylogeographic analysis Bronze Age processes in the Mediterranean region (Marcus et based on all available complete mitogenomes retrieved from al., 2020), with the Kalash branch (J2b1a7a) 2.26±1.44 kya the online platform MitoTool (http://mitotool.kiz.ac.cn/) (Fan & reflecting a recent split from its European counterpart, Yao, 2011) as well as from published literature further followed by independent differentiation in the Hindu-Kush suggested that most haplogroups identified in Kalash, like region. Similarly, haplogroup R0a5a, with root types found R0a, U2e1h, U4a1, H2a1a, T2a1a, and U2e2a1, had sub- around the Mediterranean region and a coalescent age of branches (e.g., R0a5a, U2e1h1, U4a1f, H2a1a3, etc.) ~3.01±1.5 kya in the Kalash, would also have been introduced distributed restrictively in northern Pakistan and shared by the into the Kalash gene pool during these recent times. Taken Kalash and other Indo-European-speaking populations in the together, about ~50% of the Kalash maternal genetic area (Supplementary Figure S1; Supplementary Table S2). components were derived from haplogroups J2b1a7a and Interestingly, the Kalash individuals distributed sporadically in R0a5a, thus documenting recent genetic introgression (likely the terminal positions of the sub-branches, strongly from the ancestors of modern Sardinians) to the Kalash, suggesting traces of recent gene flow from other groups into around the time when migration to Sardinia was active from the Kalash (Supplementary Figure S1). Moreover, these the northern and eastern Mediterranean regions (starting ~1 haplogroups also showed prevalence in the Mediterranean 000 BCE) (Fernandes et al., 2020). region (e.g., U2e2a1, J2b1a1, and R0a) or in Eurasian Steppe Interestingly, this genetic connection echoes well with the (e.g., H2a1a, T2a1a, U2e1h, U4a1, and U4b1a4), thus close genetic affinity found between Sardinians and Kalash possibly reached the Hindu-Kush region in different periods from studies based on eye-color informative single nucleotide and further introgressed into the Kalash by recent gene flow. polymorphisms (SNPs) (Walsh et al., 2011), thus probably Different from the above lineages in which the Kalash underlying the similarities in physical features, e.g., lighter samples distributed sporadically in different branches, complexion of Kalash and Europeans. Moreover, given that haplogroup J2b1a had a sub-branch (defined by a non- the age of J2b1a7a fell within the Macedonian advancement synonymous transition at position 11204 and tentatively towards northern Pakistan (327 BCE) (Olivieri et al., 2019), named as J2b1a7a) occupied by six Kalash and two Pashtun and the existence of J2b1c, J2b1a1, and J2b1a3 (sister and individuals, a neighboring group previously shown to have had sub-type lineages of J2b1a) in ancient and modern Greeks a limited European connection based on Y chromosome study (Lazaridis et al., 2017; Pala et al., 2012), including evidence of (Firasat et al., 2007). Further phylogeographic analysis eastern Mediterranean immigrants in (Harney et showed that the root types of J2b1a7a were predominantly found in Kalash, whereas a Pashtun individual positioned in al., 2019), it is also probable that this genetic connection was one terminal branch, indicating an in-situ differentiation of this mediated by the Greeks. In fact, according to historical lineage in the region and further spread into the . records, limited females participated in foreign campaigns of Importantly, J2b1a7a shared substitution 16274 with its sister ancient Greek warfare (Loman, 2004), making it likely that the haplogroup (defined by substitutions 15319 and 16213 and females also took part in this occupation, thus contributing to tentatively named as J2b1a7b) from Europe (nine Sardinians) the Kalash gene pool. This scenario is further supported by (Figure 1C; Supplementary Table S3), indicating a close evidence of human mobility towards mainland Greece and genetic connection between the Kalash and Europeans. islands like Sardinia, especially from the Mediterranean, via Together with the relatively high proportion of J2b1a7a in the both sea and land routes during the Mesolithic and even more Kalash samples (17.6%), this haplogroup sheds important recent times (Demand, 2012; Fernandes et al., 2020; Marcus light on the European ancestry of this ethnic group. et al., 2020). However, the absence of J2b1a in other regions Moreover, considering that the shared position 16274 that had been occupied by Alexander’s ancient empire between the Kalash and Sardinians is hypervariable, it is also (especially Greece), as well as its prevalence in Sardinian and probable that the two lineages J2b1a7a and J2b1a7b were Kalash people, should not be ignored. One probable

Zoological Research 41(5): 552−556, 2020 553

554 www.zoores.ac.cn

Figure 1 Sample locations, distribution of haplogroups identified in Kalash people, and phylogeographic structure of haplogroup J2b1a A: Geographic locations of populations with complete mitogenome sequences in Pakistan and surrounding countries are shown in the inset. Number of mitogenomes available from each region is proportional to color intensity in respective regions defined in figure legends (comprising 6 075 complete mitogenome sequences; see Supplementary Table S1). B: Schematic tree showing eight west Eurasian haplogroups identified in Kalash and their frequency in other local and west Eurasian populations. C: Phylogeographic reconstruction using median-joining network for haplogroup J2b1a from complete mitogenomes (comprising 106 complete mitogenome sequences belonging to haplogroup J2b1a and five ancient mitogenomes belonging to J2b1; see Supplementary Table S3). Each circle represents one individual sample, unless represented by a number in the circle. Dotted line shows case in which J2b1a7a and J2b1a7b emanate from root of J2b1a independently, with position 16274 (in italics) serving as a parallel mutation on both branches. Mutated positions are shown on branches with different colors for each type of mutation, as seen in legend. Specific clade shared between Sardinians and Kalash is enclosed in red circle; Kalash and Pashtun samples are shown in italics on different branches of node. Geographic affiliations of samples are shown in different colors, as defined in legends. Red circles represent ancient mitogenomes included in network construction. R or Y indicate heteroplasmic states. @ represents reverse mutation, < represents parallel mutation

on branches. explanation would be limited female migration along with collected samples; Z.U.R. and J.Y.T. collected and analyzed Alexander’s siege into other regions, or genetic dilution by the data; Z.U.R., Y.C.L., and Q.P.K. wrote the paper. All later demographic events. Additionally, genetic isolation, authors read and approved the final version of the manuscript. followed by bottlenecks in both Sardinians (Di Gaetano et al., 1,2,3 1,* 1 2014) and Kalash (Ayub et al., 2015), further played likely Zia Ur Rahman , Yu-Chun Li , Jiao-Yang Tian , roles in the increase of this lineage in these two regions. Qing-Peng Kong1,* Moreover, the limited number of reported mitogenome 1 State Key Laboratory of Genetic Resources and Evolution/Key sequences available from Greece so far could also result in Laboratory of Healthy Aging Research of Yunnan Province, this observation. More studies will be carried out to explain Kunming Institute of Zoology, Chinese Academy of Sciences, whether this maternal genetic connection between the Kalash Kunming, Yunnan 650223, China and Sardinians was mediated by Greek expansion. 2 University of Chinese Academy of Sciences, Beijing 100049, In summary, our analysis observed a genetic ancestry from China Europe (probably around the Mediterranean) within the Kalash 3 Kunming College of Life Science, University of Chinese people from about 3.01±1.5 and 2.26±1.4 kya. This recent genetic contribution from Europe, as revealed in this study, Academy of Sciences, Kunming, Yunnan 650204, China accounts for a significant proportion (~50%) of the Kalash, *Corresponding authors, E-mail: [email protected]; thus playing an important role in the formation of the maternal [email protected] gene pool of this ethnic group. Thus, our study sheds important light on the genetic history of the Kalash people of REFERENCES

northern Pakistan. Ayub Q, Mezzavilla M, Pagani L, Haber M, Mohyuddin A, Khaliq S, et al. 2015. The Kalash genetic isolate: ancient divergence, drift, and selection. DATA AVAILABILITY The American Journal of Human , 96(5): 775−783. The mitogenome sequences of nine Kalash individuals were Cacopardo AS. 2011. Are the Kalasha really of Greek origin? The Legend obtained from our unpublished dataset (GenBank accession of and the Pre-Islamic World of the Hindu Kush. Acta

Nos. MN595835-MN595843). Additionally, the 11 mitogenome Orientalia , 72: 47−92. sequences from northern Pakistan were retrieved from our Cann HM, de Toma C, Cazes L, Legrand MF, Morel V, Piouffre L, et al. unpublished work (GenBank accession Nos. MN595685, 2002. A human genome diversity cell line panel. Science, 296(5566):

MN595706, MN595718, MN595749, MN595751, MN595765, 261−262. MN595769, MN595807, MN595818, MN595820, and Demand NH. 2012. The Mediterranean Context of Early Greek History. New

MN595890). York: John Wiley & Sons. Di Gaetano C, Fiorito G, Ortu MF, Rosa F, Guarrera S, Pardini B, et al. SUPPLEMENTARY DATA 2014. Sardinians genetic background explained by runs of Homozygosity

Supplementary data to this article can be found online. and genomic regions under positive selection. PLoS One, 9(3): e91237. Fan L, Yao YG. 2011. MitoTool: a web server for the analysis and retrieval COMPETING INTERESTS of human mitochondrial DNA sequence variations. Mitochondrion, 11(2):

351−356. The authors declare that they have no competing interests. Fernandes DM, Mittnik A, Olalde I, Lazaridis I, Cheronet O, Rohland N, et al. 2020. The spread of steppe and Iranian-related ancestry in the islands of AUTHORS’ CONTRIBUTIONS

the western Mediterranean. Nature Ecology & Evolution, 4(3): 334−345. Q.P.K., Y.C.L., and Z.U.R. designed the research; Z.U.R. Firasat S, Khaliq S, Mohyuddin A, Papaioannou M, Tyler-Smith C, Underhill

Zoological Research 41(5): 552−556, 2020 555 PA, et al. 2007. Y-chromosomal evidence for a limited Greek contribution to (Bir-kot-ghwandai). Nuclear Instruments and Methods in Physics Research

the Pathan population of Pakistan. European Journal of Human Genetics, Section B: Beam Interactions with Materials and Atoms, 456: 148−156.

15 (1): 121−126. Pala M, Olivieri A, Achilli A, Accetturo M, Metspalu E, Reidla M, et al. 2012. Harney É, Nayak A, Patterson N, Joglekar P, Mushrif-Tripathy V, Mallick S, Mitochondrial DNA signals of late glacial recolonization of Europe from near

et al. 2019. Ancient DNA from the skeletons of Roopkund Lake reveals eastern refugia. The American Journal of Human Genetics, 90(5): 915−924.

Mediterranean migrants in India. Nature Communication, 10(1): 3670. Qamar R, Ayub Q, Mohyuddin A, Helgason A, Mazhar K, Mansoor A, et al. Hellenthal G, Busby GBJ, Band G, Wilson JF, Capelli C, Falush D, et al. 2002. Y-chromosomal DNA variation in Pakistan. The American Journal of

2014. A genetic atlas of human admixture history. Science, 343(6172): Human Genetics, 70(5): 1107−1124.

747−751. Quintana-Murci L, Chaix R, Wells RS, Behar DM, Sayar H, Scozzari R, et Lazaridis I, Mittnik A, Patterson N, Mallick S, Rohland N, Pfrengle S, et al. al. 2004. Where west meets east: the complex mtDNA landscape of the 2017. Genetic origins of the Minoans and Mycenaeans. Nature, 548(7666): southwest and Central Asian corridor. The American Journal of Human

214−218. Genetics , 74(5): 827−845. Loman P. 2004. No woman no war: women’s participation in ancient Greek Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky

warfare. Greece & Rome, 51(1): 34−54. LA, et al. 2002. Genetic structure of human populations. Science,

Mansoor A, Mazhar K, Khaliq S, Hameed A, Rehman S, Siddiqi S, et al. 298 (5602): 2381−2385. 2004. Investigation of the Greek ancestry of populations from northern Soares P, Ermini L, Thomson N, Mormina M, Rito T, Röhl A, et al. 2009.

Pakistan. Human Genetics, 114(5): 484−490. Correcting for purifying selection: an improved human mitochondrial Marcus JH, Posth C, Ringbauer H, Lai L, Skeates R, Sidore C, et al. 2020. molecular clock. The American Journal of Human Genetics, 84(6):

Genetic history from the Middle Neolithic to present on the Mediterranean 740−759.

island of Sardinia. Nature Communications, 11(1): 939. Walsh S, Liu F, Ballantyne KN, van Oven M, Lao O, Kayser M. 2011. Olivieri LM, Marzaioli F, Passariello I, Iori E, Micheli R, Terrasi F, et al. IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye 2019. A new revised chronology and cultural sequence of the Swat valley, colour in the absence of ancestry information. Forensic Science (Pakistan) in the light of current excavations at Barikot International: Genetics, 5(3): 170−180.

556 www.zoores.ac.cn SUPPLEMENTARY INFORMATION

Supplementary Materials and Methods Study area and sampling The Kalash are an enigmatic community living in the District in Khyber Pakhtunkhwa Province of present-day northern Pakistan. They are a non-Muslim minority and reside in three nearby villages known as the Kalash valleys (, Rumbur, and Birir) (E71–40° longitude and N35–40° latitude). Samples from nine maternally unrelated Kalash individuals (Supplementary Figure S2), as well as 11 samples from neighboring populations (one Pashtun and 10 Kho samples) were obtained, as shown in the Supplementary Material as part of our unpublished work. This study was approved by the Ethics Committee at Kunming Institute of Zoology, Chinese Academy of Sciences, China. All procedures were handled in accordance with the Declaration of Helsinki, with informed consent obtained from each participant of the study. Data analysis Phylogeographic trees of mitogenomes were reconstructed manually and reconfirmed through mt-Phyl software (https://sites.google.com/site/mtphyl/home). Variants in the mitogenome sequences were scored relative to the revised Cambridge Reference Sequence (rCRS) (Andrews et al., 1999). The variable lengths in homopolymeric C-stretches in regions 303–315 and AC indels at 515–522, 16 182C, 16 183C, 16 193.1C(C), and 16 519 were disregarded in all phylogenetic reconstructions. The type of each variant was checked by mtDNA-GeneSyn software (Pereira et al., 2009). To estimate coalescent age using rho (ρ) and standard error sigma (σ) statistics (Forster et al., 1996; Saillard et al., 2000), a complete mitogenome rate (Soares et al., 2009) was adopted. New sub-haplogroups were named following the criteria of PhyloTree Build 17 (http://phylotree.org/) (van Oven & Kayser, 2009). A reduced median network was constructed based on complete mitogenomes using NETWORK 5.0 (www.fluxus-engineering.com/sharenet.htm) (Bandelt et al., 1999).

REFERENCES Andrews RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull DM, Howell N. 1999. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nature Genetics, 23(2): 147. Bandelt HJ, Forster P, Röhl A. 1999. Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution, 16(1): 37–48. Forster P, Harding R, Torroni A, Bandelt HJ. 1996. Origin and evolution of Native American mtDNA variation: a reappraisal. American Journal of Human Genetics, 59(4): 935–945. Pereira L, Freitas F, Fernandes V, Pereira JB, Costa MD, Costa S, et al. 2009. The diversity present in 5140 human mitochondrial genomes. The American Journal of Human Genetics, 84(5): 628–640. Saillard J, Forster P, Lynnerup N, Bandelt HJ, Nørby S. 2000. mtDNA variation among Greenland Eskimos: the edge of the Beringian expansion. American Journal of Human Genetics, 67(3): 718–726. Soares P, Ermini L, Thomson N, Mormina M, Rito T, Röhl A, et al. 2009. Correcting for purifying selection: an improved human mitochondrial molecular clock. The American Journal of Human Genetics, 84(6): 740–759. van Oven M, Kayser M. 2009. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Human Mutation, 30(2): E386–E394. Supplementary Tables Supplementary Table S1 Whole mitogenome sequence data from previous literature and our new unpublished data included in present study

Nation Geographic Region n References Adygei Eastern European & 17 HGDP (Zheng et al., 2014) Siberia Armenian Middle East & 30 Schönberg et al., 2011 Caucasus Azeri Middle East & 30 Schönberg et al., 2011 Caucasus Balochi South Asian 25 HGDP (Zheng et al., 2014) Basque European 20 Batini et al., 2017 Bavarian European 20 Batini et al., 2017 Bedouin Middle East & 48 HGDP (Zheng et al., 2014) Caucasus Bengali South Asian 86 Sudmant et al., 2015 Brahui South Asian 25 HGDP (Zheng et al., 2014) British European 91 Sudmant et al., 2015 Burusho South Asian 24 HGDP (Zheng et al., 2014) Cambodians East/South East Asian 8 HGDP (Zheng et al., 2014) Chinese Dai East/South East Asian 98 Sudmant et al., 2015 Chinese Han East/South East Asian 103 Sudmant et al., 2015 Kho South Asian 148 Our data (Rahman et al., unpublished) Danish European 20 Batini et al., 2017 Daur East/South East Asian 10 HGDP (Zheng et al., 2014) Middle East & 47 HGDP (Zheng et al., 2014) Caucasus East Pamir Kyrgyz Central Asian 68 Peng et al., 2018 English European 20 Batini et al., 2017 Finnish European 99 Sudmant et al., 2015 French European 28 HGDP (Zheng et al., 2014) Frisian European 20 Batini et al., 2017 Georgian Middle East & 28 Schöenberg et al., 2011 Caucasus Greeks Eastern European & 20 Batini et al., 2017 Siberia Gujrati South Asian 106 Sudmant et al., 2015 Hazara South Asian 25 HGDP (Zheng et al., 2014) Hungarian Eastern European & 20 Batini et al., 2017 Siberia Iberian European 107 Sudmant et al., 2015 Iranian Middle East & 30 Schönberg et al., 2011 Caucasus Iranian- Azeri Middle East & 22 Derenko et al., 2013 Caucasus Iranian- Bakhtiari Middle East & 2 Derenko et al., 2013 Caucasus Iranian- Gilak Middle East & 2 Derenko et al., 2013 Caucasus Iranian- Indian Middle East & 1 Derenko et al., 2013 Caucasus Iranian- Khalaj Middle East & 2 Derenko et al., 2013 Caucasus Iranian- Khorasani Middle East & 4 Derenko et al., 2013 Caucasus Iranian- Kurd Middle East & 5 Derenko et al., 2013 Caucasus Iranian- Lur Middle East & 5 Derenko et al., 2013 Caucasus Iranian- Middle East & 4 Derenko et al., 2013 Mazandarani Caucasus Iranian- Persian Middle East & 181 Derenko et al., 2013 Caucasus Iranian- Qashqai Middle East & 112 Derenko et al., 2013 Caucasus Iranian- Turkmen Middle East & 2 Derenko et al., 2013 Caucasus Iranian-Armenian Middle East & 10 Derenko et al., 2013 Caucasus Irish European 20 Batini et al., 2017 Japanese East/South East Asian 104 Sudmant et al., 2015 Kalash South Asian 9 Our data (Rahman et al., unpublished) Kalash South Asian 25 HGDP (Zheng et al., 2014) Kashmiri South Asian 78 Sharma et al., 2018 Kinh East/South East Asian 99 Sudmant et al., 2015 Lipka Tatars Eastern European & 11 Pankratov et al., 2016 Siberia Lowland Kyrgyz Central Asian 54 Peng et al., 2018 Lowland Tajik Central Asian 28 Peng et al., 2018 Lowland Uygur East/South East Asian 27 Peng et al., 2018 Makrani South Asian 24 HGDP (Zheng et al., 2014) Mongols East/South East Asian 10 HGDP (Zheng et al., 2014) Norwegian European 20 Batini et al., 2017 Orcadian European 16 HGDP (Zheng et al., 2014) Orcadian European 20 Batini et al., 2017 Palestinian Middle East & 50 HGDP (Zheng et al., 2014) Caucasus Palestinian Middle East & 20 Batini et al., 2017 Caucasus Pamiri Tajik Central Asian 50 Peng et al., 2018 Pathan South Asian 52 Our data (Rahman et al., unpublished) Pathan South Asian 21 HGDP (Zheng et al., 2014) Polish Eastern European & 100 Malyarchuk et al., 2017 Siberia Punjabi-Pakistan South Asian 96 Sudmant et al., 2015 Russian Eastern European & 41 Malyarchuk et al., 2017 Siberia Russian Eastern European & 46 Malyarchuk et al., 2017 Siberia Russian Eastern European & 45 Malyarchuk et al., 2017 Siberia Russian Eastern European & 51 Malyarchuk et al., 2017 Siberia Russian Eastern European & 45 Malyarchuk et al., 2017 Siberia Russian Eastern European & 61 Malyarchuk et al., 2017 Siberia Russian Eastern European & 25 HGDP (Zheng et al., 2014) Siberia Saami European 20 Batini et al., 2017 Sardinian European 2092 Olivieri et al., 2017 Sarikoli Tajik Central Asian 86 Peng et al., 2018 Serbian European 20 Batini et al., 2017 Sherpa East/South East Asian 76 Kang et al., 2013 Sindhi South Asian 25 HGDP (Zheng et al., 2014) Southern Han East/South East Asian 104 Sudmant et al., 2015 Chinese Spanish European 20 Batini et al., 2017 Srilankan Tamil South Asian 102 Sudmant et al., 2015 Telugu South Asian 103 Sudmant et al., 2015 Toscani European 107 Sudmant et al., 2015 Turkish Middle East & 29 Schönberg et al., 2011 Caucasus Turkish Middle East & 20 Batini et al., 2017 Caucasus Utah European 61 Sudmant et al., 2015 Uyghur East/South East Asian 9 HGDP (Zheng et al., 2014) Wakhi Tajik Central Asian 66 Peng et al., 2018 West Pamir Kyrgyz Central Asian 3 Peng et al., 2018 Xibo East/South East Asian 8 HGDP (Zheng et al., 2014) Yakut Eastern European & 24 HGDP (Zheng et al., 2014) Siberia Yemni Middle East & 113 Vyas et al., 2016 Caucasus

REFERENCES Batini C, Hallast P, Vågene ÅJ, Zadik D, Eriksen HA, Pamjav H, et al. 2017. Population resequencing of European mitochondrial genomes highlights sex-bias in Bronze Age demographic expansions. Scientific Reports, 7(1): 12086. Derenko M, Malyarchuk B, Bahmanimehr A, Denisova G, Perkova M, Farjadian S, et al. 2013. Complete mitochondrial DNA diversity in Iranians. PLoS One, 8(11): e80673. Kang LL, Zheng HX, Chen F, Yan S, Liu K, Qin ZD, et al. 2013. mtDNA lineage expansions in Sherpa population suggest adaptive evolution in Tibetan highlands. Molecular Biology and Evolution, 30(12): 2579–2587. Malyarchuk B, Litvinov A, Derenko M, Skonieczna K, Grzybowski T, Grosheva A, et al. 2017. Mitogenomic diversity in Russians and Poles. Forensic Science International: Genetics, 30: 51–56. Olivieri A, Sidore C, Achilli A, Angius A, Posth C, Furtwängler A, et al. 2017. Mitogenome diversity in Sardinians: a genetic window onto an Island’s Past. Molecular Biology and Evolution, 34(5): 1230–1239. Pankratov V, Litvinov S, Kassian A, Shulhin D, Tchebotarev L, Yunusbayev B, et al. 2016. East Eurasian ancestry in the middle of Europe: genetic footprints of Steppe nomads in the genomes of Belarusian Lipka Tatars. Scientific Reports, 6(1): 30197. Peng MS, Xu WF, Song JJ, Chen X, Sulaiman X, Cai LH, et al. 2018. Mitochondrial genomes uncover the maternal history of the Pamir populations. European Journal of Human Genetics, 26(1): 124–136. Schönberg A, Theunert C, Li MK, Stoneking M, Nasidze I. 2011. High-throughput sequencing of complete human mtDNA genomes from the Caucasus and West Asia: high diversity and demographic inferences. European Journal of Human Genetics, 19(9): 988–994. Sharma I, Sharma V, Khan A, Kumar P, Rai E, Bamezai RNK, et al. 2018. Ancient human migrations to and through Jammu Kashmir- India were not of males exclusively. Scientific Reports, 8(1): 851. Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, et al. 2015. An integrated map of structural variation in 2, 504 human genomes. Nature, 526(7571): 75–81. Vyas DN, Kitchen A, Miró-Herrans AT, Pearson LN, Al-Meeri A, Mulligan CJ. 2016. Bayesian analyses of Yemeni mitochondrial genomes suggest multiple migration events with Africa and Western Eurasia. American Journal of Physical Anthropology, 159(3): 382–393. Zheng HX, Qin ZD, Jin L, Jin L. 2014. The mitochondrial DNA diversity of HGDP populations. Direct Submission. Supplementary Table S2 Accession numbers and information of mitogenome sequences used for phylogeographic reconstruction of haplogroups in Kalash population

Code Accession Geographic References number/ID origin 52 MN595769/CHM192 South Asia Our data (Rahman et al., unpublished) 49 MN595685/CHF101 South Asia Our data (Rahman et al., unpublished) 55 MN595749/CHF85 South Asia Our data (Rahman et al., unpublished) 54 MN595765/CHM120 South Asia Our data (Rahman et al., unpublished) 56 MN595818/CHM90 South Asia Our data (Rahman et al., unpublished) 2 MN595718/CHF184 South Asia Our data (Rahman et al., unpublished) 97 MN595807/CHM71 South Asia Our data (Rahman et al., unpublished) 93 MN595706/CHF160 South Asia Our data (Rahman et al., unpublished) 141 MN595751/CHF96 South Asia Our data (Rahman et al., unpublished) 138 MN595890/PAM317 South Asia Our data (Rahman et al., unpublished) 132 MN595820/CHM92 South Asia Our data (Rahman et al., unpublished) 127 JX462724 South Asia Khan et al., 2012 136 KF450841 South Asia Lippold et al., 2014 73 AY714022. 1 South Asia Palanichamy et al., 2004 108 KF450905. 1 South Asia Lippold et al., 2014 53 KAM4 South Asia Our data (Rahman et al., unpublished) 7 KAM7 South Asia Our data (Rahman et al., unpublished) 8 KAM9 South Asia Our data (Rahman et al., unpublished) 117 KAM6 South Asia Our data (Rahman et al., unpublished) 145 KAM8 South Asia Our data (Rahman et al., unpublished) 146 KAM5 South Asia Our data (Rahman et al., unpublished) 163 KAF12 South Asia Our data (Rahman et al., unpublished) 57 KF450963 South Asia Lippold et al., 2014 12 KF450971. 1 South Asia Lippold et al., 2014 9 KF450972. 1 South Asia Lippold et al., 2014 10 KF450974. 1 South Asia Lippold et al., 2014 11 KF450982. 1 South Asia Lippold et al., 2014 14 EU597493. 1 South Asia Hartmann et al., 2009 20 KF450968. 1 South Asia Lippold et al., 2014 19 KF450986. 1 South Asia Lippold et al., 2014 16 KF450967. 1 South Asia Lippold et al., 2014 15 KF450980. 1 South Asia Lippold et al., 2014 83 KF450975. 1 South Asia Lippold et al., 2014 95 KF450979. 1 South Asia Lippold et al., 2014 94 KF450984. 1 South Asia Lippold et al., 2014 92 KF450983. 1 South Asia Lippold et al., 2014 142 KF450977 South Asia Lippold et al., 2014 143 KF450978 South Asia Lippold et al., 2014 144 EU597564 South Asia Hartmann et al., 2009 147 KF450973 South Asia Lippold et al., 2014 140 KF450976 South Asia Lippold et al., 2014 133 EU597575 South Asia Hartmann et al., 2009 162 KF450970. 1 South Asia Lippold et al., 2014 113 JX462708 South Asia Khan et al., 2012 126 KC990673 South Asia Ramanan et al., 2013 59 MF523044 Central Peng et al., 2018 Asia 60 MF523045 Central Peng et al., 2018 Asia 61 MF523046 Central Peng et al., 2018 Asia 58 MF523052 Central Peng et al., 2018 Asia 68 MF523033 Central Peng et al., 2018 Asia 100 MF523063. 1 Central Peng et al., 2018 Asia 128 KX675279 Central Marchi et al., 2017 Asia 155 MF522979. 1 Central Peng et al., 2018 Asia 98 MF523059 Central Peng et al., 2018 Asia 99 MF523064. 1 Central Peng et al., 2018 Asia 96 MF523030 Central Peng et al., 2018 Asia 112 MF523127.1 East Asia Peng et al., 2018 21 MF522857 East Asia Peng et al., 2018 22 MF523082 East Asia Peng et al., 2018 23 MF523123 East Asia Peng et al., 2018 24 MF523124 East Asia Peng et al., 2018 25 MF523132 East Asia Peng et al., 2018 26 MF523135 East Asia Peng et al., 2018 27 MF523182 East Asia Peng et al., 2018 28 MF523202 East Asia Peng et al., 2018 50 MF523165 East Asia Peng et al., 2018 51 MF523217 East Asia Peng et al., 2018 69 MF522925 East Asia Peng et al., 2018 29 MF523180 East Asia Peng et al., 2018 30 MF523206 East Asia Peng et al., 2018 17 MF523176. 1 East Asia Peng et al., 2018 18 MF523220. 1 East Asia Peng et al., 2018 91 MF522922. 1 East Asia Peng et al., 2018 107 KF849954. 1 East Asia Jiang et al., 2014 110 KU683218. 1 East Asia Zheng et al., 2017 111 KU683284. 1 East Asia Zheng et al., 2017 101 MF523188. 1 East Asia Peng et al., 2018 102 MF523134. 1 East Asia Peng et al., 2018 137 MF522903 East Asia Peng et al., 2018 134 MF523077 East Asia Peng et al., 2018 135 MF523119 East Asia Peng et al., 2018 152 MG952837. 1 East Asia Malyarchuk et al., 2018 158 KU682886. 1 East Asia Zheng et al., 2017 159 KU682905. 1 East Asia Zheng et al., 2017 160 KU682920 East Asia Zheng et al., 2017 161 KU682931 East Asia Zheng et al., 2017 37 HM852854 Middle Schönberg et al., 2011 East and Caucasus 5 KX499665. 1 Middle Greenspan, FTDNA East and Caucasus 4 KP407079. 1 Middle Gandini et al., 2016 East and Caucasus 13 KP407080. 1 Middle Gandini et al., 2016 East and Caucasus 1 KJ716336. 1 Middle Greenspan, FTDNA East and Caucasus 89 JQ798052. 1 Middle Pala et al., 2012 East and Caucasus 84 AY495304. 1 Middle Coble et al., 2004 East and Caucasus 131 HM852776 Middle Schönberg et al., 2011 East and Caucasus 157 KC911400. 1 Middle Derenko et al., 2013 East and Caucasus 72 AY495298. 1 Middle Coble et al., 2004 East and Caucasus 64 KY670917 Eastern Malyarchuk et al., 2017 Europe and Siberia 31 KJ856689 Eastern Derenko et al., 2014 Europe and Siberia 32 KJ856717 Eastern Derenko et al., 2014 Europe and Siberia 33 KJ856745 Eastern Derenko et al., 2014 Europe and Siberia 82 FJ656215. 1 Eastern Rogaev et al., 2009 Europe and Siberia 109 FJ493504. 1 Eastern Sukernik et al., 2012 Europe and Siberia 139 KJ856729 Eastern Derenko et al., 2014 Europe and Siberia 118 MG597495 Eastern Greenspan, FTDNA Europe and Siberia 151 FJ147313. 1 Eastern Sukernik et al., 2012 Europe and Siberia 156 FJ147316. 1 Eastern Sukernik et al., 2012 Europe and Siberia 153 FJ147314. 1 Eastern Sukernik et al., 2012 Europe and Siberia 63 KF162320 Europe Li et al., 2014 48 KY646099 Europe Greenspan, FTDNA 36 KF162914 Europe Li et al., 2014 35 KY399164 Europe Olivieri et al., 2017 43 KM016488 Europe Greenspan, FTDNA 34 AY339428 Europe Moilanen et al., 2003 3 KP407078. 1 Europe Gandini et al., 2016 6 KF055865. 1 Europe Gómez-Carballa et al., 2013 90 JQ798053. 1 Europe Pala et al., 2012 88 JX297195. 1 Europe Cardoso et al., 2013 87 JX297194. 1 Europe Cardoso et al., 2013 77 KF161904. 1 Europe Li et al., 2014 106 EF660957. 1 Europe Gasparre et al., 2007 105 KF161734. 1 Europe Li et al., 2014 104 JN581653. 1 Europe Bertolin et al., 2011 103 KJ856797. 1 Europe Derenko et al., 2014 115 KF161136. 1 Europe Li et al., 2014 116 MG952848. 1 Europe Malyarchuk et al., 2018 130 KF161730 Europe Li et al., 2014 129 KF161522 Europe Li et al., 2014 124 KF161073 Europe Li et al., 2014 123 KF161338 Europe Li et al., 2014 122 KF162533 Europe Li et al., 2014 121 KF451618 Europe Lippold et al., 2014 125 KP733895 Europe Greenspan, FTDNA 120 MG669471 Europe Greenspan, FTDNA 119 KF161335 Europe Li et al., 2014 154 MG952787. 1 Europe Malyarchuk et al., 2018 70 JN383991. 1 Europe Greenspan, FTDNA 71 KC137255. 1 Europe Greenspan, FTDNA 67 HQ667351 Out of Greenspan, FTDNA Eurasia 42 FJ668389 Out of Greenspan, FTDNA Eurasia 66 JQ702059 unknown Behar et al., 2012 65 JQ703017 unknown Behar et al., 2012 62 JQ703213 unknown Behar et al., 2012 45 MF497500 unknown Järviaho et al., 2018 44 JQ702397 unknown Behar et al., 2012 47 MF497516 unknown Järviaho et al., 2018 46 MF497515 unknown Järviaho et al., 2018 40 JQ703144 unknown Behar et al., 2012 41 JQ705852 unknown Behar et al., 2012 39 JQ704579 unknown Behar et al., 2012 38 KJ801443 unknown Lang et al., 2015 86 JQ702379. 1 unknown Behar et al., 2012 85 JQ702594. 1 unknown Behar et al., 2012 81 JQ702611. 1 unknown Behar et al., 2012 80 JQ705057. 1 unknown Behar et al., 2012 79 JQ705133. 1 unknown Behar et al., 2012 78 JQ703890. 1 unknown Behar et al., 2012 74 JQ702969. 1 unknown Behar et al., 2012 76 JQ706008. 1 unknown Behar et al., 2012 75 JQ703088. 1 unknown Behar et al., 2012 149 JQ704215 unknown Behar et al., 2012 150 JQ703947 unknown Behar et al., 2012 148 JQ703903 unknown Behar et al., 2012 114 JQ705900 unknown Behar et al., 2012 REFERENCES Behar DM, van Oven M, Rosset S, Metspalu M, Loogväli EL, Silva NM, et al. 2012. A “Copernican” reassessment of the human mitochondrial DNA tree from its root. The American Journal of Human Genetics, 90(4): 675–684. Bertolin C, Magri C, Barlati S, Vettori A, Perini GI, Peruzzi P, et al. 2011. Analysis of complete mitochondrial genomes of patients with schizophrenia and bipolar disorder. Journal of Human Genetics, 56(12): 869–872. Cardoso S, Valverde L, Alfonso-Sánchez MA, Palencia-Madrid L, Elcoroaristizabal X, Algorta J, et al. 2013. The expanded mtDNA phylogeny of the Franco-Cantabrian region upholds the pre-neolithic genetic substrate of Basques. PLoS One, 8(7): e67835. Coble MD, Just RS, O’Callaghan JE, Letmanyi IH, Peterson CT, Irwin JA, et al. 2004. Single nucleotide polymorphisms over the entire mtDNA genome that increase the power of forensic testing in Caucasians. International Journal of Legal Medicine, 118(3): 137–146. Derenko M, Malyarchuk B, Bahmanimehr A, Denisova G, Perkova M, Farjadian S, et al. 2013. Complete mitochondrial DNA diversity in Iranians. PLoS One, 8(11): e80673. Derenko M, Malyarchuk B, Denisova G, Perkova M, Litvinov A, Grzybowski T, et al. 2014. Western Eurasian ancestry in modern Siberians based on mitogenomic data. BMC Evolutionary Biology, 14(1): 217.

Gandini F, Achilli A, Pala M, Bodner M, Brandini S, Huber G, et al. 2016. Mapping human dispersals into the Horn of Africa from Arabian Ice Age refugia using mitogenomes. Scientific Reports, 6(1): 25472. Gasparre G, Porcelli AM, Bonora E, Pennisi LF, Toller M, Iommarini L, et al. 2007. Disruptive mitochondrial DNA mutations in complex I subunits are markers of oncocytic phenotype in thyroid tumors. Proceedings of the National Academy of Sciences of the United States of America, 104(21): 9001–9006. Gómez-Carballa A, Pardo-Seco J, Fachal L, Vega A, Cebey M, Martinón-Torres N, et al. 2013. Indian Signatures in the westernmost edge of the European Romani diaspora: new insight from Mitogenomes. PLoS One, 8(10): e75397. Greenspan B. Direct submission. Family Tree DNA. (http://www.ncbi.nlm.nih.gov/) Hartmann A, Thieme M, Nanduri LK, Stempfl T, Moehle C, Kivisild T, et al. 2009. Validation of microarray-based resequencing of 93 worldwide mitochondrial genomes. Human Mutation, 30(1): 115–122. Järviaho T, Hurme-Niiranen A, Soini HK, Niinimäki R, Möttönen M, Savolainen ER, et al. 2018. Novel non-neutral mitochondrial DNA mutations found in childhood acute lymphoblastic leukemia. Clinical Genetics, 93(2): 275–285. Jiang CH, Cui JH, Liu FY, Gao L, Luo YJ, Li P, et al. 2014. Mitochondrial DNA 10609T promotes hypoxia-induced increase of intracellular ROS and is a risk factor of high altitude polycythemia. PLoS One, 9(1): e87775. Khan NA, Govindaraj P, Thangaraj K. 2012. Occurence of G11778A LHON mutation in diverse mitochondrial haplogroups in Indian population. (Unpublished).

Lang M, Vocke CD, Merino MJ, Schmidt LS, Linehan WM. 2015. Mitochondrial DNA mutations distinguish bilateral multifocal renal oncocytomas from familial Birt-Hogg-Dubé tumors. Modern Pathology, 28(11): 1458–1469. Li ST, Besenbacher S, Li YR, Kristiansen K, Grarup N, Albrechtsen A, et al. 2014. Variation and association to diabetes in 2000 full mtDNA sequences mined from an exome study in a Danish population. European Journal of Human Genetics, 22(8): 1040–1045. Lippold S, Xu HY, Ko A, Li MK, Renaud G, Butthof A, et al. 2014. Human paternal and maternal demographic histories: insights from high-resolution Y chromosome and mtDNA sequences. Investigative Genetics, 5: 13. Malyarchuk B, Litvinov A, Derenko M, Skonieczna K, Grzybowski T, Grosheva A, et al. 2017. Mitogenomic diversity in Russians and Poles. Forensic Science International: Genetics, 30: 51–56. Malyarchuk B, Derenko M, Denisova G, Litvinov A, Rogalla U, Skonieczna K, et al. 2018. Whole mitochondrial genome diversity in two Hungarian populations. Molecular Genetics and Genomics, 293(5): 1255–1263.

Marchi N, Hegay T, Mennecier P, Georges M, Laurent R, Whitten M, et al. 2017. Sex-specific genetic diversity is shaped by cultural factors in Inner Asian human populations. American Journal of Physical Anthropology, 162(4): 627–640. Moilanen JS, Finnilä S, Majamaa K. 2003. Lineage-specific selection in human mtDNA: lack of polymorphisms in a segment of MTND5 gene in Haplogroup J. Molecular Biology and Evolution, 20(12): 2132–2142. Olivieri A, Sidore C, Achilli A, Angius A, Posth C, Furtwängler A, et al. 2017. Mitogenome diversity in Sardinians: a genetic window onto an Island’s past. Molecular Biology and Evolution, 34(5): 1230–1239. Pala M, Olivieri A, Achilli A, Accetturo M, Metspalu E, Reidla M, et al. 2012. Mitochondrial DNA signals of late glacial recolonization of Europe from near eastern refugia. The American Journal of Human Genetics, 90(5): 915–924. Palanichamy MG, Sun C, Agrawal S, Bandelt HJ, Kong QP, Khan F, et al. 2004. Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on complete sequencing: implications for the peopling of South Asia. The American Journal of Human Genetics, 75(6): 966–978. Peng MS, Xu WF, Song JJ, Chen X, Sulaiman X, Cai LH, et al. 2018. Mitochondrial genomes uncover the maternal history of the Pamir populations. European Journal of Human Genetics, 26(1): 124–136. Ramanan D, Rani B, Govindaraj P, Sharma R, Rastogi B, Thangaraj K, et al. 2013. Spectrum of mitochondrial mutations in Non-syndromic hearing loss in North Indian Cohort. (Unpublished).

Rogaev EI, Grigorenko AP, Moliaka YK, Faskhutdinova G, Goltsov A, Lahti A, et al. 2009. Genomic identification in the historical case of the Nicholas II royal family. Proceedings of the National Academy of Sciences of the United States of America, 106(13): 5258–5263. Schönberg A, Theunert C, Li MK, Stoneking M, Nasidze I. 2011. High-throughput sequencing of complete human mtDNA genomes from the Caucasus and West Asia: high diversity and demographic inferences. European Journal of Human Genetics, 19(9): 988–994. Sukernik RI, Volodko NV, Mazunin IO, Eltsov NP, Dryomov SV, Starikovskaya EB. 2012. Mitochondrial genome diversity in the tubalar, even, and ulchi: contribution to prehistory of native Siberians and their affinities to native Americans. American Journal of Physical Anthropology, 148(1): 123–138. Zheng HX, Li L, Jiang XY, Yan S, Qin ZD, Wang XF, et al. 2017. MtDNA genomes reveal a relaxation of selective constraints in low-BMI individuals in a Uyghur population. Human Genetics, 136(10): 1353–1362.

Supplementary Table S3 GenBank accession Nos. and information of mitogenome sequences used for phylogeographic reconstruction of J2b1a using reduced median-joining network. Accession numbers of ancient mitogenomes are in italics

GenBank Haplogroup Country Geographic References accession No./ID region

KAF1 J2b1a North Pakistan South Asia Our data (Rahman et al., unpublished) KAM2 J2b1a North Pakistan South Asia Our data (Rahman et al., unpublished) DQ341090. 1 J2b1a Italy Europe Carelli et al., 2006 DQ523671. 1 J2b1a Sardinia Europe Fraumene et al., 2006 EU597520. 1 J2b1a Sardinia Europe Hartmann et al., 2009 EU673448. 1 J2b1a NA NA Greenspan, FTDNA JF915700. 1 J2b1a Ireland Europe Greenspan, FTDNA JN635304. 1 J2b1a NA NA Pacheu-Grau et al., 2013 JQ702196. 1 J2b1a NA NA Behar et al., 2012 JQ702199. 1 J2b1a NA NA Behar et al., 2012 JQ702442. 1 J2b1a NA NA Behar et al., 2012 JQ702477. 1 J2b1a NA NA Behar et al., 2012 JQ702553. 1 J2b1a NA NA Behar et al., 2012 JQ702593. 1 J2b1a NA NA Behar et al. 2012 JQ702758. 1 J2b1a NA NA Behar et al., 2012 JQ703517. 1 J2b1a NA NA Behar et al., 2012 JQ705925. 1 J2b1a NA NA Behar et al., 2012 JQ797939. 1 J2b1a Italy Europe Pala et al., 2012 JQ797942. 1 J2b1a Morocco Others Pala et al., 2012 JQ797947. 1 J2b1a Italy Europe Pala et al., 2012 JQ797948. 1 J2b1a Italy Europe Pala et al., 2012 JX152794. 1 J2b1a Denmark Europe Raule et al., 2014 JX153193. 1 J2b1a Finland Europe Raule et al., 2014 JX153750. 1 J2b1a Denmark Europe Raule et al., 2014 KF161080. 1 J2b1a Denmark Europe Li et al., 2014 KF161168. 1 J2b1a Denmark Europe Li et al., 2014 KF161362. 1 J2b1a Denmark Europe Li et al., 2014 KF161505. 1 J2b1a Denmark Europe Li et al., 2014 KF161527. 1 J2b1a Denmark Europe Li et al., 2014 KF450954. 1 J2b1a North Pakistan South Asia Lippold et al., 2014 KF450956. 1 J2b1a North Pakistan South Asia Lippold et al., 2014 KF450962. 1 J2b1a North Pakistan South Asia Lippold et al., 2014 KF450969. 1 J2b1a North Pakistan South Asia Lippold et al., 2014 KF450981. 1 J2b1a North Pakistan South Asia Lippold et al., 2014 KF450985. 1 J2b1a North Pakistan South Asia Lippold et al., 2014 KF451585. 1 J2b1a Sardinia Europe Lippold et al., 2014 KM101817. 1 J2b1a America Others Just et al., 2015 KM101927. 1 J2b1a America Others Just et al., 2015 KM101985. 1 J2b1a America Others Just et al., 2015 KX440268. 1 J2b1a Morocco Others Pereira et al., 2017 KX440269. 1 J2b1a Tunisia Others Pereira et al., 2017 KX440270. 1 J2b1a Morocco Others Pereira et al., 2017 KY399150. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY399160. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408236. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408253. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408272. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408304. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408322. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408330. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408334. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408342. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408345. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408364. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408367. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408382. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408391. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408414. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408425. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408441. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408449. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408469. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408507. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408562. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408599. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408728. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408803. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408858. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408875. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408958. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY408969. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409023. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409032. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409051. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409062. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409064. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409077. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409095. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409114. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409118. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409139. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409149. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409202. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409316. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409330. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409435. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409470. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409521. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409688. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409706. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409769. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409828. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409831. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY409967. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY410016. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY410037. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY410114. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY410121. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY410151. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY410222. 1 J2b1a Sardinia Europe Olivieri et al., 2017 KY670914. 1 J2b1a Russia Eastern Malyarchuk et al., 2017 Europe and Siberia MF498701. 1 J2b1a Germany Europe Knipper et al., 2017 MF991442. 1 J2b1a Spain Europe Fregel et al., 2018 MF991447. 1 J2b1a Spain Europe Fregel et al., 2018 MK239957. 1 J2b1a Ireland Europe Greenspan, FTDNA HG00106 J2b1a England and Europe Sudmant et al., 2015 Scotland I1109 J2b1 Bulgaria Eastern Mathieson et al., 2018 Europe and Siberia I2216 J2b1 Bulgaria Eastern Mathieson et al., 2018 Europe and Siberia KX354973. 1 J2b1 Italy Europe Modi et al., 2017 LHSZ5 J2b1 Hungary Europe Vai et al., 2019 SZ5 J2b1 Hungary Europe Amorim et al., 2018

REFERENCES Amorim CEG, Vai S, Posth C, Modi A, Koncz I, Hakenbeck S, et al. 2018. Understanding 6th-century barbarian social organization and migration through paleogenomics. Nature Communications, 9(1): 3547. Behar DM, van Oven M, Rosset S, Metspalu M, Loogvali EL, Silva NM, et al. 2012. A “Copernican” reassessment of the human mitochondrial DNA tree from its root. The American Journal of Human Genetics, 90(4): 675–684. Carelli V, Achilli A, Valentino ML, Rengo C, Semino O, Pala M, et al. 2006. Haplogroup effects and recombination of mitochondrial DNA: novel clues from the analysis of Leber hereditary optic neuropathy pedigrees. The American Journal of Human Genetics, 78(4): 564–574. Fraumene C, Belle EMS, Castri L, Sanna S, Mancosu G, Cosso M, et al. 2006. High resolution analysis and phylogenetic network construction using complete mtDNA sequences in Sardinian genetic isolates. Molecular Biology and Evolution, 23(11): 2101–2111. Fregel R, Méndez FL, Bokbot Y, Martín-Socas D, Camalich-Massieu MD, Santana J, et al. 2018. Ancient genomes from North Africa evidence prehistoric migrations to the Maghreb from both the Levant and Europe. Proceedings of the National Academy of Sciences of the United States of America, 115(26): 6774–6779.

Greenspan B. Direct submission. Family Tree DNA. http://www.ncbi.nlm.nih.gov/. Hartmann A, Thieme M, Nanduri LK, Stempfl T, Moehle C, Kivisild T, et al. 2009. Validation of microarray-based resequencing of 93 worldwide mitochondrial genomes. Human Mutation, 30(1): 115–122. Just RS, Scheible MK, Fast SA, Sturk-Andreaggi K, Röck AW, Bush JM, et al. 2015. Full mtGenome reference data: Development and characterization of 588 forensic-quality haplotypes representing three U.S. populations. Forensic Science International: Genetics, 14: 141–155. Knipper C, Mittnik A, Massy K, Kociumaka C, Kucukkalipci I, Maus M, et al. 2017. Female exogamy and gene pool diversification at the transition from the Final Neolithic to the Early Bronze Age in central Europe. Proceedings of the National Academy of Sciences of the United States of America, 114(38): 10083-10088. Li ST, Besenbacher S, Li YR, Kristiansen K, Grarup N, Albrechtsen A, et al. 2014. Variation and association to diabetes in 2000 full mtDNA sequences mined from an exome study in a Danish population. European Journal of Human Genetics, 22(8): 1040–1045. Lippold S, Xu HY, Ko A, Li MK, Renaud G, Butthof A, et al. 2014. Human paternal and maternal demographic histories: insights from high-resolution Y chromosome and mtDNA sequences. Investigative Genetics, 5(1): 13. Malyarchuk B, Litvinov A, Derenko M, Skonieczna K, Grzybowski T, Grosheva A, et al. 2017. Mitogenomic diversity in Russians and Poles. Forensic Science International: Genetics, 30: 51–56. Mathieson I, Alpaslan-Roodenberg S, Posth C, Szécsényi-Nagy A, Rohland N, Mallick S, et al. 2018. The genomic history of southeastern Europe. Nature, 555(7695): 197–203. Modi A, Tassi F, Susca RR, Vai S, Rizzi E, De Bellis G, et al. 2017. Complete mitochondrial sequences from Mesolithic Sardinia. Scientific Reports, 7(1): 42869. Pacheu-Grau D, Gómez-Durán A, Iglesias E, López-Gallardo E, Montoya J, Ruiz-Pesini E. 2013. Mitochondrial antibiograms in personalized medicine. Human Molecular Genetics, 22(6): 1132–1139. Pala M, Olivieri A, Achilli A, Accetturo M, Metspalu E, Reidla M, et al. 2012. Mitochondrial DNA signals of late glacial recolonization of Europe from near eastern refugia. The American Journal of Human Genetics, 90(5): 915–924. Pereira JB, Costa MD, Vieira D, Pala M, Bamford L, Harich N, et al. 2017. Reconciling evidence from ancient and contemporary genomes: a major source for the European Neolithic within Mediterranean Europe. Proceedings of the Royal Society B: Biological Sciences, 284(1851): 20161976. Raule N, Sevini F, Li ST, Barbieri A, Tallaro F, Lomartire L, et al. 2014. The co-occurrence of mtDNA mutations on different oxidative phosphorylation subunits, not detected by haplogroup analysis, affects human longevity and is population specific. Aging Cell, 13(3): 401–407. Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, et al. 2015. An integrated map of structural variation in 2,504 human genomes. Nature, 526(7571): 75–81. Vai S, Brunelli A, Modi A, Tassi F, Vergata C, Pilli E, et al. 2019. A genetic perspective on Longobard-Era migrations. European Journal of Human Genetics, 27(4): 647–656.

Supplementary Figures Supplementary Figure S1 Maximum-parsimony phylogenetic reconstruction of haplogroups identified in Kalash population together with available mitogenome data from our unpublished work and literature. Geographic affiliation of sequences is represented by different colors, as shown in legend. Mutations are shown on tree branches relative to the rCRS, and recurrent mutations are shown with underlined numbers. Each square block corresponds to a mitogenome sequence with the codes correspond to those in Supplementary Table S2; Kalash sequences are enclosed in red squares. Newly defined haplogroups in our study are in bold. Sequences from outside Eurasia are labeled as “Others” in brown; sequences of unknown origin are labeled as “Unknown” in yellow. Types of variants are shown in different colors, as defined in the figure legend. All transversions are shown with their respective nucleotide types. Branches with incomplete mitogenome sequence(s) with gap(s) are shown with dotted lines and their positions in the tree are uncertain. Supplementary Figure S2 Sampling locations in Kalash valleys of northern Pakistan are pinned on the map. Upper right inset shows a Kalash child in traditional Kalash dress (photo reproduced from https://www.flickr.com/ with author’s permission).