<<

Article Genomic insights into the formation of human populations in East

https://doi.org/10.1038/s41586-021-03336-2 The deep population of remains poorly understood owing to a lack 1,2 Received: 19 March 2020 of ancient DNA data and sparse sampling of present-day people . Here we report genome-wide data from 166 East Asian individuals dating to between 6000 bc and Accepted: 5 February 2021 ad 1000 and 46 present-day groups. Hunter-gatherers from , the Published online: 22 February 2021 Basin, and people of Neolithic and Taiwan and the are linked Check for updates by a deeply splitting lineage that probably reflects a coastal migration during the Late Pleistocene epoch. We also follow expansions during the subsequent Holocene epoch from four . First, hunter-gatherers from Mongolia and the Amur River Basin have ancestry shared by individuals who speak Mongolic and , but do not carry ancestry characteristic of farmers from the West (around 3000 bc), which contradicts theories that the expansion of these farmers spread the Mongolic and Tungusic proto-languages. Second, farmers from the Basin (around 3000 bc) probably spread Sino-Tibetan languages, as their ancestry dispersed both to —where it forms approximately 84% of the gene pool in some groups—and to the Central Plain, where it has contributed around 59–84% to modern groups. Third, people from Taiwan from around 1300 bc to ad 800 derived approximately 75% of their ancestry from a lineage that is widespread in modern individuals who speak Austronesian, Tai–Kadai and , and that we hypothesize derives from farmers of the River . Ancient people from Taiwan also derived about 25% of their ancestry from a northern lineage that is related to, but different from, farmers of the Yellow River Basin, which suggests an additional north-to-south expansion. Fourth, ancestry from Yamnaya pastoralists arrived in western Mongolia after around 3000 bc but was displaced by previously established lineages even while it persisted in western , as would be expected if this ancestry was associated with the spread of proto-Tocharian Indo-European languages. Two later gene flows affected western Mongolia: migrants after around 2000 bc with Yamnaya and European farmer ancestry, and episodic influences of later groups with ancestry from Turan.

East Asia was one of the earliest centres of animal and plant domes- For ancient individuals, we obtained permission for analysis from tication, and harbours an extraordinary diversity of language fami- sample custodians, following protocols to minimize damage to skel- lies including Sino-Tibetan, Tai–Kadai, Austronesian, Austroasiatic, etal material and including members of local minority groups as part Hmong–Mien, Indo-European, Mongolic, Turkic, Tungusic, Koreanic, of our study team when there was a plausible cultural connection Japonic, Yukaghiric and Chukotko–Kamchatkan1. Our current under- between modern communities and ancient individuals (see Meth- standing of the human population history in the region remains poor ods, ‘Ethics statement’). We prepared powder from bones and teeth, because of the minimal sampling of genetic diversity of present-day extracted DNA, and prepared double- or single-stranded libraries people on the Tibetan Plateau and southern China2, and a paucity of for sequencing on Illumina instruments (Methods). For most sam- ancient DNA data compared with West Eurasia3–6. ples, we enriched the DNA for a set of about 1.2 million SNPs3,7; for We collected DNA from 383 people from 46 populations from China the Chinese samples, we used exome enrichment (Methods, Sup- (n = 337) and Nepal (n = 46) who provided informed consent for broad plementary Information section 1 and Supplementary Table 1). We studies of population history; we carried out community consultation sequenced the DNA and processed the data using one of two nearly with minority group leaders as an integral part of the consent process identical bioinformatics procedures (Methods and Supplementary (see Methods, ‘Ethics statement’). We genotyped DNA using the Affym- Table 2), for which we found indistinguishable results from the per- etrix Human Origins array at about 600,000 single-nucleotide polymor- spective of analyses of population history (Supplementary Table 3). phisms (SNPs) (Extended Data Table 1 and Supplementary Information We considered samples to fail screening if they had fewer than 5,000 section 1). of the targeted SNPs covered at least once; if they had a too-low rate of

A list of authors and their affiliations appears online. ✉e-mail: [email protected]; [email protected]; [email protected]; [email protected]

Nature | Vol 591 | 18 March 2021 | 413 Article cytosine-to-thymine substitution in the terminal nucleotide; or if they which is shared with modern Japanese groups (and ancient Jomon showed evidence of major contamination based on polymorphisms hunter-gatherers of Japan) along with Indigenous Andaman islanders in mitochondrial DNA sequences8 or the X chromosome in male indi- of the Bay of Bengal15. viduals9 or a ratio of Y-to-X chromosomes that would be unexpected We used qpGraph16 to explore scenarios of population splits and for a male or female individual (Supplementary Tables 1, 2). We newly gene flow that are consistent with the data and to therefore identify a report data from 166 individuals (Fig. 1 and Supplementary Table 1): parsimonious working model for the deep history of key lineages that 82 individuals from Mongolia from between around 5700 bc and contribute to ancestry extremes in our principal component analysis ad 1400, 11 individuals from the Chinese Mainland from a site dating (Extended Data Fig. 5 and Supplementary Information section 3). Our to approximately 3000 bc in the Yellow River Basin, 7 individuals fit (Fig. 2 and Extended Data Fig. 6) suggests that much of the ancestry from Japan comprising Jomon hunter-gatherers dating to around of East Asian individuals can be derived from mixtures in different 2500–800 bc, 18 individuals from the Russian interred in proportions of two ancient populations: one from the same lineage the Boisman-2 cemetery dating to 5400–3600 bc as well as an indi- as the approximately 40,000-year-old Tianyuan individual10,13 and the vidual dating to around 900 bc and another dating to around ad 1100, other from the same lineage as Indigenous Andaman Islanders (Onge). and 46 individuals from 2 sites in Taiwan dating to between 1300 bc We infer that a Tianyuan-related lineage with a northern geographi- and ad 800 (Supplementary Table 1). For analysis we focused on 130 cal distribution contributed 98% of the ancestry of Neolithic people individuals after excluding 16 individuals with evidence of low but from Mongolia and 90% to Neolithic farmers from the Upper Yel- non-zero contamination, 10 individuals in which 5,000–15,000 SNPs low River. (The Upper Yellow River farmer lineage then mixed with were covered and 11 individuals who were close relatives of another an Onge-related branch, which we speculate is related to Tibetan higher-coverage individual in the dataset (Extended Data Table 2). hunter-gatherers to form modern Tibetan populations.) We infer that We merged our dataset with published data: 1,079 ancient individu- another Tianyuan-related lineage with a more southern geographi- als reported in 30 publications (Supplementary Table 4a) and 3,265 cal distribution contributed 73% of the ancestry of a hunter-gatherer present-day individuals reported in 16 publications (Supplementary from the Liangdao site on an island off the southeast coast of China17 Table 4b). We grouped individuals by geography, time (aided by 108 and 56% to Jomon hunter-gatherers from Japan. Japan was occupied newly reported direct dates; Supplementary Table 5), archaeological by humans before and after the Ice Age and southern and northern context and genetic cluster (Supplementary Table 1). Jomon were morphologically distinct18, which may relate to the admix- We carried out principal component analysis10, projecting ancient ture that we detect. The northerly Tianyuan-related lineage also con- individuals onto axes computed using present-day people. The popu- tributed to farmers from the West Liao River (67%) and from Taiwan lation structure is correlated with geography (R2 = 0.261; P < 0.0001) (25%) with the rest of the ancestry of these latter groups being related and language (R2 = 0.087; P < 0.0001) (Supplementary Table 6), with to Liangdao southern hunter-gatherers. The fact that this northern some exceptions. Groups in northwest China, Nepal and deviate Tianyuan-related lineage is different from (albeit related to) the lineage towards West Eurasian populations (Supplementary Information sec- that contributed to farmers from the Upper Yellow River suggests that tion 2), reflecting admixture that occurred, on average, between 5 and 70 there was probably an expansion of northern farmers to Taiwan that generations ago11 (Supplementary Tables 7, 8). Differentiation was much was not linked to the expansion of Yellow River farmers. higher in East Asian individuals living in the early Holocene (fixation index The contributions of the Onge-related lineage are concentrated

(FST) = 0.067) compared to present-day populations (FST = 0.013) (Supple- in coastal groups: we estimate 100% in Andamanese, 44% in Jomon mentary Table 9), reflecting mixture between deep East Asian lineages. and 20% in ancient Taiwan farmers, consistent with the coastal route Present-day East Asian individuals with minimal West-Eurasian-related expansion hypothesized based on the Y-chromosomal ancestry grade between three poles. The ‘Amur Basin cluster’ correlates D-M174 that is found in both Andamanese and Japanese populations15. with ancient and present-day people in the Amur River Basin, and linguis- Although Tibet is not coastal, the relatively high inferred contribution tically with speakers of Tungusic languages and the Nivkh. The ‘Tibetan of this lineage to ancient Tibetan populations (16%) and the presence Plateau cluster’ is most strongly represented in ancient people from of D-M174 with a frequency of around 50% in modern Tibetan indi- Nepal and Indigenous Tibetan peoples. The ‘Southeast Asian cluster’ viduals, provides a link between this Y-chromosomal haplogroup and is maximized in ancient Taiwan and in East Asian individuals speaking Onge-related ancestry. We hypothesize that Tibetan hunter-gatherers Tai–Kadai, Austroasiatic and (Extended Data represent an early splitting branch of this Late Pleistocene coastal Figs. 1–3). Automated clustering12 provides similar results (Extended expansion that spread inland and occupied the high plateau. Data Fig. 4 and Supplementary Information section 2). We organize our findings around themes. First, we considered deep time and determined the early lineages contributing to East Refining the trans-Eurasian hypothesis Asian populations. Then, we shed light on how population structure The farming-and-language-dispersal hypothesis19 suggests that increases came to be how it is today by testing three hypotheses about language in population densities in and around centres of were expansions and their possible connection to farming spreads. Finally, important in propelling movements of people that spread languages. we document how West and East Eurasian groups mixed along their However, in East Asia there have been limited data available to test this geographical contact zone. theory. We searched for genetic correlates of the ‘trans-Eurasian hypoth- esis’20, which proposes a macrofamily that includes Mongolic, Turkic, Tungusic, Koreanic and Japonic languages based on reconstructed fea- A Late Pleistocene coastal expansion tures including shared agricultural terms. The trans-Eurasian hypothesis Only two pre-Ice Age genomes are available from East Asia: the approxi- proposes that languages of these families descend from a proto-language mately 40,000-year-old individual from Tianyuan Cave in northern that was associated with the expansion of early millet farmers around the China13 and the around 35,000-year-old Salkhit individual from Mon- West Liao River in who spread west towards Mongolia, golia14. Nevertheless, important insights can be gleaned from analysis north towards Siberia and east towards and Japan. of post-Ice Age genomes. One question concerns the extent to which To obtain insight into possible genetic correlates of this language the peopling of East Asia by modern humans occurred via a coastal spread, we studied our time transect in the Amur River Basin21. From or interior route. Suggestive genetic evidence for a coastal route the early Neolithic individuals (around 5500 bc) and Boisman indi- comes from Y chromosome data as Tibetan populations have a high viduals (about 5000 bc) until the Iron Age Yankovsky culture (around frequency (around 50%) of the deeply branching haplogroup D-M174, 900 bc) and Xianbei culture (ad 50–250), individuals from the Amur

414 | Nature | Vol 591 | 18 March 2021 Mongolia_East_N (2) 5973–4461 cal.BC a b 0.10 Mongolia_North_N (6) 5632–5418 cal.BC Yumin DevilsCave_N BC Russia_MN_Boisman (18) 5368–3633 cal. Shamanka_Eneolithic Mongolia_Chalcolithic_Afanasievo_1 (2) 3320–2916 cal.BC Lokomotiv_Eneolithic UstIda_LN Mongolia_EBA_Ulgii_1 (3) 3012–2474 cal.BC UstIda_EBA BC Shamanka_EBA Wuzhuangguoliang_LN (11) 2906–2702 cal. Mongolia_Khovsgol_LBA Mongolia_Chalcolithic_Afanasievo_2 (1) 2857–2501 cal.BC Mongolia_Xiongnu_WE Mongolia_Xiongnu Mongolia_EBA_Chemurchek_2 (2) 2574–2459 cal.BC 0.05 Xinjiang_China_EIA_Shirenzigou Japan_Jomon (7) 2472–835 cal.BC Nepal_Chokhopani Nepal_Mebrak Mongolia_MBA_Munkhkhairkhan_1 (2) 1949–1617 cal.BC Nepal_Samdzong Amur River Mongolia_MBA_Munkhkhairkhan_2 (1) 1876–1690 cal.BC Mongolia_BA_2 (1) 1615–1452 cal.BC West Liao River Mongolia_LBA_MongunTaiga_1 (2) 1501–1305 cal.BC Mongolia_LBA_Ulaanzukh_2 (3) 1488–1292 cal.BC 0 Mongolia_LBA_MongunTaiga_3 (3) 1441–1019 cal.BC Mongolia_EIA_SlabGrave_1 (13) 1441–417 cal.BC

Mongolia_LBA_CenterWest_4 (10) 1414–846 cal.BC PC 2 Mongolia_BA_1 (1) 1388–1133 cal.BC Taiwan_Gongguan (1) 1366–1126 cal.BC Mongolia_LBA_CenterWest_5 (2) 1259–917 cal.BC −0.05 Yellow River Mongolia_EIA_2 (1) 1043–911 cal.BC Tibetan Russia_IA_Yankovsky (1) 997–825 cal.BC WLR_MN Qihe Plateau HMMH_MN Liangdao 1 Tibetan–Yi Mongolia_EIA_8 (1) 971–830 cal.BC WLR_LN Liangdao 2 Corridor WLR_BA Xitoucun Mongolia_EIA_3 (1) 512–389 cal.BC AR_EN Suogang Yangtze River AR_IA Tanshishan Mongolia_EIA_Sagly_4 (18) 405–57 cal.BC AR_Xianbei_IA Chuanyun Mongolia_EIA_5 (1) 396–210 cal.BC Miaozigou_MN Vietnam_LN −0.10 YR_MN Vietnam_BA_DongSonCulture Mongolia_EIA_Pazyryk_6 (1) 358–170 cal.BC Upper_YR_LN Vietnam_LN_HaLongCulture Upper_YR_IA Thailand_IA Mongolia_EIA_Xiongnu_7 (1) 40 cal.BC–cal.AD 109 YR_LN Malaysia_LN YR_LBIA Laos_LN_BA Taiwan_Hanben AD (45) cal. 22–774 Boshan Indonesia_LN_BA_IA Mongolia_IA_Xianbei (1) cal.AD 408–537 Bianbian Vanuatu_2900 BP Xiaogao Mongolia_Medieval (1) cal.AD 891–991 Xiaojingshan Heishui_Mohe_EarlyMedieval (1) cal.AD 1053–1223 −0.06−0.04 −0.020 0.02 0.04 Mongolia_Mongol (2) cal.AD 1229–1389 PC1

Fig. 1 | Overview. a, Locations, sample size (in brackets) and temporal Asian individuals with minimal West Eurasian-related mixture. cal.ad, distribution of newly reported ancient individuals, plotted using the ‘Google calibrated years ad; cal.bc, calibrated years bc. N, Neolithic; BA, ; IA, Map Layer’ from ArcGIS Online Basemaps (map data ©2020 Google). b, Plot of Iron Age; E, Early; M, Middle; L, Late. the first and second principal components (PCs) defined in an analysis of East

River Basin are consistent with being a clade according to qpWave including the Yellow River farming groups in the outgroup set of the (Supplementary Table 10). This locally continuous population also qpAdm analysis of Japanese populations and finding that the models contributed to later populations, as reflected in the Y-chromosomal continued to fit (Supplementary Tables 13, 14). The West Liao River haplogroup C2b-F1396 and mitochondrial D4 and C5 of ancestry is consistent with having been transmitted through Korea, as Boisman individuals—which are predominant in present-day speakers Japanese populations can be modelled as deriving from Korean (91%) and of Tungusic, Mongolic and some —and in an individual Jomon (9%) groups (Supplementary Tables 13, 14). None of the six Jomon from the Heishui Mohe culture (around ad 1100) who had an estimated individuals reported here carried the derived allele in the gene encoding 43 ± 15% ancestry from the Amur River Basin lineage (the remaining the EDAR(V370A) variant of the human ectodysplasin A receptor, which ancestry was well-modelled as Han Chinese ancestry, which could be affects hair, sweat and mammary glands (Supplementary Table 15). This expected if there was an immigration from the south in historical times) variant has been estimated to have arisen in mainland East Asia around (Supplementary Table 10). This anciently established Amur River Basin 30,000 years ago22 and that then reached a high frequency in nearly all lineage was part of a cline of more Jomon-relatedness in the east and Holocene individuals from mainland East Asia and the . The most Mongolian Neolithic-related ancestry in the west. We infer 77–94% fact that it is nearly absent from the Jomon people highlights the genetic Mongolian Neolithic-related ancestry in Baikal hunter-gatherers5 (the distinctiveness of this population compared with mainland groups. remainder from Ancient North Eurasian populations comprising a deeply splitting West Eurasian-related lineage that was established in the Baikal region during the Ice Age) (Supplementary Table 11). We Northern origin of Sino-Tibetan languages infer around 87% Mongolian Neolithic-related ancestry in Amur River The Tibetan Plateau has been occupied by modern humans since Basin hunter-gatherers such as Boisman (the remaining ancestry is 40,000–30,000 years ago23, but it is only since around 1600 bc, with Jomon-related). Native American individuals share more alleles with the advent of agriculture, that there is evidence for permanent occu- Boisman and the Mongolian Neolithic individuals than with most other pation24. Indigenous Tibetan peoples speak Sino-Tibetan languages East Asian populations, suggesting that an early branch of this lineage— linked to languages in the coastal plain of China. The northern origins reflecting the northern distribution of the Tianyuan-related branch hypothesis for the origin of these closely related languages suggests in Fig. 2—was the source for the East-Asian-related ancestry in Native that farmers who cultivated foxtail millet in the Upper and Middle American peoples (Supplementary Table 12). Yellow River Basin expanded southwest to the Tibetan Plateau and The trans-Eurasian hypothesis is that the Mongolic, Turkic, Tungusic, spread present-day Tibeto-Burman languages, and east and south Koreanic and Japonic proto-languages were spread by agriculturalists to the Central Plains and eastern coast, spreading Sinitic languages from the West Liao River region, who had a mixture of ancestries related including the linguistic ancestor of Han Chinese 25. The southern origins to individuals from the Upper Yellow River (around 67%) and Liangdao hypothesis suggests that the proto-language arose in the Tibetan–Yi (~33%) (Fig. 2). Notably, we observe that this characteristic mixture of Corridor connecting the highlands to the lowlands, and then expanded ancestries is absent from the time transects of Mongolia and the Amur in the early Holocene26. River Basin in our study (Fig. 3), which is not what is expected on the basis To shed light on Tibetan ancestry, we grouped 17 present-day popu- of the hypothesis that expansions of West Liao River farmers spread lations into three genetic clusters (Extended Data Fig. 7): ‘Core Tibetan Mongolic and Tungusic languages. By contrast, the ancestry of West Liao individuals’; ‘northern Tibetan individuals’ who are admixed between River farmers did plausibly have an influence further east. For example, lineages related to Core Tibetan and West Eurasian individuals; and we can model present-day Japanese populations as two-way mixtures ‘Tibeto–Yi Corridor’ populations who we estimate using qpAdm3,16 have of around 92% West Liao River farmer-related ancestry from the Bronze 30–70% ancestry related to Southeast Asian populations (Supplemen- Age and about 8% Jomon-related ancestry, with a negligible contribu- tary Table 16) and include not only speakers of Tibetan languages but tion from sources related to Yellow River farmers. We confirmed this by also speakers of Qiang and Lolo-Burmese languages. Ancient farmers

Nature | Vol 591 | 18 March 2021 | 415 Article

Archaic a Modern Denisovan Eurasian (64 ka)

East Eurasian West Eurasian

Coastal Interior 4%

40 ka 96% Tianyuan Interior Interior (40 ka) South North

10 ka 10% 73% 90% Yellow 27% Yangtze River River West Liangdao2 2% (8.0 ka) (7.6 ka) 98% Mongolia E Neo (6.7 ka)

5 ka 44% 67% 56% Upper Yellow 33% River (3.9 ka) West Liao Japan River (3.8 ka) 16% (3.5 ka) 25% Nepal 84% 75% (2.8 ka) Taiwan (1.5 ka) Central Andamanese 0 ka (present day) (present day) Mongolia East Neolithic (6.7 ka) b Amur River

West Liao River West Liao River Late Neolithic Japan Jomon (3.8 ka) (3.5 ka) Upper Yellow River Late Neolithic (3.9 ka) Tianyuan (40 ka)

Yellow River

Yangtze River Nepal Chokopani Liangdao 2 (2.8 ka) (7.6 ka) Taiwan Hanben (1.5 ka)

Andaman Islanders (present day)

Fig. 2 | Model of deep population relationships. We started with a skeleton coalescent (MSMC) and MSMC2 analyses48,49 to constrain models. a, We colour tree with one admixture event that fits the data for Denisovan, Mbuti, Onge, lineages modelled as derived from the hypothesized coastal expansion Tianyuan and Loschbour according to qpGraph. We grafted on Mongolia East (green), interior southern expansion (red) or interior northern expansion Neolithic (E Neo), Late Neolithic farmers from the Upper Yellow River, (blue), and populations according to ancestry proportions. Dashed lines Liangdao 2, Japan Jomon, Nepal Chokhopani, Taiwan Hanben and Late Neolithic represent admixture (proportions are indicated). The grey circles represent farmers from the West Liao River, adding them consecutively to all possible sampled populations and white circles represent unsampled hypothesized edges and retaining only graphs that provided no differences of |Z| < 3 between nodes. b, Locations and dates of East Asian individuals used in model fitting, fitted and estimated statistics (maximum |Z| = 2.95 here). We used relative with colours indicating the majority ancestry source, are plotted using the population split time estimates from the multiple sequentially Markovian ‘Google Map Layer’ from ArcGIS Online Basemaps (map data ©2020 Google). from the Yellow River and present-day Han and Qiang individuals share using models of a single pulse of admixture (Supplementary Table 18). the most drift with Core Tibetan individuals (Supplementary Table 17), The start of admixture could plausibly be as long ago as around 1600 bc, consistent with the hypothesis that Tibetan, Han and Qiang peoples all the inferred date for the spread of agriculture onto the Tibetan plateau. harbour ancestry from a population related to Neolithic farmers from the Han Chinese populations are characterized by a north–south genetic Yellow River. We confirm large-scale admixture related to Yellow River cline27,28. Farmers from the Upper and Middle Yellow River and Tibetan farmers (minimum 22% but plausibly a much higher percentage, which individuals share more alleles with Han Chinese populations compared is consistent with the 84% estimate in Fig. 2) in Core Tibetan individuals with the Southeast Asian cluster, whereas the Southeast Asian cluster through the decay of admixture linkage disequilibrium11. This provides groups share more alleles with most Han Chinese groups when com- independent evidence that Core Tibetan populations and their genetically pared with Yellow River farmers (Supplementary Tables 19, 20). Using almost indistinguishable relatives in ancient Nepal are unlikely to repre- qpWave3,29, we determined that two sources are consistent with con- sent continuous descendants of Tibetan hunter-gatherers75. We estimate tributing all of the ancestry of most Han Chinese individuals (Supple- that mixture occurred between, on average, around 290 bc and ad 270 mentary Table 21), with the exception of the northern Han populations

416 | Nature | Vol 591 | 18 March 2021 Mongolia_East_ N (5973–4461 cal.BC) (2) Mongolia_North_N (5632–5418 cal.BC) (7) a CHB b Mongolia_Chalcolithic_Afanasievo_1 (3316–2916 cal.BC) (1) Mongolia_EBA_Ulgii_1A (3012–2710 cal.BC) (2) Han_NChina Mongolia_Chalcolithic_Afanasievo_2 (2857–2501 cal.BC) (1) Upper_YR_LN Mongolia_EBA_Ulgii_1B (2619–2474 cal.BC) (1) Han_Shanxi Han_Shandong Mongolia_EBA_Chemurchek_2A (2574–2459 cal.BC) (1) *Mongolia_EBA_Chemurchek_2B (2571–2471 cal.BC) (1) Mongolia_MBA_Munkhkhairkhan_1 (1949–1617 cal.BC) (2) Han_Henan Yellow River Mongolia_MBA_Munkhkhairkhan_2 (1876–1690 cal.BC) (1) Han_Jiangsu Mongolia_LBA_MongunTaiga_1 (1501–1421 cal.BC) (1) Han_Shanghai Mongolia_LBA_Ulaanzukh_2 (1488–1292 cal.BC) (3) Mongolia_Khovsgol_LBA_H (1446–839 cal.BC) (8) Han_Chongqing Han_Hubei **Mongolia_EIA_SlabGrave_1A (1441–779 cal.BC) (3) *Mongolia_LBA_CenterWest_4A (1414–1130 cal.BC) (4) Han_Sichuan Yangzte River Han_Zhejiang Mongolia_Khovsgol_LBA_G (1389–816 cal.BC) (5) Mongolia_LBA_CenterWest_4B (1384–1019 cal.BC) (3) Miao CHS She *Mongolia_LBA_MongunTaiga_3A (1215–1019 cal.BC) (2) Mongolia_EIA_SlabGrave_1B (1124–800 cal.BC) (6) Liangdao Dong_Guizhou Dong_Hunan **Mongolia_EIA_8 (971–830 cal.BC) (1) Han_Fujian Mongolia_EIA_3 (512–389 cal.BC) (1) Maonan Mulam **China_Xinjiang_EIA_Shirenzigou_1A (410–200 cal.BC) (1) Dai Ami China_Xinjiang_EIA_Shirenzigou_1B (410–200 cal.BC) (1) China_Xinjiang_EIA_Shirenzigou_1C (410–200 cal.BC) (2) Gelao Han_Guangdong China_Xinjiang_EIA_Shirenzigou_1D (410–200 cal.BC) (2) Mongolia_East_N Zhuang *China_Xinjiang_EIA_Shirenzigou_1E (410–200 cal.BC) (4) Kinh Russia_Afanasievo Li *Mongolia_EIA_Sagly_4A (399–170 cal.BC) (6) Russia_Sintashta MLBA Mongolia_EIA_Pazyryk_6 (358–170 cal.BC) (1) Mongolia_EIA_Xiongnu_7 (40 cal.BC–cal.AD 109) (1) WSHG Mongolia_IA_Xianbei (cal.AD 408–537) (1) Turkmenistan_Gonur_BA_1 Mongolia_Mongol (cal.AD 1229–1389) (2) Han 4500 AD AD 1000 1500 2000 2500 3000 4000 5000 5500 AD 1 500 3500 1000 500 1500 BC BC BC BC BC BC BC BC BC BC BC

Fig. 3 | Estimates of mixture proportions using qpAdm. a, qpAdm modelling and identifying parsimonious models (smallest numbers of sources) that fit at of ancestry related to Yellow River farmers (blue) and Liangdao (orange) in P > 0.05 based on the Hotelling T2 test implemented in qpAdm (Supplementary present-day East Asian populations. Proportions are described in Supplementary Table 25). These P values do not incorporate any correction for multiple- Table 22 and the map was plotted using the ‘Google Map Layer’ from ArcGIS hypothesis testing. *Parsimonious models pass at only P > 0.01. **Multiple Online Basemaps (map data ©2020 Google). CHB, Han Chinese in Beijing; CHS, equally parsimonious models pass at P > 0.05, so we cannot determine whether Han Chinese South; Upper_YR_LN, Upper Yellow River Late Neolithic. the West-Eurasian-related source was Afanasievo, west Siberian b, Mongolian and Xinjiang populations. As sources we explored all possible hunter-gatherers or Sintashta_MLBA (we plot the model with the largest P subsets of Mongolia_East_N, Afanasievo, west Siberian hunter-gatherers value). Bars show ancestry proportions, and time spans are unions of all (WSHG), Sintashta_MLBA, Turkmenistan_Gonur_BA_1 and Han Chinese samples. We do not visualize results from singleton outliers. N, Neolithic; BA, individuals, adding all groups to the reference set when not used as sources, Bronze Age; IA, Iron Age; E, Early; M, Middle; L, Late.

for whom we infer West-Eurasian-related admixture of 2–4% (Supple- mentary Tables 7, 8). We estimate this mixture occurred, on average, Rice farming expansions linked by shared ancestry 32–45 generations ago, which overlaps the Tang (ad 618–907) and Song Previous ancient DNA analysis in has shown that the (ad 960–1279) dynasties for which historical records of integration of earliest farmers of Southeast Asia had about two-thirds ancestry from Han Chinese and western ethnic groups are available. For all other Han East Asian populations that were plausibly related to southern Chinese Chinese groups, we estimate that 59–84% of ancestry is related to farm- agriculturalists, and about one-third ancestry from a deeply diverged ers from the Upper and Middle Yellow River, and the remainder from hunter-gatherer lineage, a pattern that is most-strongly evident in a population related to the ancient Liangdao hunter-gatherers. This Austroasiatic speakers, which suggests that there is an association latter group possibly corresponds to rice farmers of the Yangtze River with the spread of these languages30,31. By capitalizing on our time Basin, an inference that gains strength from the fact that it comprises the series, which spans about 2,000 years from ancient Taiwan, we confirm primary ancestry of many Austronesian speakers, Tai–Kadai speakers that this was part of a broader pattern. The ancient individuals from on Hainan Island (Li, around 66%), Southeast Asian individuals from the Taiwan show strong genetic links to modern Austronesian speakers, Bronze Age and around two-thirds of the ancestry of some Austroasiatic a connection that is further supported by the fact that the dominant speakers30,31 (Fig. 3 and Supplementary Table 22). haplogroups in these ancient individuals are the Y-chromosome line- Our results support the northern origins hypothesis for Sino-Tibetan age O3a2c2-N6 and maternal mitochondrial DNA lineages E1a, B4a1a, languages, as we detect a specific link between present-day individuals F3b1 and F4b35,36. These Y-chromosome and mitochondrial lineages who speak Sino-Tibetan languages and farmers from the Upper and are shared by modern Indigenous Taiwanese peoples, and mitochon- Middle Yellow River. A timing that coincides with the archaeologically drial lineages are also present in individuals of the Lapita culture from attested expansions of farming from this region is also supported by the Vanuatu who were plausibly part of the first spread of Austronesian Y-chromosome evidence of a shared haplogroup (Oα-F5) between Han languages into the southwest Pacific region37 (Supplementary Table 12). Chinese and Tibetan peoples that derives from a single male ancestor Ancient Taiwan groups and modern Indigenous Taiwanese peoples who around 3800 bc32. The cline of increasing Liangdao-related ancestry in speak Austronesian languages share significantly more alleles with present-day southern Han Chinese people is plausibly due to expanded speakers of Tai–Kadai languages in southern Chinese Mainland and in mixing of Han Chinese individuals with southern groups as they spread Hainan Island38 than with other East Asian populations (Supplementary into southern China as recorded in the historical literature33. How- Table 12), which is consistent with the hypothesis that ancient popula- ever, this was not the first southward migration, as southern Chinese tions that were related to present-day speakers of Tai–Kadai languages populations are genetically closer to Late Neolithic farmers from the and descended more anciently from farmers of the Yangtze River (for Yellow River than to earlier Middle Neolithic ones34 and because we whom ancient DNA samples have not yet been analysed) spread agri- also observe about 25% northern ancestry in ancient farmers from culture to Taiwan around 3000 bc39. A surprising finding is our observa- Taiwan (Fig. 2). tion that ancient North Chinese individuals are more closely related to

Nature | Vol 591 | 18 March 2021 | 417 Article ancient individuals of our Taiwan time transect than to early Holocene but we can successfully model the ancestry of this individual by adding hunter-gatherers on the mainland side of the Straits of Taiwan (Sup- 15% additional ancestry from populations related to the Turan region far plementary Table 23). This suggests gene flow from Neolithic northern to the south (Fig. 3). A parallel study43 models a Chemurchek-associated Chinese Mainland into Taiwan, which we estimate to be around 25% if individual as a mixture of Turan and early pastoralists we model it as derived from one of the two source lineages of Yellow from the site of Botai, without any of the other three ancestries that River farmers (Fig. 2). This ancestry does not fit as coming from Yel- we detect in all Chemurchek individuals in our study. As our best-fit low River farmers themselves, suggesting a north-to-south migration model passes when Botai is in the reference set (P > 0.63) (Supplemen- that is not associated with expansions of these farmers. A speculative tary Table 25), the two findings would indicate an extremely complex possibility is that this ancestry was carried by cultivators of foxtail origin for Chemurchek if both were correct, with one migration stream millet—which was domesticated in the north by around 8000 bc40 and carrying Botai-related ancestry and the other not carrying it. which, in the south, appears relatively early in the Neolithic Tapenkeng From the Middle Bronze Age, there is no compelling evidence culture (around 3000–2500 bc) of Taiwan. in the Mongolian time transect data for the persistence of the Yamnaya-derived lineages that spread with the Afanasievo culture. Instead, the Yamnaya-related ancestry can only be modelled as deriving Admixture of West and East Eurasian populations from a later spread related to people of the Sintashta and Andronovo Mongolia falls near the eastern extreme of the Eurasian Steppe, and horizons of the Middle to Late Bronze Age who were themselves a mix- archaeological evidence shows that throughout the Holocene this ture of around two-thirds Yamnaya-related and one-third European region was a conduit for cultural exchanges between East and West farmer-related ancestry4–6. The Sintashta-related ancestry is detected . For example, the Afanasievo culture—an eastward extension in proportions of 0–57% in groups from this time onward, with sub- of the Yamnaya steppe pastoralist culture—brought the first dairying stantial proportions of Sintashta-related ancestry only in western to the region41 and had a cultural influence on subsequent phenomena Mongolia (Fig. 3 and Supplementary Table 25). For all of these groups, such as Chemurchek. qpAdm ancestry models pass with Afanasievo groups in the outgroups Our Mongolian time transect overwhelmingly derives ancestry whereas models with the Afanasievo-associated peoples as the source from four sources from 6000 to 600 bc. The earliest-established and Sintashta-related groups in the outgroups are all rejected (Fig. 3 source—and the only source that is primarily East-Asian-associated—is and Supplementary Table 25). represented at essentially 100% frequency in the two East Mongolian New ancestry began reaching Mongolia in large proportions starting hunter-gatherer individuals from the Neolithic (6000–5000 bc) who in the Late Bronze Age, with qpAdm models failing when using Neolithic are some of the earliest individuals in our dataset (Fig. 3 and Sup- eastern Mongolian populations as a single East Asian source in some plementary Tables 24, 25). The second source appears the earliest in individuals from the Late Bronze Age of Khovsgol, Ulaanzukh and the seven Neolithic hunter-gatherers from northern Mongolia from 5700 Centre–West region, two individuals from the Early Iron Age associated to 5400 bc who can be modelled as having around 5% of ancestry related with Slab Grave culture, and for , Xianbei and Mongol peoples. to previously reported west Siberian hunter-gatherers6 (Supplementary However, when we include Han Chinese populations as a source, we Table 25). The third source appears the earliest in individuals from the estimate Han-related ancestry proportions of 9–80% in the aforemen- Afanasievo culture (around 3100 bc), who are genetically extremely tioned individuals (Supplementary Table 25). Turan-derived ancestry similar to Yamnaya steppe pastoralists which is consistent with the pat- spread into the region again by the sixth to fourth century bc as we tern in individuals of the Afanasievo culture from Russia4,6. The fourth detect it in multiple individuals from the Iron Age Sagly culture. We source appears by around 1400 bc and is well-modelled as deriving from find that alleles with two polymorphisms (rs1426654 and rs16891982) people with ancestry similar to the pastoralists of the Sintashta culture that are associated with light skin pigmentation and one (rs12913832) who derive from a mixture of the Yamnaya culture (around two-thirds) associated with blue eyes in European individuals occur frequently in and European farmers (approximately one-third). the Sagly samples, but that the rs4988235 allele associated with lactose To quantify the admixture history in Mongolia, we used qpAdm3,16 tolerance is nearly absent in all East Asian individuals that we analysed (Supplementary Table 25). Many eastern Mongolian individuals can (Supplementary Table 15). be modelled as simple two-way admixtures of Neolithic eastern Mon- Although the Yamnaya–Afanasievo-associated lineages are con- golian populations as one source (65–100%) and the remainder of the sistent with having largely disappeared in Mongolia by the Middle ancestry deriving from west Siberian hunter-gatherers (Fig. 3). The to Late Bronze Age, we confirm and strengthen previous ancient individuals who fit this model were not only from Neolithic groups DNA analysis that suggested that the legacy of this expansion per- (0–5% west Siberian hunter-gatherers), but also a child from the Early sisted in into the time of the Iron Age Shirenzigou Bronze Age from the Afanasievo Kurgak govi site (15%), the Ulgii group culture (410–190 bc)44. Considering many of the Shirenzigou indi- (21%), the main grouping from the Middle Bronze Age Munkhkhairkhan viduals separately as well as three of the five genetically homogene- culture (31–36%) and, in the Late Bronze Age, a combined group from ous subclusters, the only parsimonious models derive all of their the Centre–West region (24–31%), as well as individuals of the Mon- West-Eurasian-related ancestry from groups related to the Afa- gun Taiga type (35%). The fact that the child from Kurgak govi has no nasievo culture, confirming that Afanasievo ancestry without the evidence of Yamnaya-related ancestry despite his clear Afanasievo characteristic European farmer-related mixture, which appeared cultural association and chronology makes him the first case of an later in and Mongolia, persisted in Xinjiang. For exam- individual buried with Afanasievo traditions who has no evidence of ple, for the two individuals with the most West-Eurasian-related Yamnaya ancestry. The legacy of the spread during the Yamnaya era ancestry (Xinjiang_EIA_Shirenzigou_1C), all three-way models that into Mongolia continued in two individuals from the Chemurchek cul- fit include Russian Afanasievo ancestry (71–77%) (Fig. 3 and Supple- ture whose ancestry can only be modelled using Yamnaya–Afanasievo mentary Table 25). Moreover, the total ancestry from the two other ancestry as a source (around 33–51%) (Supplementary Table 25). This West-Eurasian-related groups that can fit in small proportions in such fits even when ancient European farmers are included in the outgroups, models is always less than 9% (Supplementary Table 25). In pre-state providing no evidence for the theory that long-distance movement of societies, languages are thought to spread primarily through the people spread West European megalithic cultural traditions to people movements of people45, and these results therefore add weight to of the Chemurchek culture42. the theory that the Tocharian languages of the Tarim Basin spread The one instance before 600 bc for which our four source model does through the migration of Yamnaya descendants to the Altai Moun- not fit occurs in a Chemurchek individual (P = 3.7 × 10−4 from qpAdm), tains and Mongolia (in the guise of the Afanasievo culture), from

418 | Nature | Vol 591 | 18 March 2021 whence they spread further to Xinjiang4–6,44,46,47. These results are 18. Nakashima, A., Ishida, H., Shigematsu, M., Goto, M. & Hanihara, T. Nonmetric cranial variation of Jomon Japan: implications for the evolution of eastern Asian diversity. Am. J. important for theories of the diversification of Indo-European lan- Hum. Biol. 22, 782–790 (2010). guages, as they increase the evidence in favour of the hypothesis that 19. Bellwood, P. & Renfrew, C. Examining the Farming/Language Dispersal Hypothesis the split of the second-oldest branch in the Indo-European language (McDonald Institute for Archaeological Research, 2002). 44,46,47 20. Robbeets, M. & Savelyev, A. The Oxford Guide to the Transeurasian Languages (Oxford tree occurred at the end of the fourth millennium bc . Univ. Press, 2020). 21. Siska, V. et al. Genome-wide data from two early Neolithic East Asian individuals dating to 7700 years ago. Sci. Adv. 3, e1601877 (2017). Conclusion 22. Kamberov, Y. G. et al. Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell 152, 691–702 (2013). While this study marks considerable progress in understanding the 23. Zhang, X. L. et al. The earliest human occupation of the high-altitude Tibetan Plateau 40 population , the findings raise as many questions thousand to 30 thousand years ago. Science 362, 1049–1051 (2018). 24. , F. H. et al. Agriculture facilitated permanent human occupation of the Tibetan as answers, motivating the collection of additional ancient DNA data. A Plateau after 3600 B.P. Science 347, 248–250 (2015). particular priority should be to generate an ancient DNA time transect 25. Zhang, M., , S., Pan, W. & , L. Phylogenetic evidence for Sino-Tibetan origin in through southern China, including early farmers of the Yangtze River northern China in the Late Neolithic. Nature 569, 112–115 (2019). 26. van Driem, G. in The Peopling of East Asia: Putting Together Archaeology, and region—the putative source for the ancestry prevalent in the Southeast Genetics (eds Sagart, L. et al.) 81–106 (Routledge, 2005). Asian Cluster of present-day groups—which would make it possible 27. Liu, S. et al. Genomic analyses from non-invasive prenatal testing reveal genetic to test and extend the model presented in this study, and to better associations, patterns of viral infections, and Chinese population history. Cell 175, 347–359 (2018). understand how dispersals of languages in Southeast Asia do or do not 28. Chiang, C. W. K., Mangul, S., Robles, C. & Sankararaman, S. A comprehensive map of correlate to ancient movements of people. Another priority should be genetic variation in the world’s largest ethnic group—Han Chinese. Mol. Biol. Evol. 35, to generate data on many additional pre-Ice Age individuals from East 2736–2750 (2018). 29. Reich, D. et al. Reconstructing Native American population history. Nature 488, 370–374 Asia, which will make it possible to test the model of deep population (2012). relationships presented in Fig. 2 and to better understand the origins, 30. Lipson, M. et al. Ancient genomes document multiple waves of migration in Southeast migrations, and mixtures of the diverse modern human populations Asian prehistory. Science 361, 92–95 (2018). 31. McColl, H. et al. The prehistoric peopling of Southeast Asia. Science 361, 88–92 (2018). that have lived in East Asia for more than 40,000 years. 32. Wang, L. X. et al. Reconstruction of Y-chromosome phylogeny reveals two neolithic expansions of Tibeto-Burman populations. Mol. Genet. Genomics 293, 1293–1300 (2018). 33. Ge, J. X., , S. D. & Chao, S. J. Zhongguo yimin shi (The Migration ) (Fujian People’s Publishing House, 1997). Online content 34. Ning, C. et al. Ancient genomes from northern China suggest links between subsistence Any methods, additional references, Nature Research reporting sum- changes and human migration. Nat. Commun. 11, 2700 (2020). maries, source data, extended data, supplementary information, 35. , L. H. et al. Phylogeography of Y-chromosome haplogroup O3a2b2-N6 reveals patrilineal traces of Austronesian populations on the eastern coastal regions of Asia. PLoS acknowledgements, peer review information; details of author con- ONE 12, e0175080 (2017). tributions and competing interests; and statements of data and code 36. Ko, A. M. et al. Early Austronesians: into and out of Taiwan. Am. J. Hum. Genet. 94, availability are available at https://doi.org/10.1038/s41586-021-03336-2. 426–436 (2014). 37. Skoglund, P. et al. Genomic insights into the peopling of the Southwest Pacific. Nature 538, 510–513 (2016). 1. Cavalli-Sforza, L. L. The Chinese human genome diversity project. Proc. Natl Acad. Sci. 38. Lipson, M. et al. Reconstructing Austronesian population history in island Southeast Asia. USA 95, 11501–11503 (1998). Nat. Commun. 5, 4689 (2014). 2. HUGO Pan-Asian SNP Consortium. Mapping human genetic diversity in Asia. Science 39. Bellwood, P. The checkered prehistory of rice movement southwards as a domesticated 326, 1541–1545 (2009). cereal—from the Yangzi to the equator. Rice 4, 93–103 (2011). 3. Haak, W. et al. Massive migration from the steppe was a source for Indo-European 40. Yang, X. et al. Early millet use in northern China. Proc. Natl Acad. Sci. USA 109, 3726–3730 languages in Europe. Nature 522, 207–211 (2015). (2012). 4. Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015). 41. Wilkin, S. et al. Dairy pastoralism sustained eastern Eurasian steppe populations for 5,000 5. Damgaard, P. B. et al. 137 ancient human genomes from across the Eurasian . years. Nat. Ecol. Evol. 4, 346–355 (2020). Nature 557, 369–374 (2018). 42. Kovalev, A. The great migration of the Chemurchek people from France to the Altai in the 6. Narasimhan, V. M. et al. The formation of human populations in South and Central Asia. early 3rd millennium bce. Int. J. Eurasian Stud. 1, 1–58 (2011). Science 365, eaat7487 (2019). 43. Jeong, C. et al. A dynamic 6,000-year genetic history of Eurasia’s Eastern Steppe. Cell 7. Fu, Q. et al. An from Romania with a recent ancestor. 183, 890–904 (2020). Nature 524, 216–219 (2015). 44. Ning, C. et al. Ancient genomes reveal Yamnaya-related ancestry and a potential source 8. Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. of Indo-European speakers in Iron Age Tianshan. Curr. Biol. 29, 2526–2532 (2019). Natl Acad. Sci. USA 110, 2223–2227 (2013). 45. Bellwood, P. in The Encyclopedia of Global Human Migration (Wiley-Blackwell, 2013). 9. Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation 46. Mallory, J. P. in Search of the Indo-Europeans: Language, Archaeology and Myth (Thames sequencing data. BMC Bioinformatics 15, 356 (2014). & Hudson, 1991). 10. Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 47. Anthony, D. The Horse, the , and Language: How Bronze-Age Riders from the 2, e190 (2006). Eurasian Steppes Shaped the Modern World (Princeton Univ. Press, 2007). 11. Loh, P. R. et al. Inferring admixture of human populations using linkage 48. Schiffels, S. & Durbin, R. Inferring human population size and separation history from disequilibrium. Genetics 193, 1233–1254 (2013). multiple genome sequences. Nat. Genet. 46, 919–925 (2014). 12. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in 49. Wang, K., Mathieson, I., O’Connell, J. & Schiffels, S. Tracking human population structure unrelated individuals. Genome Res. 19, 1655–1664 (2009). through time from whole genome sequences. PLoS Genet. 16, e1008552 (2020). 13. Yang, M. A. et al. 40,000-year-old individual from Asia provides insight into early 75. Jeong, C. et al. Long-term genetic stability and a high-altitude East Asian origin for population structure in Eurasia. Curr. Biol. 27, 3202–3208 (2017). the peoples of the high valleys of the Himalayan arc. Proc. Natl Acad. Sci. USA. 113, 14. Massilani, D. et al. Denisovan ancestry and population history of early East Asians. 7485–7490 (2016). Science 370, 579–583 (2020). 15. Wang, C. C. & Li, H. Inferring in East Asia from Y chromosomes. Investig. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in Genet. 4, 11 (2013). published maps and institutional affiliations. 16. Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012). 17. Yang, M. A. et al. Ancient DNA indicates human population shifts and admixture in northern and southern China. Science 369, 282–288 (2020). © The Author(s), under exclusive licence to Springer Nature Limited 2021

Nature | Vol 591 | 18 March 2021 | 419 Article

Chuan-Chao Wang1,2,3,4,44✉, Hui- Yeh5,44, Alexander N. Popov6,44, Hu- Zhang7,44, 15Department of Anthropology and Archaeology, National University of Mongolia, Hirofumi Matsumura8, Kendra Sirak2,9, Olivia Cheronet10, Alexey Kovalev11, Nadin Rohland2, Ulaanbaatar, Mongolia. 16Institute of Archaeology, National Cheng Kung University, Tainan, Alexander M. Kim2,12, Swapan Mallick2,9,13,14, Rebecca Bernardos2, Dashtseveg Tumen15, Taiwan. 17Department of Anthropology, University of Washington, Seattle, WA, USA. Jing Zhao7, Yi-Chang Liu16, Jiun-Yu Liu17, Matthew Mah2,13,14, Ke Wang3, Zhao Zhang2, 18Institutes of Energy and the Environment, The Pennsylvania State University, University Park, Nicole Adamski2,14, Nasreen Broomandkhoshbacht2,14, Kimberly Callan2,14, PA, USA. 19Department of Anthropology, Pennsylvania State University, University Park, PA, Francesca Candilio10, Kellie Sara Duffett Carlson10, Brendan J. Culleton18, Laurie Eccles19, USA. 20Institute of Archaeological Science, Fudan University, Shanghai, China. 21School of Suzanne Freilich10, Denise Keating10, Ann Marie Lawson2,14, Kirsten Mandl10, Megan Michel2,14, Ethnology and Sociology, Minzu University of China, Beijing, China. 22Institute of History and Jonas Oppenheimer2,14, Kadir Toykan Özdoğan10, Kristin Stewardson2,14, Shaoqing Wen20, Philology, Institute of History and Philology, Academia Sinica, Taipei, Taiwan. 23Museum of Shi Yan21, Fatma Zalzala2,14, Richard Chuang16, Ching-Jung Huang16, Hana Looh22, Archaeology and Ethnology, Institute of History, Archaeology and Ethnology, Far Eastern Chung-Ching Shiung16, Yuri G. Nikitin23, Andrei V. Tabarev24, Alexey A. Tishkin25, Song Lin7, Branch of the Russian Academy of Sciences, Vladivostok, Russia. 24Institute of Archaeology Zhou-Yong Sun26, Xiao-Ming Wu7, Tie-Lin Yang7, Xi Hu7, Liang Chen27, Hua Du28, and Ethnography, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia. Jamsranjav Bayarsaikhan29, Enkhbayar Mijiddorj30, Diimaajav Erdenebaatar30, 25Department of Archeology, Ethnography and Museology, Altai State University, Barnaul, Tumur-Ochir Iderkhangai30, Erdene Myagmar15, Hideaki Kanzawa-Kiriyama31, Russia. 26Shaanxi Provincial Institute of Archaeology, Xi’an, China. 27School of Cultural Masato Nishino32, Ken-ichi Shinoda31, Olga A. Shubina33, Jianxin Guo1, Wangwei Cai34, Heritage, Northwest University, Xi’an, China. 28Xi’an AMS Center, Institute of Earth Qiongying Deng35, Longli Kang36, Dawei Li37, Dongna Li38, Rong Lin38, Nini36, Rukesh Shrestha4, Environment, Chinese Academy of Sciences, Xi’an, China. 29Research Center at the National Ling-Xiang Wang4, Lanhai Wei1, Guangmao Xie39,40, Hongbing Yao41, Manfei Zhang4, Museum of Mongolia, Ulaanbaatar, Mongolia. 30Department of Archaeology, Ulaanbaatar Guanglin He1, Xiaomin Yang1, Rong Hu1, Martine Robbeets42, Stephan Schiffels3, State University, Ulaanbaatar, Mongolia. 31Department of Anthropology, National Museum of Douglas J. Kennett43, Li Jin4, Hui Li4, Johannes Krause3✉, Ron Pinhasi10✉ & David Reich2,9,13,14✉ Nature and Science, Tsukuba, Japan. 32Archaeological Center of Chiba City, Chiba, Japan. 33Department of Archeology, Sakhalin Regional Museum, Yuzhno-Sakhalinsk, Russia. 34Department of Biochemistry and Molecular Biology, Hainan Medical University, Haikou, China. 35Department of Human Anatomy and Center for Genomics and Personalized 1Department of Anthropology and Ethnology, Institute of Anthropology, State Key Laboratory Medicine, Medical University, Nanning, China. 36Key Laboratory for Molecular of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, China. Genetic Mechanisms and Intervention Research on High Altitude Disease of Tibet 2Department of Genetics, Harvard Medical School, Boston, MA, USA. 3Department of Autonomous Region, Ministry of Education, School of Medicine, Xizang Minzu University Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany. (Tibet University for Nationalities), Xianyang, China. 37Institute for History and Culture of 4MOE Key Laboratory of Contemporary Anthropology, Department of Anthropology and Science & Technology, Guangxi University for Nationalities, Nanning, China. 38Department of Human Genetics, School of Life Sciences, Fudan University, Shanghai, China. 5School of Biology, Hainan Medical University, Haikou, China. 39College of History, Culture and Tourism, Humanities, Nanyang Technological University, Nanyang, Singapore. 6Scientific Museum, Far Guangxi Normal University, Guilin, China. 40Guangxi Institute of Cultural Relics Protection and Eastern Federal University, Vladivostok, Russia. 7Key Laboratory of Biomedical Information Archaeology, Nanning, China. 41Belt and Road Research Center for Forensic Molecular Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong Anthropology, Key Laboratory of Evidence Science of Province, Gansu Institute of University, Xi’an, China. 8School of Health Science, Sapporo Medical University, Sapporo, Political Science and Law, , China. 42Eurasia3angle Research group, Max Planck Japan. 9Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA. Institute for the Science of Human History, Jena, Germany. 43Department of Anthropology, 10Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria. 11Institute of University of California Santa Barbara, Santa Barbara, CA, USA. 44These authors contributed Archaeology, Russian Academy of Sciences, Moscow, Russia. 12Department of Anthropology, equally: Chuan-Chao Wang, Hui-Yuan Yeh, Alexander N. Popov, Hu-Qin Zhang. Harvard University, Cambridge, MA, USA. 13Broad Institute of Harvard and MIT, Cambridge, ✉e-mail: [email protected]; [email protected]; [email protected]; MA, USA. 14Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA. [email protected] Methods makes it difficult to connect particular sites to specific modern ethnic groups for prehistoric sites older than 400 years, and it is rare for local Ethics statement communities to express connections with prehistoric sites. Neverthe- The collection of modern samples was carried out in 2014 in strict less, two co-authors with Indigenous Taiwanese ancestry or cultural accordance with the ethical research principles of The Ministry of affiliation to these groups specifically reviewed the discussion of the Science and Technology of the People’s Republic of China (Interim results of Taiwanese groups to increase the sensitivity of our study to Measures for the Administration of Human Genetic Resources, 10 June the perspectives of Indigenous groups. H.-Y.Y., who is co-first author 1998). Our sample collection and genotyping protocol was further of the study, has ancestry from the Paiwan Indigenous group. H.L. was reviewed and approved by the Ethics Committee of the School of Life the excavation leader for the Bilhun Hanben site and is the local com- Sciences, Fudan University (22 October 2014). Study staff informed munity leader for the Ami group, whose present-day culture shows potential participants about the goals of the project, and individuals some similarities to the material culture of the site. who chose to participate gave informed consent consistent with broad studies of population history and human variation and public posting Ancient DNA laboratory work of anonymized data. There were no rewards for participating and no All samples except those from Wuzhuangguoliang were prepared in negative consequences for not participating; all participants signed or dedicated clean room facilities at the Harvard Medical School, Bos- affixed a thumbprint to the consent form reviewed by Fudan Univer- ton, USA and in some cases also the University of Vienna, Vienna, sity. An important principle of our study was to ensure not only that Austria. Supplementary Table 2 lists experimental settings for each the research was underpinned by individual informed consent, but sample and library included in the dataset. Skeletal samples were also that it had support from community representatives sensitive to surface-cleaned and drilled or sandblasted and milled to produce a local perspectives, and we therefore carried out community consulta- fine powder for DNA extraction50,51. We either followed a previously pub- tion with minority group leaders or village leaders as an integral part of lished extraction protocol52 replacing the extender-MinElute-column the consent process. For each minority group, community representa- assembly with the columns from the Roche High Pure Viral Nucleic tives affirmed community support for the study through a signature Acid Large Volume Kit53 (manual extraction) or, for samples prepared or thumbprint on a form summarizing the community consultation later, used a DNA-extraction protocol based on silica beads instead process (these forms were completed between 10 November 2014 and of spin columns (and Dabney buffer) to enable automated DNA puri- 10 December 2014). Co-authors of the manuscript who were culturally fication54 (robotic extraction). We prepared individually barcoded Indigenous and in some cases were legally registered as members of double-stranded libraries for most samples using a protocol that minority groups specifically reviewed the discussions of population included a DNA repair step with uracil-DNA glycosylase (UDG) to cut history in this manuscript to increase sensitivity to local perspectives. molecules at locations containing ancient DNA damage but that is inef- Specifically, L.W. is a Tai–Kadai-speaking Zhuang person from Guangxi ficient at the terminal positions of DNA molecules, allowing the rate in southwest China; R.S. is from Nepal; and L.K. and N. are based at the of damage at the final nucleotide to be used as a measure of authen- Tibet University for Nationalities, and N. is an Indigenous Tibetan. We ticity (Supplementary Table 1, UDG: ‘half’)55. We also prepared some emphasize that Indigenous and community narratives co-exist with libraries without UDG pre-treatment (double-stranded minus). For a scientific ones and may or may not align with them. Indigenous ancestry few extracts, single-stranded DNA libraries56 were prepared with USER should not be confused with identity, which is about self-perception (NEB) addition in the dephosphorylation step, which results in ineffi- and culture and cannot be defined by genetics alone. cient uracil removal at the 5′ end of the DNA molecules, and does not The ancient samples newly reported in this study were collected with affect deamination rates at the terminal 3′ end57. We performed target the permission of the custodians of the samples, who are the archaeolo- enrichment via hybridization with previously reported protocols8. We gists or museums in each of the countries for which we analysed the either enriched for the mitochondrial genome and 1.2 million SNPs in data. We applied a case-by-case approach to obtaining permissions for two separate experiments or together in a single experiment. If split each set of samples depending on the local expectations as these vary over two experiments, the first enrichment was for sequences align- by region and cultural context. Every newly reported ancient sample in ing to mitochondrial DNA (mtDNA)55,58 with some baits overlapping this study has permission for analysis from custodians of the samples nuclear targets spiked-in to screen libraries for nuclear DNA content. who are co-authors and who affirm that ancient DNA analysis of these The second enrichment was for a targeted set of 1,237,207 SNPs that samples is appropriate. For most samples, we prepared formal col- comprises a merge of two previously reported sets of 394,577 SNPs3 laboration agreements to explicitly list the ancient DNA work being and 842,630 SNPs7. We sequenced the enriched libraries on an Illumina performed by our team. In other instances, sample custodians who are NextSeq500 instrument for 2 × 76 cycles (and both indices) or on Hiseq co-authors determined that the generation and publication of ancient X10 instruments at the Broad Institute of MIT and Harvard for 2 × 101 DNA data was covered under their existing permissions for sample cycles. We also shotgun-sequenced a few hundred thousand molecules analysis, and so new sampling agreements were not required. Going from each library to assess the fraction of human DNA. beyond what was formally required, we also sought to make the pres- Extractions of the Wuzhuangguoliang samples were performed in entation of the scientific findings sensitive to local perspectives from the clean room at the Xi’an Jiaotong University and Xiamen University the regions from which the skeletons were excavated. For some regions following a previously published protocol59. Each extract was converted for which we obtained DNA such as the southern islands of Japan and into double-stranded Illumina libraries following the manufacturer’s the sites we are not aware of modern communities with protocol (Fast Library Prep Kit, iGeneTech). Sample-specific indexing traditions of biological or cultural connection to the ancient remains. barcodes were added to both sides of the fragments via amplifica- For other regions, such as the Upper Yellow River Chinese or Mongolian tion. Nuclear DNA capture was performed with the AIExome Enrich- regions, the modern nation-states in which the ancient individuals ment Kit V1 (iGeneTech) according to the manufacturer’s protocol lived are modern inheritors of the cultural and genetic heritage of the and sequenced on an Illumina NovaSeq instrument with 150-base-pair ancient groups. In Taiwan, in addition to obtaining formal permission paired-end reads. for sampling from government institutions, we sought to ensure that the presentation of our results was sensitive to the perspectives of Indig- Bioinformatics processing enous Taiwanese groups who plausibly descend thousands of years ago We de-multiplexed the data and assigned sequences to samples based from groups related to those individuals whose data we report. The on the barcodes and/or indices, allowing up to one mismatch per bar- existence of at least 16 non-Han Chinese Indigenous groups in Taiwan code or index. We trimmed adapters and restricted to fragments for Article which the two reads overlapped by at least 15 nucleotides. We merged damaged sequences, and merging damaged sequences with sequences sequences (allowing up to one mismatch) choosing bases in the merged that do not show damage in the final nucleotides but that are short region based on highest quality in case of a conflict, using either a modi- (requiring a minimum of 30 bp, and increasing in 10-bp increments fied version of Seqprep60 (if we were using bioinformatics processing from there up to 180 bp). We considered data from an individual usable pipeline 1 as specified in Supplementary Table 2) or custom software for analysis if it consisted of a minimum 5,000 SNPs, if the lower bound (if we were using bioinformatics processing pipeline 2; https://github. of its ANGSD9 95% confidence interval is less than 0.01, and if the upper com/DReichLab/ADNA-Tools). We aligned the merged sequences using bound of its contamMix 95% confidence interval is more than 0.98. We bwa (v.0.6.1 for pipeline 1 and v.0.7.15 for pipeline 2)61 to the mitochon- choose the version of each sample that has the most SNPs covered as drial genome RSRS62 and to the human genome (GRCh37, https://www. long as it meets the criteria above (Supplementary Table 26). ncbi.nlm.nih.gov/assembly/GCF_000001405.13/). We removed dupli- cate sequences with the same orientation, start and stop positions and Accelerator mass spectrometry radiocarbon dating barcodes. We determined haplogroups using HaploGrep263. To assess We generated 108 direct accelerator mass spectrometry (AMS) radio- authenticity, we estimated the rate of cytosine to thymine substitution carbon (14C) dates; 70 at the Pennsylvania State University (PSUAMS), in the final nucleotide, which is expected to be at least 3% at cytosines 32 through a collaboration of Pennsylvania State University (PSUAMS) in libraries prepared with a partial UDG treatment protocol and at least and the University of California Irvine (UCIAMS), and 6 at Poznan 10% for untreated libraries (minus) and single-stranded libraries; all Radiocarbon Laboratory. The methods used at Poznan are available libraries that we analysed met this threshold. We also assessed authen- at https://radiocarbon.pl/en/ and here we summarize the methods ticity using contamMix (v.1.0.9 for pipeline 1 and v.1.0.12 for pipeline used for the samples measured at PSUAMS and UCIAMS. Bone collagen 2)8 to determine the fraction of mtDNA sequences in an ancient sam- from petrous, phalanx or tooth (dentine) samples was extracted and ple that matches the endogenous majority consensus more closely purified using a modified Longin method with ultrafiltration (>30 kDa than a comparison set of 311 worldwide present-day human mtDNAs. gelatin)65. If bone collagen was poorly preserved or contaminated, we For whole-genome analysis, we randomly selected a single sequence hydrolysed the collagen and purified the amino acids using solid-phase covering every SNP position of interest (‘pseudo-haploid’ data) using extraction columns (XAD amino acids)66. Before extraction, we sequen- custom software, only using nucleotides that were a minimum distance tially sonicated all samples in ACS-grade methanol, acetone and dichlo- from the ends of the sequences to avoid deamination artefacts (https:// romethane (30 min each) at room temperature to remove conservants github.com/DReichLab/adna-workflow). The coverages and numbers of or adhesives possibly used during curation. Extracted collagen or amino SNPs covered at least once on the autosomes (chromosomes 1–22) are acid preservation was evaluated using crude gelatin yields (%wt), %C, included in Supplementary Table 1 for a merge of data from all libraries %N and C/N ratios. Stable carbon and nitrogen isotopes were measured for each sample. Supplementary Table 2 shows the results by library. on a Thermo DeltaPlus instrument with a Costech elemental analyser To evaluate whether there was evidence that ancient DNA data pro- at Yale University. C/N ratios between 3.06 and 3.45 indicate that all cessed using the same bioinformatics pipeline was artefactually biased radiocarbon-dated samples are well-preserved. All samples were to appear similar to each other in f-statistic analysis, we computed statis- combusted and graphitized at PSUAMS and UCIAMS using methods 65 14 tics of the form f4(Group1Pipeline1, Group1Pipeline2; Group2Pipeline1, described elsewhere . C measurements were made on a modified Group2Pipeline2) for all groups for which we had individuals in our main National Electronics Corporation 1.5SDH-1 compact accelerator mass analysis dataset processed by both pipelines (Mongolia_EIA_Sagly_4, Mon- spectrometer at either the PSUAMS facility or the Keck-Carbon Cycle golia_EIA_SlabGrave_1, Mongolia_LBA_CenterWest_4, Mongolia_LBA_Mon- AMS Facility at the University of California Irvine. All dates were cali- gunTaiga_3, Russia_MN_Boisman and Taiwan_Hanben). For all 15 possible brated using the IntCal20 curve67 in OxCal v.4.4.268 and are presented pairwise comparisons, the Z-scores for deviation from zero as computed in calibrated calendar years bc or ad. based on a block jackknife standard error had a magnitude of less than |2.7|, which is not significant after correcting for the 15 tests that we performed Y-chromosomal haplogroup analysis (P = 0.11 after applying a Bonferroni correction) (Supplementary Table 3). We determined Y-chromosomal haplogroups by examining the state Although these analyses reduce concerns about systematic differences of SNPs in ISOGG v.15.56 (https://isogg.org/tree/index.html) (Sup- in population genetic analysis driven by changes over time in the software plementary Information section 4). that we used to carry out our bioinformatics processing steps, we caution that there are other inhomogeneities in our ancient DNA dataset that X-chromosome contamination estimates have the potential to affect inferences. Other sources of inhomogeneity We performed an X-chromosomal contamination test for the male indi- include systematic differences in the chemical properties and preserva- viduals following a previously described approach69 and implemented tion conditions of DNA from different archaeological sites, differences in the ANGSD software9. We used the ‘methods of moments’ estimates. in wet laboratory protocols including differences between data from The estimates for some male individuals are not informative because in-solution enrichment and direct shotgun sequencing, and differences in of the limited number of X-chromosomal SNPs covered by at least two wet laboratory and bioinformatics processing protocols across research sequences (we only report results for individuals with at least 200 SNPs groups that published the various datasets co-analysed in our study. The covered at least twice). fact that we can obtain fitting models of population history through admix- ture graph analysis (Fig. 2) even in the presence of these differences, and Procedure for combining new Affymetrix human origins that the admixture graph model also fits when restricting to transversion genotyping data on modern individuals with previously polymorphisms (Supplementary Information section 3) and finally that published data our f4-symmetry tests reveal no significant differences between data gen- We merged the newly generated data with previously published data- erated for this study using wet laboratory and bioinformatics protocols sets genotyped on Affymetrix Human Origins arrays16, restricting to that changed over time (Supplementary Table 3) increases confidence present-day individuals with more than 95% genotyping completeness. that our inferences are valid even in the presence of inhomogeneities64. We manually curated the data using ADMIXTURE12 and principal com- ponent analysis as implemented in EIGENSOFT10 to identify individuals Customized damage restriction to address contamination in that were outliers compared with others from their own populations Wuzhuangguoliang in cases in which a main cluster was identifiable. We removed seven We explored authenticity metrics for different filtering strategies for present-day individuals as outliers from subsequent analysis; the the data from the Wuzhuangguoliang individuals: restricting only to population identifiers for these individuals are prefixed by the string “Ignore_” in the dataset that we released (for analyses of ancient indi- that we are evaluating whether the test populations are consistent with viduals, we do not remove outliers). descending from as few as N + 1 sources of ancestry.

Principal component analysis Inferring mixture proportions without an explicit phylogeny We used the smartpca program of EIGENSOFT10, using default param- We used qpAdm3,29 as implemented in ADMIXTOOLS16 to estimate mix- eters and the lsqproject: YES and numoutlieriter: 0 options. ture proportions for a Test population as a combination of N ‘reference’ populations by exploiting (but not explicitly modelling) shared genetic ADMIXTURE analysis drift with a set of ‘Outgroup’ populations. We compute standard errors We carried out ADMIXTURE analysis in unsupervised mode12 after with a block jackknife and a P value for fit using a single hypothesis pruning for linkage disequilibrium in PLINK70 with parameters Hotelling T2 test. --indep-pairwise 200 25 0.4, which retained 256,427 SNPs. We ran ADMIXTURE with default fivefold cross-validation (--cv = 5), varying Weighted linkage disequilibrium analysis the number of ancestral populations between K = 2 and K = 18 in 100 Linkage disequilibrium decay was calculated using ALDER11 to infer bootstraps with different random seeds. admixture parameters including dates and mixture proportions, with a standard error computed as a block jackknife over chromosomes. Clustering of ancient individuals We clustered ancient individuals based on chronology and archae- MSMC and MCMC2 ological association, and then further based on both qualitative We used MSMC48 as previously described72 to infer cross-coalescence similarity (in principal component analysis (PCA), ADMIXTURE and rates and population sizes among Ami and Atayal, Tibetan and Ulchi. 49 outgroup f3-statistics) and quantitative homogeneity (based on We also ran MCMC2 as described previously . f4-statistics and qpAdm results). In general, group names have the format ‘__

If we detect a significantly non-zero value of a statistic of the form f4(A, otype data of 383 modern East Asian individuals have been deposited in B; C, D) we can be confident that populations A and B (or C and D) are Zenodo (https://doi.org/10.5281/zenodo.4058532). The previously pub- not consistent with being descended from a homogeneous ancestral lished data co-analysed with our newly reported data can be obtained population that split earlier in time from the ancestors of the other two as described in the original publications, which are all referenced in groups. A significantly positive value of an f4-statistic of the form f4(A, Supplementary Table 4; a compiled dataset that includes the merged B; C, D) implies an excess allele sharing between populations A and C genotypes used in this paper is available as the Allen Ancient DNA or B and D, while a negative value implies sharing between populations Resource at https://reich.hms.harvard.edu/allen-ancient-dna-resource- B and C, or A and D. aadr-downloadable-genotypes-present-day-and-ancient-dna-data. Any other relevant data are available from the corresponding authors

FST computation upon reasonable request. 10 We estimated FST using the smartpca program of EIGENSOFT with default parameters and fstonly: YES and inbreed: YES. The populations 50. Pinhasi, R., Fernandes, D. M., Sirak, K. & Cheronet, O. Isolating the human cochlea and groupings used in this analysis are shown in Supplementary Table 9. to generate bone powder for ancient DNA analysis. Nat. Protocols 14, 1194–1205 (2019). Admixture graph modelling 51. Sirak, K. A. et al. A minimally-invasive method for sampling human petrous bones from the cranial base for ancient DNA analysis. Biotechniques 62, 283–289 (2017). We modelled population relationships and admixture with qpGraph 52. Dabney, J. et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave 16 bear reconstructed from ultrashort DNA fragments. Proc. Natl Acad. Sci. USA 110, in ADMIXTOOLS using Mbuti as an outgroup. We computed f2-, f3- 15758–15763 (2013). and f4-statistics measuring allele sharing of two, three or four sets of 53. Korlević, P. et al. Reducing microbial and human contamination in DNA extractions from populations and reported the maximum |Z|-score between predicted ancient bones and teeth. Biotechniques 59, 87–93 (2015). and observed values. We ranked models that passed according to 54. Rohland, N., Glocke, I., Aximu-Petri, A. & Meyer, M. Extraction of highly degraded DNA from ancient bones, teeth and for high-throughput sequencing. Nat. Protocols this metric based on relative likelihood (Supplementary Information 13, 2447–2461 (2018). section 3). 55. Rohland, N., Harney, E., Mallick, S., Nordenfelt, S. & Reich, D. Partial uracil-DNA-glycosylase treatment for screening of ancient DNA. Phil. Trans. R. Soc. Lond. Determining a minimum number of streams of ancestry B 370, 20130624 (2015). 56. Gansauge, M. T. & Meyer, M. Selective enrichment of damaged DNA molecules for 3,29 16 We used qpWave as implemented in ADMIXTOOLS to test whether ancient genome sequencing. Genome Res. 24, 1543–1549 (2014). a set of test populations is consistent with being related via N streams 57. Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012). of ancestry from a set of outgroup populations. In qpWave, a test for 58. Maricic, T., Whitten, M. & Pääbo, S. Multiplexed DNA sequence capture of mitochondrial 2 rank N, implemented as a single hypothesis Hotelling T test, means genomes using PCR products. PLoS ONE 5, e14004 (2010). Article

59. Rohland, N. & Hofreiter, M. Ancient DNA extraction from bones and teeth. Nat. Protocols Pospelovo-1 site (Yankovsky culture) and the Roshino-4 site (Heishui Mohe culture) were 2, 1756–1762 (2007). funded by the Far Eastern Federal University and the Institute of History, Archaeology and 60. John, J. S. SeqPrep. GitHub https://github.com/jstjohn/SeqPrep (2011). Ethnology Far Eastern Branch of the Russian Academy of Sciences; research on Pospelovo-1 is 61. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler funded by RFBR project number 18-09-40101. C.-C.W. was funded by the Max Planck Society, transform. Bioinformatics 25, 1754–1760 (2009). the National Natural Science Foundation of China (NSFC 31801040), the Nanqiang 62. Behar, D. M. et al. A “Copernican” reassessment of the human mitochondrial DNA tree Outstanding Young Talents Program of Xiamen University (X2123302), the Major project of from its root. Am. J. Hum. Genet. 90, 675–684 (2012). National Social Science Foundation of China (20&ZD248), a European Research Council (ERC) 63. Weissensteiner, H. et al. HaploGrep 2: mitochondrial haplogroup classification in the era grant to D. (ERC-2019-ADG-883700-TRAM) and Fundamental Research Funds for the of high-throughput sequencing. Nucleic Acids Res. 44, W58–W63 (2016). Central Universities (ZK1144). H.M. was supported by grant JSPS 16H02527. M.R. and C.-C.W. 64. Günther, T. & Nettelblad, C. The presence and impact of reference bias on population received funding from the ERC under the European Union’s Horizon 2020 research and genomic studies of prehistoric human populations. PLoS Genet. 15, e1008302 (2019). innovation program (grant no. 646612) to M.R. H. Li was funded NSFC (91731303, 31671297), 65. Kennett, D. J. et al. Archaeogenomic evidence reveals prehistoric matrilineal dynasty. B&R International Joint Laboratory of Eurasian Anthropology (18490750300). J.K. was funded Nat. Commun. 8, 14115 (2017). by DFG grant KR 4015/1-1, the Baden Württemberg Foundation and the Max Planck Institute. 66. Lohse, J. C., Madsen, D. B., Culleton, B. J. & Kennett, D. J. Isotope paleoecology of Accelerator Mass Spectrometry radiocarbon dating work was supported by the National episodic mid-to-late Holocene bison population expansions in the southern Plains, U.S.A. Science Foundation (NSF) (BCS-1460369) to D.J.K. and B.J.C. D.R. was funded by NSF grant Quat. Sci. Rev. 102, 14–26 (2014). BCS-1032255, NIH (NIGMS) grant GM100233, the Paul M. Allen Frontiers Group, John 67. Reimer, P. J. et al. The IntCal20 radiocarbon age calibration curve Templeton Foundation grant 61220, a gift from J.-F. Clin and the Howard Hughes Medical (0–55 cal kBP). Radiocarbon 62, 725–757 (2020). Institute. 68. Bronk Ramsey, C. Bayesian analysis of radiocarbon dates. Radiocarbon 51, 337–360 (2009). Author contributions C.-C.W., H.-Y.Y., A.N.P., H.M., A.M.K., L.J., H. Li, J.K., R.P. and D.R. 69. Rasmussen, M. et al. An Aboriginal Australian genome reveals separate human dispersals conceptualized the study. C.-C.W., R.B., M. Mah, S.M., Z.Z., B.J.C. and D.R. carried out the formal into Asia. Science 334, 94–98 (2011). analysis; C.-C.W., K. Sirak, O.C., A.K., N.R., A.M.K., M. Mah, S.M., K.W., N.A., N.B., K.C., F.C., 70. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer K.S.D.C., B.J.C., L.E., S.F., D.K., A.M.L., K.M., M. Michel, J.O., K.T.O., K. Stewardson, S.W., S.Y., F.Z., datasets. Gigascience 4, 7 (2015). J.G., Q.D., L.K., Dawei Li, Dongna Li, R.L., W.C., N., R.S., L.-X.W., L.W., G.X., H.Y., M.Z., G.H., X.Y., 71. Busing, F. T. A., Meijer, E. & van der Leeden, R. Delete-m jackknife for unequal m. Stat. R.H., S.S., D.J.K., L.J., H. Li, J.K., R.P. and D.R. carried out the investigation. H.-Y.Y., A.N.P., R.B., Comput. 9, 3–8 (1999). D.T., J.Z., Y.-C.L., J.-Y.L., M. Mah, S.M., Z.Z., R.C., H. Looh, C.-J.H., C.-C.S., Y.G.N., A.V.T., A.A.T., S.L., 72. Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse Z.-Y.S., X.-M.W., T.-L.Y., X.H., L.C., H.D., J.B., E. Mijiddorj, D.E., T.-O.I., E. Myagmar, H.K.-K., M.N., populations. Nature 538, 201–206 (2016). K.-i.S., O.A.S., D.J.K., R.P. and D.R. provided resources. C.-C.W., K. Sirak, O.C., A.K., N.R., R.B., M. 73. Monroy Kuhn, J. M., Jakobsson, M. & Günther, T. Estimating genetic kin relationships in Mah, S.M., B.J.C., L.E., A.A.T. and D.R. curated the data. C.-C.W., H.-Y.Y., A.N.P., H.M., A.K. and D.R. prehistoric populations. PLoS ONE 13, e0195491 (2018). wrote the paper. C.-C.W., H.-Q.Z., N.R., M.R., S.S., D.J.K., L.J., H. Li, J.K., R.P. and D.R. supervised 74. Ringbauer, H., Novembre, J. & Steinruecken, M. Human parental relatedness through time the study. — detecting runs of homozygosity in ancient DNA. Preprint at bioRxiv https://doi.org/ 10.1101/2020.05.31.126912 (2020). Competing interests The authors declare no competing interests.

Additional information Acknowledgements We thank D. Anthony, O. Bar-Yosef, K. Brunson, R. Flad, P. Flegontov, Q. Fu, Supplementary information The online version contains supplementary material available at W. Haak, I. Lazaridis, M. Lipson, I. Mathieson, R. Meadow, I. Olalde, N. Patterson, P. Skoglund, https://doi.org/10.1038/s41586-021-03336-2. D. Xu, P. Bellwood and C. Chiang for comments; N. Saitou and the Asian DNA Repository Correspondence and requests for materials should be addressed to C.-C.W., J.K., R.P. or D.R. Consortium for sharing genotype data from present-day Japanese groups; T. Nishimoto and Peer review information Nature thanks Peter Bellwood, Charleston Chiang and the other, T. Fujisawa from the Rebun Town Board of Education for sharing the Funadomari Jomon anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer samples, and H. Tanaka and W. Nagahara from the Archeological Center of Chiba City, who are reports are available. excavators of the Rokutsu Jomon site. The excavations at Boisman-2 site (Boisman culture), the Reprints and permissions information is available at http://www.nature.com/reprints. Extended Data Fig. 1 | PCA of ancient samples. Projection of ancient samples onto PCA dimensions 1 and 2 defined by East Asian, European, Siberian and Native American populations. Article

Extended Data Fig. 2 | PCA of present-day samples. a, PCA dimensions 1 and 2 defined by present-day East Asian, European, Siberian and Native American populations. b, PCA dimensions 1 and 2 defined by present-day East Asian groups with little West Eurasian mixture. Extended Data Fig. 3 | Neighbour-joining tree of present-day East Eurasian origin dataset. The branch length is shown in FST distance. b, Neighbour-joining individuals using the human origin dataset. a, Neighbour-joining tree of tree of present-day East Eurasian individuals in which internal branches are all present-day East Eurasian individuals based on FST distances using the human shown with the same branch length for better visualization. Article

Extended Data Fig. 4 | Admixture plot at K = 15 using the human origin (purple). c, Populations mainly from southern China and Southeast Asia (light dataset. a–f, We grouped the populations roughly into six groups based on blue). d, Populations mainly from the Tibetan Plateau (olive) and Neolithic geographical and genetic affinity. a, Populations mainly from Africa (yellow), Yellow River Basin (red). e, Mainly Han Chinese groups from China (light blue America (magenta), West Eurasia (dark green and light brown) and and red). f, Populations mainly from the Amur River Basin (blue and red) and (light magenta). b, Populations mainly from Mongolia (blue) and Siberia . Extended Data Fig. 5 | Estimates of population split times. a, Cross- Project. The times are calculated based on the mutation rate and generation coalescence rates for selected population pairs. We ran MSMC for four pairs of time specified on the x axis. b, Cross-coalescence rates for selected population populations: Tibetan–Ami, Tibetan–Atayal, Tibetan–Ulchi and Tibetan–Mixe. pairs. The same analysis as shown in a but using MSMC2 instead of MSMC, and We used one individual from each population in this analysis. The modern using two individuals per population except for the Tibetan–Atayal pair, for genomic data for those individuals are from the Simons Genome Diversity which we used only one. Article

Extended Data Fig. 6 | Admixture graph model. This figure is the same as splits are not well constrained because of the minimal availability of data on Fig. 2 except we show the fitted genetic drifts on each lineage. We used all East Asian populations from the Upper Paleolithic. a, Locations and dates of available sites in the dataset comprising 1,237,207 SNPs, restricting to the East Asian individuals used in model fitting, with colours indicating transversions only to confirm that the same model fit (Supplementary whether the majority ancestry is from the hypothesized coastal expansion Information section 3). We started with a skeleton tree that fits the data for (green), interior expansion south (red) and interior expansion north (blue). The Denisovan, Mbuti, Onge, Tianyuan and Luxembourg Loschbour and one map is based on the ‘Google Map Layer’ from ArcGIS Online Basemaps (map admixture event. We grafted on Mongolia East Neolithic, Late Neolithic data ©2020 Google). The grey circles represent sampled populations and farmers from the Upper Yellow River, Liangdao 2, Japan Jomon, Nepal white circles represent unsampled hypothesized nodes. b, In the model Chokhopani, Taiwan Hanben and Late Neolithic farmers from the West Liao visualization, we colour lineages modelled as deriving entirely from one of River in turn, adding them consecutively to all possible edges in the tree and these expansions, and also colour populations according to ancestry retaining only graph solutions that provided no differences of |Z| < 3 between proportions. Dashed lines represent admixture (proportions are marked), and fitted and estimated statistics (maximum |Z| = 2.95 here). We used the MSMC we show the amount of genetic drift on each lineage in units of FST × 1,000. and MSMC2 relative population split time estimates to constrain models. Deep Extended Data Fig. 7 | Shared genetic drift among Tibetan groups, The Tibetan_Yajiang are geographically in the Tibeto-Burman Corridor but measured by f3(X, Y; Mbuti). Lighter colours indicate more shared drift. Lahu group with Core Tibetan individuals, presumably reflecting less genetic groups with the Southeast Asian cluster probably due to substantial admixture. admixture from people of the Southeast Asian cluster. Article Extended Data Table 1 | Population information for newly genotyped present-day individuals Extended Data Table 2 | Kinship detected between pairs of individuals        $%&&'()%012034567%&8(9@ D'2H7    

A4(65)146'1BC4567%&8(9@j40klWmnml 

   

D')%&6203E5FF4&C     

G465&'D'('4&H7I2(7'(6%2F)&%P'67'&')&%15H2B2Q26C%R67'I%&S6746I')5BQ2(7TU72(R%&F)&%P21'((6&5H65&'R%&H%0(2(6'0HC4016&40()4&'0HC  

20&')%&6203TV%&R5&67'&20R%&F462%0%0G465&'D'('4&H7)%Q2H2'(W(''%5&X126%&24QY%Q2H2'(40167'X126%&24QY%Q2HC$7'HSQ2(6T     E6462(62H( V%&4QQ(6462(62H4Q404QC('(WH%0R2&F674667'R%QQ%I20326'F(4&')&'('062067'R235&'Q'3'01W64BQ'Q'3'01WF4206'`6W%&a'67%1(('H62%0T

0b4 $%0R2&F'1

o U7''`4H6(4F)Q'(2c'8d9R%&'4H7'`)'&2F'064Q3&%5)bH%01262%0W32P'04(412(H&'6'05FB'&4015026%RF'4(5&'F'06

o e(646'F'06%0I7'67'&F'4(5&'F'06(I'&'64S'0R&%F12(620H6(4F)Q'(%&I7'67'&67'(4F'(4F)Q'I4(F'4(5&'1&')'46'1QC U7'(6462(62H4Q6'(68(95('1eGfI7'67'&67'C4&'%0'g%&6I%g(21'1 o hdipqrsttsdquvwuwqwxsyi€qvq€vwr‚ƒv€qwsivipqpqd„tv q€vwr‚ƒvqts‚vqrst†iv‡quvrxdƒˆyvwqƒdquxvq‰vuxs€wqwvruƒsd

o e1'(H&2)62%0%R4QQH%P4&246'(6'(6'1

o e1'(H&2)62%0%R40C4((5F)62%0(%&H%&&'H62%0(W(5H74(6'(6(%R0%&F4Q26C40141‘5(6F'06R%&F5Q62)Q'H%F)4&2(%0( eR5QQ1'(H&2)62%0%R67'(6462(62H4Q)4&4F'6'&(20HQ51203H'06&4Q6'01'0HC8'T3TF'40(9%&%67'&B4(2H'(62F46'(8'T3T&'3&'((2%0H%'RR2H2'069 o eGfP4&2462%08'T3T(64014&11'P2462%09%&4((%H246'1'(62F46'(%R50H'&64206C8'T3TH%0R21'0H'206'&P4Q(9

V%&05QQ7C)%67'(2(6'(6203W67'6'(6(6462(62H8'T3T’WuW‚9I267H%0R21'0H'206'&P4Q(W'RR'H6(2c'(W1'3&''(%RR&''1%F401“P4Q5'0%6'1 o ”ƒ•vq“q•„iyvwq„wqv‡„ruq•„iyvwq–xvdv•v‚qwyƒu„iv

o V%&—4C'(240404QC(2(W20R%&F462%0%067'H7%2H'%R)&2%&(401a4&S%PH7420a%06'$4&Q%('66203(

o V%&72'&4&H72H4Q401H%F)Q'`1'(230(W21'062R2H462%0%R67'4))&%)&246'Q'P'QR%&6'(6(401R5QQ&')%&6203%R%56H%F'(

o X(62F46'(%R'RR'H6(2c'(8'T3T$%7'0˜(€WY'4&(%0˜(‚9W2012H462037%I67'CI'&'H4QH5Q46'1

hy‚q–vqrsiivruƒsdqsdqwu„uƒwuƒrwq™s‚qƒsisdƒwuwqrsdu„ƒdwq„‚uƒrivwqsdqt„dpqs™quxvq†sƒduwq„s•v E%R6I4&'401H%1' Y%Q2HC20R%&F462%04B%564P42Q4B2Q26C%RH%F)56'&H%1' f464H%QQ'H62%0 0b4

f464404QC(2( e'5('1efapqUrrAEWefapqUsDXWXpfXGErVUWeAfXDWeGfEfWgeYArfDXYmW$rGUeaapqWEXtYDXYW401—eeW401&'R'&'0H'67'(' )4HS43'(2067'a'67%1(T V%&F405(H&2)6(562Q2c203H5(6%F4Q3%&267F(%&(%R6I4&'67464&'H'06&4Q6%67'&'('4&H7B560%6C'61'(H&2B'120)5BQ2(7'1Q26'&465&'W(%R6I4&'F5(6B'F41'4P42Q4BQ'6%'126%&(401 &'P2'I'&(Te'(6&%03QC'0H%5&43'H%1'1')%(262%0204H%FF5026C&')%(26%&C8'T3Tf26g5B9TE''67'G465&'D'('4&H73521'Q20'(R%&(5BF266203H%1'h(%R6I4&'R%&R5&67'&20R%&F462%0T f464 Y%Q2HC20R%&F462%04B%564P42Q4B2Q26C%R1464 eQQF405(H&2)6(F5(620HQ51'414644P42Q4B2Q26C(646'F'06U72((646'F'06(7%5Q1)&%P21'67'R%QQ%I20320R%&F462%0WI7'&'4))Q2H4BQ'@ geHH'((2%0H%1'(W502i5'21'062R2'&(W%&I'BQ20S(R%&)5BQ2HQC4P42Q4BQ'1464('6(

geQ2(6%RR235&'(674674P'4((%H246'1&4I1464   

ge1'(H&2)62%0%R40C&'(6&2H62%0(%014644P42Q4B2Q26C ! " # "

U7'14644P42Q4B2Q26C(646'F'062(0%IH%F)Q'6'401&'41(4(R%QQ%I(TeQQ1464I2QQB'R5QQC4P42Q4BQ'4667'()'H2R2'1Q%H462%0(BC67'62F'%R)5BQ2H462%0T #

uU7'4Q230'1('i5'0H'(4&'4P42Q4BQ'67&%53767'X5&%)'40G5HQ'%621'e&H72P'501'&4HH'((2%005FB'&YDjX—vmwxlTU7'0'IQC3'0'&46'13'0%6C)'1464%Rkxk F%1'&0X4(6e(2402012P2154Q(74P'B''01')%(26'120y'0%1%8766)(@bb1%2T%&3blnTzmxlbc'0%1%Tvnzxzkm9TU7')&'P2%5(QC)5BQ2(7'11464H%g404QCc'1I267%5&0'IQC &')%&6'11464H40B'%B6420'14(1'(H&2B'12067'%&23204Q)5BQ2H462%0(I72H74&'4QQ'`)Q2H26QC&'R'&'0H'120r0Q20'U4BQ'k{4H%F)2Q'11464('6674620HQ51'(67' F'&3'13'0%6C)'(5('120672()4)'&2(4P42Q4BQ'4(67'eQQ'0e0H2'06fGeD'(%5&H'46766)(@bb&'2H7T7F(T74&P4&1T'15b4QQ'0g40H2'06g104g&'(%5&H'g441&g 1%I0Q%414BQ'g3'0%6C)'(g)&'('06g14Cg401g40H2'06g104g1464Te0C%67'&&'Q'P40614644&'4P42Q4BQ'R&%F67'H%&&'()%012034567%&(5)%0&'4(%04BQ'&'i5'(6Tu

   

V2'Q1g()'H2R2H&')%&6203    YQ'4('('Q'H667'%0'B'Q%I67462(67'B'(6R26R%&C%5&&'('4&H7TpRC%54&'0%6(5&'W&'4167'4))&%)&246'('H62%0(B'R%&'F4S203C%5&('Q'H62%0T    

o A2R'(H2'0H'( —'74P2%5&4Qh(%H24Q(H2'0H'( XH%Q%32H4QW'P%Q562%04&Ch'0P2&%0F'064Q(H2'0H'( 

V%&4&'R'&'0H'H%)C%R67'1%H5F'06I2674QQ('H62%0(W(''0465&'TH%Fb1%H5F'06(b0&g&')%&6203g(5FF4&CgRQ46T)1R        

A2R'(H2'0H'((651C1'(230  

eQQ(6512'(F5(612(HQ%('%067'(')%206('P'0I7'067'12(HQ%(5&'2(0'3462P'T   E4F)Q'(2c' e'&')%&60'I40H2'06fGe1464R&%FFCH%06'`6(R&%FI72H740H2'06fGe74(0%6)&'P2%5(QCB''0&')%&6'1T—'H45('%R672(W'P'0(F4QQ 05FB'&(%R2012P2154Q(4&'4BQ'6%)&%P21'H%0(21'&4BQ'0'I20(2376(4B%56)%)5Q462%072(6%&C{40C(4F)Q'(20672(H%06'`6)&%P21'(4  F'40203R5Q(H2'062R2H41P40H'Tp0411262%0W26(7%5Q1B'0%6'16746%5&40H2'06fGe(4F)Q'((2c'(I72Q'%R6'020HQ51203%0QC4R'I12(620H6  2012P2154Q()'&(26'W20R4H6'RR'H62P'QC&')&'('064F5H7Q4&3'&05FB'&%R(4F)Q'(R&%F67')'&()'H62P'%R)%)5Q462%03'0'62H404QC(2(Te 3'0%F'H%06420(F40C(6462(62H4QQC50Q20S'1(6&'6H7'(%RfGe'4H7%RI72H7)&%P21'(201')'01'0620R%&F462%04B%5667')4(6{7'0H''P'04 (F4QQ05FB'&%R2012P2154Q(2('RR'H62P'QC4P'&CQ4&3'05FB'&R&%F67')%206%RP2'I%RF4S20320R'&'0H'(4B%5640H'(6&C40141F2`65&'T U7&%537%5667'F405(H&2)6WI'0%6'67')&'H2(2%0I267I72H7I'4&'4BQ'6%F4S'20R'&'0H'(5(203—Q%HSj4HSS02R'(64014&1'&&%&(T

f464'`HQ5(2%0( e'1'(H&2B'2067'F4206'`67%II''`HQ51'1k}%R67'l}}0'IQC&')%&6'1(4F)Q'(R&%F67'F420404QC('(@uV%&404QC(2(I'R%H5('1%0 lkn2012P2154Q(4R6'&'`HQ51203l}I267'P21'0H'%RQ%IB560%0gc'&%H%064F20462%0WlnI267znnnglznnnEGY(H%P'&'1W401ll67464&'HQ%(' &'Q462P'(%R40%67'&7237'&H%P'&43'2012P2154Q2067'1464('6Tug237QC1'642Q'120R%&F462%032P20367'&'4(%0R%&'`HQ51203)4&62H5Q4& 2012P2154Q(2(32P'020r0Q20'U4BQ'lT

D')Q2H462%0 D')Q2H462%02(0%6)%((2BQ'20'P%Q562%04&C404QC(2(B'H45('I'4&''`4F20203%0QC4(203Q'72(6%&2H4Q)&%H'((@I'H400%6&')'4667'72(6%&C%R 67'Q4(6znWnnnC'4&(20X4(6e(244(4&')Q2H462%0'`)'&2F'06T

D401%F2c462%0 D401%F2c462%02(0%6&'Q'P40620'P%Q562%04&C404QC(2(B'H45('I'4&'H%0R&%06'1I267%0QC4(203Q''`)'&2F'06%R0465&'6746I'0''16% (651Cgg67'72(6%&C%R67'Q4(6znWnnnC'4&(20X4(6e(24gg401I'H400%6&')'46672('`)'&2F'06&401%F2c20312RR'&'06P4&24BQ'(4RR'H620367' 72(6%&CT

—Q201203 —Q2012032(0%6&'Q'P4066%672((651CB'H45('67'3'%3&4)72H40172(6%&2H4QH%06'`6R%&67')%)5Q462%072(6%&C%R'4H7&'32%06746I'4&' 404QCc2032('(('0624QR%&20P'(62346%&(6%S0%I4B%56I7'0F4S20320R'&'0H'(T

D')%&6203R%&()'H2R2HF46'&24Q(W(C(6'F(401F'67%1( e'&'i52&'20R%&F462%0R&%F4567%&(4B%56(%F'6C)'(%RF46'&24Q(W'`)'&2F'064Q(C(6'F(401F'67%1(5('120F40C(6512'(Tg'&'W2012H46'I7'67'&'4H7F46'&24QW (C(6'F%&F'67%1Q2(6'12(&'Q'P4066%C%5&(651CTpRC%54&'0%6(5&'2R4Q2(626'F4))Q2'(6%C%5&&'('4&H7W&'4167'4))&%)&246'('H62%0B'R%&'('Q'H62034&'()%0('T a46'&24Q(h'`)'&2F'064Q(C(6'F( a'67%1( 0b4 p0P%QP'12067'(651C 0b4 p0P%QP'12067'(651C o e062B%12'( o $7pYg('i

o X5S4&C%62HH'QQQ20'( o VQ%IHC6%F'6&C

o Y4Q4'%06%Q%3C4014&H74'%Q%3C o aDpgB4('10'5&%2F43203

o e02F4Q(401%67'&%&3402(F(

o g5F40&'('4&H7)4&62H2)406(

o $Q202H4Q1464

o f54Q5('&'('4&H7%RH%0H'&0   

! " # " #

| 

Y4Q4'%06%Q%3C401e&H74'%Q%3C    

E)'H2F'0)&%P'040H' U7'40H2'06(4F)Q'(0'IQC&')%&6'120672((651CI'&'H%QQ'H6'1I26767')'&F2((2%0%R67'H5(6%1240(%R67'(4F)Q'(WI7%4&'67'  

4&H74'%Q%32(6(%&F5('5F(20'4H7%R67'H%506&2'(R%&I72H7I'404QCc'167'1464Te'4))Q2'14H4('gBCgH4('4))&%4H76%  

%B6420203)'&F2((2%0(R%&'4H7('6%R(4F)Q'(1')'01203%067'Q%H4Q'`)'H6462%0(4(67'('P4&CBC&'32%0401H5Q65&4QH%06'`6TXP'&C  

0'IQC&')%&6'140H2'06(4F)Q'20672((651C74()'&F2((2%0R%&404QC(2(R&%FH5(6%1240(%R67'(4F)Q'(I7%4&'H%g4567%&(401I7% 

4RR2&F674640H2'06fGe404QC(2(%R67'('(4F)Q'(2(4))&%)&246'TV%&F%(6(4F)Q'(WI')&')4&'1R%&F4QH%QQ4B%&462%043&''F'06(  I26767'20(626562%0I7'&'67'(4F)Q'(I'&'H5&46'16%'`)Q2H26QCQ2(667'40H2'06fGeI%&SB'203)'&R%&F'1BC%5&6'4FW40120'4H7 

%R67'('H4('(W&')&'('06462P'4RR2Q246'(%R67'20(626562%04&'H%g4567%&(Tp0%67'&20(640H'(W(4F)Q'H5(6%1240(I7%4&'H%g4567%&( 

1'6'&F20'167463'0'&462%0401)5BQ2H462%0%R40H2'06fGe1464I4(H%P'&'1501'&67'2&'`2(6203)'&F2((2%0(R%&(4F)Q'404QC(2(W  

401(%0'I(4F)Q20343&''F'06(I'&'0%6&'i52&'1T      e'1'(H&2B'67')&%P'040H'%R'4H7401'P'&C4&H74'%Q%32H4Q(4F)Q'201'642Q20E5))Q'F'064&Cp0R%&F462%0('H62%0l4014Q(%20 r0Q20'U4BQ'lI7'&'I'2012H46'uH%g4567%&(4((%H246'1I267404QCc203672((4F)Q'uI72H74QI4C(20HQ51'14&')&'('06462P'(4F)Q'   H5(6%1240T 

E)'H2F'01')%(262%0 U7'()'H2F'0(4&'501'&67'H5(6%1240(72)%R67'4&H74'%Q%32(6(401H5Q65&4Q20(626562%0(R&%FI72H767'CI'&'(4F)Q'1TU7'CH40  B'&'g'`4F20'15)%0&'i5'(66%67'4&H74'%Q%32(6(T

f46203F'67%1( e'1'(H&2B'67'F'67%1%Q%3CI'5('R%&146203401H4Q2B&462%02067'a'67%1(('H62%0%0ueHH'Q'&46%&a4((E)'H6&%F'6&C D412%H4&B%0f46203u401)&'('0667'R5QQ1'642Q(%R67'146'(20r0Q20'U4BQ'zTp067'a'67%1(('H62%0I'I&26'@ue'3'0'&46'1lnx 12&'H6eaE8eHH'Q'&46%&a4((E)'H6&%F'6&C9&412%H4&B%08lv$9146'({wn4667'Y'00(CQP4024E646's02P'&(26C8YEs9Wkm67&%5374 H%QQ4B%&462%0%RY'00(CQP4024E646's02P'&(26C40167's02P'&(26C%R$4Q2R%&024p&P20'8s$peaE9W401}46Y%c040D412%H4&B%0 A4B%&46%&CTU7'F'67%1(5('146B%67Q4B%&46%&2'(4&')5BQ2(7'1W4017'&'I'(5FF4&2c'67'F'67%1(R&%FYEsT—%0'H%QQ43'0 R&%F)'6&%5(W)74Q40`W%&6%%6781'0620'9(4F)Q'(I4('`6&4H6'1401)5&2R2'15(2034F%12R2'1A%0320F'67%1I2675Q6&4R2Q6&462%0 8knSf43'Q46209}zTpRB%0'H%QQ43'0I4()%%&QC)&'('&P'1%&H%064F2046'1I'7C1&%QC('167'H%QQ43'0401)5&2R2'167'4F20%4H21( 5(203(%Q21)74(''`6&4H62%0H%Q5F0(8qef4F20%4H21(9}}TY&2%&6%'`6&4H62%0I'('i5'0624QQC(%02H46'14QQ(4F)Q'(20e$E3&41' F'6740%QW4H'6%0'W40112H7Q%&%F'6740'8knF2056'('4H7946&%%F6'F)'&465&'6%&'F%P'H%0('&P406(%&417'(2P'()%((2BQC5('1 15&203H5&462%0TX`6&4H6'1H%QQ43'0%&4F20%4H21)&'('&P462%0I4('P4Q546'15(203H&51'3'Q4620C2'Q1(8€I69W€$W€G401$bG &462%(TE64BQ'H4&B%0401026&%3'02(%6%)'(I'&'F'4(5&'1%04U7'&F%f'Q64YQ5(20(6&5F'06I2674$%(6'H7'Q'F'064Q404QC('&46 4Q's02P'&(26CT$bG&462%(B'6I''0kTn}401kTvz2012H46'67464QQ&412%H4&B%0146'1(4F)Q'(4&'I'QQ)&'('&P'1TeQQ(4F)Q'(I'&' H%FB5(6'14013&4)7262c'146YEs5(203F'67%1(1'(H&2B'120‚'00'66'64QTmnlw}zTlv$F'4(5&'F'06(I'&'F41'%04F%12R2'1 G462%04QXQ'H6&%02H($%&)%&462%0lTzEfgglH%F)4H64HH'Q'&46%&F4((()'H6&%F'6'&46'267'&67'YEseaER4H2Q26C%&67'‚'HSg$4&B%0 $CHQ'eaEV4H2Q26C4667's02P'&(26C%R$4Q2R%&024p&P20'TeQQ146'(I'&'H4Q2B&46'15(20367'p06$4QmnH5&P'}w20r`$4QPvTvTm}x401 4&')&'('06'120H4Q'014&C'4&(—$Xb$XTu

o U2HS672(B%`6%H%0R2&F674667'&4I401H4Q2B&46'1146'(4&'4P42Q4BQ'2067')4)'&%&20E5))Q'F'064&Cp0R%&F462%0T

X672H(%P'&(2376 e'20HQ51'40X672H((646'F'064(R%QQ%I(T

uU7'40H2'06(4F)Q'(0'IQC&')%&6'120672((651CI'&'H%QQ'H6'1I26767')'&F2((2%0%R67'H5(6%1240(%R67'(4F)Q'(WI7%4&'67' 4&H74'%Q%32(6(%&F5('5F(20'4H7%R67'H%506&2'(R%&I72H7I'404QCc'167'1464Te'4))Q2'14H4('gBCgH4('4))&%4H76% %B6420203)'&F2((2%0(R%&'4H7('6%R(4F)Q'(1')'01203%067'Q%H4Q'`)'H6462%0(4(67'('P4&CBC&'32%0401H5Q65&4QH%06'`6TXP'&C 0'IQC&')%&6'140H2'06(4F)Q'20672((651C74()'&F2((2%0R%&404QC(2(R&%FH5(6%1240(%R67'(4F)Q'(I7%4&'H%g4567%&(401I7% 4RR2&F674640H2'06fGe404QC(2(%R67'('(4F)Q'(2(4))&%)&246'TV%&F%(6(4F)Q'(WI')&')4&'1R%&F4QH%QQ4B%&462%043&''F'06( I26767'20(626562%0I7'&'67'(4F)Q'(I'&'H5&46'16%'`)Q2H26QCQ2(667'40H2'06fGeI%&SB'203)'&R%&F'1BC%5&6'4FW40120'4H7 %R67'('H4('(W&')&'('06462P'4RR2Q246'(%R67'20(626562%04&'H%g4567%&(Tp0%67'&20(640H'(W(4F)Q'H5(6%1240(I7%4&'H%g4567%&( 1'6'&F20'167463'0'&462%0401)5BQ2H462%0%R40H2'06fGe1464I4(H%P'&'1501'&67'2&'`2(6203)'&F2((2%0(R%&(4F)Q'404QC(2(W 401(%0'I(4F)Q20343&''F'06(I'&'0%6&'i52&'1Tf%203B'C%01I746I4(R%&F4QQC&'i52&'1WI'4Q(%'0(5&'1674667' )&'('06462%0%R67'(H2'062R2HR201203(I4(('0(262P'6%Q%H4Q)'&()'H62P'(R&%F67'&'32%0(R&%FI72H767'(S'Q'6%0(I'&''`H4P46'1T V%&(%F'&'32%0(R%&I72H7I'%B6420'1fGe(5H74(67'(%567'&02(Q401(%Rj4)4040167'D5((240V4&X4(6(26'(I'4&'0%64I4&'%R F%1'&0H%FF50262'(I2676&41262%0(%RB2%Q%32H4Q%&H5Q65&4QH%00'H62%06%67'40H2'06&'F420(TV%&%67'&&'32%0((5H74(67's))'& 'QQ%ID2P'&$720'('%&a%03%Q2467'F%1'&00462%0g(646'(20I72H767'40H2'062012P2154Q(Q2P'14&'F%1'&0207'&26%&(%R67' H5Q65&4Q4013'0'62H7'&2643'%R67'40H2'063&%5)(Tp0U42I40W20411262%06%%B6420203R%&F4Q)'&F2((2%0R%&(4F)Q203R&%F 3%P'&0F'0620(626562%0(WI'(%53766%'0(5&'674667')&'('06462%0%R%5&&'(5Q6(I4(('0(262P'6%67')'&()'H62P'(%Rp0123'0%5( U42I40'('I7%)Q45(2BQC1'(H'0167%5(401(%RC'4&(43%R&%F3&%5)(&'Q46'16%67%('R&%FI72H7I'&')%&61464TU7''`2(6'0H'%R 46Q'4(6(2`6''00%0gg40$720'('p0123'0%5(3&%5)(20U42I40F4S'(2612RR2H5Q66%H%00'H6)4&62H5Q4&(26'(6%()'H2R2HF%1'&0'6702H 3&%5)(R%&)&'72(6%&2H(26'(%Q1'&6740R%5&7501&'1C'4&(W401262(&4&'R%&Q%H4QH%FF50262'(6%'`)&'((H%00'H62%0(I267)&'72(6%&2H (26'(TG'P'&67'Q'((W6I%H%g4567%&(I267p0123'0%5(U42I40'('40H'(6&C%&H5Q65&4Q4RR2Q2462%06%67'('3&%5)(()'H2R2H4QQC&'P2'I'1 67'12(H5((2%0%R67'U42I40&'(5Q6(6%20H&'4('67'('0(262P26C%R%5&(651C6%p0123'0%5(3&%5))'&()'H62P'(TgTgTTI7%2(H%gR2&(6 4567%&%R67'(651C74(40H'(6&CR&%F67'Y42I40p0123'0%5(3&%5)TgTATI4(67''`H4P462%0Q'41'&R%&67'—2Q750g40B'0(26'4012( 67'Q%H4QH%FF5026CQ'41'&R%&67'eF23&%5)WI7%(')&'('06g14CH5Q65&'(7%I((%F'(2F2Q4&262'(6%67'F46'&24QH5Q65&'%R67'(26'Tu

G%6'6746R5QQ20R%&F462%0%067'4))&%P4Q%R67'(651C)&%6%H%QF5(64Q(%B')&%P21'12067'F405(H&2)6T   

! " # " #

~  g5F40&'('4&H7)4&62H2)406(     Y%Q2HC20R%&F462%04B%56(6512'(20P%QP20375F40&'('4&H7)4&62H2)406(   

Y%)5Q462%0H74&4H6'&2(62H( p012P2154Q(R&%F12P'&('75F40)%)5Q462%0(20$7204401G')4QI'&'(4F)Q'1I26767'3%4Q%R&')&'('06203Q%H4Q40H'(6&C  

P4&2462%0TV%&672((651CI'I'&'0%6(651C203)7'0%F'044RR'H6'1BCB2%Q%32H4Q('`W43'W%&7'4Q67(6465(W4017'0H'I'121   0%66&4HS672(20R%&F462%0T



D'H&526F'06 E651C(64RR20R%&F'1)%6'0624Q)4&62H2)406(4B%5667'3%4Q(%R67')&%‘'H6W4012012P2154Q(I7%H7%('6%)4&62H2)46'34P'  

20R%&F'1H%0('06H%0(2(6'06I267B&%41(6512'(%R)%)5Q462%072(6%&C40175F40P4&2462%0401)5BQ2H)%(6203%R40%0CF2c'1  1464TU7'&'I'&'0%&'I4&1(R%&)4&62H2)462034010%0'3462P'H%0('i5'0H'(R%&0%6)4&62H2)46203{4QQ)4&62H2)406((230'1%&    

4RR2`'14675FB)&2066%67'H%0('06R%&F&'P2'I'1BCV5140s02P'&(26CTe02F)%&6406)&20H2)Q'%R%5&(651CI4(6%'0(5&' 

674667'&'('4&H7I4(501'&)200'10%6%0QCBC2012P2154Q20R%&F'1H%0('06WB564Q(%(5))%&6R&%FH%FF5026C  

&')&'('06462P'(('0(262P'6%Q%H4Q)'&()'H62P'(W401675(I'H4&&2'1%56H%FF5026CH%0(5Q6462%0I267F20%&26C3&%5)Q'41'&( 

%&P2QQ43'Q'41'&(4(40206'3&4Q)4&6%R67'H%0('06)&%H'((TV%&'4H7F20%&26C3&%5)WH%FF5026C&')&'('06462P'(4RR2&F'1  H%FF5026C(5))%&6R%&67'(651C67&%5374(230465&'%&675FB)&206%04R%&F(5FF4&2c20367'$%FF5026C$%0(5Q6462%0

)&%H'((867'('R%&F(I'&'H%F)Q'6'1B'6I''0G%P'FB'&lnmnlv401f'H'FB'&lnmnlv9T  

—'H45('&'H&526F'06I4(P%Q5064&2QCW262(Q2S'QC6746P%Q506''&(&')&'('0640%0g&401%F(5B('6%R67'Q%H4Q)%)5Q462%0(I' 404QCc'1I267&'()'H66%('`W43'W4017'4Q67(6465(Tg%I'P'&W(20H'%5&R%H5(7'&'2(%040H'(6&C&467'&674040C%R67'(' 6&426(I'1%0%6'`)'H6('QRg('Q'H62%06%B24(20R'&'0H'(T

X672H(%P'&(2376 e'20HQ51'40X672H((646'F'064(R%QQ%I(@

U7'F%1'&0(4F)Q'H%QQ'H62%0I4(H4&&2'1%5620mnlv20(6&2H64HH%&140H'I26767''672H4Q&'('4&H7)&20H2)Q'(%RU7'a202(6&C %REH2'0H'401U'H70%Q%3C%R67'Y'%)Q'„(D')5BQ2H%R$72048p06'&2Fa'4(5&'(R%&67'e1F202(6&462%0%Rg5F40f'0'62H D'(%5&H'(Wj50'lnWl x9Tr5&(4F)Q'H%QQ'H62%04013'0%6C)203I4(R5&67'&&'P2'I'14014))&%P'1BC67'X672H( $%FF266''%R67'EH7%%Q%RA2R'EH2'0H'(WV5140s02P'&(26C8rH6%B'&mmWmnlv9TE651C(64RR20R%&F'1)%6'0624Q)4&62H2)406( 4B%5667'3%4Q(%R67')&%‘'H6W4012012P2154Q(I7%H7%('6%)4&62H2)46'34P'20R%&F'1H%0('06H%0(2(6'06I267B&%41(6512'( %R)%)5Q462%072(6%&C40175F40P4&2462%0401)5BQ2H)%(6203%R40%0CF2c'11464TU7'&'I'&'0%&'I4&1(R%&)4&62H2)46203 4010%0'3462P'H%0('i5'0H'(R%&0%6)4&62H2)46203{4QQ)4&62H2)406((230'1%&4RR2`'14675FB)&2066%67'H%0('06R%&F &'P2'I'1BCV5140s02P'&(26CTe02F)%&6406)&20H2)Q'%R%5&(651CI4(6%'0(5&'674667'&'('4&H7I4(501'&)200'10%6%0QC BC2012P2154Q20R%&F'1H%0('06WB564Q(%(5))%&6R&%FH%FF5026C&')&'('06462P'(('0(262P'6%Q%H4Q)'&()'H62P'(W401675(I' H4&&2'1%56H%FF5026CH%0(5Q6462%0I267F20%&26C3&%5)Q'41'&(%&P2QQ43'Q'41'&(4(40206'3&4Q)4&6%R67'H%0('06)&%H'((T V%&'4H7F20%&26C3&%5)WH%FF5026C&')&'('06462P'(4RR2&F'1H%FF5026C(5))%&6R%&67'(651C67&%5374(230465&'%& 675FB)&206%04R%&F(5FF4&2c20367'$%FF5026C$%0(5Q6462%0)&%H'((867'('R%&F(I'&'H%F)Q'6'1B'6I''0G%P'FB'& lnmnlv401f'H'FB'&lnmnlv9T$%g4567%&(%R67'F405(H&2)6I7%I'&'H5Q65&4QQCp0123'0%5(40120(%F'H4('(I'&' Q'34QQC&'32(6'&'14(F'FB'&(%RF20%&26C3&%5)(()'H2R2H4QQC&'P2'I'167'F405(H&2)6„(12(H5((2%0%R)%)5Q462%072(6%&C6% 20H&'4('('0(262P26C6%Q%H4Q)'&()'H62P'(TE)'H2R2H4QQCWH%g4567%&ATeT2(4U42g‚4142()'4S203y75403)'&(%0R&%Ff5403`220 (%567I'(6$7204{DTET2(R&%FG')4Q{401AT‚T401GT4&'B4('14667'U2B'6s02P'&(26CR%&G462%04Q262'(W401GT2(40p0123'0%5( U2B'640Te''F)74(2c'6746p0123'0%5(401H%FF5026C04&&462P'(H%g'`2(6I267(H2'062R2H%0'(401F4C%&F4C0%64Q230I267 67'FTp0123'0%5(40H'(6&C(7%5Q10%6B'H%0R5('1I26721'0626CWI72H72(4B%56('QRg)'&H')62%0401H5Q65&'401H400%6B' 1'R20'1BC3'0'62H(4Q%0'T G%6'6746R5QQ20R%&F462%0%067'4))&%P4Q%R67'(651C)&%6%H%QF5(64Q(%B')&%P21'12067'F405(H&2)6T   

! " # " #

ƒ