<<

Manuscript

1 Survival of Late Hunter- 2 Gatherer Ancestry in the Iberian Peninsula 3 4 5 Vanessa Villalba-Mouco1,2, Marieke S. van de Loosdrecht1, Cosimo Posth1, Rafael Mora3, Jorge 6 Martínez-Moreno3, Manuel Rojo-Guerra4, Domingo C. Salazar-García5, José I. Royo-Guillén6, Michael 7 Kunst7, Hélène Rougier8, Isabelle Crevecoeur9, Héctor Arcusa-Magallón10, Cristina Tejedor- 8 Rodríguez11, Iñigo García-Martínez de Lagran12, Rafael Garrido-Pena13, Kurt W. Alt14, 15, Choongwon 9 Jeong1, Stephan Schiffels1, Pilar Utrilla2, Johannes Krause1, Wolfgang Haak1,16 10 11 12 1 Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Kahlaische 13 Straße 10, 07745 Jena, . 14 2 Departamento de Ciencias de la Antigüedad, Grupo Primeros Pobladores del Valle del Ebro (PPVE), 15 Instituto de Investigación en Ciencias Ambientales (IUCA), Universidad de Zaragoza, Pedro Cerbuna, 16 50009, Zaragoza, . 17 3 Centre d'Estudis del Patrimoni Arqueològic de la Prehistòria (CEPAP), Facultat de Lletres, 18 Universitat Autònoma Barcelona, 08190 Bellaterra, Spain. 19 4 Department of , University of Valladolid, Plaza del campus, 47011 Valladolid, Spain. 20 5 Grupo de Investigación en Prehistoria IT-622-13 (UPV-EHU)/IKERBASQUE-Basque Foundation for 21 Science, Euskal Herriko Unibertsitatea, Francisco Tomas y Valiente s/n., 01006 Vitoria, Spain. 22 6 Dirección General de Cultura y Patrimonio. Gobierno de Aragón, Avenida de Ranillas, 5 D. 50018 23 Zaragoza, Spain. 24 7 Deutsches Archäologisches Institut, Abteilung Madrid, E-28002 Madrid, Spain. 25 8 Department of Anthropology, California State University Northridge, Northridge, California 91330- 26 8244, USA. 27 9 Université de Bordeaux, CNRS, UMR 5199-PACEA, 33615 Pessac Cedex, . 28 10 Professional Archaeologist. 29 11 Postdoctoral researcher. Juan de la Cierva-Formación Program. Institute of Heritage Sciences. 30 INCIPIT-CSIC, Av. de Vigo, 15705 Santiago de Compostela, Spain. 31 12 Postdoctoral researcher. Juan de la Cierva-Incorporación Program. Department of Prehistory. 32 Valladolid University, Plaza del campus, 47011 Valladolid, Spain. 33 13 Department of Prehistory. Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049, 34 Madrid, Spain. 35 14 Center of Natural and Cultural Human History, Danube Private University, Steiner Landstr. 124, A- 36 3500 Krems, . 37 15 Department of Biomedical Engineering, University of Basel, , Gewerbestrasse 14-16, 38 CH-4123 Allschwil 39 16 Corresponding author 40 41 42 43 44 45 46 47 48 49 50 Correspondence 51 [email protected] (W.H.) 52 53 Summary 54 55 The Iberian Peninsula in southwestern represents an important test case for the study 56 of human population movements during prehistoric periods. During the Last Glacial Maximum 57 (LGM) the peninsula formed a periglacial refugium [1] for hunter-gatherers (HG) and thus 58 served as a potential source for the re-peopling of northern latitudes [2]. The post-LGM genetic 59 signature was previously described as a cline from Western HG (WHG) to Eastern HG (EHG), 60 further shaped by later expansions from the and the North Pontic steppes 61 [3–9]. Western and central Europe were dominated by ancestry associated with the ~14,000- 62 -old individual from Villabruna, , which had largely replaced earlier genetic ancestry, 63 represented by 19,000-15,000-year-old individuals associated with the culture 64 [2]. However, little is known about the genetic diversity in southern European refugia, the 65 presence of distinct genetic clusters and correspondance with geography. Here, we report new 66 genome-wide data from eleven HG and individuals that highlight the late survival of 67 ancestry in Iberia, reported previously in Magdalenian-associated individuals. We 68 show that all Iberian HG, including the oldest ~19,000-year-old individual from El Mirón in 69 Spain, carry dual ancestry from both Villabruna and the Magdalenian-related individuals. Thus, 70 our results suggest an early connection between two potential refugia resulting in a genetic 71 ancestry that survived in later Iberian HG. Our new genomic data from Iberian Early and Middle 72 Neolithic individuals show that the dual Iberian HG genomic legacy pertains in the peninsula, 73 suggesting that expanding farmers mixed with local HG. 74 75 76 Keywords 77 78 Paleolithic, , Neolithic, ancient DNA, human, Last Glacial Maximum, Iberia, 79 Europe, genome, ancestry 80 81 Results and Discussion 82 83 84 We successfully generated autosomal genome-wide data and mitochondrial genomes 85 of ten new individuals from key sites in the Iberian Peninsula ranging from ~13,000–6,000 yrs 86 cal BP: Late (n=2), Mesolithic, (n=1), Early Neolithic (n=4) and Middle 87 Neolithic (n=3) (STAR Methods, Data S1, Figure 1). We furthermore improved the sequencing 88 depth of one Upper Paleolithic individual from the Troisième caverne of Goyet (), 89 dated to ~15,000 yrs cal BP and associated with the Magdalenian culture [2]. For each 90 individual we generated multiple double-stranded DNA libraries with unique index pairs [10,11] 91 for next generation sequencing from ancient DNA (aDNA) extracted from teeth and bones [12] 92 (STAR Methods, Data S2). These were subsequently enriched using targeted in-solution 93 capture for ~1240K informative nuclear single nucleotide polymorphisms (SNPs) [13], and 94 independently for the complete mitogenome [14], and sequenced on an Illumina HiSeq4000 95 platform. All libraries contained short DNA fragments (46-65 base pair length on average) with 96 post-mortem deamination patterns characteristic for aDNA (4-16% for UDG-half and 19-31% 97 for non-UDG at the first base pair position; STAR Methods, Data S3 and S5). We estimate 98 contamination rates from nuclear DNA in males to be 1.0-3.4% for final merged libraries using 99 ANGSD method 2 [15], and for mitogenomes in both sexes to be 0.14-2.20% using ContamMix 100 [16] (STAR Methods, Data S6). After quality filtering we obtained an endogenous DNA content 101 of 1.3-29.5% on the targeted 1240K SNPs (STAR Methods, Data S3). We called pseudo- 102 haploid genotypes for each individual (merged libraries) by randomly choosing a single base 103 per site and intersected our data with a set of global modern populations genotyped for ~1240K 104 nuclear SNP positions [17], including published ancient individuals from [2,5,7–9,13,14,18,21– 105 26]. The final dataset from the newly reported individuals contained 19,269-814,072 covered 106 SNPs (STAR Methods, Data S3, S7). For principal component analysis (PCA) we intersected 107 our new data and published ancient individuals with a panel of worldwide modern populations 108 genotyped on the Affymetrix Human Origins (HO) array [27]. This data set includes ~600K 109 intersecting autosomal SNPs for which our individuals covered 10,602-447,287 SNPs. 110 111 112 Genetic structure in Iberian hunter-gatherers 113 To characterise the genetic differentiation between HG individuals, we calculated 114 genetic distances, defined as 1 – f3, where f3 denotes the f3-outgroup statistics [27], for 115 pairwise comparisons among all published and newly generated HG, and visualized the results 116 using Multidimensional Scaling (MDS) (STAR Methods, Figure 2A). The HG individuals form 117 distinguishable clusters on the MDS plot, supported by f4-statistics and clustering analysis 118 (Figure S1), which we label in line with Fu et al. [2] as Villabruna, Věstonice, Satsurblia, and 119 Mal’ta clusters, respectively. Henceforth we present genetic clusters in Italics and individuals 120 in normal print. We introduce the GoyetQ2 cluster (based on the highest genomic coverage) 121 representing the Magdalenian-associated individuals Goyet Q-2, 49, Rigney 1, and 122 Burkhardtshöhle. With more data available, we notice that Iberian HG form a cline between 123 the GoyetQ2 and Villabruna clusters. This cline also includes El Mirón, which had previously 124 been considered representing its own El Mirón cluster together with all individuals of the 125 GoyetQ2 cluster (yellow symbols in Figure 2A) [2]. Here, Canes 1 and La Braña 1 (Mesolithic 126 individuals from Cantabria in northern Iberia) are falling closer to the Villabruna cluster, whilst 127 Chan (northwestern Iberia) and our newly reported individuals from Moita do Sebastião 128 (Portuguese Atlantic coast) and Balma Guilanya (Pre-Pyrenean region, northeastern Iberia) 129 are closer to El Mirón, which is in turn closer to the Magdalenian GoyetQ2 cluster 130 (Supplemental Information 2.1). 131 132 This observation is confirmed by f4-statistics of the form f4(GoyetQ2, Villabruna; test, Mbuti), 133 which measures whether a test population shares more genetic drift with Goyet Q-2 than with 134 the Villabruna individual. Three Iberian HG (Chan, Moita do Sebastião, and El Mirón) as 135 as Hohle Fels 49 and Goyet Q-116-1 show significantly positive f4-values, indicating that these 136 individuals shared more ancestry with Goyet Q-2 than with Villabruna (Figure 2B). This 137 heterogeneity in Iberian HG cannot be explained by genetic drift alone (to which this type of F- 138 statistics is robust against), but only by admixture between two sources related to Goyet Q-2 139 and Villabruna, respectively. We visualise this admixture cline using contrasting f3-outgroup 140 statistics of the form f3(GoyetQ2; test, Mbuti) and f3(Villabruna; test, Mbuti) (Figure 3A). The 141 individuals from the Villabruna cluster deviate from the symmetry line x=y towards the y-axis, 142 expectedly, indicating excess genetic drift shared with Villabruna. In contrast, individuals of the 143 GoyetQ2 cluster deviate from the symmetry line x=y towards the x-axis, indicating excess 144 genetic drift with Goyet Q-2. Iberian HG fall between the two clusters, which is inconsistent 145 with them forming a clade with either group, but can only be explained by admixture. Here, 146 Iberian HG Canes 1 and La Braña 1 share more Villabruna-like ancestry while El Mirón, Moita 147 do Sebastião and Chan share more Goyet Q-2-like ancestry. 148 149 To further confirm the potential admixture of El Mirón, we used f4(Goyet-Q-2 cluster, ElMirón; 150 Villabruna, Mbuti) to test if Magdalenian-associated individuals were cladal with El Mirón. Here, 151 we obtained significantly negative Z-scores for Hohle Fels 49, Goyet Q-2 and Burkhardshöhle. 152 Among these, Goyet Q-2 has the highest data quality and most negative Z-score, and thus 153 represents the best proxy for the non-Villabruna-like ancestry proportion in individuals such as 154 El Mirón (Z= -6.82) (Data S8). Based on this observation, we used the test f4(Goyet Q-2, Goyet 155 Q-2 cluster; Villabruna, Mbuti), for which El Mirón is significantly negative (Figure 3B, Data 156 S8), confirming shared ancestry with Villabruna. 157 158 To show that the affinity of El Mirón with the Villabruna individual cannot be explained by El 159 Mirón representing a basal split from Villabruna and the other Goyet Q-2 individuals we used 160 the test f4(Goyet Q-2 cluster, Villabruna; El Mirón, Mbuti). Here, all individuals of the Goyet Q- 161 2 cluster are significantly positive (El Mirón Z-score = 11.25), indicating that this cluster does 162 not represent a sister branch of Villabruna and that El Mirón is not an outgroup to both the 163 Villabruna and GoyetQ2 clusters (Figure 3C, Data S8). The mixed ancestry of El Mirón could 164 also explain its reduced affinity to the 35,000-year-old Goyet Q116-1 individual when 165 compared to the younger GoyetQ2 cluster in the test f4(‘GoyetQ2 cluster’, Goyet Q-2; Goyet 166 Q116-1, Mbuti) (Data S8). 167 168 Having established two potential Paleolithic source populations surviving in Iberia from 169 ~19,000 yrs BP onwards, we used the admixture modelling programs qpWave and qpAdm 170 (ADMIXTOOLS; STAR Methods, Figure 3C) to explore the dual ancestry in all Iberian HGs. 171 We used Villabruna and Goyet Q-2 as ultimate sources to model the dual ancestry in European 172 HGs relative to outgroups that can distinguish these two sources from shared deeper 173 ancestries (STAR Methods). Our two-source admixture model provides a good fit for the 174 genetic profiles of most European HG and is consistent with the cline between Villabruna and 175 Goyet Q-2-like ancestries described above (Figure 3D, STAR methods). Here, Villabruna-like 176 ancestry is the dominant component (69.8 ± 4.3 -100%) in individuals of the Villabruna cluster, 177 and Goyet Q-2-like ancestry the dominant component (61.9 ± 6.3 - 94.3 ± 5.7 %) in GoyetQ2 178 cluster individuals (Figure 3C, Data S8). These results underline the power of our outgroups 179 and choice of proxies to differentiate Goyet Q-2-, from Villabruna-like ancestry within our test 180 individuals (STAR methods). Congruent with the pattern observed in MDS (Figure 1A), the F- 181 statistic based tests (Figure 1B, 2A), and the biplot of f3-outgroup tests (Figure 3B), the two- 182 source admixture model assigns a higher proportion of Goyet Q-2-like ancestry to Iberian HG 183 (ranging from 23.7 to 75.3%) than to contemporaneous WHG outside of Iberia. Balma 184 Guilanyà, La Braña 1, and Canes 1 (Figure 3B, Data S9) show elevated Villabruna admixture 185 proportions, but also higher Goyet Q-2 proportions than non-Iberian HG. 186 187 188 Dual hunter-gatherer genetic legacy in Iberian Neolithic individuals 189 During the Neolithic transition ~7600 ago, human expansions reached the Iberian 190 Peninsula relatively swiftly via expanding early farmers from western Anatolia [3–5]. The rapid 191 expansion of Early Neolithic (EN) individuals associated with farming practises across Europe 192 resulted in a relatively low genetic variability in the reported Neolithic genomes, which makes 193 it difficult to distinguish between the Mediterranean and Danubian routes of expansion of 194 Neolithic lifeways [28,29]. However, Olalde et al. [9] noted subtle regional differences between 195 WHG individuals and used the proportion of HG ancestry from La Braña 1 in Neolithic Iberians 196 to trace the expansion from southwestern Europe along the Atlantic Coast to Britain. This 197 movement corresponds well with the Megalithic practices of these regions observed in 198 the archaeological record [30,31]. Under the assumption that these proportions reflect one, or 199 potentially more, local admixture events along the routes of expansion, it is thus possible to 200 distinguish Neolithic groups by their varying autochthonous HGs signatures [18]. 201 202 Given the presence of two ancestral lineages in Iberian HG, we aimed to explore this potential 203 genetic legacy in our newly generated Early and Middle Neolithic (MN) individuals. We first 204 used principal component analysis (PCA) to assess the genetic affinities qualitatively (Figure 205 4A). Here, the new Neolithic Iberian individuals spread along a cline from Neolithic Anatolia to 206 WHG, on which the new individuals cluster with contemporaneous Iberian Neolithic individuals 207 [5,7,8,13,19,26]. As shown before, MN individuals are shifted towards WHG individuals [3], 208 including the newly reported MN individuals from Cova de Els Trocs and Cueva de Chaves. 209 210 Olalde et al. [8] have shown that Cova Bonica and other published EN individuals received 211 either all or the majority of their HG ancestry en route from western Anatolia to Iberia, with only 212 marginal additional ancestry from inside Iberia [8,9]. We thus tested our new Iberian HG 213 individuals, who differ subtly from those previously reported. Using qpAdm models consistent 214 with those above (Supplementary Figure S3), we aimed to trace and quantify the proportion of 215 GoyetQ-2-, and Villabruna-like HG ancestry in EN and MN groups from Iberia and 216 western/central Europe as a mixture of three ancestral sources: Anatolian Neolithic, GoyetQ- 217 2 and Villabruna, respectively. We show that EN Iberians shared a higher proportion of 218 GoyetQ-2-like ancestry than EN individuals from outside Iberia (Figure 4B, Data S10). GoyetQ- 219 2-like ancestry is higher in EN from southern Iberia (Andalusia) suggesting additional 220 admixture with local Iberian HG, who carried mixed Upper Paleolithic ancestry. 221 222 Goyet Q-2 ancestry is continuously detectable in all Iberian MN individuals including broadly 223 contemporaneous individuals from Scotland, Wales, Ireland, and France, but not in Neolithic 224 England, for which the qpAdm model with three sources fails (p-value 6.91e-05) in favour of 225 two sources, despite being poorly supported (p-value 0.001) (Data S10). GoyetQ-2-like 226 ancestry is highest in all Iberian MN (except Chaves MN) when compared with other MN 227 populations that share a similar overall amount of HG ancestry (Figure 4B and 4C). We note 228 the presence of Goyet Q-2 ancestry in MN Trocs, where this ancestry was not observed during 229 the EN, but importantly also in MN individuals from France and Globular Amphora from . 230 231 Olalde and colleagues (2018) reported an elevated signal of La Braña 1 ancestry in Neolithic 232 individuals from Wales and England (using KO1 HG from and Anatolian Neolithic as 233 the other two sources), and thus argued for an Iberian contribution to the Neolithic in Britain 234 [9]. We replicated these findings by using similar sources (El Mirón instead of Goyet Q-2, Data 235 S10), showing that these results are sensitive to the source populations used. However, our 236 models with Goyet Q-2 as ultimate source highlight not only the admixed of La Braña 1 237 and El Mirón, but Goyet Q-2-like ancestry in NM individuals outside Iberia also hints at Iberia 238 as one possible but not the exclusive source of the Neolithic in Britain. Further sampling from 239 regions in today’s France, the Netherlands, Belgium, and Germany is needed to 240 answer this question. 241 242 243 Conclusions 244 Our results highlight the unique genetic structure observed in Iberian HG individuals, 245 which results from admixture of individuals related to the GoyetQ2 and Villabruna clusters. 246 This suggests a survival of two lineages of Upper Paleolithic/Late Pleistocene ancestry in 247 Holocene western Europe, in particular the Iberian Peninsula, whereas HG ancestry in most 248 other regions was largely replaced by Villabruna-like ancestry. With an age estimate of 249 ~18,700 yrs cal BP for the El Mirón individual, the oldest representative of this mixed ancestry, 250 the timing of this admixture suggests an early connection (terminus ante quem) between 251 putative ancestries from LGM refugia. It is possible that GoyetQ2 ancestry could have existed 252 in Iberia in unadmixed form, where it was complemented by Villabruna ancestry as early as 253 ~18,700 years ago. Alternatively, both Magdalenian-associated Goyet-Q2 and Villabruna 254 ancestries originated in regions outside Iberia and arrived in Iberia independently, where both 255 lineages admixed, or had already existed in admixed form outside Iberia. Interestingly, the dual 256 Upper Paleolithic ancestry was also found in EN individuals from the Iberian Peninsula, 257 supporting the hypothesis of additional local admixture with resident HG in Iberia during the 258 time of the Mesolithic-Neolithic transition. 259 260 261 Acknowledgements 262 We thank the members of the Archaeogenetics Department of the Max Planck Institute for the 263 Science of Human History, especially Maite Rivollat, Thiseas Lamnidis, Cody Parker, Rodrigo 264 Barquera, Stephen Clayton, Aditya Kumar and all technicians. We thank Iñigo Olalde for 265 valuable comments on the manuscript. We are indebted to the Museo de Huesca, Patrick 266 Semal and the Royal Belgian Institute of Natural Sciences, and all archaeologists involved in 267 the excavations. The genetic research was funded by the Max Planck Society and the 268 European Research Council ERC-CoG 771234 PALEoRIDER (WH). VVM was funded by a 269 predoctoral scholarship of the Gobierno de Aragón and the Fondo Social Europeo 270 (BOA20150701025) and a 3-months research stay grant (CH 76/16) by Programa CAI-Ibercaja 271 de Estancias de Investigación. VVM and PU are members of the Spanish project HAR2014- 272 59042-P (Transiciones climáticas y adaptaciones sociales en la prehistoria de la Cuenca del 273 Ebro), and of the regional government of Aragon PPVE research group (H-07: Primeros 274 Pobladores del Valle del Ebro). The Goyet project was funded by the Wenner-Gren Foundation 275 (Gr. 7837 to HR), the College of Social and Behavioral Sciences of CSUN, and the CSUN 276 Competition for Research, Scholarship and Creative Activity Awards. 277 278 Author Contributions 279 V.V.-M., J.K., W.H. conceived the study, R.M., J.M.-M, M.R.-G., D.C. S.-G., J.I. R.-G., M. K., 280 H.R., I.C., H. A-M., C. T-R., I. G-M. d.L., R. G-P., K.W. A. and P.U. assembled samples and 281 provided archaeological context, V.V-M., M.v.d.L. and C.P. performed aDNA lab work and 282 sequencing, V.V.-M., C.P., M.v.d.L., S.S., C.J., and W.H. analysed data, V.V.-M., C.P., 283 M.v.d.L., and W.H. wrote the manuscript with input from all co-authors. 284 285 Declaration of Interests 286 The authors declare no competing interests. 287 288 289 References 290 291 1. Stewart, J.R., and Stringer, C.B. (2012). Human Evolution Out of : The Role of 292 Refugia and Climate Change. Science (80-. ). 335, 1317 LP-1321. 293 2. Fu, Q., Posth, C., Hajdinjak, M., Petr, M., Mallick, S., Fernandes, D., Furtwängler, A., 294 Haak, W., Meyer, M., Mittnik, A., et al. (2016). The genetic history of Ice Age Europe. 295 Nature 534, 200-205. 296 3. Haak, W., Lazaridis, I., Patterson, N., Rohland, N., Mallick, S., Llamas, B., Brandt, G., 297 Nordenfelt, S., Harney, E., Stewardson, K., et al. (2015). Massive migration from the 298 steppe was a source for Indo-European languages in Europe. Nature 522, 207-211. 299 4. Günther, T., and Jakobsson, M. (2016). Genes migrations and cultures in 300 prehistoric Europe—a population genomic perspective. Curr. Opin. Genet. Dev. 41, 301 115–123. 302 5. Valdiosera, C., Günther, T., Vera-Rodríguez, J.C., Ureña, I., Iriarte, E., Rodríguez- 303 Varela, R., Simões, L.G., Martínez-Sánchez, R.M., Svensson, E.M., Malmström, H., et 304 al. (2018). Four millennia of Iberian biomolecular prehistory illustrate the impact of 305 prehistoric migrations at the far end of Eurasia. Proc. Natl. Acad. Sci. 306 http://doi.org.10.1073.pnas.1717762115 307 6. Allentoft, M.E., Sikora, M., Sjögren, K.-G., Rasmussen, S., Rasmussen, M., 308 Stenderup, J., Damgaard, P.B., Schroeder, H., Ahlström, T., Vinner, L., et al. (2015). 309 Population genomics of Bronze Age Eurasia. Nature 522, 167-172. 310 7. Martiniano, R., Cassidy, L.M., Ó’Maoldúin, R., McLaughlin, R., Silva, N.M., Manco, L., 311 Fidalgo, D., Pereira, T., Coelho, M.J., Serra, M., et al. (2017). The population 312 genomics of archaeological transition in west Iberia: Investigation of ancient 313 substructure using imputation and haplotype-based methods. PLOS Genet. 13, 314 e1006852. 315 8. Olalde, I., Schroeder, H., Sandoval-Velasco, M., Vinner, L., Lobón, I., Ramirez, O., 316 Civit, S., García Borja, P., Salazar-García, D.C., Talamo, S., et al. (2015). A Common 317 Genetic Origin for Early Farmers from Mediterranean Cardial and Central European 318 LBK Cultures. Mol. Biol. Evol. 32, 3132–3142. 319 9. Olalde, I., Brace, S., Allentoft, M.E., Armit, I., Kristiansen, K., Booth, T., Rohland, N., 320 Mallick, S., Szécsényi-Nagy, A., Mittnik, A., et al. (2018). The Beaker phenomenon 321 and the genomic transformation of northwest Europe. Nature 555, 190-196. 322 10. Meyer, M., and Kircher, M. (2010). Illumina Sequencing Library Preparation for Highly 323 Multiplexed Target Capture and Sequencing. Cold Spring Harb. Protoc. 2010, 324 pdb.prot5448. 325 11. Kircher, M., Sawyer, S., and Meyer, M. (2012). Double indexing overcomes 326 inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, 327 e3. 328 12. Dabney, J., Knapp, M., Glocke, I., Gansauge, M.-T., Weihmann, A., Nickel, B., 329 Valdiosera, C., García, N., Pääbo, S., Arsuaga, J.-L., et al. (2013). Complete 330 mitochondrial genome sequence of a Middle Pleistocene reconstructed 331 from ultrashort DNA fragments. Proc. Natl. Acad. Sci. U. S. A. 110, 15758–63. 332 13. Mathieson, I., Lazaridis, I., Rohland, N., Mallick, S., Patterson, N., Roodenberg, S.A., 333 Harney, E., Stewardson, K., Fernandes, D., Novak, M., et al. (2015). Genome-wide 334 patterns of selection in 230 ancient Eurasians. Nature 528, 499-503. 335 14. Mittnik, A., Wang, C.-C., Pfrengle, S., Daubaras, M., Zariņa, G., Hallgren, F., Allmäe, 336 R., Khartanovich, V., Moiseyev, V., Tõrv, M., et al. (2018). The genetic prehistory of 337 the Baltic Sea region. Nat. Commun. 9, 442. https://doi.org/10.1038/s41467-018- 338 02825-9. 339 15. Korneliussen, T.S., Albrechtsen, A., and Nielsen, R. (2014). ANGSD: Analysis of Next 340 Generation Sequencing Data. BMC Bioinformatics 15, 356. 341 https://doi.org/10.1186/s12859-014-0356-4. 342 16. Fu, Q., Mittnik, A., Johnson, P.L.F., Bos, K., Lari, M., Bollongino, R., Sun, C., 343 Giemsch, L., Schmitz, R., Burger, J., et al. (2013). A Revised Timescale for Human 344 Evolution Based on Ancient Mitochondrial Genomes. Curr. Biol. 23, 553–559. 345 17. Mallick, S., Li, H., Lipson, M., Mathieson, I., Gymrek, M., Racimo, F., Zhao, M., 346 Chennagiri, N., Nordenfelt, S., Tandon, A., et al. (2016). The Simons Genome 347 Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201-206. 348 18. Lipson, M., Szécsényi-Nagy, A., Mallick, S., Pósa, A., Stégmár, B., Keerl, V., Rohland, 349 N., Stewardson, K., Ferry, M., Michel, M., et al. (2017). Parallel palaeogenomic 350 transects reveal complex genetic history of early European farmers. Nature 551, 368- 351 372. 352 19. Fregel, R., Méndez, F.L., Bokbot, Y., Martín-Socas, D., Camalich-Massieu, M.D., 353 Santana, J., Morales, J., Ávila-Arcos, M.C., Underhill, P.A., Shapiro, B., et al. (2018). 354 Ancient genomes from evidence prehistoric migrations to the 355 from both the and Europe. Proc. Natl. Acad. Sci. 356 https://doi.org/10.1073/pnas.1800851115 357 20. Lazaridis, I., Nadel, D., Rollefson, G., Merrett, D.C., Rohland, N., Mallick, S., 358 Fernandes, D., Novak, M., Gamarra, B., Sirak, K., et al. (2016). Genomic insights into 359 the origin of farming in the ancient Near East. Nature 536, 419-424. 360 21. Mathieson, I., Alpaslan-Roodenberg, S., Posth, C., Szécsényi-Nagy, A., Rohland, N., 361 Mallick, S., Olalde, I., Broomandkhoshbacht, N., Candilio, F., Cheronet, O., et al. 362 (2018). The genomic history of southeastern Europe. Nature 555, 197-203. 363 22. Cassidy, L.M., Martiniano, R., Murphy, E.M., Teasdale, M.D., Mallory, J., Hartwell, B., 364 and Bradley, D.G. (2015). Neolithic and Bronze Age migration to Ireland and 365 establishment of the insular Atlantic genome. Proc. Natl. Acad. Sci. Available at: 366 http://doi.org/10.1073/pnas.1518445113 367 23. Sikora, M., Seguin-Orlando, A., Sousa, V.C., Albrechtsen, A., Korneliussen, T., Ko, A., 368 Rasmussen, S., Dupanloup, I., Nigst, P.R., Bosch, M.D., et al. (2017). Ancient 369 genomes show social and reproductive behavior of early Upper Paleolithic foragers. 370 Science. http://doi.10.1126/science.aao1807. 371 24. Van de Loosdrecht, M., Bouzouggar, A., Humphrey, L., Posth, C., Barton, N., Aximu- 372 Petri, A., Nickel, B., Nagel, S., Talbi, E.H., El Hajraoui, M.A., et al. (2018). Pleistocene 373 North African genomes link Near Eastern and sub-Saharan African human 374 populations. Science (80-. ). 360, 548 LP-552. 375 25. Jones, E.R., Gonzalez-Fortes, G., Connell, S., Siska, V., Eriksson, A., Martiniano, R., 376 McLaughlin, R.L., Gallego Llorente, M., Cassidy, L.M., Gamba, C., et al. (2015). Upper 377 Palaeolithic genomes reveal deep roots of modern Eurasians. Nat. Commun. 6, 8912. 378 26. González-Fortes, G., Jones, E.R., Lightfoot, E., Bonsall, C., Lazar, C., Grandal- 379 d’Anglade, A., Garralda, M.D., Drak, L., Siska, V., Simalcsik, A., et al. (2017). 380 Paleogenomic Evidence for Multi-generational Mixing between Neolithic Farmers and 381 Mesolithic Hunter-Gatherers in the Lower Danube Basin. Curr. Biol. 27, 1801– 382 1810.e10. 383 27. Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y., Genschoreck, 384 T., Webster, T., and Reich, D. (2012). Ancient Admixture in Human History. Genetics 385 192, 1065–1093. 386 28. Manning, K., Timpson, A., Colledge, S., Crema, E., Edinborough, K., Kerig, T., and 387 Shennan, S. (2014). The chronology of culture: a comparative assessment of 388 European Neolithic dating approaches. Antiquity 88, 1065–1080. 389 29. Perrin, T., Manen, C., Valdeyron, N., and Guilaine, J. (2018). Beyond the sea… The 390 Neolithic transition in the southwest of France. Quat. Int. 470, 318–332. 391 30. Sherratt, A. (1995). Instruments of conversion? The role of in the 392 mesolithic/neolithic transition in Northwest Europe. Oxford J. Archaeol. 14, 245–260. 393 31. Masset, C. (1993). Les : Sociétés néolithiques, practiques funéraires. Ed. 394 Errance, Paris. 395 32. Terradas, X., Pallarés, M., Mora, R., and Moreno, J.M. (1993). Estudi preliminar de les 396 ocupacions humanes de la balma de Guilanyà (Navès, Solsonès). Rev. d’arqueologia 397 Ponent, 231–248. 398 33. Martínez-Moreno, J., Mora, R., and Casanova, J. (2006). Balma Guilanyà y la 399 ocupación de la vertiente sur del Prepirineo del Noreste de la Península Ibérica 400 durante el Tardiglaciar. In La cuenca mediterránea durante el paleolítico superior: 401 38.000-10.000 años, pp. 444–457. 402 34. Martínez-Moreno, J., Mora, R., and Casanova, J. (2007). El contexto cronométrico y 403 tecno-tipológico durante el Tardiglaciar y Postglaciar de la viertiente sur de los 404 Pirineos orientales. Rev. d’arqueologia Ponent, 7–44. 405 35. Martínez-Moreno, J., and Mora, R. (2009). Balma Guilanyà (Prepirineo de Lleida) y el 406 Aziliense en el noreste de la Península Ibérica. Trab. Prehist. 66, 45–60. 407 36. Garcia-Guixé, E., Martínez-Moreno, J., Mora, R., Núñez, M., and Richards, M.P. 408 (2009). Stable isotope analysis of human and animal remains from the Late Upper 409 Palaeolithic site of Balma Guilanyà, southeastern Pre-Pyrenees, Spain. J. Archaeol. 410 Sci. 36, 1018–1026. 411 37. Ruiz, J., García-Sívoli, C., Martínez-Moreno, J. and Subirá, M.E. (2006). Los restos 412 humanos del Tardiglaciar de Balma Guilanyà. In La cuenca mediterránea durante el 413 paleolítico superior: 38.000-10.000 años, pp. 458–467. 414 38. Straus, L.G. (2015). Chronostratigraphy of the Pleistocene/Holocene boundary: the 415 problem in the Franco-Cantabrian region. Palaeohistoria 27, 89–122. 416 39. Martzluff, M., Martínez-Moreno, J., Guilaine, J., Mora, R., and Casanova, J. (2012). 417 Transformaciones culturales y cambios climáticos en los Pirineos catalanes entre el 418 Tardiglaciar y Holoceno antiguo: Aziliense y Sauveterriense en Balma de la 419 Margineda y Balma Guilanyà. Cuaternario y Geomorfol. 26, 61–78. 420 40. Szécsényi-Nagy, A., Roth, C., Brandt, G., Rihuete-Herrada, C., Tejedor-Rodríguez, C., 421 Held, P., García-Martínez-de-Lagrán, Í., Arcusa Magallón, H., Zesch, S., Knipper, C., 422 et al. (2017). The maternal genetic make-up of the Iberian Peninsula between the 423 Neolithic and the Early Bronze Age. Sci. Rep. 7, 15644. 424 https://doi.org/10.1038/s41598-017-15480-9. 425 41. Lubell, D., Jackes, M., Schwarcz, H., Knyf, M., and Meiklejohn, C. (1994). The 426 Mesolithic-Neolithic transition in : isotopic and dental evidence of diet. J. 427 Archaeol. Sci. 21, 201-216. 428 42. Jackes, M., and Alvim, P. (1999). Reconstructing Moita do Sebastião, the first step. In 429 Do Epipaleolítico ao Calcolítico na Península Iberica, Actas do IV Congresso de 430 Arqueologia Peninsular, pp 13-25. 431 43. Bicho, N.F. (1994). The End of the Paleolithic and the Mesolithic in Portugal. Curr. 432 Anthropol. 35, 664–674. 433 44. Utrilla, P., Laborda, R. (2018). La Cueva de Chaves (Bastaras, Huesca): 15000 años 434 de ocupación prehistórica. Trab. Prehist., 248–269. 435 https://doi.org/10.3989/tp.2018.12214 436 45. Castaños, P.M. (2004). Estudio arqueozoológico de los macromamíferos del Neolítico 437 de la Cueva de Chaves (Huesca). Saldvie Estud. Prehist. y Arqueol., 125–172. 438 46. Baldellou, V. (2011). La Cueva de Chaves (Bastarás-Casbas, Huesca). SAGVNTVM 439 Extra 12, 141–144. 440 47. Utrilla, P., and Baldellou, V. (2001). Cantos pintados neolíticos de la Cueva de Chaves 441 (Bastarás, Huesca). Saldvie Estud. Prehist. y Arqueol., 45–126. 442 48. Utrilla, P., and Baldellou, V. (2007). Les galets peints de la Grotte de Chaves. Bull. la 443 Société préhistorique Ariège-Pyrénées 62, 73–88. 444 49. Zilhão, J. (2001). Radiocarbon evidence for maritime pioneer colonization at the 445 origins of farming in west Mediterranean Europe. Proc. Natl. Acad. Sci. 98, 14180 LP- 446 14185. 447 50. Isern, N., Zilhão, J., Fort, J., and Ammerman, A.J. (2017). Modeling the role of 448 voyaging in the coastal spread of the Early Neolithic in the West Mediterranean. Proc. 449 Natl. Acad. Sci. 114, 897 LP-902. 450 51. Bernabeu, J., Balaguer, L.M., Esquembre-Bebiá, M.A., Pérez, J.R.O., and Soler, J. de 451 D.B. (2009). La cerámica impresa mediterránea en el origen del Neolítico de la 452 península Ibérica. In De Méditerranée et d’ailleurs...: mélanges offerts à Jean 453 Guilaine, pp. 83–96. 454 52. Martins, H., Oms, F.X., Pereira, L., Pike, A.W.G., Rowsell, K., and Zilhão, J. (2015). 455 the beginning of the Neolithic in Iberia: new results, new 456 problems. J. Mediterr. Archaeol. 28, 105–131. 457 53. Villalba-Mouco, V., Martínez-labarga, C., Utrilla, P., Laborda, R., Ignacio, J., and 458 Salazar-garcía, D.C. (2018). Reconstruction of human subsistence and husbandry 459 strategies from the Iberian Early Neolithic : A stable isotope approach. Am. J. Phys. 460 Anthropol. 167, 257–271. 461 54. Cuenca-Romero, M. del C.A., Ballestero, E.C., Blanco, S.P., Díez, G.M., and Pastor, 462 C.D. (2011). El “campo de hoyos” calcolítico de Fuente Celada (Burgos): datos 463 preliminares y perspectivas. Complutum 22, 47–69. 464 55. Rojo-Guerra, M., Peña-Chocarro, L., Royo-Guillén, J.I., Tejedor, C., García-Martínez 465 de Lagrán, I., Arcusa, H., Garrido Pena, R., Moreno, M., Mazzucco, N., and Gibaja, 466 J.F. et al. (2013). Pastores trashumantes del Neolítico Antiguo en un entorno de alta 467 montaña: secuencia crono-cultural de la Cova de Els Trocs (San Feliú de Veri, 468 Huesca). BSAA Arqueol., 9–55. 469 56. Posth, C., Nägele, K., Colleran, H., Valentin, F., Bedford, S., Kami, K.W., Shing, R., 470 Buckley, H., Kinaston, R., Walworth, M., et al. (2018). Language continuity despite 471 population replacement in Remote Oceania. Nat. Ecol. Evol. 2, 731–740. 472 57. Rohland, N., Harney, E., Mallick, S., Nordenfelt, S., and Reich, D. (2015). Partial 473 uracil–DNA–glycosylase treatment for screening of ancient DNA. Philos. Trans. R. 474 Soc. B Biol. Sci. 370. http://doi.10.1098/rstb.2013.0624. 475 58. Peltzer, A., Jäger, G., Herbig, A., Seitz, A., Kniep, C., Krause, J., and Nieselt, K. 476 (2016). EAGER: efficient ancient genome reconstruction. Genome Biol. 17, 60. 477 https://doi.org/10.1186/s13059-016-0918-z. 478 59. Maricic, T., Whitten, M., and Pääbo, S. (2010). Multiplexed DNA Sequence Capture of 479 Mitochondrial Genomes Using PCR Products. PLoS One 5, e14004. 480 60. Fu, Q., Hajdinjak, M., Moldovan, O.T., Constantin, S., Mallick, S., Skoglund, P., 481 Patterson, N., Rohland, N., Lazaridis, I., Nickel, B., et al. (2015). An early modern 482 human from with a recent ancestor. Nature 524, 216-219. 483 61. Schubert, M., Lindgreen, S., and Orlando, L. (2016). AdapterRemoval v2: rapid 484 adapter trimming, identification, and read merging. BMC Res. Notes 9, 88. 485 https://doi.org/10.1186/s13104-016-1900-2. 486 62. Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows– 487 Wheeler transform. Bioinformatics 25, 1754–1760. 488 63. Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, 489 S., Cooper, A., Markowitz, S., Duran, C., et al. (2012). Geneious Basic: An integrated 490 and extendable desktop software platform for the organization and analysis of 491 sequence data. Bioinformatics 28, 1647–1649. 492 64. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., 493 Abecasis, G., Durbin, R., and Subgroup, 1000 Genome Project Data Processing 494 (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078– 495 2079. 496 65. Monroy Kuhn, J.M., Jakobsson, M., and Günther, T. (2018). Estimating genetic kin 497 relationships in prehistoric populations. PLoS One 13, e0195491. 498 https://doi.org/10.1371/journal.pone.0195491. 499 66. Weissensteiner, H., Pacher, D., Kloss-Brandstätter, A., Forer, L., Specht, G., Bandelt, 500 H.-J., Kronenberg, F., Salas, A., and Schönherr, S. (2016). HaploGrep 2: 501 mitochondrial classification in the era of high-throughput sequencing. 502 Nucleic Acids Res. 44, 58–63. 503 67. Poznik, G.D. (2016). Identifying Y-chromosome in arbitrarily large 504 samples of sequenced or genotyped men. bioRxiv. Available at: 505 http://biorxiv.org/content/early/2016/11/19/088716.abstract. 506 68. Patterson, N., Price, A.L., and Reich, D. (2006). Population Structure and 507 Eigenanalysis. PLOS Genet. 2, e190. 508 509 510 Main text figure legends 511 512 513 Figure 1. Geo-chronological location of ancient individuals from the Iberian Peninsula 514 (A) Map showing the geographical location of the new individuals and sites included in this 515 study (black outlines) and relevant published data for HG and EN/MN Neolithic individuals from 516 the Iberian Peninsula (no outlines). (B) Radiocarbon dates of newly reported individuals in 517 calibrated years BP (error bars indicate the 2-sigma range). 518 519 520 Figure 2. Genetics distances between European HG and key f4-statistics 521 (A) MDS plot of genetic distances between Eurasian HG individuals (>30,000 SNPs). The main 522 genetic clusters defined previously [2] are: Vestonice (dark red), Mal’ta (orange: Mal’ta 1 [MA1] 523 and Afontova Gora 3), Satsurblia (light pink: Kotias and Satsurblia), Villabruna (blue: Koros 524 EN-HG. Berry-au-Bac 1, Rochedane, Villabruna, Chaurdardes 1, Ranchot 88, LaBraña 1 and 525 Loschbour), and GoyetQ2 (yellow; newly defined). Iberian HG are shown as green symbols. 526 (B) f4-statistics highlighting the excess affinity to Goyet Q-2 in Iberian and European HG 527 (>20,000 SNPs; error bars indicate ±3 standard errors; Z-scores >3 (green)). 528 529 530 Figure 3. Key f3-outgroup tests, f4-statistics, and qpAdm results (A) Biplot of f3-outgroup 531 tests illustrating the Villabruna-like and GoyetQ2-like genetic ancestries in European HG. The 532 x=y axis marks full symmetry between GoyetQ2-, and Villabruna-like ancestries and deviations 533 mark excess ancestry shared with GoyetQ2 (yellow) or Villabruna (blue). (B) Results of f4- 534 statistics highlighting the shared genetic drift between El Mirón and Villabruna individuals using 535 20,000 intersecting SNPs. (C) Results of f4-statistic showing that El Mirón is not a sister clade 536 of Villabruna. (D) Modelling European HG as a two-way admixture of Villabruna-, and 537 GoyetQ2-like ancestry using Villabruna and Goyet Q-2 as proxies, respectively. 538 539 540 Figure 4. PCA results and qpAdm admixture models. (A) PCA analysis calculated with 777 541 present-day West Eurasians on which published ancient individuals from western Europe and 542 newly reported individuals (stars) were projected. (B) Modelling EN and MN populations from 543 Iberian and western Europe as admixture of three ancestral sources: Anatolian Neolithic, 544 Goyet Q-2 and Villabruna (Data S10). (C) f4(Iberian MN, Neolithic British Isles, test, Mbuti), 545 where test is Goyet-Q2 (yellow) or Villabruna (blue), highlighting the excess of Goyet Q-2 546 ancestry in Iberian MN compared to Neolithic England and Scotland. The affinity to Villabruna 547 is shown for comparison to avoid potential bias created by unspecified HG attraction. 548 549 Star Methods: 550 Detailed methods are provided in the online version of this paper and include the following: 551 552 KEY RESOURCES TABLE 553 CONTACT FOR REAGENT AND RESOURCE SHARING 554 EXPERIMENTAL MODEL AND SUBJECT DETAILS 555 ARCHAEOLOGICAL SITES AND SAMPLE DESCRIPTION 556 ANCIENT DNA PROCESSING AND QUALITY CONTROL 557 Sampling of ancient human remains 558 DNA extraction 559 Library preparation 560 Shotgun screening and in-solution enrichment of nuclear DNA (1240k capture) and 561 mtDNA capture 562 QUANTIFICATION AND STATISTICAL ANALYSIS 563 Read processing and assessment of ancient DNA authenticity 564 Contamination tests 565 DNA damage 566 Contamination based on the match rate to the mtDNA dataset (ContamMix) 567 Sex determination and X-contamination 568 Genotyping and merging with data set. 569 Kinship relatedness and individual assessment. 570 Phenotypic traits analysis 571 Y and mtDNA haplogroups 572 Population genetic analysis 573 Labelling population groups 574 Principal Component Analysis 575 F-statistics 576 qpAdm and qpWave 577 Multi-Dimensional Scale analysis (MDS) 578 DATA AVAILABILITY 579 580 581 Supplemental Information: 582 Supplementary Data include a text file and four supplementary figures detailing additional 583 population genomic analysis and archaeological implications of the main findings. 584 585 586 STAR METHODS 587 588 KEY RESOURCES TABLE 589 REAGENT OR RESOURCE SOURCE IDENTIFIER Biological samples Ancient individual This study/ Troisième GoyetQ2 caverne of Goyet archaeological site Ancient individual This study/ Balma BAL001/ E1206 Guilanyà Shown to be identical with BAL005 archaeological site Ancient individual This study/ Balma BAL005/ BG E 3214 Guilanyà Shown to be identical with BAL001 archaeological site Ancient individual This study/ Balma BAL003/ E9605 Guilanyà archaeological site Ancient individual This study/ Moita do CMS001/ 22 Sebastião archaeological site Ancient individual This study/ Cueva de CHA001/ 84C Chaves archaeological site Ancient individual This study/ Cueva de CHA002/ CH.NIG.11559 Chaves archaeological site Ancient individual This study/ Cueva de CHA003/ CH.NIG.11558 Chaves archaeological site Ancient individual This study/ Cueva de CHA004/ Ch.Banda13 Chaves archaeological site Ancient individual This study/ Fuente FUC003/ H62 UE 622 Celada archaeological site Ancient individual This study/ Cova de ELT002/ UE 69 C: 589 S:7 Nº Inv: 14227 Els Trocs archaeological site Ancient individual This study/ Cova de ELT006/ UE:1 C: 650 S:1 Nº Inv: 22404 Els Trocs archaeological site Chemicals, Peptides, and Recombinant Proteins 0.5 M EDTA pH 8.0 Life AM9261 1x Tris-EDTA pH 8.0 AppliChem A8569,0500 Proteinase K Sigma-Aldrich P2308-100MG Guanidine hydrochloride Sigma-Aldrich G3272-500g 3M Sodium Acetate pH 5,2 Sigma-Aldrich S7899-500ML Tween 20 Sigma-Aldrich P9416-50ML Water Chromasolv Plus Sigma-Aldrich 34877-2.5L Ethanol Merck 1009832511 Isopropanol Merck 1070222511 Buffer Tango Life Technologies BY5 T4 DNA Polymerase NEB M0203 L T4 Polynucleotide Kinase NEB M0201 L User Enzyme NEB M5505 L Uracil Glycosylase inhibitor NEB (UGI) M0281 S Bst 2.0 DNA Polymerase NEB M0537 S Quick Ligation Kit NEB M2200 L BSA 20mg/ml NEB B9000 S ATP NEB P0756 S DyNAmo Flash SYBR Green qPCR Kit Life Technologies F-415L dNTPs 25 mM Thermo Scientific R1121 Tween 20 Sigma-Aldrich P9416-50ML Ethanol Merck 1009832511 D1000 ScreenTapes Agilent Technologies 5067-5582 D1000 Reagents Agilent Technologies 5067-5583 Pfu Turbo Cx Hotstart DNA 600412 Polymerase Agilent Technologies Herculase II Fusion DNA Polymerase Agilent Technologies 600679 Oligo aCGH/Chip-on-Chip Hybridization Kit Agilent Technologies 5188-5220 1x TE-Puffer pH 8,0 low EDTA AppliChem A8569,0500 Sodiumhydroxide Pellets Fisher Scientific 10306200 GE Healthcare 65152105050250 Sera-Mag Speed CM Lifescience Dynabeads MyOne Streptavidin T1 Life Technologies 65601 DyNAmo Flash SYBR Green qPCR Kit Life Technologies F-415L 0.5 M EDTA pH 8.0 Life Technologies AM9261 GeneRuler Ultra Low Range DNA Ladder Life Technologies SM1211 10x GeneAmp PCR Gold Buffer and MgCl2 Life Technologies 4379874 Human Cot-1 DNA Life Technologies 15279011 1M Tris-HCI pH 8.0 Life Technologies 15568025 20x SCC Buffer Life Technologies AM9770 UltraPure™ Salmon Sperm DNA Solution Life Technologies 15632011 Ethanol Merck 1009831011 MinElute PCR Purification Kit Qiagen 28006 PEG 8000 Powder, Molecular Biology Grade Promega V3011 20% SDS Solution Serva 39575.01 3M Sodium Acetate buffer Sigma-Aldrich solution pH 5,2 S7899-500ML 5 M Sodium chloride solution Sigma-Aldrich S5150-1L Denhardt’s solution Sigma-Aldrich D9905-5Ml Tween 20 Sigma-Aldrich P9416-50ML Critical Commercial Assays High Pure Viral Nucleic Acid Roche 5114403001 Large Volume Kit MinElute PCR Purification Kit Qiagen 28006 NextSeq 500/550 High Output NextSeq 500/550 High NextSeq 500/550 High Output Kit v2 Kit v2 Output Kit v2 Deposited Data Raw and analyzed data This paper ENA: XXXXXXXXXX Software and Algorithms Samtools http://samtools.sourceforge.net/ BWA http://bio-bwa.sourceforge.net/ ADMIXTOOLS https://github.com/DReichLab/AdmixTools smartpca https://www.hsph.harvard.edu/alkes- price/software/ ADMIXTURE http://www.genetics.ucla.edu/software/admixture/ Haplogrep 2 http://haplogrep.uibk.ac.at/ ANGSD http://popgen.dk/angsd/index.php/ Contamination EAGER http://eager.readthedocs.io/en/latest/ 590 591 CONTACT FOR REAGENT AND RESOURCE SHARING 592 Further information and requests for resources and reagents should be directed to Wolfgang 593 Haak ([email protected]). 594 595 EXPERIMENTAL MODEL AND SUBJECT DETAILS 596 We generated new genome-wide data from skeletal remains of eleven prehistoric individuals 597 from the Belgian Upper Paleolithic and Iberian Late Upper Paleolithic, Mesolithic, Early and 598 Middle Neolithic. 599 600 601 Archaeological sites and sample description 602 603 Troisième caverne of Goyet (Upper Paleolithic) 604 605 The Troisième caverne of Goyet (Belgium) is a cave with an extensive Paleolithic record, from 606 the Middle to the Upper Paleolithic periods (Aurigniacian, and Magdalenian). The 607 site was previously described and eight individuals were analysed by Fu et al [2]. For this study 608 we have generated deeper sequencing data from individual Goyet Q-2 who is attributed to the 609 Magdalenian period. 610 • Goyet Q-2, juvenile individual (12,650 ± 50 BP [GrA-46168], 15,232–14,778 yrs cal 611 BP [2-sigma value]) [2]. 612 613 Balma Guilanyà (Late Upper Paleolithic) 614 615 Balma Guilanyà is a located in Northeastern Iberia, at 1,150 m.a.s.l. (metres above 616 sea level) in the Serra de Busa Pre-Pyrenean range (Navés, Lleida). After an initial test pit 617 where Late Upper Paleolithic remains were recovered [32], the site was excavated between 618 2001 to 2008 [33,34]. Two main chrono-cultural phases were defined. The oldest dates back 619 to the Late Upper Paleolithic (15,000–11,000 yrs cal BP) and the youngest corresponds to the 620 Early Mesolithic (11,000 – 9,500 yrs cal BP). The two chrono-cultural units are separated by a 621 big fallen boulder which sealed the Late Upper Paleolithic levels [35]. A set of human skeletal 622 remains were found under this big stone block without any evidence of funerary structures. 623 Direct radiocarbon dates from two human remains (one human tooth and one human bone 624 fragment) recovered from the same context dated to 13,380–12,660 yrs cal BP (Ua-34297) 625 and 12,830–10,990 yrs cal BP (Ua- 34298) [36]. These dates fall inside the Bølling/Allerød 626 interstadial and Younger Dryas stadial, which correspond to the Late Glacial. The Minimum 627 Number of Individuals (MNI) was estimated to be three based on dental morphology: two adults 628 and one immature individual [37]. The stable carbon and nitrogen isotope analyses performed 629 on human bone collagen suggested a diet based on terrestrial herbivores, without any 630 evidence of marine or freshwater resources [36]. The material cultural artifacts recovered from 631 the same level as the human remains have been attributed to the Azilian Culture [35]. However, 632 in general this culture is considered to be more common in Vasco-Cantabrian northern Iberia 633 and on the other side of the Pyrenees [38]. Balma Guilanyà shows clear technical parallels 634 with the near Azilian site Balma Marguineda [39]. Here, we report the genome-wide data from 635 two individuals from this site: 636 • BAL0051, adult individual 637 • BAL003, adult individual 638 639 640 Moita do Sebastião (Mesolithic) 641 642 This site was previously described in Szécsényi-Nagy et al. [40]. Moita do Sebastiāo is a Late 643 Mesolithic shell site located in the Muge region (Salvaterra de Magos, Portugal) on the 644 Atlantic coastline of Portugal. The Muge and Sado regions were very fertile estuaries and 645 marshes during the Mesolithic, which were exploited by hunter-gatherers to obtain marine 646 resources [41]. Although Mesolithic groups are not considered fully sedentary, Moita do 647 Sebastião presents some cultural characteristics that suggest permanence at the site: the 648 presence of post holes associated with building, and a big burial space [42]. These features 649 have been interpreted as a systematic occupation of the estuarine areas, which is also 650 reflected in the shell midden conformation. The lithic assemblage is characterized by 651 microburin technique and geometrics [44]. The Moita do Sebastião site was excavated by 652 different archaeologists since the last century. The total MNI is unknown, but it could reach up 653 to 100 individuals when summarizing the different campaigns [42]. In this study, we genetically 654 analyse one Mesolithic individual from this site: 655 • CMS001, adult individual, (7,240 ± 70 BP [To-131], 8,185–7,941 yrs cal BP [2-sigma 656 value]) [40]. 657 658 659 Cueva de Chaves (Early Neolithic) 660 661 Cueva de Chaves is located in Northeastern Iberia, at 663 m.a.s.l. in the Pre-Pyrenean 662 mountain range of Sierra de Guara (Bastarás, Huesca). The site was excavated under the 663 direction of Pilar Utrilla and Vicente Baldellou in between 1984 and 2007. Cueva de Chaves 664 was occupied during the Paleolithic, Neolithic, and sporadically during the Bronze Age and 665 Late Roman periods. Neolithic deposits were divided in two archaeological levels and dated to 666 the Early Neolithic period (Ia: 5,600–5,300 yrs cal BCE; Ib 5,300-5,000 yrs cal BCE) [44]. Both 667 levels show a full Neolithic package consisting of domestic fauna [45], Cardial [46] and 668 schematic painted on pebbles [47,48]. The earliest Neolithic sites in the Iberian 669 Peninsula are located in coastal areas [49,50]. A long-standing hypothesis in archaeology to 670 explain this is that the first arrival of the Neolithic in the Iberian Peninsula resulted from a 671 Cardial expansion by a maritime route. In this context, Cueva de Chaves represents an 672 interesting case study, because radiocarbon dates for the occupation of this cave overlap in 673 time with other Cardial Early Neolithic sites in coastal Iberia [51,52]. The pottery style, together 674 with the radiocarbon dates, suggest an early expansion of the first farmers from coastal to the 675 inland areas following the Ebro Basin [44]. An MNI of four individuals, directly radiocarbon 676 dated, were recovered from this Early Neolithic context (although one radiocarbon date points 677 back to the early Middle Neolithic). One of individuals was in a complete anatomical 678 articulation. A human isotopic dietary study shows a high animal protein intake consumed by 679 all individuals [53]. This was related to the existence of a specialized animal husbandry 680 management community in which agriculture was not intensively developed. We included four 681 Neolithic individuals for genetic analyses in this study: 682 • CHA001, adult (6,230 ± 45 BP [GrA-26912], 7,257–7,006 yrs cal BP [2-sigma value]) 683 [46]. 684 • CHA002, adult (6,227 ± 28 BP [MAMS 29127], 7,250–7,018 yrs cal BP [2-sigma value]) 685 [53]. 686 • CHA003, infant (6,180 ± 54 BP [D-AMS 015821], 7,245–6,947 yrs cal BP [2-sigma 687 value]) [53]. 688 • CHA004, adult (5,645 ± 31 BP [MAMS 28128], 6,494–6,321 yrs cal BP [2-sigma value]) 689 [53]. 690 691 692 Fuente Celada (Early Neolithic) 693 694 This site was described in Szécsényi-Nagy et al. [40]. Fuente Celada is an open-air settlement 695 located in the northern Iberian Central Plateau (Quintanaduenas, Burgos). All the 696 archaeological materials are from a rescue excavation carried out in 2008 by Alameda 697 Cuenca-Romero et al. [54]. The site presents many negative structures, most of them from the 698 period, suggesting a habitat settlement. Some of these negative structures contain 699 Chalcolithic humans remains. One of these gave an older date corresponding to the 700 Early Neolithic. This burial contained an individual in a flexed position, with three bone rings 701 close to the cervical vertebrae, which was interpreted as a necklace [54]. Here we include this 702 Early Neolithic individual in our genetic analyses: 703 • FUC003, adult (6,120 ± 30 BP [UGA-7565], 7,157–6,910 yrs cal BP [2-sigma value]) 704 [54]. 705 706 707 Cova de Els Trocs (Middle Neolithic) 708 709 This site was also described in Szécsényi-Nagy et al. [40]. Cova de Els Trocs is a cave located 710 in Northeastern Iberia, at 1,564 m.a.s.l. in San Feliú de Veri (Bisauri, Huesca) in the South of 711 the Axial Pyrenees [55]. The excavation of the site is ongoing and lead by Manuel Rojo Guerra 712 and José Ignacio Royo Guillén. Within the large stratigraphic sequence three different 713 occupation phases are discerned that are supported by more than twenty radiocarbon dates 714 [55]. The first phase corresponds to the Early Neolithic (ca. 5,300–4,800 yrs cal BCE.) for 715 which genomic data has been published in Haak et al. [3] for seven individuals. The second 716 phase dates back to the Middle Neolithic (ca. 4,500–4,300 yrs cal BCE.) when the cave was 717 possibly used by animals and no human remains have been retrieved. During the third phase 718 (ca. 4,000–3,700; 3,350–2,900 yrs cal BCE) the cave was used as a burial site. From the third 719 phase, we have included two individuals in the present genomic study: 720 • ELT002, adult (5,008 ± 23 BP [MAMS-16160], 5,882–5,658 yrs cal BP [2-sigma value]) 721 [55]. 722 • ELT006, adult (5,035 ± 23 BP [MAMS-16165], 5,895–5,716 yrs cal BP [2-sigma value]) 723 [55]. 724

725 Ancient DNA processing and quality control 726 727 Sampling of ancient human remains 728 729 For the ancient individuals analysed in this study, we sampled various bones (a humerus, 730 phalanges, metacarpals, mandibles and a cranial fragment) and teeth (molars) in the clean 731 room of the Max Planck Institute for the Science of Human History (MPI-SHH) in Jena, 732 Germany, and in the Institute of Anthropology, Johannes Gutenberg University, Mainz (Data 733 S2). Prior to sampling, samples were irradiated with UV-light for 30 min at all sides. Different 734 sampling methods were used for different bones types, including sandblasting, grinding with 735 , and cutting and drilling in the denser regions (Data S2). Teeth surfaces 736 were cleaned with a low concentration bleach solution (10%). For the teeth sampled at the 737 MPI-SHH, the crown was separated from the root by cutting with a hand saw along the 738 cementum/enamel junction followed by drilling inside the pulp chamber [56]. For the teeth 739 sampled in Mainz the complete tooth was grinded down using a mixer mill [40]. 740 741 DNA extraction 742 743 DNA extraction was done following a modified version of the Dabney protocol [12], with an 744 initial amount of 50-100 mg of bone or tooth powder. Samples were digested with extraction 745 buffer (EDTA, UV H2O and Proteinase K) during 16-24h in a rotator at 37 °C. The suspension 746 was centrifuged and the supernatant transferred into binding buffer (GuHCl, UV H2O and 747 Isopropanol) and then into silica columns (High Pure Viral Nucleic Acid Kit; Roche). The 748 columns were first washed with wash buffer (High Pure Viral Nucleic Acid Kit; Roche) and then 749 eluted in 100 µL TET (TE-buffer with 0.05% Tween). We included one or two extraction blanks 750 in each extraction series to check for cross-contamination between samples and background 751 contamination from the lab. 752 753 Library preparation 754 755 A total of 27 double strand (ds) libraries were created from 25 µL DNA template extract at the 756 MPI-SHH, following a protocol by Meyer & Kircher [10] with unique index pairs [11]. We used 757 a partial Uracil DNA Glycosylase treatment (UDG-half) that repairs damaged nucleotides by 758 removing deaminated cytosines except for the final nucleotides at the 5´ and 3´ read ends to 759 retain a damage pattern characteristic for ancient DNA [57]. The libraries generated from 760 Goyet Q-2 were ds-non-UDG and ss-UDG-half treated. We repaired the terminal ends of the 761 DNA fragments using T4 DNA Polymerase (NBE) and joined the Illumina adaptors using the 762 Quick Ligation Kit (NBE). We also added one or two library blanks per batch. One aliquot of 763 each library was used to quantify the DNA copy number with IS7/IS8 primers [10] outside the 764 clean room using DyNAmo SYBP Green qPCR Kit (Thermo Fisher Scientific) on the 765 LightCycler 480 (Roche). Libraries were double indexed with unique index combinations [11] 766 before doing PCR amplifications outside the cleanroom with PfuTurbo DNA Polymerase 767 (Agilent). After amplification, the indexed products were purified with MinElute columns 768 (Qiagen) and eluted in 50 µL TET buffer and quantified with IS5/IS6 primers using the DyNAmo 769 SYBP Green qPCR Kit (Thermo Fisher Scientific) on the LightCycler 480 (Roche) [10]. We 770 used Herculase II Fusion DNA Polymerase (Agilent) with the same IS5/IS6 primers for the 771 further amplification of the indexed products up to a copy number of 10 e-13 molecules/µL. 772 After another purification round, we quantified the indexed libraries on a TapeStation 773 (TapeStation Nucleic Acid System, Agilent 4200) and made a 10nM equimolar pool. Data S2 774 shows an overview of the extracts and libraries generated for each ancient individual. 775 776 Shotgun screening and in-solution enrichment of nuclear DNA (1240k capture) and 777 mtDNA (mitocapture) 778 779 The pooled double indexed libraries were sequenced on an Illumina HiSeq2500 for a depth of 780 ~5 million read cycles, using either a single (1x75bp reads) or double end (2x50bp reads) 781 configuration. Reads were analyzed with EAGER 1.92.32 [58] to check the quality and quantity 782 of endogenous human DNA in each library. We selected samples for targeted in-solution 783 capture enrichment that showed a damage pattern characteristic for ancient DNA and with 784 >0.2% endogenous DNA. We further amplified these libraries with the IS5/IS6 primer set to a 785 concentration of 200-400 ng/µL. After that, the libraries were hybridized in-solution to different 786 oligonucleotide probe sets synthetized by Agilent Technologies to enrich for the complete 787 mitogenome (mtDNA capture, [59]) and for 1,196,358 informative nuclear SNP markers 788 (1240K capture, [60]). 789 790 791 Quantification and statistical analysis 792 793 Read processing and assessment of ancient DNA authenticity 794 795 We demultiplexed the sequenced libraries according to expected read indexes, allowing for 796 one mismatch. We clipped adapters with AdapterRemoval v2.2.0 [61]. For paired end reads, 797 we restricted to merged fragments with an overlap of at least 30 bp. Single end reads shorter 798 than 30 bp were discarded. We mapped fragments to the Human Reference Genome Hs37d5 799 using the Burrows-Wheeler Aligner (BWA, v0.7.12-r1039) aln and samse commands (-l 16500, 800 -n0.01, -q30) [62] and removed duplicate reads using DeDup v0.12.1. We excluded reads with 801 a mapping quality phred score <30. A summary of quality statistics is given for 1240K SNP 802 captured libraries in Data S3 and for mtDNA captured libraries in Data S4. 803 804 Contamination tests 805 Prior to genotype calling we assessed the level of contamination in the mitochondrial and 806 nuclear genome using several methods. 807 808 DNA damage 809 We determined and plotted the deamination rate pattern in our UDG-half libraries using 810 MapDamage v.2.0.6 from EAGER 1.92.32 [58]. Although damage rates at the terminal read 811 ends vary (5.3-14.8%) in libraries for individuals from different sites, all libraries show 812 deamination patterns expected for ancient DNA (Data S3). Then we trimmed the reads for 2 813 bp at both terminal ends of the UDG-half libraries to reduce the bias of deamination from our 814 genotype calls. Non-UDG libraries generated from Goyet Q-2 were trimmed for 10 bp. 815 816 Contamination based on the match rate to the mtDNA dataset (ContamMix) 817 We used ContamMix 1.0.10 to estimate the mitochondrial contamination levels in our mito- 818 captured libraries taking a worldwide mitochondrial dataset to compare as a potential 819 contamination source [16] (Table S3). We find contamination rates below 2.2% for all libraries 820 (Data S6). We visualized the mitochondrial read alignment with Geneious R8.1.974 [63] and 821 manually checked for heterozygous calls to confirm the ContamMix estimates. We found a 822 substantial mitogenome heterozygosity level for BAL003_MT in the manual check, contrasting 823 its respective ContamMix estimate of 2.2%, and therefore excluded this library from further 824 genome analyses. 825 826 Sex determination and X-contamination 827 We determined genetic sex by calculating the X-ratio (targeted X-Chromosome SNPs/ targeted 828 autosomal SNPs) and Y-ratio (targeted Y-Chromosome SNPs/ targeted autosomal SNPs) 829 (Data S3). For uncontaminated libraries, we expect an X ratio ~1 and Y ratio ~0 for females 830 and X and Y ratio of 0.5 in males [2]. Potential individuals that fall in an intermediate position 831 could indicate the presence of DNA contamination. 832 Method 2 of the ANGSD package was used on merged and unmerged libraries from male 833 individuals to test the heterozygosity of polymorphic sites on the X chromosome [15]. For low 834 coverage libraries from the same individual with < 200 SNPs on the X chromosome we merged 835 them into a single BAM file using samtools v0.1.19 [64] to facilitate contamination estimation 836 of the merged libraries (Data S3). Finally, merged libraries with less than 3.3% contamination 837 were selected for population genetic analysis. 838 839 840 Genotyping and merging with data set. 841 842 After trimming of potentially damaged terminal ends bamfiles were genotyped with pileupCaller 843 (https://github.com/stschiff/sequenceTools/tree/master/srcpileupCaller), which call one SNP 844 per position considering the human genome as pseudo-haploid genome. Genotyped data were 845 merged with Human Origins panel (~600K SNPs) [27] and 1240K panel [17]. For Goyet Q-2 846 the genotyping was applied to clipped and unclipped bamfiles, calling only transversions in the 847 latter to avoid residual ancient DNA damage and merging these extra SNPs in the final 848 genotype. The number of SNPs covered per individual is shown in Data S4. 849 850 851 Kinship relatedness and individual assessment. 852 853 We used Relationship Estimation from Ancient DNA (READ) to estimate the degree of genetic 854 kinship relatedness among individuals [65]. This method can determine first and second 855 degree relatedness among individuals and can also be used to test for potential cross- 856 contamination among libraries from the same batch of sampling processing. We calculated the 857 proportion of non-matching alleles and normalized the results separating Neolithic from HG 858 individuals taking into account the potential genetic diversity within each group. We found that 859 individuals BAL001 and BAL005 were identical and consequently merged. All downstream 860 analyses were performed with the merged version BAL051. 861 862 Phenotypic traits analysis 863 864 We called specifically some SNPs positions associated with some phenotypic traits from the 865 captured SNPs (e.g. , pigmentation, eye colours) (Figure S4). We 866 calculated the genotype likelihood based on number of reads of each specific position to 867 determinate the presence of the ancestral or derived alleles in homozygous or heterozygous 868 fashion [24]. Results are shown in Figure S4. 869 870 Y-chromosomal and mitochondrial haplogroups 871 872 We used samtools v1.3.1 to extract reads from mitocapture data [64] and mapped them to the 873 rCRS and called the consensus sequences using Geneious R8.1.974 [63]. We downloaded 874 these consensus sequences in fasta format and they were used to determinate mitochondrial 875 haplotypes using Haplogrep 2 [66] (Data S6). 876 For Y haplogroup determination, we first called the Y chromosome SNPs of the 1240K SNP 877 panel from all male individuals using pileupCaller with MajorityCalling mode, 878 (https://github.com/stschiff/sequenceTools/tree/master/srcpileupCaller), and mapping quality 879 ≥30 and base quality ≥30 (Data S5). Y haplogroups were determined using the program Yhaplo 880 [67]. 881 882 Population genetic analysis 883 884 Labelling population groups 885 For the Paleolithic individuals, we adopted the labels from Fu et al [2] and used the improved 886 genotype calls from Mathieson et al. [21]. We included the HG with more than 15,000 SNPs 887 covered in the PCA (Supplemental Item 1). If the hunter-gatherer individuals from the same 888 site or chrono-cultural context clustered in the PCA analysis, we grouped them using the same 889 label for the following population genetic analysis [14,21,23]. Neolithic individuals from Iberia 890 were grouped by sites and by chrono-cultural context. For Neolithic individuals from outside of 891 Iberia, we kept the label names from the respective initial publications [9,13,20–22,26]. 892 893 Principal Component Analysis 894 PCA analysis was run with the Human Origins data set using smartpca v10210 (EIGENSOFT) 895 with the option SHRINKMODE [68] using 777 modern populations to calculate eigenvectors 896 on which aDNA samples were projected [20]. PC1 was multiplied by -1 (–PC1) in order to 897 mirror geography. 898 899 F-statistics 900 D-statistics and F-statistics were calculated with qpDstat from ADMIXTOOLS 901 (https://github.com/DReichLab). We used the 1240K panel to increase the number of SNPs 902 covered by the ancient individuals and get more resolution in the statistic tests. Standard errors 903 were calculated with the default block jacknife. We report and plot three standard errors in all 904 F-statistics. 905 906 qpAdm and qpWave 907 We used qpWave and qpAdm from the ADMIXTOOLS package 908 (https://github.com/DReichLab) to estimate admixture proportions. We used this framework to 909 model and quantify the ancestry proportions of HG individuals in- and outside of Iberia (we 910 only use HG or groups of HG with more than 30,000 SNPs). First, we tested whether Goyet 911 Q-2 and Villabruna formed a clade with respect to the following set outgroups: Mota, Ust’- 912 Ishim, Mal’ta 1 (MA1), Koros EN-HG, Goyet Q-116-1, Mbuti, Papuan, Onge, Han, Karitiana 913 and Natufian extending the set used by Olalde et al. [9]. The resulting qpWave model showed 914 an extremely poor fit (p-value= 3.07705115e-91), which means that our set of outgroups can 915 be used to differentiate between GoyetQ2 and Villabruna-related ancestry. We then modelled 916 the ancestry in the HG groups as a mixture of GoyetQ2 and Villabruna. Alternatively, we also 917 used El Miron and Loschbour as potential source populations and the same ten outgroups, 918 after checking that El Mirón and Loschbour are not equally related to the outgroups (p-value= 919 5.41365181e-68). 920 We also used qpWave and qpAdm to explore the HG admixture in the Neolithic populations. 921 In this case, the sources (left populations) were Anatolia Neolithic, Goyet Q-2 and Villabruna. 922 We chose the same set of outgroups, and first checked that Anatolia Neolithic, Goyet Q-2 and 923 Villabruna were not equally related to the outgroups (p-value of 8.63552043e-92). 924 925 The results of all qpWave and qpAdm models are reported in tables S8 and S9. The criteria to 926 report these values were as follows: If the resulting p-values were higher than 0.05 we report 927 the three-sources model. If the p-value were lower than 0.05 we show the best p-value 928 obtained from the three- or two-sources model (i.e. the nested model). In case of negative 929 values for some of the sources, we report the two-sources model (nested model) and the 930 respective p-value of these models. We applied the same criteria to two-sources models. 931 932 Multi-Dimensional Scaling analysis (MDS) 933 We computed Multi-Dimensional Scaling (MDS) analysis using the R package cmdscale to 934 measure the genetic dissimilarity among hunter-gatherers (HG), and then used the inverted 935 [1-f3(X; X, Mbuti)] pairwise values among all the combinations [2]. 936 937 938 Data Availability 939 will be released on ENA at the time of publication. 940 941 942 Legends of supplemental figures and data files 943 944 Figure S1. Heat plot showing the genetic distances between Eurasian HG. Genetic 945 distances were calculated using f3-outgroup statistics of the form f3(X;Y, Mbuti), with X and Y 946 being Eurasian hunter-gatherers in all possible pairwise comparisons. The analysis has been 947 restricted to samples with more than 30,000 SNPs, following Fu et al. [1]. 948 949 Figure S2. PCA analysis calculated with 777 present day West Eurasians [3] with option 950 shrinkmode:YES on which HG individuals were projected. 951 952 Figure S3. A) Modelling European HG as admixture of Villabruna-like and El Mirón-like 953 ancestry using Loschbour and El Mirón as proximal sources, respectively. B) Map showing the 954 distribution of the main sites related to the Azilian techno-complex, adapted from Strauss [4]. 955 Note that Iberian HG restricted to the Cantabrian and the Pre-Pyrenean regions carry more 956 Loschbour-like ancestry (e.g. Balma Guilanyà, La Braña 1, and Canes 1) than Iberian HG from 957 outside this area (Chan and Moita do Sebastião individuals). The increase of Villabruna 958 ancestry is first attested with the Balma Guilanyà individual, recovered from an Azilian 959 archaeological level. C) f4-statistics showing no affinity between Geometric Mesolithic Moita 960 do Sebastião from Portugal and HG from , , North Africa. D) 961 Correlation between Villabruna-like ancestry (dark blue; Figure 3D) and Loschbour-like 962 ancestry (light blue; Figure S3A) and time. Both models result in a fit of R=0.99 for individuals 963 from north and northeast of Iberia Peninsula (to the exclusion of Chan and Moita do Sebastião), 964 where we observe an increase of WHG-like ancestry similar to other parts of Europe. 965 966 Figure S4. Summary of genotypes of phenotypic and functional SNPs. Colours indicate 967 the homozygous ancestral/derived or heterozygous state of the SNPs reported in the left-hand 968 column. Numbers in cells indicate the number of reads matching the ancestral/derived allele.

969

970 Supplemental Data 971 Excel spreadsheet containing 972 Data S1. Chrono-cultural information 973 Data S2. Sampling method and library treatment 974 Data S3. Ancient DNA information for 1240K captured libraries UDG-half treated. DNA 975 Capture (%) represents the percentage of overlapped reads with the targeted 976 1240K capture SNPs. 977 Data S4. Ancient DNA information for mtDNA captured in UDG-half treated libraries. 978 DNA Capture (%) represents the percentage of overlapped reads with the 979 targeted mitochondrial capture SNPs. 980 Data S5. Sex determination and nuclear contamination estimation on X Chromosome in 981 male samples. 982 Data S6. Mitochondrial contamination and haplogroup determination 983 Data S7. Number of SNPs called and % SNPs covered in 1240K and HO panel. 984 Data S8. F4-statistics 985 Data S9. qpAdm models for HG individuals 986 Data S10. qpAdm models for Neolithic individuals Figure 1 A

Balma Guilanyà (Late Upper Paleolithic) B Years cal BP 10,000 12,000 13,000 14,000 11,000 9,000 5,000 6,000 7,000 8,000 Moita do Sebastião (Mesolithic)

Cueva de Chaves (Early Neolithic)

Fuente Celada (Early Neolithic) Cueva de Chaves (Middle Neolithic)

Cova de Els Trocs (Middle Neolithic) Figure 2 0.35 A B Hohle Fels 49 Goyet Q-116-1 El Mirón 0.30 Moita do Sebastião Chan Paglicci 133 0.25 Balma Guilanyà Kostenki 12 MA1 HG 0.20 Vestonice 13 El Mirón Kostenki 14 Ust' Ishim 0.15 Moita do Sebastião Oase 1 Chan Vestonice 16 Sunghir 0.10 Balma Guilanyà Taforalt Vestonice 43

y Afontova Gora 3 0.05 Ostuni 1 Popovo HG La Braña 1 Kotias Natufian 0 Satsurblia Canes 1 Muierii 2 Krems WA3 −0.05 EHG Pavlov 1 Karelia HG −0.10 La Braña 1 Ukraine Mesolithic Motala HG −0.15 Canes 1 Iboussieres 25-1 Iboussieres 31-2 −0.20 Latvia HG −0.20 −0.15 −0.10 −0.05 0 0.05 0.10 0.15 0.20 Chaudardes x Ranchot 88 Bichon Oase1 Vestonice 13 Goyet Q-116-1 Goyet Q-2 Iboussieres 25-1 Motala HG Falkenstein Loschbour Ust' Ishim Vestonice 16 Paglicci 133 El Mirón Iboussieres 31-2 Chaudardes Berry-Au-Bac Kotias Russia EHG Kostenki 14 Chan Koros EN-HG Ranchot 88 Iron Gates HG Satsurblia Popovo HG Kostenki 12 Moita do Sebastião Berry-Au-Bac Bichon Rochedane Z > 3 Koros EN-HG Pavlov 1 MA1 HG Muierii 2 Balma Guilanyà Rochedane Loschbour Z < 3 OrienteC HG Vestonice 43 Sunghir Hohle Fels 49 La Braña 1 Villabruna Taforalt −0.02 −0.01 0 0.01 0.02 0.03 Ostuni 1 Karelia HG Rigney 1 Canes 1 OrienteC HG Latvia HG f4 (Goyet Q-2,Villabruna;test,Mbuti) Krems WA3 Afontova Gora 3 Burkhardtshohle Falkenstein Iron Gates HG Natufian Ukraine Mesolithic Figure 3

A 0.40

Oase 1 Goyet Q-116 Iboussieres 31-2

0.38 Ust Ishim Paglicci 133 Koros EN-HG Kotias Kostenki 14 Berry-Au-Bac Satsurblia Kostenki 12 Rochedane 0.36 Pavlov 1 Muierii 2 OrienteC HG y=x Vestonice 43 Hohle Fels 49 HG 0.34 Canes 1 Ostuni 1 Rigney 1 Iron Gates HG La Braña 1Balma Guilanyà Krems WA3 Burkhardtshohle Motala HG Moita do Sebastião 0.32 Chan Vestonice 13 El Mirón Chaudardes Vestonice 16 Chan Ranchot 88 El Mirón 0.30 Russia EHG Moita do Sebastião Bichon Popovo HG Balma Guilanyà Loschbour

0.28 MA1 HG La Braña 1 Taforalt Sunghir Canes 1 Latvia HG

f3(Villabruna; test, Mbuti) test, f3(Villabruna; Karelia HG Falkenstein Natufian 0.26 Afontova Gora 3 Iboussieres 25-1 Ukraine Mesolithic

0.24 B El Mirón f4-value = 0 El Mirón f4-value < 0 0.22

0.20

0.18 0.20 0.25 0.30 0.35 0.40 0.45 f3(Goyet Q-2; test, Mbuti) Hohle Fels 49

D Burkhardtshohle Goyet Q-2 Villabruna Hohle Fels 49 (p=0.949) Rigney 1 Burkhardtshohle (p=0.772) |Z| < 3 El Mirón (p=0.066) El Mirón |Z| > 3

Rigney 1 (p=0.968) −0.010 −0.005 0 0.005 Chan (p=0.136) f4 (Goyet Q-2,GoyetQ2 cluster;Villabruna,Mbuti) Moita do Sebastião (p=0.168) Balma Guilanyà (p=0.872) C La Braña 1 (p=0.001)* El Mirón f4-value = 0 El Mirón f4-value > 0 Canes 1 (p=0.016)*

Iboussieres 25-1 (p=0.532) Goyet Q-2 cluster Chaudardes (p=0.930) and Iboussieres 31-2 (p=0.806) El Mirón OrienteC HG (p=0.603) Burkhardtshohle Ranchot 88 (p=0.025)* Rochedane (p=0.118) Rigney 1 Falkenstein (p=0.268)

Bichon (p=0.063) Hohle Fels 49 Berry-Au-Bac (p=0.072) Z > 3 Loschbour (p=0.008)* Goyet Q-2 Z < 3 Iron Gates HG (p=0.472) 0 0.005 0.010 0.015 0.020 0 0.5 1.0 f4 (GoyetQ2 cluster,Villabruna;El Mirón,Mbuti) Figure 4 A C PC2 (0.356 %) −0.012 −0.010 −0.008 −0.006 −0.004 −0.002

f4 (Iberian MN, England Neo; test, Mbuti) 0.002 0.004 0.006 0.008 0.010 − − − 0.002 0.004 0.006 0.006 0.004 0.002 Chaves MN Chaves EN Starčevo CHG Sebastião do Moita 0

Chaves MN 0

LaMina MN

LugarCanto MN −0.010

Trocs MN

Chaves MN Trocs MN N Anatolia SHG Guilanyà Balma

LaMina MN

LugarCanto MN −0.005

Trocs MN PC1 (0.832 %) (0.832 PC1 France MN France EN WHG HG Iberian 0 f4 (Iberian MN, Wales Neo; test, Mbuti) − − − 0.002 0.004 0.006 0.006 0.004 0.002

Chaves MN 0

LaMina MN 0.005 LugarCanto MN England MN England LBK EN FuenteCelada Natufian

Trocs MN

Chaves MN

LaMina MN 0.010 LugarCanto MN

Trocs MN MN Iberia EN Iberia EN Chaves EHG 0.015

f4 (Iberian MN, Ireland Neo; test, Mbuti) B − 0.005 0.005

Chaves MN 0 Canes 1 Canes Sebastião do Moita 1 Braña La Chan Guilanyà Balma El Mirón EN Bonica Cova EN Portalon EN FuenteCelada Trocs EN EN Pancorbo EN Chaves EN Murcielagos Toro EN

LaMina MN 1.0

LugarCanto MN 0.0 0.8 Trocs MN

Chaves MN Goyet Q-2-like ancestry

LaMina MN 0.6

LugarCanto MN 0.2 Chaves MN Chaves MN LugarCanto Trocs MN MN LaMina 0.4 Trocs MN 0.2 Anatolia-like ancestry 0.4

f4 (Iberian MN, Scotland Neo; test, Mbuti) 0.0 − − − 0.002 0.004 0.006 0.006 0.004 0.002 Serbia Starcevo EN Starcevo Serbia Iron Gates EN EN Romania EN Koros Hungary EN Impressa Croatia LBK EN Austria Neo Cardial Croatia LBK EN Germany 1.0

Chaves MN 0

LaMina MN 0.6

LugarCanto MN 0.8

Trocs MN 0.6

Chaves MN Villabruna-like ancestry Villabruna-like Ireland MN Ireland Neolithic Neolithic England MN Sopot Croatia Neolithic Neolithic Scotland Wales Neolithic Globular Poland Amphora MN France Neolithic Serbia

LaMina MN 0.8

LugarCanto MN 0.4

Trocs MN 0.2 1.0 0.0 Supplemental Information & Figures

S1. Exploring the genetic structure of Iberian hunter-gatherers

The Iberian Peninsula in southwestern Europe is understood as a periglacial refugium for Pleistocene hunter-gatherers (HG) during the Last Glacial Maximum (LGM). The post-LGM genetic signature in western and central Europe was dominated by ancestry similar to the Villabruna individual, commonly described as WHG ancestry or ‘Villabruna’ cluster [1]. This Villabruna cluster had largely replaced the earlier El Mirón genetic cluster, comprised of 19,000-15,000-year-old individuals from central and western Europe associated with the Magdalenian culture [1]. Interestingly, González-Fortes et al. [2] reported two new Mesolithic individuals from Iberia, named after the respective sites Chan (from Chan do Lindeiro) and Canes 1 (from Los Canes), noting that the Chan individual differed from the predominant WHG ancestry profile without exploring this observation further.

By generating new genome-wide data from Belgian HG, Iberian HG and Neolithic individuals we aimed to further refine the HG genetic structure in the Iberian Peninsula during the Upper Paleolithic and Mesolithic, and to characterise the HG ancestry sources that contributed genetically to Neolithic groups. We hypothesize that (i) admixture events of different Upper Paleolithic HG ancestries resulted in genetic structure among various Iberian HG groups, which can be observed as asymmetric genetic affinities to the Villabruna and El Mirón cluster or another potential source, respectively, (ii) this structure during the Early Holocene is stronger in the Iberian Peninsula than in Central Europe (iii) the genetic structure was inherited by Neolithic individuals through mixture with local HG ancestry, especially in regions that initially were more affected by expanding farmers, and (iv) Neolithic groups that followed different routes of expansion can be identified by the structure in their respective HG ancestry profiles.

We performed f3-outgroup statistics of the form f3(HG1, HG2, Mbuti) to measure the shared genetic drift among HG individuals in our dataset. In doing so, we obtained the same clustering pattern (Satsurblia, Mal’ta, Věstonice, El Mirón and Villabruna) concordant with Fu et al. [1] (Figure S1). Most Iberian HG are grouped either with the previously defined El Mirón cluster or fall within the wider GoyetQ2 cluster, which formed the rationale for additional testing of the robustness of this cluster (e.g. our newly reported Moita do Sebastião [~8ky cal BP], Balma Guilanyà [~12ky cal BP], and Chan [~9ky cal BP], whereas La Braña 1 and Canes 1 cluster with Villabruna. Formal statistics to test the clustering of Iberian HG to the GoyetQ2 cluster are shown in the main manuscript. The positions of the Iberian HG on the PCA plot (Figure S2) corroborates this pattern, with the Iberian HG forming a cline between WHG (i.e. Villabruna) and the GoyetQ2 cluster.

S2 Modelling admixture proportions in Iberian HG and WHG with qpWave/qpAdm

As shown by analyses presented in the main manuscript El Mirón is best explained as being admixed between individuals related to the Villabruna and GoyetQ2 clusters (Figure 3), despite being chronologically older than the best representative of each cluster, respectively, at least given the currently available fossil genetic record. Here we explore the implications of our genetic findings in view of the archaeological record in southwestern Europe.

S2.1 Iberian HG population structure and its correlation with the Azilian techno-complex

We also repeated the same admixture models as presented in the main manuscript (Figure 3D) using El Mirón instead of Goyet Q-2 as a geographically closer proxy for the non-Villabruna ancestry and Loschbour as a source of Villabruna-like ancestry (Figure S3A). Most of these models are statistically well supported, which suggests that El Mirón can serve as a local proxy for Iberian HG and explains why Chan and Moita do Sebastião are genetically similar to El Mirón (Supplementary section S2.1, Figure S3A). This also implies that the admixture between the GoyetQ2 and Villabruna clusters must have happened earlier, given the chronologically older date of El Mirón. We also observe a subtle El Mirón-related ancestry contribution outside the Iberian Peninsula in France (e.g. Rochedane 9.8 ± 8.3 %, and Iboussieres 25-1 20.7 ± 9.1 %) (Supplementary section S3.1, Figure S3A, Data S9), whereas Goyet Q-2-like ancestry is present in a higher number of individuals outside Iberia (Figure 3D). Inside Iberia, we notice an additional contribution of Villabruna-like ancestry visible in the 12,000-year-old Balma Guilanyà individual, which could reflect the transition to the Holocene and ‘Azilian’ techno- complexes in northeastern Iberia.

The so-called ‘Azilian’ is the dominant techno-complex in Vasco-Cantabria [4] and the Pre- Pyrenean region of the Iberian Peninsula [5,6], and in central and southern France [7,8] during the Pleistocene-Holocene transition around 14,000 cal BP. This geographic region formed a natural corridor, which connected Iberia and France and where the Azilian techno-complex was shared, but at the same time was absent in the rest of the Iberian Peninsula (Figure S3B). Traditionally, the transition to the Azilian has been viewed as a direct result of the climatic amelioration, which took place during the Bølling/Allerød warming phase (14,700 -12,700 BP) [9] when HG started to develop a broader subsistence spectrum, specializing in small prey hunting and combining terrestrial and aquatic resources; a trend, which then continued during the Holocene [10]. It is also in this period that the naturalistic art paintings, very common in the region, disappeared, and were replaced by geometric motifs mostly found in mobile art [11]. The material assemblage also presents changes with respect to the preceding Magdalenian culture. Classically, the Azilian techno-complex was defined by the presence of flat, asymmetrical , painted cobles, thumb-end scrapers, and curve-backed ‘Azilian’ points. This set has been re-defined, incorporating new elements and criteria such as a generalized simplification of the knapping method combined with a microlithization/miniaturization effect, in particular of backed points and end scrapers [12–14]. However, in many of these Azilian sites the chrono-stratigraphic continuity between the Magdalenian and Azilian levels makes it difficult to establish a clear rupture between both phases (Figure S3B).

The genomic data available for Iberian HG up to this time supports a gradual replacement of the genomic structure present in northern/northeastern regions of Iberia during the Magdalenian period (represented by the El Mirón individual). Indeed, the Iberian HG individuals with more Villabruna-like ancestry are from the Vasco-Cantabrian and Pre- Pyrenean regions (Iberia) and the increasing shift to Villabruna-like ancestry could be associated geographically and chronologically with the dispersion of new HG technologies along the Cantabrian corridor during Azilian and Mesolithic times, exemplified by individuals such as Azilian-associated Balma Guilanyà. Villabruna-like ancestry becomes even stronger during the Mesolithic in Cantabria (La Braña 1 and Canes 1) suggesting additional HG flux into north/northeastern Iberia, which again must have had a higher impact in this region. Mesolithic HG outside this region retain more ancestry shared with the GoyetQ2 cluster and have a higher, significant amount of El Mirón-like-ancestry (Chan from Galicia or Moita do Sebastião in Portugal) when modelled with El Mirón as a local source (Figure S3A and S3D).

S2.2 No African-Iberian connections during Mesolithic times

Another question that is heavily debated among archaeologists and that can be addressed based on our results is the potential African-Iberian connection during Mesolithic times, or more precisely during the period of the Geometric Mesolithic. The main characteristic technological change during the Late Mesolithic is the shift to the and Trapeze , in Iberia commonly grouped as Mesolithic Geometric (Trapezes and Triangles phases). Although the distribution of this phenomenon is spread widely along eastern, western and North Africa, its origin is still debated [15]. Based on a chronological gradient an African origin was suggested, from where it spread into Europe through the South of Italy (Sicily) and from where it followed a Mediterranean expansion to Iberia [16]. Other authors posit Crimea as the place of origin for the Blade and Trapeze industries [17]. Inside Iberia this cultural phenomenon is distributed along both the Mediterranean [18] and Atlantic [19] coastlines. This contrasts with the situation in northern Iberia where Mesolithic Geometric flint tools are found at significantly lower numbers [20] with the exception of the Ebro Valley, a natural corridor between the Mediterranean region and Cantabria [21]. Interestingly, this is the geographic region and the time period where and when the Villabruna/Loschbour-related ancesty increases (e.g .shown in individuals from La Braña 1 and Canes 1), in contrast with the individual Moita do Sebastião from Portugal where the Geometric Mesolithic was predominant [22] and which shows a similar genetic ancestry to El Mirón. Importantly, our genomic results from Moita do Sebastião, attributed to the Geometric Mesolithic, show a strong affinity with El Mirón but lack any genetic evidence in support of the hypothesis of an African origin for the Mesolithic Geometric (Figure S3C). Here, the f4- statistics of the form f4(Moita do Sebastião, WHG; Taforalt, Mbuti) in Figure S3C show no extra genetic affinity of Moita do Sebastião to Pleistocene Iberomaurusian HG from Taforalt (North Africa) [23]. Taforalt individuals are considered a good proxy to test the African-Iberian connections due to the genetic continuity attested in North Africa from the Late Pleistocene to the Holocene (Early Neolithic) despite their chronologically older age [24]. The fact that the genomic structure at the Atlantic coast has not significantly changed since the Magdalenian currently supports the idea of cultural diffusion or analogous technological changes rather than a direct gene-flow. However, more data, especially from the Mediterranean Mesolithic Geometric, is needed to address this question in more detail.

S3. Phenotypic traits

We extracted a list of 36 SNPs of functional importance or related to known phenotypic traits [23] and estimated the probability of carrying the ancestral or derived alleles based on the number of reads after using a quality filter q30. We checked different SNPs positions on the gene OCA2 related to light eye colour. We obtain the ancestral allele (rs12913832, 3 reads) in the CHA001 and ELT002 individuals, and heterozygous values for ELT006, suggesting dark colour eyes for all of them. The SNP coverage for the remaining individuals was not sufficient. Another allele from the same gene (rs1800404) related to eye pigmentation could support the darker pigmentation in ELT006 than in ELT002. We also checked different SNPs positions in the gene SLC45A2. We obtain the ancestral alleles (rs1426654, 2 reads) in BAL0051 and (rs16891982, 4 reads) in CHA001 which suggest a darker skin colour than ELT002 and ELT006, who are heterozygous or homozygous for the derived allele. The coverage in the other individuals is very low to allow comparisons. The allele rs3827760 of the EDAR gene, related to straight and thick hair, is ancestral in all individuals, albeit with variable coverage (CHA001, 4 reads; CHA003, 3 reads; ELT002, 14 reads and; ELT006, 8 reads). Also, as reported before for pre-farming and Neolithic individuals [25], none of our newly typed individuals show evidence for Lactase persistence. Individuals from the Neolithic times (ELT002, ELT006 and FUC003) show different combinations of derived and ancestral alleles of the gene rs174546, which is related to the capacity of regulation of the production of long-chain polyunsaturated fatty acids (FADS1/FADS2).

S4. Kinship analysis.

We used the published tool ‘READ’ (Relationship Estimation from Ancient DNA) [26] to estimate the degree of genetic kinship relatedness among individuals (STAR Methods). We split our HG and Neolithic individuals to calculate the proportions of non-matching alleles (P0). After the normalization of both groups, P0 of HG ranged between 0.964-1.029 and between 0.974-1.074 for Neolithic individuals, which in both cases is higher than 0.90625, the top value for second-degree related individuals. In sum, there are no first- or second-degree relatives among our newly reported ancient Iberian individuals. S5. Y-chromosomal and mitochondrial haplogroups

Using an in-house mtDNA capture assay, we could recover complete mitochondrial genomes from individuals CHA002, CHA003, CHA004, ELT002, ELT006, and FUC003. The coverage of the mtDNA genome for the rest or the samples ranges from 88.86-99.99% (Data S4). Iberian HG individuals from Balma Guilanyà and Moita do Sebastião belong to the haplogroup U, together with the two Middle Neolithic (MN) individuals CHA004 and ELT006 (Data S6). Individual BAL003 could be assigned to U2’3’4’7’8’9’, also found in the Paglicci 108 (~27,000 yrs cal BP, Italy), Rigney 1 (~15,500 yrs cal BP, France) [1], and Grotta d’Oriente C HG (~14,000 yrs cal BP, Italy) [27]. Individual BAL0051 belongs to U5b2a, also found in Neolithic Scotland [28]. Moita do Sebastião (CMS001) carries haplogroup U5b1, which was also reported from Middle Neolithic, Bell Beaker and Middle Bronze Age individuals from Portugal and Spain [28,29], and in the Ranchot 88 HG (~10,000 yrs cal BP, France). Early Neolithic individuals from Cueva de Chaves present non-U haplogroups. Individual CHA001 carries haplogroup HV0+195, but was previously reported as K based on PCR-based results [30]. This haplogroup was previously found in MN Ireland [31], as well as MN Scotland and Bell Beaker individuals from England [28]. Individual CHA002 was assigned to K1a2a, which is common in Early Neolithic Iberia, e.g. Cova Bonica [32], Cova de Els Trocs [33] and Cueva del Toro [24], but also in Chalcolithic and Bell Beaker individuals from Iberia and Italy [25,28]. CHA003 was assigned to K1a3a, so far reported from Neolithic Anatolia [25], Neolithic and Chalcolithic Scotland, and Bell Beaker individuals from Sicily [28]. Middle Neolithic individual CHA004 carried haplogroup U4a2f, found in HG from the Iron Gates, Romania, and Lithuania [27,34]. The Middle Neolithic ELT002 individual carries haplogroup J1c1b, present in the Koros Early Neolithic [35], Neolithic from Scotland [28], Iberian Late Neolithic- Chalcolithic [29], and Bronze Age from Italy and Germany [36]. ELT006 was assigned to haplogroup U3a1, which has been reported from Middle Neolithic France and Germany [27,28], and Chalcolithic Iberia [25]. Early Neolithic Fuente Celada carries haplogroup X2b+226, found in MN Hungary [37] and Middle Bronze Age Iberia [29]. X2b was also found in Iberian Late Neolithic [38], Chalcolithic [35] and Bell-Beaker individuals [28], Neolithic England [28] and Greece [39], and the Early and Late Neolithic from Morocco [24].

Y chromosome haplogroups were called from the list of Y-SNPs including in the 1240K capture assay using the script yhaplo [40]. BAL0051 could be assigned to haplogroup I1, while BAL003 carries the C1a1a haplogroup. To the limits of our typing resolution, Early and Middle Neolithic individuals CHA001, CHA003, ELT002 and ELT006 share haplogroup I2a1b, which was also reported for the Loschbour [41] and Motala HG [25], and other Late Neolithic and Chalcolithic individuals from Iberia [28,29], as well as Neolithic Scotland, France, England [28], and Lithuania [34]. Both C1 and I1/ I2 are considered typical male HG lineages prior to the arrival of farming in Europe. Interestingly, CHA002 was assigned to haplogroup R1b-M343, which together with an Early Neolithic individual from Cova de Els Trocs (R1b1a) confirms the presence of R1b in Western Europe prior to the expansion of steppe pastoralists that established a related male lineage in Bronze Age Europe [25,27,28,33,36]. The geographical vicinity and contemporaneity of these two sites led us to run genomic kinship analysis in order to rule out any first or second degree of relatedness. Early Neolithic individual FUC003 carries the Y haplogroup G2a2a1, commonly found in other Early Neolithic males from Neolithic Anatolia [25], Starçevo, LBK Hungary [35], Impressa from Croatia and Serbia Neolithic [27] and Czech Neolithic [28], but also in Middle Neolithic Croatia [27] and Chalcolithic Iberia [28]. Color Key and Histogram 500 Count 200 0

0.2 0.3 0.4 Value

Villabruna Rochedane Iboussieres25_1 Iboussieres31_2 Brana1 Canes1 Iron_Gates_HG Latvia_HG KorosEN_HG Loschbour Ranchot88 Bichon BerryAuBac Falkenstein Chaudardes OrienteC_HG BalmaGuilanya Chan MoitaSebastiao ElMiron Rigney1 GoyetQ2 Burkhardtshohle HohleFels49 Vestonice13 Vestonice16 KremsWA3 Vestonice43 Ostuni1 Pavlov1 Sunghir Kostenki14 Muierii2 Kostenki12 Paglicci133 GoyetQ116_1 Karelia_HG Russia_EHG Popovo_HG Motala_HG Ukraine_Mesolithic Kotias Satsurblia Natufian MA1_HG AfontovaGora3 Ust_Ishim Oase1 Taforalt Chan Kotias Oase1 Bichon Brana1 Taforalt ElMiron Sunghir Ostuni1 Canes1 Pavlov1 Muierii2 Rigney1 Natufian GoyetQ2 MA1_HG Satsurblia Ust_Ishim Villabruna Loschbour Latvia_HG Ranchot88 Kostenki12 Kostenki14 KremsWA3 Falkenstein Motala_HG Paglicci133 Rochedane Karelia_HG BerryAuBac Popovo_HG Vestonice43 Vestonice16 Vestonice13 Chaudardes HohleFels49 Russia_EHG KorosEN_HG OrienteC_HG GoyetQ116_1 AfontovaGora3 BalmaGuilanya MoitaSebastiao Iron_Gates_HG Burkhardtshohle Iboussieres31_2 Iboussieres25_1 Ukraine_Mesolithic

Figure S1. Heat plot showing the genetic distances between Eurasian HG. Genetic distances were calculated using f3-outgroup statistics of the form f3(X;Y, Mbuti), with X and Y being Eurasian hunter- gatherers in all possible pairwise comparisons. The analysis has been restricted to samples with more than 30,000 SNPs, following Fu et al. [1].

0.010

0.008

0.006

0.004

0.002

0

−0.002 PC2(0.356 %)

−0.004

−0.006

−0.008

−0.010

−0.012

−0.010 −0.005 0 0.005 0.010 0.015 PC1 (0.832 %)

Rigney 1 Paglicci 133 Afontova Gora 3 MA1 Vestonice 16 Ostuni 1 Satsurblia

Ranchot 88 Iron Gates HG Villabruna KorosEN HG Canes 1 Balma Guilanyà El Mirón

Chan Burkhardtshohle Hohle Fels 49 Sunghir Russia EHG Krems WA3 Pavlov 1

Kotias Bichon Motala HG OrienteC HG Berry-Au-Bac Falkenstein La Braña 1

Iboussieres 31-2 Iboussieres 25-1 MoitaSebastiao Karelia HG Popovo HG Vestonice 13 Vestonice 43

Natufian Taforal Latvia HG Goyet Q-2 Loschbour Chaudardes Rochedane

Ukraine Mesolithic Muierii 2 Kostenki 12 Kostenki 14 Ust' Ishim Oase 1 Goyet Q-116-1

Figure S2. PCA analysis calculated with 777 present day West Eurasians [3] with option shrinkmode:YES on which HG individuals were projected.

A Loschbour El Mirón B Moita do Sebastião (p=0.547)

Chan (p=0.301)

Balma Guilanyà (p=0.940)

La Braña 1 (p=0.047)*

Iboussieres 25-1 (p=0.506)

Canes 1 (p=0.077)

Rochedane (p=0.096)

Falkenstein (p=0.750)

Berry-Au-Bac (p=0.875)

Ranchot 88 (p=0.833)

Chaudardes (p=0.369) C Loschbour Iboussieres 31-2 (p=0.600) Villabruna OrienteC HG (p=0.171)

Bichon (p=0.025)* Koros EN-HG

Villabruna (p=0.033)* −0.002 −0.001 0 0.001 0.002 f4 (Moita do Sebastião,WHG;Taforalt,Mbuti) 0 0.5 1.0 D

80 Villabruna-like-ancestry Loschbour-like-ancestry 60 Fit of ancestry % vs time Fit of ancestry % vs time 40

20

WHG-like-ancestry(%) 0

−20,000 −15,000 −10,000 Radiocarbon date (Cal BP)

El Mirón Balma Guilanyà Chan La Braña 1 Moita do Sebastião Canes 1

Figure S3. A) Modelling European HG as admixture of Villabruna-like and El Mirón-like ancestry using Loschbour and El Mirón as proximal sources, respectively. B) Map showing the distribution of the main sites related to the Azilian techno-complex, adapted from Strauss [4]. Note that Iberian HG restricted to the Cantabrian and the Pre-Pyrenean regions carry more Loschbour-like ancestry (e.g. Balma Guilanyà, La Braña 1, and Canes 1) than Iberian HG from outside this area (Chan and Moita do Sebastião individuals). The increase of Villabruna ancestry is first attested with the Balma Guilanyà individual, recovered from an Azilian archaeological level. C) f4-statistics showing no affinity between Geometric Mesolithic Moita do Sebastião from Portugal and Iberomaurusian HG from Taforalt, Morocco, North Africa. D) Correlation between Villabruna-like ancestry (dark blue; Figure 3D) and Loschbour-like ancestry (light blue; Figure S3A) and time. Both models result in a fit of R=0.99 for individuals from north and northeast of Iberia Peninsula (to the exclusion of Chan and Moita do Sebastião), where we observe an increase of WHG-like ancestry similar to other parts of Europe.

BAL0051all CHA001 CHA002 CHA003 CHA004.A0102 CMS001 ELT002 ELT006 FUC003 Hom.Anc Het Hom.Der rs3827760 (EDAR) 1/0 4/0 0/0 3/0 1/0 0/0 14/0 8/0 0/0 >= 0.999 snp_2_136608574 (LCT) 1/0 0/0 0/0 0/0 0/0 0/0 0/0 4/0 1/0 >= 0.990 snp_2_136608642 (LCT) 1/0 3/0 0/0 2/0 0/0 0/0 18/0 14/0 1/0 >= 0.950 rs41525747 (LCT) 1/0 3/0 0/0 2/0 0/0 0/0 18/0 15/0 1/0 >= 0.900 rs4988236 (LCT) 1/0 3/0 0/0 2/0 0/0 0/0 18/0 14/0 1/0 >= 0.750 rs4988235 (LCT) 1/0 3/0 0/0 2/0 0/0 0/0 18/0 16/0 1/0 >= 0.500 rs41456145 (LCT) 1/0 3/0 0/0 2/0 0/0 0/0 19/0 15/0 1/0 < 0.500 rs41380347 (LCT) 1/0 1/0 0/0 2/0 0/0 0/0 21/0 13/0 1/0 rs145946881 (LCT) 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 1/0 rs4833103 (TLR1/TLR6/TLR10) 2/0 0/0 0/0 1/0 0/0 0/0 1/0 1/0 1/0 rs1229984 (ADH1B) 0/0 0/0 0/0 0/0 0/0 0/0 4/0 4/0 0/0 rs3811801 (ADH1B) 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 rs35395 (SLC45A2) 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 rs16891982 (SLC45A2) 0/0 4/0 0/0 0/0 0/0 0/0 5/8 4/3 2/0 rs272872 (SLC22A4) 0/2 0/0 1/0 0/1 0/0 0/0 5/8 0/4 1/0 rs2269424 (MHC) 0/1 0/0 0/0 0/1 0/0 0/0 7/6 12/0 0/1 rs2733831 (TYRP1) 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 rs174546 (FADS1/FADS2) 1/2 2/0 0/0 1/0 0/0 0/0 0/11 11/0 1/1 rs7944926 (DHCR7) 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 rs7119749 (GRM5) 1/0 0/0 0/0 1/0 0/1 0/3 0/6 2/4 3/0 rs10831496 (TYR) 0/0 0/0 0/0 0/0 0/0 0/0 0/2 0/0 1/0 rs1042602 (TYR) 2/0 0/0 0/1 1/0 0/0 0/0 2/4 5/1 0/0 rs642742 (KITLG) 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 rs653178 (ATXN2/SH2B3) 0/0 0/4 0/0 0/0 0/0 0/0 4/0 3/0 2/0 rs1979866 (NA) 0/0 1/0 0/0 0/0 0/0 0/0 2/0 3/0 2/0 rs2031526 (DCT) 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 rs1800414 (OCA2) 0/0 0/0 0/0 0/0 0/0 0/0 5/0 2/0 0/0 rs1800404 (OCA2) 0/0 0/0 0/0 0/3 0/0 1/0 0/3 2/2 1/0 rs12913832 (OCA2) 1/0 3/0 0/0 1/0 0/0 0/0 3/0 1/2 0/0 rs4424881 (APBA2) 2/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 rs1426654 (SLC24A5) 2/0 0/0 0/1 0/0 0/0 0/0 0/0 0/2 0/0 rs2470102 (SLC24A5) 0/0 0/0 0/3 0/1 0/0 0/0 0/1 0/2 0/0 rs2228479 (MC1R) 0/0 1/0 0/0 2/0 0/0 0/0 13/0 15/0 1/0 rs2228478 (MC1R) 0/0 0/0 0/0 0/1 0/0 0/0 0/7 0/3 0/0 rs6058017 (ASIP) 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 ELT002 ELT006 FUC003 CHA001 CHA002 CHA003 CMS001 BAL0051all

CHA004.A0102

Figure S4. Summary of genotypes of phenotypic and functional SNPs. Colours indicate the homozygous ancestral/derived or heterozygous state of the SNPs reported in the left-hand column. Numbers in cells indicate the number of reads matching the ancestral/derived allele.

Supplementary References

1. Fu, Q., Posth, C., Hajdinjak, M., Petr, M., Mallick, S., Fernandes, D., Furtwängler, A., Haak, W., Meyer, M., Mittnik, A., et al. (2016). The genetic history of Ice Age Europe. Nature 534, 200-205.

2. González-Fortes, G., Jones, E.R., Lightfoot, E., Bonsall, C., Lazar, C., Grandal- d’Anglade, A., Garralda, M.D., Drak, L., Siska, V., Simalcsik, A., et al. (2017). Paleogenomic Evidence for Multi-generational Mixing between Neolithic Farmers and Mesolithic Hunter-Gatherers in the Lower Danube Basin. Curr. Biol. 27, 1801– 1810.e10.

3. Lazaridis, I., Nadel, D., Rollefson, G., Merrett, D.C., Rohland, N., Mallick, S., Fernandes, D., Novak, M., Gamarra, B., Sirak, K., et al. (2016). Genomic insights into the origin of farming in the ancient Near East. Nature 536, 419-424.

4. Straus, L.G., and González Morales, M. (2003). El Mirón Cave and the 14C Chronology of Cantabrian Spain. Radiocarbon 45, 41–58.

5. Barandiarán, I., and Cava, A. (1994). Zatoya, sitio magdaleniense de caza en medio pirenaico. Homen. al Dr. J. Gonzalez Echegaray, Mus. y Cent. Investig. Altamira, 71– 85.

6. Martínez-Moreno, J., and Torcal, R.M. (2009). Balma Guilanyà (Prepirineo de Lleida) y el Aziliense en el noreste de la Península Ibérica. Trab. Prehist. 66, 45–60.

7. Clottes, J. (1979). Midi-Pyrénées. Gall. Préhistoire 22, 629–671.

8. Clottes, J. (1983). Midi-Pyrénées. Gall. préhistoire 26, 465–510.

9. Heiri, O., Brooks, S.J., Renssen, H., Bedford, A., Hazekamp, M., Ilyashuk, B., Jeffers, E.S., Lang, B., Kirilova, E., Kuiper, S., et al. (2014). Validation of climate model- inferred regional temperature change for late-glacial Europe. Nat. Commun. 5, 4914.

10. Straus, L.G. (2015). Chronostratigraphy of the Pleistocene/Holocene boundary: the Azilian problem in the Franco-Cantabrian region. Palaeohistoria 27, 89–122.

11. Sieveking, A. (1987). L’Art azilien, origine-survivance. Antiq. J. 67, 394–395.

12. Alonso, D.Á. (2008). La cronología del tránsito Magdaleniense/Aziliense en la región cantábrica/The chronology of the Magdalenian/Azilian transition in the Cantabrian region. Complutum 19, 67–78.

13. Fernández-Tresguerres.J.A., and Junceda Quintana, F. (1994). Los arpones azilienses de la cueva de los Azules (Cangas de Onís, Asturias). In Homenaje al Dr. Joaquín González Echegaray (Dirección General de Bellas Artes y Archivos), pp. 87– 96.

14. Martzluff, M., Martínez-Moreno, J.M., Guilaine, J., Mora, R., and Casanova, J. (2012). Transformaciones culturales y cambios climáticos en los Pirineos catalanes entre el Tardiglaciar y Holoceno antiguo: Aziliense y Sauveterriense en Balma de la Margineda y Balma Guilanyà. Cuaternario y Geomorfol. 26, 61–78. 15. Gronenborn, D. (2017). Migrations before the Neolithic? The Late Mesolithic blade- and-trapeze horizon in Central Europe and beyond. In Migration and Integration from Prehistory to the Middle Ages, LandesMuseum fur Vorgechichte, Halle, Germany, pp. 113–122.

16. Perrin, T., Marchand, G., Allard, P., Binder, D. (2009). Le second Mesolithique d’Europe occidentale : origines et gradient chronologique. In Fondation Fyssen, Annales 24, pp. 162–172.

17. Biagi, P. (2016). The Last Hunter-Gatherers of the Northern Coast of the and their Role in the Mesolithic of Europe. Souteast Eur. Before Neolit., 113-129.

18. García Puchol, O., Molina, L., Aura Tortosa, J., and Bernabeu, J. (2009). From the Mesolithic to the Neolithic on the Mediterranean Coast of the Iberian Peninsula. J. Anthropol. Res. 65, 237–251.

19. Bicho, N., Cascalheira, J., Gonçalves, C., Umbelino, C., Rivero, D.G., and André, L. (2017). Resilience, replacement and acculturation in the Mesolithic/Neolithic transition: The case of Muge, central Portugal. Quat. Int. 446, 31–42.

20. Arias, P., and Fano, M.A. (2009). ¿ Mesolítico geométrico o Mesolítico con geométricos?: el caso de la región cantábrica. In El mesolítico geométrico en la Península Ibérica (Departamento de Ciencias de la Antigüedad), pp. 69–92.

21. Utrilla, P., Montes, L., Mazo, C., Bea, M., and Domingo, R. (2009). El Mesolítico geométrico en Aragón. In El mesolítico geométrico en la Península Ibérica (Departamento de Ciencias de la Antigüedad), pp. 131–190.

22. Bicho, N., Cascalheira, J., Marreiros, J., Gonçalves, C., Pereira, T., and Dias, R. (2013). Chronology of the Mesolithic occupation of the Muge valley, central Portugal: The case of Cabeço da Amoreira. Quat. Int. 308–309, 130–139.

23. Van de Loosdrecht, M., Bouzouggar, A., Humphrey, L., Posth, C., Barton, N., Aximu- Petri, A., Nickel, B., Nagel, S., Talbi, E.H., El Hajraoui, M.A., et al. (2018). Pleistocene North African genomes link Near Eastern and sub-Saharan African human populations. Science (80-. ). 360, 548-552.

24. Fregel, R., Méndez, F.L., Bokbot, Y., Martín-Socas, D., Camalich-Massieu, M.D., Santana, J., Morales, J., Ávila-Arcos, M.C., Underhill, P.A., Shapiro, B., et al. (2018). Ancient genomes from North Africa evidence prehistoric migrations to the Maghreb from both the Levant and Europe. Proc. Natl. Acad. Sci. https://doi.org/10.1073/pnas.1800851115

25. Mathieson, I., Lazaridis, I., Rohland, N., Mallick, S., Patterson, N., Roodenberg, S.A., Harney, E., Stewardson, K., Fernandes, D., Novak, M., et al. (2015). Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, 499-503.

26. Monroy Kuhn, J.M., Jakobsson, M., and Günther, T. (2018). Estimating genetic kin relationships in prehistoric populations. PLoS One 13, e0195491. https://doi.org/10.1371/journal.pone.0195491.

27. Mathieson, I., Alpaslan-Roodenberg, S., Posth, C., Szécsényi-Nagy, A., Rohland, N., Mallick, S., Olalde, I., Broomandkhoshbacht, N., Candilio, F., Cheronet, O., et al. (2018). The genomic history of southeastern Europe. Nature 555, 197-203.

28. Olalde, I., Brace, S., Allentoft, M.E., Armit, I., Kristiansen, K., Booth, T., Rohland, N., Mallick, S., Szécsényi-Nagy, A., Mittnik, A., et al. (2018). The Beaker phenomenon and the genomic transformation of northwest Europe. Nature 555, 190-196.

29. Martiniano, R., Cassidy, L.M., Ó’Maoldúin, R., McLaughlin, R., Silva, N.M., Manco, L., Fidalgo, D., Pereira, T., Coelho, M.J., Serra, M., et al. (2017). The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods. PLOS Genet. 13, e1006852.

30. Gamba, C., Fernández, E., Tirado, M., Deguilloux, M.F., Pemonge, M.H., Utrilla, P., Edo, M., Molist, M., Rasteiro, R., Chikhi, L., et al. (2011). Ancient DNA from an Early Neolithic Iberian population supports a pioneer colonization by first farmers. Mol. Ecol. 21, 45–56.

31. Cassidy, L.M., Martiniano, R., Murphy, E.M., Teasdale, M.D., Mallory, J., Hartwell, B., and Bradley, D.G. (2015). Neolithic and Bronze Age migration to Ireland and establishment of the insular Atlantic genome. Proc. Natl. Acad. Sci. https://doi.org/10.1073/pnas.1518445113

32. Olalde, I., Schroeder, H., Sandoval-Velasco, M., Vinner, L., Lobón, I., Ramirez, O., Civit, S., García Borja, P., Salazar-García, D.C., Talamo, S., et al. (2015). A Common Genetic Origin for Early Farmers from Mediterranean Cardial and Central European LBK Cultures. Mol. Biol. Evol. 32, 3132–3142.

33. Haak, W., Lazaridis, I., Patterson, N., Rohland, N., Mallick, S., Llamas, B., Brandt, G., Nordenfelt, S., Harney, E., Stewardson, K., et al. (2015). Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207-211.

34. Mittnik, A., Wang, C.-C., Pfrengle, S., Daubaras, M., Zariņa, G., Hallgren, F., Allmäe, R., Khartanovich, V., Moiseyev, V., Tõrv, M., et al. (2018). The genetic prehistory of the Baltic Sea region. Nat. Commun. 9, 442. https://doi.org/10.1038/s41467-018- 02825-9.

35. Lipson, M., Szécsényi-Nagy, A., Mallick, S., Pósa, A., Stégmár, B., Keerl, V., Rohland, N., Stewardson, K., Ferry, M., Michel, M., et al. (2017). Parallel palaeogenomic transects reveal complex genetic history of early European farmers. Nature 551, 368- 372.

36. Allentoft, M.E., Sikora, M., Sjögren, K.-G., Rasmussen, S., Rasmussen, M., Stenderup, J., Damgaard, P.B., Schroeder, H., Ahlström, T., Vinner, L., et al. (2015). Population genomics of Bronze Age Eurasia. Nature 522, 167-172.

37. Gamba, C., Jones, E.R., Teasdale, M.D., McLaughlin, R.L., Gonzalez-Fortes, G., Mattiangeli, V., Domboróczki, L., Kővári, I., Pap, I., Anders, A., et al. (2014). Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5, 5257.

38. Valdiosera, C., Günther, T., Vera-Rodríguez, J.C., Ureña, I., Iriarte, E., Rodríguez- Varela, R., Simões, L.G., Martínez-Sánchez, R.M., Svensson, E.M., Malmström, H., et al. (2018). Four millennia of Iberian biomolecular prehistory illustrate the impact of prehistoric migrations at the far end of Eurasia. Proc. Natl. Acad. Sci. http://doi.org.10.1073.pnas.1717762115

39. Hofmanová, Z., Kreutzer, S., Hellenthal, G., Sell, C., Diekmann, Y., Díez-del-Molino, D., van Dorp, L., López, S., Kousathanas, A., Link, V., et al. (2016). Early farmers from across Europe directly descended from Neolithic Aegeans. Proc. Natl. Acad. Sci. 113, 6886-6891.

40. Poznik, G.D. (2016). Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men. bioRxiv. http://biorxiv.org/content/early/2016/11/19/088716.abstract.

41. Lazaridis, I., Patterson, N., Mittnik, A., Renaud, G., Mallick, S., Kirsanow, K., Sudmant, P.H., Schraiber, J.G., Castellano, S., Lipson, M., et al. (2014). Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409-413.