SUPPLEMENTARY INFORMATION Doi:10.1038/Nature12736
Total Page:16
File Type:pdf, Size:1020Kb
SUPPLEMENTARY INFORMATION doi:10.1038/nature12736 TABLE OF CONTENTS SI 1. THE ARCHAEOLOGICAL SITES OF MAL’TA AND AFONTOVA GORA-2 .............................................................................................................................................1 SI 1.1 MAL’TA ..................................................................................................................1 SI 1.2 AFONTOVA GORA-2 ...............................................................................................2 SI 2. RADIOCARBON DATING.....................................................................................6 SI 3. EXTRACTIONS, LIBRARIES AND SEQUENCING..........................................9 SI 3.1 ANCIENT SAMPLES: MA-1 AND AG-2.....................................................................9 SI 3.1.1 Extractions .................................................................................................9 SI 3.1.2 Library preparation and sequencing ..........................................................9 SI 3.2 MODERN INDIVIDUALS: TAJIK, AVAR, MARI, INDIAN ...........................................10 SI 3.2.1 Sample background and collection..........................................................10 SI 3.2.2 Extractions ...............................................................................................11 SI 3.2.3 Library preparation and sequencing ........................................................11 SI 4. PROCESSING AND MAPPING OF RAW SEQUENCE DATA FROM ANCIENT AND MODERN INDIVIDUALS ................................................................13 SI 4.1 ANCIENT SAMPLES: MA-1 AND AG-2...................................................................13 SI 4.2 SIXTEEN PRESENT-DAY HUMANS, DENISOVA, TIANYUAN......................................13 SI 4.2.1 Sample overview .....................................................................................13 SI 4.2.2 Read processing of the Mari, Avar, Tajik and Indian individuals...........13 SI 4.2.3 Read processing of the present-day human genomes, Denisova and Tianyuan genomes.................................................................................................14 SI 4.2.4 Genotyping ..............................................................................................14 SI 4.3 BIOLOGICAL SEX OF MA-1 AND AG-2..................................................................15 SI 5. DNA CONTAMINATION ESTIMATES.............................................................22 SI 5.1 MITOCHONDRIAL DNA CONTAMINATION ESTIMATES ...........................................22 SI 5.1.1 Description of the method .......................................................................22 SI 5.1.2 Data filtering............................................................................................22 SI 5.1.3 Results......................................................................................................22 SI 5.2 X CHROMOSOME-BASED DNA CONTAMINATION ESTIMATES.................................23 SI 5.2.1 Description of the method .......................................................................23 SI 5.2.2 Data..........................................................................................................23 SI 5.2.3 Results......................................................................................................24 SI 5.3 CONCLUSIONS: LIBRARIES FOR DOWNSTREAM ANALYSES....................................24 SI 6. ERROR RATE ESTIMATION AND DNA DAMAGE.......................................29 SI 6.1 ERROR RATE ESTIMATION .....................................................................................29 SI 6.1.1 Description of the method .......................................................................29 SI 6.1.2 Data .........................................................................................................29 SI 6.1.3 Results......................................................................................................29 SI 6.2 DNA DAMAGE IN MA-1 .......................................................................................29 SI 7. MTDNA HAPLOGROUP OF MA-1 ....................................................................33 ! WWW.NATURE.COM/NATURE | 1 doi:10.1038/nature12736 RESEARCH SUPPLEMENTARY INFORMATION SI 8. Y CHROMOSOME HAPLOGROUP OF MA-1.................................................43 SI 9. ANCESTRY PROPORTIONS IN MA-1..............................................................49 SI 9.1 ADMIXTURE ANALYSIS.....................................................................................49 SI 9.1.1 Method.....................................................................................................49 SI 9.1.2 Data .........................................................................................................49 SI 9.1.3 Results......................................................................................................50 SI 9.2 NGSADMIX ANALYSIS ..........................................................................................50 SI 9.2.1 Method.....................................................................................................50 SI 9.2.2 Data .........................................................................................................50 SI 9.2.3 Data filtering and preparation..................................................................50 SI 9.2.4 Results......................................................................................................51 SI 10. PRINCIPAL COMPONENT ANALYSIS..........................................................59 SI 11. TREEMIX ANALYSIS .........................................................................................64 SI 11.1 DATA FILTERING.................................................................................................64 SI 11.2 ADMIXTURE GRAPH FITTING ...............................................................................64 SI 11.3 RESULTS .............................................................................................................64 SI 11.4 ADMIXTURE GRAPHS FOR SUBSETS OF THE 17 GENOMES .....................................65 SI 11.5 ADMIXTURE GRAPHS FOR SIMULATED DATA .......................................................65 SI 12. MIXMAPPER ANALYSIS ...................................................................................74 SI 13. D-STATISTIC TESTS BASED ON SEQUENCING DATA (DSEQ) ................80 SI 13.1 NOTATION AND DESCRIPTION OF THE TEST .........................................................80 SI 13.2 DATA..................................................................................................................80 SI 13.3 DATA FILTERING.................................................................................................80 SI 13.4 TEST RESULTS ....................................................................................................81 SI 13.5 CONTAMINATION CORRECTION ...........................................................................81 SI 14. ANALYSES USING F-STATISTICS AND D-STATISTICS BASED ON ALLELE FREQUENCY DATA (DFREQ).......................................................................87 SI 14.1 METHODS AND DATA..........................................................................................87 SI 14.2 ESTIMATES OF SHARED DRIFT USING F3-STATISTICS.............................................87 SI 14.3 D-STATISTIC TESTS (DFREQ) ON DATA WITH VARIOUS ASCERTAINMENTS DOCUMENT GENE FLOW BETWEEN THE MA-1 LINEAGE AND NATIVE AMERICAN ANCESTORS ............88 SI 14.4 NO DIFFERENCE BETWEEN NATIVE AMERICAN POPULATIONS IN THEIR AFFINITY TO MA-1..............................................................................................................................88 SI 14.5 GENETIC AFFINITIES BETWEEN NATIVE AMERICANS AND ALL WESTERN EURASIANS .....................................................................................................................89 SI 14.6 EQUAL AFFINITY OF MA-1 TO HAN AND PAPUANS .............................................90 SI 14.7 EVIDENCE FOR BASAL EAST ASIAN GENE FLOW INTO MA-1 ...............................91 SI 14.8 NO EUROPEAN POPULATION IS AS CLOSE TO NATIVE AMERICANS AS IS MA-1....91 SI 15. ANALYSIS OF AG-2 USING NUCLEOTIDE MISINCORPORATION PATTERNS ....................................................................................................................103 SI 16. PHENOTYPE RECONSTRUCTION...............................................................107 ! WWW.NATURE.COM/NATURE | 2 doi:10.1038/nature12736 RESEARCH SUPPLEMENTARY INFORMATION SI 1. The archaeological sites of Mal’ta and Afontova Gora-2 S1 1.1 Mal’ta Mal’ta is a multi-component site located along the Belaya River, tributary of the Angara River in the Pre-Baikal region of southern Siberia, about 86 km northwest of Irkutsk (Figure 1a). It is most noted for its rich middle Upper Palaeolithic (MUP) archaeological materials, including the famous double-child burial and “Venus” figurines. The site was originally excavated during the early 20th century (1928-1958) by Gerasimov1,2, and later during the 1990s by Medvedev and colleagues3. Original excavations were extensive, covering 1348 m2, where Gerasimov reported a single cultural layer for the Upper Palaeolithic found 1.5 m below the surface1. Recent geoarchaeological investigations have shown site stratigraphy to be more complex than originally thought3.