A2k +Meta +Guaviare +Inirida-Atabapo B2b +Upper-Vaupes +Putumayo D1a +Miriti-Parana +Caqueta Pasto 1 +Amazonas Kamentsa D1 +Andean B2oSaliba +Interfluvial Other-EET Inga C1b

A2h A2q A2 Cocama D1f A2a C Desano Siona Dim2 (14.2%) Pira-Wanano 0 Ach-Piapoco Coreguaje Tikuna Tuka-Tatuyo A2d Mur-Uitoto

Guayabero Yucu-Matapi C1c Puinave Tanimuka Sikuani CarijonaB2 Curripaco B2e C1d

-1

Nukak

-1.0 -0.5 0.0 0.5 1.0 1.5 Dim1 (18.9%)

Supporting information Figure S1. Correspondence analysis based on the sub-haplogroup frequencies by population. Populations are color-coded by river or area of settlement. 100 Sikuani Guayabero Tanimuka

75 s r

50 uence pai q Se

25

0

0 20 40 60 Pairwise differences

Supporting information Figure S2. Mismatch distributions for the four populations with high positive values of Tajima’s D. 1.E6 1.E6

1.E5 1.E5

1.E4

1.E4 1.E3 a b

1.E2 1.E3 population size x generation time 0 10000 20000 30000 40000 0 10000 20000 30000 40000 50000

1.E5 1.E6

1.E4 1.E5

1.E3 1.E4

1.E2 1.E3 c d 1.E1 0 10000 20000 30000 40000 0 10000 20000 30000 40000 population size x generation time Time (years ago) Time (years ago)

Supporting information Figure S3. Four main trajectories observed in the BSP plots per population, exemplified by: a) Yucuna, showing population increase; b) Coreguaje, showing population stability; c) Nukak, showing population decrease; and d) Sikuani, showing extreme population contraction. H_95 H_94 H_84 H_83 H_82 H_79 H_70 H_7 H_65 H_62 H_6 H_59 H_55 H_52 H_50 H_5 H_48 H_43 H_42 H_40 H_32 H_3 H_225 H_223 H_219 H_214 H_213 H_212 H_211 H_21 H_208 Abs.freq H_199 H_197 12.5 H_194 10.0 H_191 H_190 7.5 H_19 H_182 5.0 H_181 H_18 2.5 H_177 H_17 0.0 H_165 H_164 H_161 H_159 H_154 H_152 H_150 H_15 H_146 H_145 H_142 H_14 H_136 H_135 H_131 H_129 H_128 H_126 H_12 H_118 H_117 H_113 H_111 H_110 H_109 H_106 H_102 H_10 o o o e o a t r y g s av -ET tot una a uani r In k agua k atu P abe Siona anano Saliba Y Nukak -Ui T Ti eguaje he y Siriano - r Si t r Desano a Puin W Cocama animuka amentsa - O T Curripaco K a Mu Co uka r Gu ucu-Matapi ch-Piapoco T Y A Pi

Supporting information Figure S4. Absolute frequency of shared haplotypes. Populations are color-coded by linguistic affiliation as in Figure 2. 1 Language family Arawakan 1 East-Tukanoan 2 Tikunan 3 4 4 Tupi 3 3 3 2 1 4 Peba-Yaguan 1 3 1 2 1 4 2 11 Quechuan 1 Maku-Puinave 1 5 3 2 2 1 5 11 Guahiban Cluster II 1 3 Piaroa-Saliba 5 2 2 5 West-Tukanoan 7 9 2 1 1 3 3 Kamentsa 4 1 1 1 1 Pasto 1 5 1 4 5 6 Huitoto 3 2 1 1 1 2 5 Caribe 4 6 8 Cluster III Barbacoan 2 4 5 4 10 Nasa 2 1 5 1 1 1 9 1 1 1 1 1 9 1 1 5 1 2 1

1 Cluster I 1 2

Supporting information Figure S5A. Networks of sequences, color-coded by linguistic affiliation. Haplogroup A2. Language family East-Tukanoan Arawakan Cluster I Tupi Peba-Yaguan 9 6 1 Maku-Puinave 2 10 Caribe 1 1 1 1 Guahiban 1 4 5 Piaroa-Saliba 2 West-Tukanoan 2 2 1 1 1 1 Kamentsa 1 1 3 1 2 1 3 Pasto 1 1 Quechuan 5 2 5 Huitoto 8 Cofan 8 1 2 2 1

4 5 1 1 1 1

1 1 1 2 1

Cluster II

Supporting information Figure S5B. Networks of sequences, color-coded by linguistic affiliation. Haplogroup B2. Language family Arawakan Huitoto East-Tukanoan Tupi Tikunan Peba-Yaguan Maku-Puinave Guahiban Caribe Piaroa-Saliba Kamentsa Pasto Quechuan Cofan West-Tukanoan

1 1 1 1 3 Cluster III 1 2 4 5 1 1 Cluster I 2 7 6 1 2 2 2 6 1 1 1 3 Cluster II 5 1 2 Cluster IV 6 3 2 3 1 2 1 4 1 1 2 2 1 1 1 2 5 4 11 1 3 3 2 1 1 10 1 1 2 2 3 1 1 1 9 1 2 1 7 2 1 1 1 1 1 2 5 3 2 1 1 2 1 1 1 6 3 2 5 4 1 5 1 2 4 8 3 3 5 6 7 4 1 6 2 2 3 3 1 4 1 1 4 2 7 3 2

2 Cluster VI

Cluster V

Supporting information Figure S5C. Networks of sequences, color-coded by linguistic affiliation. Haplogroup C1. Language family Arawakan Huitoto East-Tukanoan Tupi Tikunan Peba-Yaguan Maku-Puinave Guahiban Caribe Piaroa-Saliba Kamentsa Pasto Quechuan Cofan West-Tukanoan

Cluster I

1 1 1 1 3 5 1 2 4 Cluster II

2 1 4 6 2 2 2 1 7 1 4 1 9 1

2 1 1 3 5 1

1

1 2 1 8 6 10 4

2

1

Supporting information Figure S5D. Networks of sequences, color-coded by linguistic affiliation. Haplogroup D1. PhiST value

0.4

0.3

0.2

0.1

0.0

Arawakan Yagua Carib Cocama Eastern-Tukanoan Tikuna Western-Tukanoan Inga Guahiban Kamentsa Piaroa-Saliban Pasto Huitotoan Maku-Puinave Nukak Pasto Puinave Kamentsa Mur.Uitoto Quechuan Saliba Tikuna Guayabero Tupi Sikuani Peba-Yaguan Coreguaje Siona Tanimuka Other-ET Siriano Pira.Wanano Desano Tuka.Tatuyo Carijona Ach.Piapoco Curripaco Yucu.Matapi a o o o e o g t t r y s In av -ET to una r a uani k agua k atu P abe Siona anano Saliba Y Nukak .Ui T he Ti eguaje y Siriano r t Si r Desano a Puin W Cocama Carijona animuka amentsa O T Curripaco K Mu a. Co uka. r ucu.Matapi Gu ch.Piapoco T Y A Pi

Supporting information Figure S6. Matrix of pairwise ΦST values. The stars indicate significance (p-value <0.05) after Benjamin-Hochberg correction for multiple tests. Arawakan Carib Eastern-Tukanoan Western-Tukanoan Guahiban Piaroa-Saliba Huitotoan Maku-Puinave pasto 0.2 Kamentsa Quechua Tanimuka Tikuna Nukak Tupi Tikuna Yucu.Matapi Peban-Yagua Mur.Uitoto Kamentsa 0.1 Inga Tuka.Tatuyo Coreguaje Pasto Saliba Siona Other.ET Pira.WananoCarijona Siriano Ach.Piapoco Desano Puinave Yagua Cocama Curripaco 0

0. Guayabero Dimension 3 Sikuani 0.2

0.1 -0.1

0.0

-0.1 Dimension 2

2 -0.2 -0. -0.2 -0.1 0.0 0.1 0.2

Dimension 1 stress=7.85

Supporting information Figure S7. Three-dimensional non-metric multidimensional scaling plot. Stress value is given in percentage. 0.4 3 T) 0. S tance (Phi 0.2 s tic di e gen 1 0. 0.0 1 -0. 0 200 400 600 800 1000 great circle distance (km)

Supporting information Figure S8. Correlation between geographic distances and genetic distances. The blue line corresponds to the regression line of the entire dataset of 24 populations, the red line corresponds to the regression line excluding the outlier populations (Sikuani, Siona, Nukak), and the black line corresponds to the regression line of the pairwise comparisons involving the outlier populations. 1.E7 1.E6

1.E6 1.E5

1.E5

1.E4 1.E4

population size x generation time A B 1.E3 1.E3 0 5000 10000 15000 20000 0 2500 5000 7500 10000 12500 15000

1.E6 1.E6

1.E5 1.E5

1.E4 1.E4

population size x generation time C D 1.E3 1.E3 0 5000 10000 15000 20000 25000 2500 5000 7500 10000 12500 15000 17500 Time (years ago) Time (years ago)

Supporting information Figure S9. Bayesian Skyline Plots by haplogroup. A, haplogroup A2; B, haplogroup B2; C, haplogroup C1; D, haplogroup D1. Africa ISEA NW_Amazonia 0.004 East_Asia NorthEastAsia Oceania 0.003 2 Nuc.div 0.00 0.001 0

Supporting information Figure S10. Nucleotide diversity observed in complete mitochondrial genomes of different human populations from around the world. Data from: Barbieri et al. 2012; Barbieri et al. 2013; Barbieri et al. 2014a; Barbieri et al. 2014b; Duggan et al. 2013; Duggan et al. 2014; Jinam et al. 2012; Ko et al. 2014; Delfin et al. 2014; present study. Populations are ordered on the X-axis by geographic region.