Online-Only Tables and Figures

Total Page:16

File Type:pdf, Size:1020Kb

Online-Only Tables and Figures

ONLINE-ONLY TABLES AND FIGURES

Table S1: Samples and their geographic/ ethnic origin

Region No. Country/population 25 African Americans 19 Biaka Pygmies 14 M'Buti Pygmies 14 Black South Africans 14 Ethiopians Africa (n=162) 25 Ghanaians 14 Gabonese 14 Khoisans 13 Mossi 10 Rimaibe 13 Ashkenazim 22 Bedouins 22 Druze ME (n=110) 21 Palestinians 18 Iranian Jews 14 Yemenite Jews 12 Basque 15 Urban French 13 Cajuns Europe (n=87) 21 Germans 16 Polish 10 Bretons 20 Cree's 23 Navajo's Americas (n=96) 37 Chipewyans 16 Karitiana 19 Japanese ECAsia (n=87) 18 Chinese 22 Mongolians Turkic (Kazakh, Khoton) 28 Mongolians Oriat (Olet, Mingat, Derbet, Bait) 18 Papuans of New Guinea 9 Roti IndPN G (n=58) 17 Ternate 14 Timor

1/11 Table S2. Sequence motifs within GT-tracts in the human genome (UCSC genome assembly from June 2002)

Sequence motifs count

GTX>12 50268

GTX>12 GAY>12 265 GTX>12 2 ATGT GAY>12

GTi>3 GGj>0 GTX>6 340 GTi>3 GG 299 GTX>7

92 GT4 GG GTX

65 GT5 GG GTX GTi>3 GG2 34 GTX>6

17 GT4 GG2 GTX

7 GT5 GG2 GTX GTi>3 GG>2 7 GTX>6

GTi>3 GGj>0 GTk GGl>0 GTX 49 GT5 GG GT3 2 GG GTX GT5 GG GT5 1 GG GTX

2/11 3/14

Table S3. Haplotypes by population and continental groups

Counts per group or population

D Africa ME Europe Americas ECAsia IndPNG L

J R l l l l l l r i i u e r j i r l l f p s r r v t a b s k d G u i p e h a a a a a a a T o m F J m A a e h o i a m e o a t r B o t t r t a t t t e O i s a o h a a e i r r ' h a r N h I e A o o o o o o B P P M S E C C T B K D J A G T B M R B N R G G M t t t t t t U C C P M K W Haplogroup x y Y 20 0 14 2 1 1 2 21 0 14 2 2 2 21 0 15 1 1 1 22 0 13 1 1 1 0000x0y 22 0 15 1 1 1 22 0 18 1 1 1 24 0 14 1 1 1 25 0 14 1 1 1 7 0 14 1 1 1 14 0 15 4 1 2 3 1 1 15 0 14 8 6 2 8 15 0 15 8 3 2 3 8 16 0 13 2 2 2 4100x0y 16 0 14 1 1 1 16 0 15 2 1 1 2 17 0 15 5 4 1 5 18 0 14 1 1 1 18 0 15 6 3 1 2 6 19 0 14 2 2 2 15 0 15 1 1 1 4200x0y 18 0 16 1 1 1 16 0 15 1 1 1 5100x0y 18 0 15 1 1 1 19 0 15 1 1 1 11 0 14 2 1 1 1 1 5131x0y 11 0 15 3 3 3 12 0 14 6 1 1 3 5 1 1 5132x0y 11 0 14 9 3 1 1 3 1 9 5300x0y 14 0 14 1 1 1 0000x2y 13 2 24 1 1 1 17 2 21 2 1 1 1 1 18 2 21 4 1 1 1 1 2 1 1 18 2 22 1 1 1 18 2 23 1 1 1 18 2 25 2 2 2 19 2 12 1 1 1 19 2 17 1 1 1

3/11 4/14

19 2 20 2 1 1 1 1 19 2 21 26 1 1 2 1 5 3 9 2 4 3 1 2 12 2 2 1 1 19 2 22 5 1 1 2 1 1 1 1 2 19 2 23 3 1 2 3 19 2 28 3 2 1 3 20 2 20 1 1 1 20 2 21 3 1 1 2 1 1 20 2 22 3 1 1 1 1 2 21 2 21 4 1 1 1 1 1 1 1 1 21 2 22 1 1 1 22 2 18 6 2 1 1 1 1 6 24 2 20 1 1 1 25 2 15 1 1 1 26 2 16 1 1 1 25 2 18 1 1 1 18 2 20 1 1 1 4300x2y 19 2 21 1 1 1 5100x2y 11 2 18 1 1 1 13 2 21 4 1 1 2 1 1 1 1 14 2 18 2 2 2 15 2 15 31 1 1 1 2 1 2 1 9 1 1 1 3 2 1 1 2 6 3 3 6 3 1 3 7 15 2 18 2 1 1 2 16 2 15 1 1 1 17 2 15 12 1 1 2 1 1 1 1 4 1 1 2 2 2 2 2 17 2 18 16 1 1 1 1 1 3 1 1 3 5 2 2 4 1 2 3 17 2 20 2 1 1 2 17 2 21 2 1 1 2 17 2 22 3 1 1 1 3 18 2 11 1 1 1 18 2 12 1 1 1 18 2 15 5 2 2 4 1 1 18 2 17 1 1 1 18 2 18 27 1 1 3 6 2 3 2 16 1 3 1 2 1 8 1 1 1 1 18 2 19 1 1 1 18 2 20 2 1 1 2 18 2 22 1 1 1 19 2 11 4 1 1 2 4 19 2 12 10 2 1 2 5 1 2 2 5 19 2 13 3 1 1 1 1 2 19 2 14 1 1 1 19 2 15 41 2 1 3 1 5 1 2 2 17 1 2 3 1 7 1 1 1 2 5 2 1 3 1 1 1 3 6 2 1 3 19 2 16 8 1 1 1 1 1 1 2 6 1 1 19 2 17 3 1 1 1 1 1 1 19 2 18 79 1 1 1 3 3 3 3 3 4 2 18 1 1 5 6 5 1 19 4 2 4 10 5 2 3 4 14 4 7 4 15 19 2 19 3 1 1 1 1 1 1 19 2 21 2 1 1 1 1 20 2 11 2 2 2

4/11 5/14

20 2 12 1 1 1 20 2 15 4 1 1 1 3 1 1 20 2 16 1 1 1 20 2 17 1 1 1 20 2 18 15 3 2 1 1 7 1 1 2 4 4 1 1 2 20 2 21 1 1 1 21 2 17 1 1 1 21 2 18 40 1 1 6 4 20 9 39 21 2 19 2 1 1 1 1 22 2 11 1 1 1 22 2 15 1 1 1 23 2 15 1 1 1 19 2 22 1 1 1 14 2 15 1 1 1 19 2 20 1 1 1 15 2 13 56 1 1 2 1 1 5 3 10 17 35 10 2 3 3 18 15 2 14 1 1 1 5131x2y 15 2 15 1 1 1 16 2 13 1 1 1 17 2 13 1 1 1 5151x2y 10 2 18 1 1 1 18 2 15 1 1 1 18 2 18 5 2 2 1 2 3 18 2 20 2 2 2 5200x2y 18 2 21 1 1 1 18 2 22 31 2 2 1 3 4 1 3 1 1 6 4 9 6 19 18 2 24 2 1 1 1 1 19 2 22 2 1 1 2 600 25 19 14 14 14 25 14 14 13 10 162 13 22 22 21 18 14 110 12 15 13 21 16 10 87 20 23 37 16 96 19 18 22 28 87 18 9 17 14 58

5/11 6/14

Table S4. Summary statistics of DXS1238 haplotypes (pw, ps and p represent probabilities evaluated according to Waterson (1978), Slatkin (1994) and Chakraborty (1990), respectively)

n k (HT) G (HT) F ΘF Θk Fexp kexp

(pw; ps) (p) All 600 112 0.95 0.05 18.6 40.3 0.025 66 (0.998; 1.00) (0.000) Africa 162 66 0.97 0.03 33.6 41.0 0.03 60 (ns) (ns) X0Y 72 28 0.95 0.07 15.9 16.4 0.07 28 (ns) (ns) X2Y 90 38 0.95 0.07 15.6 24.3 0.05 30 (0.951; ns) (0.040) Non-Africans 438 65 0.93 0.07 11.1 20.9 0.05 42 (0.97; 1.00) (0.000) ME 110 38 0.93 0.07 12.9 20.1 0.05 30 (ns; 0.988) (0.031) Europe 87 25 0.91 0.09 9.2 11.4 0.09 22 (ns) (ns) Americas 96 18 0.78 0.22 2.9 6.3 0.14 11 (0.0951; 0.961) (0.009) ECAsia 87 21 0.80 0.20 3.3 8.5 0.11 11 (0.983; 0.992) (0.001) IndPNG 58 14 0.82 0.19 3.7 5.5 0.16 11 (ns) (ns)

6/11 7/14

7/11 8/14

Table S5. Variance in the repeat length and the proportion of the major length allele

GT- tract GA-tract 2 2 n of n of S GA S GA/lnPGA lnPGT n 2 2 Haplogroup Population S2 Major X major -lnP -S2/lnP S2 Major Y major -lnP -S2/lnP S GT S GT/lnPGT lnPGA (total) X Y All 75 3.5 19 41 0.60 5.8 6.0 21 39 0.65 9.2 1.7 1.6 0.92 Africa 23 6.8 19 7 1.19 5.7 12.3 18 6 1.34 9.2 1.8 1.6 0.89 0.0.0.0.X.2.Y Non- 52 1.9 19 0.42 4.5 3.2 21 35 0.40 8.1 1.7 1.8 1.07 Africans 34 All 343 3.4 19 156 0.79 4.3 4.6 18 182 0.63 7.3 1.4 1.7 1.24 Africa 64 4.0 19 27 0.86 4.7 7.9 15 37 0.55 14.4 2.0 3.1 1.58 5.1.0.0.X.2.Y Non- 279 3.2 19 0.77 4.1 3.4 18 177 0.46 7.5 1.1 1.8 1.70 Africans 129 Non- 5.1.3.1.X.2.Y 60 0.1 15 0.03 3.0 0.1 13 57 0.05 2.0 1.0 0.7 0.66 Africans 58 Non- 5.2.0.0.X.2.Y 42 0.1 18 0.05 2.1 3.0 22 31 0.30 9.9 30.0 4.8 0.16 Africans 40 5.1.3.1.X.0.Y Africa 9 0.3 12 5 0.59 n.a. 0.3 14 6 0.41 n.a. 1.0 1.4 1.45 4.1.0.0.X.0.Y Africa 39 4.2 15 16 0.89 n.a. 0.4 15 25 0.44 n.a. 0.1 0.2 2.00 5.1.3.2.X.0.Y Africa 9 - 11 9 0.00 - - 14 9 0.0 - - - - 0.0.0.0.X.0.Y Africa 9 2.9 21 3 1.10 n.a. 1.8 14 6 0.41 n.a. 0.6 1.7 2.71

8/11 9/14

Footnote to Table S5:

Within each haplogroup the histories of both repeated tracts are linked; thus, independently of the population demography (see Eqs. 3

2 2 and 4 in Materials and Methods) the ratio of S GA / S GT = μGA 2GA /μGT 2GT, where 2 is the mean mutation square size (mutation size corresponds to the number of added or deleted repeat units, i.e. 2 = 1 for single unit mutations). The relative mean square mutation

2 2 size 2GA/2GT can be evaluated from the product (S GA/ S GT) x (lnPGT/lnPGA) (Eqs. 1, 3 and 4 in Materials and Methods), whereas under

2 population expansion 2 can be obtained directly from the ratio S /lnP =2 (Eqs. 1 and 4). To evaluate P, the major length allele was always assumed to be the ancestral one.

As shown in this Table there is more variance within GT tracts than on the GA-side in the Africa-only X0Y lineages. Interestingly, the introduction of the ATGT interruption between GT and GA repeats resulted not only in more length variation in all haplogroups of the

new X2Y branch, but in a greater variance in the repeat length within the GA tract as well (Fig. 3). We evaluated the average 2GA at

7.8 and 2GT at 4.3 and (ratio of about 1.8) using the data from non-Africans for 0000X2Y and 5100X2Y haplogroups. The African data give similar results, yet, because in Africa the haplotypes are older, the ancestral haplotype becomes relatively less frequent. Over time, due to recurrent mutations there is an increased possibility of homoplasies (non-identity by descent of identical alleles) that would affect population frequency of the presumed ancestral allele and by the same token the resulting estimates. While lowering P,

homoplasies would increase the estimate of 2; the tendency which is actually observed in Africans as compared to non-Africans. In

5131X2Y and 5200X2Y haplogroups, given the number of mutation observed, only 2GA of 5200X2Y can be reliably evaluated, and at

9/11 10/14

9.9 its estimated value is in an excellent agreement with the value of 7.8 above, especially considering that the inherent variance in S2 estimates is very large (Zhivotovsky and Feldman 1995).

The relative mutation rates of GT and GA repeats can be estimated from Eq. 1 as well as Eq. 4 if 2 is known. The ratio μGT/μGA = lnPGT/lnPGA, obtained from the persistence of the ancestral alleles P within each of the repeats, provides information about the partition of the rates μGT and μGA in their cumulative rate estimate (μGT + μGA). In the haplogroup 0000X2Y lnPGT/lnPGA ~ 1, both repeats appear to have mutated at the same rate and thus using whereas in other X2Y haplogroups the ratio lnPGT/lnPGA is different. In 5100X2Y, GT tract seem to have mutated 1.7-fold faster than the GA tract, whereas in 5200X2Y the effect is opposite suggesting a relative reduction in the mutation rate of its GT tract, also reflected in the S2 values.

REFERENCES

Zhivotovsky LA, Feldman MW (1995) Microsatellite variability and genetic distances. Proc Natl Acad Sci U S A 92:11549-52

10/11 11/14

11/11 12/14

12/11 13/14

13/11 14/14

Figure S3: PCA based on haplogroup frequencies.

14/11

Recommended publications