History and Geography of Human Y-Chromosome in Europe: a SNP Perspective
Total Page:16
File Type:pdf, Size:1020Kb
JASs Invited Reviews Journal of Anthropological Sciences Vol. 86 (2008), pp. 59-89 History and geography of human Y-chromosome in Europe: a SNP perspective Paolo Francalacci & Daria Sanna Dipartimento di Zoologia e Genetica Evoluzionistica, Università di Sassari, via Muroni 25 – 07100 Sassari, Italy e-mail: [email protected] Summary - e genetic variation observed in the modern European populations can be used to reconstruct the history of the human peopling of the continent. In recent times, a great importance has been given to uniparental markers such as the Y-chromosome. is chromosome, which is passed from father to son, does not have a counterpart subject to recombination and the only possible source of variation is mutation. e nucleotide changes accumulate over time in the molecule, with no rearrangement among lineages. Lately, the D-HPLC technique, which allows the eff ective detection of single nucleotide polymorphisms (SNPs), was used to boost the number of available polymorphisms on the Y-chromosome. Since the year 2000, a number of studies were aimed both at the reconstruction of Y-chromosome phylogeny and the geographic distribution of Y-chromosome variation in Europe. e distribution of distinctive Y-chromosome lineages can also display a correspondence with geography, thus providing patterns of affi nity and clues concerning past human movements. It is therefore possible to recognize the eff ect of the colonization of Europe following the Last Glacial Maximum, both from the western Iberian and the eastern Balkan refuges. Other lineages show a migratory wave from the Near East, consistent with the demic diff usion model of agriculture. A minor east-west genetic cline was proposed as a signal of an expansion from north of the Black Sea, related with the diff usion of people speaking languages of the Indo-European family. Keywords - Phylogeography, Single Nucleotide Polymorphisms, Non-Recombining portion, Y-chromo- some, Population genetics. Introduction their uniqueness in the human genome, being the only two haploid segments of the human genome, Since the publication of “ e history and geog- not subject to the rearrangement of recombina- raphy of human genes” by Cavalli-Sforza and col- tion, and consequently inherited linearly identi- laborators (1994), a pivotal study for human evo- cal through generations. Mutation is the only lutionary genetics focused on divergence among possible source of variation for these two genetic human populations using many autosomal inher- systems and nucleotide changes accumulate over ited traits (e.g. blood types, HLA factors, pro- time in the molecule. All the mtDNAs and NRYs teins), an increasing attention has been given to of living humans descend by coalescence from a DNA markers at uniparental transmission, such single female and male respectively and mutations as mitochondrial DNA (mtDNA) (Richards & from the common ancestors accumulated along Macaulay, 2001), and the non recombining por- lineages can be used for tracking back their evo- tion of Y-chromosome (NRY) (Jobling & Tyler- lutionary history. However, the haploidy makes Smith, 2003). e reason of this interest resides in mtDNA and Y chromosome more sensitive to the the JASs is published by the Istituto Italiano di Antropologia www.isita-org.com 60 Phylogeography of Y-chromosome in Europe eff ect of genetic drift which may signifi cantly skew introduced unwanted source of misunderstand- the haplogroup frequencies. is is particularly ing. To overcome this inconvenient, a strictly cla- true for the Y chromosome, considering the dif- distic nomenclature, paralleled to the one in use ferences between male and female eff ective popu- for mtDNA, has been proposed and now com- lation sizes (Jorde et al., 2000). Furthermore, the monly accepted (Y Chromosome Consortium, NRY is far larger than the mitochondrial genome 2002). e new system is based on capital letters (about 60 Mb in respect to 16.569 bp), giving to in order to identify the broader clades and a suc- this system huge potentiality for reconstructing cession of numbers and letters for lower hierar- human evolution, at least from the paternal side. chical levels, thus fl exible enough to allow the Recent advances in molecular biology, namely, unambiguous naming of haplogroups defi ned by the development of Denaturing-High Performance newly discovered downstream markers. However, Liquid Chromatography (D-HPLC) (Underhill et the internal nodes are highly sensitive to changes al., 1997), improved the knowledge of the phy- in tree topology and to the addition of new SNPs logeny of the Y-chromosome at an extraordinary which resolve paraphyletic into monophyletic detail, and the number of known polymorphisms groups. is occurrence may require the peri- increased dramatically in the past decade from odical update of the nomenclature (Karafet et al., just a handful to over 600 (Karafet et al., 2008). 2008), especially for the commonest haplogroups is technique made easy, fast and inexpensive such as E, I or J, and creating disorder when com- the detection of new nucleotide changes in a given paring data between papers published in a dif- stretch of DNA, and it was possible to identify ferent time. Moreover, a lineage showing a root a large number of biallelic markers, called single SNP, but not presenting any of the derived ones, nucleotide polymorphisms (SNPs) all over the is identifi ed by a * (star) symbol, but it should molecule. e mutational event that introduces be distinguished by lineages where derived SNP a SNP in the NRY is so rare as to be considered were not tested, and this is obviously the case if unique and recurrent independent mutations are the derived SNP were not yet discovered. For this practically absent (Underhill et al., 2000). is reason, a “star” haplogroup may not have the same feature allows a hierarchical approach in recon- meaning in diff erent papers. Moreover, to distin- structing the molecular phylogeny, so when diff er- guish haplogroups where downstream markers ent haplotypes share a SNP are considered derived were assayed but not detected, at the haplogroup from a common ancestor and clustered together name is usually added, in brackets, the symbol in a haplogroup. Moreover, fast evolving multi- “x” followed by the name of the excluded haplo- allelic markers, such as Short Tandem Repeats groups. So, to overcome possible ambiguities, it (STRs), are also present in the Y-chromosome and is strongly advisable (after a statement of all the their variability, considering an average mutation SNPs assayed, including those not detected) to rate of about 6,9 x 10-4 per generation (25 years) identify a given lineage adding to the haplogroup (Zhivotovski et al., 2004), can be used to date the name the last downstream SNP observed. approximate age of a haplogroup. In addition, e distribution of haplogroups in a given the combined use of fast evolving STRs can also territory is not random, refl ecting the spread of provide further helpful information to dissect the human beings in the past. Phylogeographic anal- evolutionary history of lineages where no down- ysis confi rms the African origin of Homo sapiens stream SNP markers are available. (Underhill et al., 2001) and also provides clues Early papers used diff erent nomenclatures concerning past human movements at continen- to defi ne haplogroups (Underhill et al., 2000; tal and even regional level. Geographical clines Semino et al., 2000; Hammer et al., 2001: of Y-chromosome haplogroups in Europe have Rosser et al., 2000) and this fact, together with been reported (Semino et al., 2000; Rosser et al., the contemporary existence of a STR-based 2000) and their distribution can be related to classifi cation (Jobling & Tyler-Smith, 2000), archaeological, linguistic and other non genetic P. Francalacci & D. Sanna 61 data to infer the past events of the peopling of Haplogroup A the continent. In this work we review papers Defi ned by the mutation M91, this hap- dealing with Y-chromosome variability based on logroup is the fi rst branch of the evolutionary SNPs and focused on Europe and neighboring tree of mankind and it is diff used in sub-Saha- areas (Anatolia, the Caucasus, the Levant and ran Africa with higher frequencies in Khoisan North Africa). In order to homogenize the data populations (Underhill et al., 2001). is hap- from many studies carried out at various depth logroup can be sporadically found in south- of analysis, the haplogroup frequencies for all the ern European populations such as Greeks (Di population studied are reported in Appendix 1 in Giacomo et al., 2003; Capelli et al., 2006) and a summarized form, tracing back into the main Portuguese (Goncalves et al., 2005), although haplogroups the sub-clades not analyzed in the in these studies no downstream markers were whole set of the considered papers. e com- assayed and European Y-chromosomes belong- plete data set is available online at the Journal of ing to haplogroup A may not be closely related Anthropological Science website as supplemen- with the southern African lineages. An early tary material. Some papers (Weale et al., 2002; divergence originated the East African sub- Di Giacomo et al., 2003; Dupuy et al., 2006), clade A3b2-M13, (Semino et al., 2002) found using STRs and/or a limited set of SNPs, could at high percentage in Falasha (Ethiopian Jews) not provide the necessary level of resolution to (Cruciani et al., 2002; Shen et al., 2004) and be included in the overall analysis but they were consistently observed, although at very low taken into account for the general discussion. frequencies, in Sardinia (Semino et al., 2000; Other papers focused on the phylogeny of specifi c Passarino et al., 2001; Contu et al., 2008). haplogroups (Di Giacomo et al., 2003; Rootsi et However, the extreme rarity of this haplogroup al., 2004; Semino et al., 2004; Cruciani et al., does not off er clues about its possible appear- 2004) have been used for the drawing of the fre- ance in Europe.