<<

Commentary

Genome, diversity, and origins: The Y chromosome as a storyteller Jaume Bertranpetit* Facultat de Cie`ncies de la Salut i de la Vida, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain

nalysis of genome variation concerning the accuracy of our knowledge tion. The new analysis (2) on the same Amay focus on one of two possible goals: of genome dynamics, worries concerning chromosome gives only one-third of the understanding the genome region under the ability and power to detect specific pro- time. It is thus an interesting new proposal study or solving historical and evolutionary cesses and disentangle cases where more not only in human evolution but also in questions specific to the population(s) ana- than one mechanism may have produced human evolutionary genetics. It could mean lyzed. Understanding of variation of a given similar genetic patterns, and worries con- that a population with modern characteris- genome region has a genetic interest be- cerning the appropriateness of evolutionary tics had to exist in Africa 50,000 ya and cause it is a consequence of the dynamics of models needed for the inference. And fi- spread later to Eurasia and the rest of the the genome and thus the evolutionary forces nally there have been worries from anthro- world. Besides the concordance of this pro- (mutation, selection in its varieties, drift, pologists who do not perceive the interface posal with archaeological evidence, there recombination, . . . ) may be understood. It between the evolutionary biology of a spe- are more strictly genetic issues to be dis- is thus a way to understand the mechanisms cies and that of tiny fragments of DNA, cussed, namely the limitations of the theory that produce variation in the genome. On usually in noncoding regions, worries sur- and of the genetic data, the interpretation of the other hand, when genetic variation is rounding a fast-developing field, heir to the variation, and possible specific proper- being analyzed, random individuals from classical population genetics, with brilliant ties unique to the Y chromosome. specific populations offer the possibility to novelties but also eager to get headlines. New ideas coming from genetics in hu- trace back their origin beyond the limitation In this context, two related papers in man evolution have had very different fates, of the genome region sampled. The goal this issue (1, 2) analyze an ample set of and some cases are remembered for having then may be the evolutionary reconstitution sequences of the Y chromosome to ad- proposed a paradigm shift strongly attacked of specific populations or, for a global sam- dress the issue of human origins. Their by paleoanthropologists but later shown to pling, the origin of our species. The under- main objective is to propose a novelty: the be correct. This was the case for the time

lying idea is very simple: if we are able to common origin of present is not depth of the hominid branch, as a separate COMMENTARY trace back the coalescence of genomic re- as old as previously believed, but should group from our close relatives, the chim- gions from an ample worldwide sample, we be much more recent, around 50,000 years panzee and the bonobo. In most cases the can infer the phylogeny of humans. The ago (ya), the age of coalescence of the Y process either to acceptance or rejection rationale can be sketched as follows: chromosomes surveyed (2) and in congru- goes with a lively debate in which scholars (i) The evolutionary dynamics of the ge- ence with an onset of expansion of around from very different disciplines enter the fray nome region is known in pattern and tempo. 30,000 ya (1). Moreover, both studies cor- with non-mutually intelligible languages. A (ii) From the extant variation the past roborate the existence of a substantial (or proposal like the present one (1, 2), stem- can be inferred and the coalescence pro- exponential) and an ming from genetics, has implications in sev- cess reconstructed; in some cases just part African origin for modern humans, in eral fields, from which specific clarifications of the genetic information is being used, accord with most other genetic data. may be asked: and analyses do not fully exploit the in- The newly proposed age is much younger (i) Evolutionary genetics: Are conclu- formation contained in the data. than dates consistent with nuclear and mi- sions about population fully supported or (iii) Translate the genetic process into a tochondrial DNA (mtDNA) (see references may there be a bias due to the genomic process of individuals that reproduce (the in ref. 2), where the minimum coalescent region under study? Does all of the ge- population), and the time and place of the age was obtained for mtDNA at around nome tell the same story? ancestral individual (carrying the ances- 150,000 years and much older for nuclear (ii) Mathematical genetics: May the tral genome) may be recognized. genes, except for the present case. Similar models used be safely applied to the real (iv) If the geographic distribution of results have been obtained through other world, apart from their theoretical ele- derivative genetic stages is known, the genetic approaches, such as short tandem gance? What are the implications of vio- expansion process may be dissected as repeat (STR, also called microsatellite) vari- lation of the theoretical assumptions? migratory waves and events. ation (3), where a figure of 156,000 years for (iii) Human paleontology: Is the evolu- (v) The structure of the genetic variation the deepest split in human phylogeny was tionary history of humans as interpreted may reveal demographic characteristics such obtained, or Alu insertions (4), with a pro- from the hard evidence (the fossils) com- as population size and subdivision as well as posed age of 137,000 years for the time of patible with the new genetic proposals? ancient dynamics, such as expansions. separation of African versus non-African (iv) Archaeology and Paleodemogra- This simple pathway is not easy. Infer- populations. phy: Since cultural innovations are at the ences from molecules to populations are not What seems more surprising is the dis- base of population expansions, are time straightforward, and there have been recur- crepancy with the results of nine diallelic and mode of cultural change correlated rent worries on what was being analyzed, polymorphic sites on the Y chromosome either the genes or genomic regions on one (5), where the analysis with the same meth- hand or the individuals, populations, or spe- ods as those in ref. 2 gave a figure of around See companion articles on pages 7354 and 7360. cies on the other. There have been worries 150,000 ya for the coalescence of the varia- *E-mail: [email protected].

PNAS ͉ June 20, 2000 ͉ vol. 97 ͉ no. 13 ͉ 6927–6929 Downloaded by guest on September 26, 2021 Table 1. Observed nucleotide diversity in coding and noncoding regions of autosomes that it is a footprint of a population expan- and X and Y chromosomes and values predicted for the Y chromosome by correcting sion, other phenomena could be involved, the values for autosomes and the X chromosome for the population size, assuming that such as a selective sweep or an uneven male and female effective population sizes are equal mutation rate along the nucleotides. Thus Diversity ϫ 10Ϫ4 genomic factors and population factors mimic each other in shaping genetic varia- Observed Predicted for Y tion, and only accurate and comparative Regions Autosomes X Y By autosomes By X analyses may solve ambiguities. There is still another alternative that Coding 2.91 1.96 0.52 0.73 0.65 could account for a discordance of results Noncoding 6.18 3.92 1.11 1.54 1.31 between Y chromosome age and other ge- netic results, as the history of the Y chro- mosome may be different from the history with their demographic consequences re- selection has not been important in hu- of the autosomes and mtDNA. Conflicting trieved through genetic data? man history, what is at the base of the patterns in the structure of genetic variation differentiation of populations is drift. To have been found between Y chromosome To Understand the Genome or the Popula- what extent can drift alone explain levels and mtDNA, that is, between paternal and tion? Take two good papers on evolution- of differentiation found among Y chro- maternal lineages (8, 9), which have been ary genetics, one on Drosophila and the mosomes in humans? For STRs (6) it was interpreted as a sex-specific migratory pat- other on humans. As an average, there is shown that differentiation between popu- tern, with higher mobility for females than a striking difference: while the Drosophila lations (as measured by Fst, a measure of males. Could a sex-specific characteristic paper tends to focus on the understanding genetic distance) and gene diversity within account for discrepancies in coalescence? A of the genomic region and the forces act- populations were comparable to those of possible explanation could be a lower effec- tive size for Y chromosomes than for ing on it, keeping demography and history autosomal STRs when corrections for the mtDNA and for autosomes. The latter is at a second level, the human paper tends smaller effective population size of the Y much less likely, as differences should be to address specific population questions, chromosome were taken into account. dramatic to show their effect, as autosomes considering that the mechanisms of ge- Nothing else was needed to explain the are in both males and females. But even for nome change are sufficiently well known 4-fold difference in the value of Fst be- mtDNA, differences do not seem to have for the purpose. As a whole, the ap- tween Y and autosomes and the 2-fold been important. Known causes of reduced proaches are very different; perhaps too difference in gene diversity. Thus there effective number of males, such as polygyny much so if we consider that the scope of was no need to invoke selection, as drift or higher male prereproductive mortality the problem is the same. Human studies may be a complete explanation for STR variation. caused by hunting or warfare (ref. 9 and have ignored until recently most of the references therein), seem to have had little strong development of molecular evolu- With the new sequence data (1) the sit- uation is similar. Under the infinite-site impact. Thus it does not seem plausible that tionary theory, and the incorporation of factors tied to male specificity of the Y scholars and methodologies into the study model and with random mating, it has been shown (7) that the mean of ␲ (nucleotide chromosome could be important in explain- of humans has improved the field and is ing the data. giving interesting new views. diversity or average number of nucleotide To what extent do the data support pop- differences between two sequences ran- ␪ Use of Complex Models. Coalescence theory ulation-related results rather than being a domly chosen from the population) is (a function of effective population size), and is providing extremely useful tools for the result of genomic properties of the region ␪ understanding of molecular variation. under analysis? Let us focus on the Y chro- thus the comparison between values should take into consideration the values of Nonetheless, some of the models are very mosome. Recombination adds complexity complex, and most geneticists do not feel to autosomal studies but it does not affect N for autosomes, Y chromosomes, and X chromosomes according to comfortable with black boxes in which to the nonrecombining region of the Y chro- pour data and extract powerful results, such mosome (NRY). The lack of recombination ϭ ͑ ͞ ͒ ϭ ͑ ͞ ͒ NY 1 4 Nau and NY 1 3 NX. as precise estimates of ages of each mutation has a drawback: wherever and however se- in the coalescent tree, including the age of lection acts, it will affect the whole NRY, as Results are given in Table 1. the most recent common ancestor. This is a result of positive selection (hitchhiking The differences between the values 0.73 the case of the interesting approach devel- effect or selective sweep) or of negative and 0.65 in relation to 0.52 on one hand and oped by R. C. Griffiths in the program selection (background selection). In both between 1.54 and 1.31 in relation to 1.11 on GENETREE (see references in ref. 2), for cases there is a reduction of nucleotide the other may be a footprint of selection, as which the robustness under violation of the- variation in linked sites, a result that could it may also be the sole fact of having a lower oretical assumptions should be better clari- also be achieved by a reduction of effective nucleotide diversity for coding regions. Al- fied. The problem may not be whether hu- population size. It affects the estimation of though present results do not allow measur- man populations have been of constant size the coalescent age. In fact, until recently the ing past selection in the Y chromosome, it or have followed a fixed exponential model, Y chromosome was thought to be extremely seems likely to have been of small impor- but heterogeneity in time and space of poor in genetic variation, and an easy ex- tance; explanations based on population growth patterns. planation could be postulated: positive se- structure and history seem to explain better The historical demography of humans is lection in a single gene may have created a the historical shaping of genetic diversity. much more complex than what all popula- dramatic genetic sweep, setting to zero the Perhaps some background selection will be tion genetics models assume; this is obvious amount of variation accumulated. The lack demonstrated, but it will have little impact and it does not add any value to the discus- of recombination makes the situation easily on the bulk of the structure of the genetic sion. What is more difficult to understand is affected by the action of positive selection. variation. the robustness of the methods to the real Exploring population size may help us A similar reasoning could be done with demographic history, with likely exponen- in recognizing the importance of selection the distribution of pairwise differences be- tial increases and stasis (or even decreases) when explaining genetic diversity levels. If tween sequences. While it is mainly assumed in a highly irregular pattern. Undoubtedly

6928 ͉ www.pnas.org Bertranpetit Downloaded by guest on September 26, 2021 both a global and a continental approach to and time fit the new hypothesis of a recent East at about the same time as in the the past are extreme oversimplifications. common origin. earliest sites in , around 40,000 ya, The well-demonstrated heterogeneity in The earliest well-characterized anatomi- but there is controversy about whether mtDNA composition of African popula- cally modern humans have been found both they have a common origin and where it tions is a clear example of the complex in Africa and in the Middle East, and they could be. The simple model of an origin in historical patterns. Genomic issues should are significantly older than findings from the Middle East and a spread to Europe also be clarified, as should the importance of other parts of the . Ages around has not been clearly demonstrated, even if uneven mutation rates along the nucleotides 100,000 ya seem well documented for mod- it could be true, as a general trend, for the or, for autosomal loci, the impact of recom- ern morphologies in Africa (Omo Kibish in expansion within Europe, mostly in its bination or gene conversion along the gene Ethiopia; Klasies River Mouth in South southern part. Genetic evidence support- genealogy. Africa; Border Cave, South Africa) and in ing this view was provided by mtDNA Another question is related to population the Middle East (Skhul and Qafzeh). For analysis (12, 13). The East evidence subdivision. No doubt that human popula- the rest of Eurasia dates are more recent: is too scarce to be considered here. tions, which for most of the past millennia less than 65,000 ya in China (Liujiang) and The evidence for Africa is puzzling. Al- have had extremely low densities and have much more recent in Southeast Asia, where though the classical view established both a occupied vast territories, have experienced seems to have persisted chronological and a technological parallel- extraordinary barriers to gene flow, and longer. In Europe, the first appearance is ism between cultural phases, Middle to subdivision at many levels has been an im- dated around 40,000 ya, and modern mor- Later Stone Ages and Middle to Upper portant genetic force. The distortion that phology seems to expand from east to west, Paleolithic, recent evidence points to more this may produce in coalescence estimations with earlier morphologies (i.e., Neander- elaborate technologies well within the Mid- should be analyzed from a theoretical per- thal) persisting in southern Iberia until dle Stone Age with chronologies compatible spective considering what is known or in- 27,000 ya (ref. 10 and references therein). with the oldest anatomically modern hu- ferred about real past populations. In summary, morphological evidence mans. Nonetheless, the correlation between The discrepancy with the analysis by M. F. supports the hypothesis of a single and Af- morphology and culture is not found in the Hammer et al. (5) is also puzzling. There is rican origin of modern humans, but at a date earliest modern Africans. We are still far a methodological difference that may be much older than the proposed 50,000 ya. from having a clear picture of the making of modern behavior or even which are the important: whereas in sequence analysis (2) But it is clear that, in a large part, the spread landmarks that would allow recognizing the variable positions appear while sequencing, of modern humans, first to Asia and Aus- initial steps. in a survey on polymorphisms defined be- tralia, later to Europe and America, took Cultural evidence tends to show that signs forehand there may be a lack of alleles at low place much later than the initial appearance of modernity appeared first in Africa, and frequencies (like singletons), which may af- of the morphology. Present morphological later developed in Eurasia (10), with a fect GENETREE behavior. In fact, its use with data do not exclude a later migration out of tempo and mode of dispersion that are clear previously defined polymorphisms may dis- Africa (except for the very special case of only for Europe, beginning 40,000 ya and tort the estimates. This should also be a the Middle East), but they do not support a likely to be associated with morphologically point of clarification. very recent origin. Perhaps multiple disper-

modern humans. A recent date for the or- COMMENTARY I would like to call upon theoretical sions from Africa took place at different igin of humans as proposed (2) does not population geneticists to test the robust- times and from different places (11); com- contradict the scarce existing evidence un- ness of the GENETREE estimates in the real plex models, as usual, could better accom- less the old tool industries in Africa are genetic and demographic world to allow modate the existing data. taken as evidence of modern behavior. The many users (more comfortable with mo- timing of the population expansion (1) at lecular than with theoretical challenges) Biology and Culture in Human Evolution. The 30,000 ya may fit the archaeological evi- to feel more secure in their inferences and evolutionary framework stemming from bi- dence, mainly for Europe. interpretations. ology (be it from genetic or morphological It has taken years to assemble some evidence) should fit into the archaeological pieces of the puzzle of human origins, within Do Bones and Genes Agree? The answer evidence. Modern human behavior seems the substitution model (with some variants) should be positive, as there is a single human to be associated with the shift from the and an African origin. A consensus between history. But controversy has often arisen Middle Paleolithic to the Upper Paleolithic genetic, morphological, and maybe even ar- between the fossil hard evidence and the (equivalent for most of Africa to the shift chaeological sources placed the origin of genetic evidence. There are still competing from Middle Stone Age to Later Stone modern humans at before 100,000 ya. A explanations for the origin of modern hu- Age), which does not seem to have taken much more recent date, as proposed (2) mans, even if the replacement hypothesis place before 40,000 ya. Some archeologists does not fit with the present interpretation with an out of Africa migration is accepted believe that this is a dramatic behavioral of the fossil evidence, but it could fit the by a wide majority of the scientific commu- shift and may be called a major revolution in cultural evidence. The detection of popula- nity. The paleontological data fit the model, human history. This cultural change could tion growth is well supported by the cultural although some cases (especially in Asia) account for a demographic expansion, likely evidence, and the timing of population ex- remain debatable. What seems clear is that related to the spread of modern humans in pansion at 30,000 ya could be related to the we should look for the earliest modern some regions. Upper Paleolithic transition and, in Europe, morphologies and see whether their place This transition took place in the Middle to the expansion of modern humans.

1. Shen, P., Wang, F., Underhill, P. A., Franco, C., Yang, Arcot, S. S., Saha, N., Jenkins, T., Tahir, M. A., Deininger, 9. Pe´rez-Lezaun, A., Calafell, F., Comas, D., Mateu, E., Bosch, E. W.-H., Roxas, A., Sung, R., Lin, A. A., Hyman, R. W., P. L. & Batzer, M. A. (1997) Genome Res. 7, 1061–1071. Martı´nez-Arias, R., Clarimon, J., Fiori, G., Luiselli, D., Facchini, Vollrath, D., et al. (2000) Proc. Natl. Acad. Sci. USA 97, 5. Hammer, M. F., Karafet, T., Rasanayagam, A., Wood, E. T., F., Pettener, D. & Bertranpetit, J. (1999) Am. J. Hum. Genet. 65, 7354–7359. Altheide, T. K., Jenkins, T., Griffiths, R. C., Templeton, A. R. & 208–219. 2. Thomson, R., Pritchard, J. K., Shen, P., Oefner, P. J. & Feldman, Zegura, S. L. (1998) Mol. Biol. Evol. 15, 427–441. 10. Lewin, R. (1999) Human Evolution. An Illustrated Intro- duction (Blackwell, Oxford). Proc. Natl. Acad. Sci. USA 97, M. W. (2000) 7360–7365. 6. Pe´rez-Lezaun, A., Calafell, F., Seielstad, M., Mateu, E., Comas, 11. Lahr, M. M. & Foley, R. (1994) Evol. Anthropol. 3, 48–60. 3. Goldstein, D. B., Ruiz Linares, A., Cavalli-Sforza, L. L. & D., Bosch, E. & Bertranpetit, J. (1997) J. Mol. Evol. 45, 265–270. 12. Comas, D., Calafell, F., Mateu, E., Pe´rez-Lezaun, A., Bosch, E. Feldman, M. W. (1995) Proc. Natl. Acad. Sci. USA 92, 6723– 7. Watterson, G. A. (1975) Theor. Pop. Biol. 7, 256–276. & Bertranpetit, J. (1997) Hum. Genet. 99, 443–449. 6727. 8. Seielstad, M. T., Minch, E. & Cavalli-Sforza, L. L. (1998) Nat. 13. Simoni, L., Calafell, F., Pettener, D., Bertranpetit, J. & Barbujani, 4. Stoneking, M., Fontius, J. J., Clifford, S. L., Soodyall, H., Genet. 20, 278–280. G. (2000) Am. J. Hum. Genet. 66, 262–278.

Bertranpetit PNAS ͉ June 20, 2000 ͉ vol. 97 ͉ no. 13 ͉ 6929 Downloaded by guest on September 26, 2021