VIEW Open Access Evaluating the Y Chromosomal Timescale in Human Demographic and Lineage Dating Chuan-Chao Wang1, M Thomas P Gilbert2, Li Jin1,3 and Hui Li1*
Total Page:16
File Type:pdf, Size:1020Kb
Evaluating the Y chromosomal timescale in human demographic and lineage dating Wang et al. Wang et al. Investigative Genetics 2014, 5:12 Wang et al. Investigative Genetics 2014, 5:12 http://www.investigativegenetics.com/content/5/1/12 REVIEW Open Access Evaluating the Y chromosomal timescale in human demographic and lineage dating Chuan-Chao Wang1, M Thomas P Gilbert2, Li Jin1,3 and Hui Li1* Abstract Y chromosome is a superb tool for inferring human evolution and recent demographic history from a paternal perspective. However, Y chromosomal substitution rates obtained using different modes of calibration vary considerably, and have produced disparate reconstructions of human history. Here, we discuss how substitution rate and date estimates are affected by the choice of different calibration points. We argue that most Y chromosomal substitution rates calculated to date have shortcomings, including a reliance on the ambiguous human-chimpanzee divergence time, insufficient sampling of deep-rooting pedigrees, and using inappropriate founding migrations, although the rates obtained from a single pedigree or calibrated with the peopling of the Americas seem plausible. We highlight the need for using more deep-rooting pedigrees and ancient genomes with reliable dates to improve the rate estimation. Keywords: Y chromosome, Substitution rate, Demographic history, Lineage dating Introduction from human-chimpanzee comparisons [6,7], the genea- The paternally inherited Y chromosome has been widely logical rate observed in a deep-rooting pedigree [8], the applied in anthropology and population genetics to better rate adjusted from autosomal mutation rates [9], and describe the demographic history of human populations the rates based on archaeological evidence of founding [1]. In particular, Y chromosomal single nucleotide poly- migrations [10,11]. The choice of which kind of mutation morphisms (SNP) have been demonstrated one of the rate to be used in Y chromosome dating is controversial, useful markers, thus have been widely used in genetic since different rates can result in temporal estimates that diversity studies over the last two decades [1]. One of deviate several-fold. To address the above concern, we the most important links between genetic diversity and review how substitution rate and date estimates are human history is time, for instance, the time when a affected by the choice of different calibration points. lineage originated or expanded, or when a population split from another and migrated. In this regard, molecular clock theory has provided an approach to build bridges Review between genetics and history. Specifically, under the as- Y chromosome base-substitution rate measured from sumption of substitution rate among lineages is constant, human-chimpanzee comparisons et al. Y chromosomal molecular clocks have been used to esti- In 2000, Thomson screened three Y chromosome SMCY DFFRY DBY mate divergence times between lineages or populations genes ( , ,and ) for sequence variation in a [2-4]. Although this approach is widely accepted and used, worldwide sample set, using denaturing high-performance there is still ongoing debate about the most suitable sub- liquid chromatography (DHPLC) [6]. In order to infer the stitution rate for demographic and lineage dating [5]. In ages of major events in the phylogenetic trees, they had to particular, there are several popularly used Y chromosomal first estimate the Y chromosome base-substitution rate. substitution rates, such as the evolutionary rates measured This they obtained by dividing the number of substitutions differences between a chimpanzee and human sequence * Correspondence: [email protected] over the relevant regions, by twice an estimated human- 1 State Key Laboratory of Genetic Engineering and MOE Key Laboratory of chimpanzee split time (5 million years) resulting in a substi- Contemporary Anthropology, School of Life Sciences, Fudan University, −9 Shanghai 200438, China tution rate of 1.24 × 10 per site per year (95% confidence Full list of author information is available at the end of the article interval (CI) was not given in [6]). Using this rate, they were © 2014 Wang et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Wang et al. Investigative Genetics 2014, 5:12 Page 2 of 7 http://www.investigativegenetics.com/content/5/1/12 subsequently able to calculate the time of the Y chromo- under the assumption that the generation time is 30 years. somal spread out of Africa to approximately 50 thousand It is notable that this pedigree-based estimate overlaps with years ago (kya) [6]. A weakness of this approach was the evolutionary rates estimated from human and chim- that the sum of the lengths of the three genes was rela- panzee comparisons. For pedigree-based substitution rate tively small - at 64,120 base pairs (bp) it represented estimation, there are at least two criteria to be taken into just a fraction of the total Y chromosome. Kuroki et al. careful consideration. First, thepedigreemustbebiologic- attempted to address this in 2006, by sequencing nearly ally true and the generation information validated. The 13 Mb (more than 20% of the whole chromosome) of the pedigree used by Xue et al. is a Chinese family carrying the male-specific region of the chimpanzee Y chromosome. DFNY1 Y-linked hearing-impairment mutation. The same − Their analysis yielded a slightly higher rate, at 1.5 × 10 9 Y-linked disease-related mutation has validated the authen- (assuming that the generation time is 30 years, 95% CI: ticity of their genealogy. Second, the detected mutations − − 7.67 × 10 10-2.10 × 10 9), despite also using a chimpanzee- must be true. In this regard, Xue et al. used a variety of human calibration time that was 20% older than the methods to verify the candidate mutations, thus validity of previous study (6 million years) [7]. the rate: The Y chromosomes of the two individuals were What is hopefully clear from the above, is that although sequenced to an average depth of 11× or 20×, respectively, direct comparisons of human and chimpanzee Y chromo- thus mitigating the possibility of sequencing and assem- somes offer us a powerful means to better understand the bling errors; they also reexamined the candidate mutations evolutionary process in our sex chromosomes during using capillary sequencing. the past 5 to 6 million years, the process is clearly sus- This pedigree-based rate has been widely used in Y ceptible to a number of assumptions that need to be chromosome demographic and lineage dating. Cruciani made. First, there is uncertainty over the exact timing et al. [2] applied this rate to get an estimate of 142 kya to of the human-chimpanzee divergence, as fossil records the coalescence time of the Y chromosomal tree (including and genetic evidence have given a range of 4.2 to 12.5 haplogroup A0). Wei et al. [3] also used this substitution million years ago [12]. Second, extreme structural diver- rate to estimate the time to the most recent common gence between the human Y chromosome and that of the ancestor (TMRCA) of human Y chromosomes (haplogroups chimpanzee makes it difficult to do precise alignment. A1b1b2b-M219 to R) as 101 to 115 kya, and dated the The possible ascertainment bias and reference bias in data lineages found outside Africa to 57 to 74 kya. Rootsi et al. analysis might affect rate estimation. Third, it is not even [4] used this rate to estimate the age of R1a-M582 as 1.2 clear that the human and chimpanzee Y chromosomes to 4 kya, suggesting the Near Eastern rather than Eastern are even evolving under the same selective pressures. European origin of Ashkenazi Levites. Specifically, the chimpanzee Y chromosome might be Although this pedigree-based substitution rate is widely subject to more powerful selection driven by fierce sperm accepted, some concerns have also been raised. First, the competition since the split of human and chimpanzee [13], mutation process of Y chromosome is highly stochastic, which will accelerate the mutation rate in the chimpanzee and the rate based on a single pedigree and only four lineage. Therefore, some concerns have been raised over mutations might not be suitable for all the situations. whether the evolutionary rate based on human-chimpanzee For instance, the haplogroup of the pedigree used in rate divergence is consistent with the rate measured within estimation of Xue et al. is O3a; however, other haplogroups human species or whether it can be used in human probably have experienced very different demographic population demographic and paternal lineage dating. history and selection process, and might have different Given the above, a variety of other methods have been substitution rates as compared with haplogroup O3a. proposed, including Y chromosome base-substitution rate Second, the substitution rate was estimated using two measured in a deep-rooting pedigree, adjusted from auto- individuals separated only 13 generations, thus, the ques- somal mutation rates, and based on archaeological evidence tion is whether the substitution rate estimated at relatively of founding migrations. We address each of these in turn. short time spans could be used in long-term human popu- lation demographic analysis without considering natural Y chromosome base-substitution rate measured in a selection and genetic drift. Actually, many studies have deep-rooting pedigree noted that molecular rates observed on genealogical In 2009, Xue et al. [8] sequenced Y chromosomes of timescales are greater than those measured in long- two individuals separated by 13 generations using second term evolution scales [14]. generation paired-end sequencing methodology.