Whole-Genome Sequencing of Giant Pandas Provides Insights Into Demographic History and Local Adaptation
Total Page:16
File Type:pdf, Size:1020Kb
LETTERS Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation Shancen Zhao1,2,10, Pingping Zheng1,3,10, Shanshan Dong2,10, Xiangjiang Zhan1,10, Qi Wu1,10, Xiaosen Guo2, Yibo Hu1, Weiming He2, Shanning Zhang4, Wei Fan2, Lifeng Zhu1, Dong Li2, Xuemei Zhang2, Quan Chen2, Hemin Zhang5, Zhihe Zhang6, Xuelin Jin7, Jinguo Zhang8, Huanming Yang2, Jian Wang2, Jun Wang2,9 & Fuwen Wei1 The panda lineage dates back to the late Miocene1 and (QXL)—among the current panda population using frappe7, ultimately leads to only one extant species, the giant panda Admixture8 and an allele-shared matrix (Online Methods) (Fig. 1 and (Ailuropoda melanoleuca). Although global climate change Supplementary Fig. 1). Previous studies only showed a distinct QIN and anthropogenic disturbances are recognized to shape cluster9; our larger study revealed that the MIN and QXL populations animal population demography2,3 their contribution to panda were also genetically distinct. We found no population substructure population dynamics remains largely unknown. We sequenced present in the QIN or MIN population but detected two subpopula- the whole genomes of 34 pandas at an average 4.7-fold tions within the QXL population (K = 4; Fig. 1b and Supplementary coverage and used this data set together with the previously Fig. 1): one comprising Xiaoxiangling and some Qionglai individuals deep-sequenced panda genome4 to reconstruct a continuous and the other comprising Daxiangling, Liangshan and the remaining 10 demographic history of pandas from their origin to the present. Qionglai individuals. The fixation index (FST) strongly supported this We identify two population expansions, two bottlenecks and three-population stratification (Supplementary Table 2). Principal- two divergences. Evidence indicated that, whereas global components analysis (PCA)11 provided additional corroborative changes in climate were the primary drivers of population evidence. The first eigenvector separated these three genetic populations fluctuation for millions of years, human activities likely (P < 0.05) (Fig. 1c and Supplementary Fig. 2). The second eigen- underlie recent population divergence and serious decline. vector indicated that the Liangshan population was separate from We identified three distinct panda populations that show the other populations, but this assignment was ambiguous because genetic adaptation to their environments. However, in all three of limited sampling in the Liangshan population (n = 2 individu- © All rights reserved. 2012 Inc. Nature America, populations, anthropogenic activities have negatively affected als). Overall, the three populations showed similar genetic diversity −3 −3 pandas for 3,000 years. (1.04–1.30 × 10 for Watterson’s estimator (θw) and 1.13–1.37 × 10 for the average pairwise diversity within populations (θπ); Supplementary We carried out whole-genome resequencing of 34 wild giant pandas Table 3) as humans12, confirming the results from a study using (Fig. 1a and Supplementary Table 1). This sample constitutes ten microsatellite loci that indicated that the panda has substantial ~2% of the current estimates of the entire wild panda population5, the genetic variability9. highest percentage of individuals assessed for existing animal popu- To reconstruct the demographic history of the giant panda, lation genomics studies. Genome alignment indicated an average of we used the pairwise sequentially Markovian coalescent (PSMC) 91.5% sequencing coverage and 4.7-fold depth for each individual model13 to examine changes in the local density of heterozygotes relative to the panda’s 2.25-Gb genome4. To improve SNP inference across the panda genome4. PSMC analysis showed a well-defined quality, we estimated the probabilities of individual genotypes and demographic history from 8 million to 20,000 years ago (Fig. 2a), a population allele frequencies for each site6 and identified a total of period covering the chronological distribution of three fossil panda 13,020,055 SNPs with ≥99% probability of being variable over the species or subspecies (primal panda Ailurarctos lufengensis, pygmy panda population. panda Ailuropoda microta and baconi panda Ailuropoda melano- We inferred three distinct genetic clusters—Qinling (QIN), leuca baconi)1,14. Considering the time since the origin of the panda, Minshan (MIN) and Qionglai-Daxiangling-Xiaoxiangling-Liangshan demography showed population peaks at ~1 million years ago and 1Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China. 2Shenzhen Key Laboratory of Transomics Biotechnologies, BGI-Shenzhen, Shenzhen, China. 3College of Life Sciences, University of the Chinese Academy of Sciences, Beijing, China. 4China Wildlife Conservation Society, Beijing, China. 5China Conservation and Research Center for the Giant Panda, Wolong, China. 6Chengdu Research Base of Giant Panda Breeding, Chengdu, China. 7Shaanxi Wild Animal Rescue and Research Center, Louguantai, Xi’an, China. 8Beijing Zoo, Beijing, China. 9Department of Biology, University of Copenhagen, Copenhagen, Denmark. 10These authors contributed equally to this work. Correspondence should be addressed to F.W. ([email protected]) or Jun Wang ([email protected]). Received 13 March; accepted 14 November; published online 16 December 2012; doi:10.1038/ng.2494 NATURE GEnETICS ADVANCE ONLINE PUBLICATION 1 LETTERS Qinling Mountains Liangshan Mountains a b Minshan Mountains Daxiangling Mountains Qionglai Mountains Xiaoxiangling Mountains QIN MIN QXL N K = 2 Qinling Minshan Mountains Mountains K = 3 K = 4 Qionglai Chengdu K = 5 Mountains Daxiangling K = 6 Mountains Xiaoxiangling Mountains Liangshan Study regions K = 7 Mountains Sampling sites River 80 40 0 80 GP3 GP4 GP5 GP6 GP7 Kilometers GP8 GP2 GP10 GP12 GP18 GP33 GP25 GP13 GP30 GP36 GP24 GP22 GP31 GP38 GP26 GP37 GP52 GP15 GP16 GP17 GP19 GP51 GP14 GP28 GP39 GP23 GP27 GP29 GP35 c d Polar bear 0.3 GP51 GP18 QIN GP16 MIN GP10 GP5 GP12 0.2 GP19 GP3 0.1 GP17 GP7 GP6 2 GP15 0 GP8 GP14 –0.1 GP4 GP33 –0.2 GP13 GP52 –0.3 GP2 Principal component GP37 –0.4 GP36 GP26 GP24 –0.5 GP28 GP30 GP29 –0.6 GP25 GP35 GP22 GP38 GP39 GP31 © All rights reserved. 2012 Inc. Nature America, –0.7 –0.35 –0.30 –0.25 –0.20 –0.15 –0.10 –0.05 0 0.05 0.10 0.15 GP27 GP23 Principal component 1 0.05 QXL Figure 1 Current geographic populations of the giant panda and inferred genetic populations. (a) Sampling sites and genetic structure detected by frappe analysis (K = 3 populations) were mapped using ArcGIS v9.2 on the basis of the proportion of an individual’s ancestry attributed to a given population. The genetic QIN population is shown in red, the MIN population is shown in yellow, and the QXL population is shown in green. Inset, the shaded area represents current panda habitats. (b) Genetic populations of the studied pandas inferred by frappe analysis. The number of populations (K) was predefined from 2 to 7. Symbols following each panda ID indicate where sampling occurred. (c) Results obtained from PCA using autosomal SNPs. Principal components 1 and 2 are shown. (d) A rooted neighbor-joining tree constructed from the allele-shared matrix of SNPs among the wild pandas, with the polar bear as an outgroup. The scale bar represents the p distance. ~40,000 years ago and population bottlenecks at ~0.2 million years the molecular level by the concurrent pseudogenization of the umami ago and ~20,000 years ago (Fig. 2a). Notably, we found that these taste gene Tas1r1 associated with the pandas’ decreased reliance on 18 fluctuations in effective population size (Ne) were significantly nega- meat . The low levels of MAR during that time (Fig. 2a) indicate tively correlated with changes in the amount of atmospheric dust, as warm and wet weather conditions, which were ideal for the spread inferred by the mass accumulation rate (MAR) of Chinese loess15 of bamboo forests. (Pearson’s correlation R = −0.30, P < 0.05), an index indicating cold The panda population declined around 0.7 million years ago, and the and dry or warm and wet climatic periods in China. first bottleneck occurred about 0.2 million years ago (Fig. 2a), around The first population expansion coincided with a dietary switch to the same time as the two largest Pleistocene glaciations in China, bamboo ~3 million years ago when pygmy pandas emerged16. Fossil the Naynayxungla Glaciation (0.78–0.50 million years ago) and the evidence indicates that the earliest (primal) pandas were omnivores Penultimate Glaciation (0.30–0.13 million years ago)19. Additionally, or carnivores, living in swamp habitats lacking bamboo1, whereas fossil evidence indicated that, from ~0.75 million years ago, the pygmy pandas mainly ate bamboo, as indicated by their specialized pygmy panda had been replaced by the subspecies A. melanoleuca cranial and dental adaptations16,17. This hypothesis is supported at baconi, which has the largest body size of all the panda species14. ADVANCE ONLINE PUBLICATION NATURE GEnETICS LETTERS Figure 2 Demographic history of the giant panda a 8 40 reconstructed from the reference and population A. melanoleuca baconi A. microta A. lufengensis resequencing genomes. (a) PSMC result showing 7 35 demographic history from the panda’s origin ) 4 6 30 MAR (g/cm to 10,000 years ago. The red line represents 10 × ( the estimated effective population size (Ne), and the 100 thin blue curves represent the 5 25 3 PSMC estimates for 100 sequences randomly /1,000 years) 4 20 resampled from the original sequence. The brown line shows the MAR of Chinese loess15. 3 15 Generation time (g) = 12 years, and neutral −8 mutation rate per generation (µ) = 1.29 × 10 . 2 10 The approximate chronological ranges of three Effective population size fossil panda species or subspecies (primal, 1 5 pygmy and baconi panda) are shaded in pink, orange and blue, respectively. Note that PSMC 0 0 simulation cannot detect population changes 104 105 106 107 more recent than 20,000 years ago. (b) ∂a∂i Years before the present result showing the demographic history of the panda from ~300,000 years ago to the present.