A Genomic View of the Peopling and Population Structure of India
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from http://cshperspectives.cshlp.org/ on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press A Genomic View of the Peopling and Population Structure of India Partha P. Majumder and Analabha Basu National Institute of Biomedical Genomics, Kalyani 741251, India Correspondence: [email protected] Recent advances in molecular and statistical genetics have enabled the reconstruction of human history by studying living humans. The ability to sequence and study DNA by cali- brating the rate of accumulation of changes with evolutionary time has enabled robust inferences about how humans have evolved. These data indicate that modern humans evolved in Africa about 150,000 years ago and, consistent with paleontological evidence, migrated out of Africa. And through a series of settlements, demographic expansions, and further migrations, they populated the entire world. One of the first waves of migration from Africa was into India. Subsequent, more recent, waves of migration from other parts of the world have resulted in India being a genetic melting pot. Contemporary India has a rich tapestry of cultures and ecologies. There are about 400 tribal groups and more than 4000 groups of castes and subcastes, speaking dialects of 22 recognized languages belonging to four major language families. The contemporary social structure of Indian populations is characterized by endogamy with different degrees of porosity. The social structure, possibly coupled with large ecological heterogeneity, has resulted in considerable genetic diversity and local genetic differences within India. In this essay, we provide genetic evidence of how India may have been peopled, the nature and extent of its genetic diversity, and genetic structure among the extant populations of India. OUR DNA IS A PALIMPSEST OF about the evolutionary past. Considerable hu- OUR HISTORY man DNA sequence data have been obtained to date. These have been used to study our diver- odern biology has provided powerful tools sity and investigate how humans who reside Mfor reconstructing the history of the earth in different geographical regions, or belong to and its inhabitants, including humans. Central distinct cultural groups, are genetically related. to this development has been the ability to study This is zoology and anthropology at the molec- the genetic material of organisms, DNA. Scien- ular level, and then the biologist turns into an tists can extract DNA from microbes, plants, historian. animals, and humans, including their fossil- The data generated by the human genome ized remains. We can sequence DNA and ex- project indicate that two humans, selected ran- tract valuable information from their sequences domly, are genetically 99.9% identical in their Editor: Aravinda Chakravarti Additional Perspectives on Human Variation available at www.cshperspectives.org Copyright # 2014 Cold Spring Harbor Laboratory Press; all rights reserved Advanced Online Article. Cite this article as Cold Spring Harb Perspect Biol doi: 10.1101/cshperspect.a008540 1 Downloaded from http://cshperspectives.cshlp.org/ on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press P.P. Majumder and A. Basu DNA sequences. Geneticists who study human abundant to form new subpopulations. Because genomic diversity therefore intensively focus these subgroups may have been small, each their studies on the tiny fraction (0.1%) of subgroup carried with them not the complete our genome in which we differ. This tiny frac- catalog, but only a fractional sample of the ge- tion contains very rich information pertaining nomes present in the original population. This to our origins and diversity. Because the human creates genomic diversity among the subpopu- genome comprises about three billion nucleo- lations. Furthermore, when subgroup sizes are tides, this tiny fraction (0.1%) corresponds to small, demographic bottlenecks are created that about three million nucleotide differences. would have exacerbated genomicdiversityamong Normally, only a fraction of the genetic var- the subpopulations. With the passage of time, iation of a large population of individuals is subpopulations of an ancestral population di- represented in a subset of those individuals. In verge from one another. Despite these pressures other words, any subset of individuals of a larg- and processes that enhance genomic differences er set is genetically more homogeneous than the among subpopulations of an ancestral popu- larger set. Residents of a restricted geographi- lation, some core genomic “signatures” are re- cal region are usually descendants of a small tained in the subpopulations. Thus, by study- ancestral group and, therefore, often have lim- ing the genome of individuals of contemporary ited genetic variation among them. However, subpopulations, it is possible to reconstruct their genetic variation between two ancestral groups ancestries and ancestral relationships among the is larger. Thus, descendants of a single ancestral subpopulations. group are genetically more similar than descen- After subpopulations emerge from an an- dants from different ancestral groups. There- cestral population, if they remain isolated, that fore, by studying the patterns of variation in is, if no mate-exchange or admixture takes place the genomes of contemporary individuals, res- among them, they evolve independently. New ident in one or more geographical regions, it is mutations arise in each subpopulation. These possible to reconstruct their ancestral affilia- mutations remain confined, because of lack of tions. The differences in DNA sequence in the admixture, to the subpopulations in which they tiny (0.1%) “variable” fraction of our genome have arisen. Therefore, over time, genomic di- also holds the key to why some individuals are versity is even more enhanced among the sub- susceptible to, whereas others are protected populations. from, a specific disease. However, these disease- Admixture, or the exchange of genes, allows associated sequence differences, like those indic- a new genetic variation introduced by mutations ative of our historical origins, are not clustered, to move from one subpopulation to another. but spread across the entire genome. Thus, admixture increases genetic similarities Inferring human population history rests on or affinities among subpopulations. The pri- a simple reality. New population groups arise mary barriers to admixture are cultural dif- or evolve from pre-existing groups. A population ferences, linguistic differences, and geographi- splits into subpopulations (population “fis- cal distance. In general, unless there has been sion”) because of various cultural and demo- large-scale admixture on a continued basis, the graphic reasons and forces. In the past, when longer two populations have been separated, we were predominantly dependent on natural the larger the genetic distance between them. resources for our survival, increase of the nu- Genetic distance, therefore, is a useful “clock” merical size of a population implied that there by which we can date evolutionary history. It was pressure on natural resources. This pressure must, however, be emphasized that various fac- impinged on the survival of members of that tors, especially natural selection, can, over time, group. Therefore, members of an expanding increase or decrease the speed at which the clock population would possibly have formed sub- ticks, thereby causing significant differences be- groups and moved away to new geographical tween the actual and estimated times of evolu- locations in which natural resources were more tionary events (Barreiro et al. 2008). 2 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Biol doi: 10.1101/cshperspect.a008540 Downloaded from http://cshperspectives.cshlp.org/ on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press Genomic View of Peopling and Population of India Anatomically, modern humans (Homo sa- DNA changes is younger than another with a piens sapiens) evolved in Africa about 150,000 larger number of accumulated changes. If the years ago and moved to other geographical re- rate of accumulation of change per year remains gions between 125,000 and 60,000 years ago. approximately constant, then one can estimate There is now abundant evidence that human the time of accumulation from the observed genomic diversity is greater in Africa than other number of changes. (Usually, such estimates global regions, indicating that Africa is likely to are quite rough because of random fluctuations have been the source of subsequent human dis- in accumulation of changes in different regions persals (Campbell and Tishkoff 2008). of the genome. Therefore, only a broad range of Migration is not always symmetric with re- time can usually be ascribed.) The extant vari- spect to gender; men appear to be more mi- ation in our genomes thus carries footprints of gratory than women. The genetic consequences the history of our species. Additionally, the pat- of such gender asymmetry of migration can be tern with which these mutations and polymor- revealed using genomic variations that are sole- phisms (mutations that increase to high fre- ly transmitted by either fathers or mothers, or quencies in a population) accumulate in our transmitted to daughters or sons. Accounting genomes is indicative of the dynamics of our for such gender differences is crucial because population history (Nordborg 1997; Kingman the dates of evolutionary events, such as migra- 2000; Rosenberg and