Detecting Natural Selection in Genomic Data

Detecting Natural Selection in Genomic Data

GE47CH05-Sabeti ARI 29 October 2013 12:12 Detecting Natural Selection in Genomic Data Joseph J. Vitti,1,2 Sharon R. Grossman,2,3,4 and Pardis C. Sabeti1,2 1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138; email: [email protected], [email protected] 2Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142 3Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115 4Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 Annu. Rev. Genet. 2013. 47:97–120 Keywords The Annual Review of Genetics is online at genet.annualreviews.org population genetics, adaptation, selective sweeps, genome scans, by Harvard University on 11/26/13. For personal use only. evolutionary genomics This article’s doi: 10.1146/annurev-genet-111212-133526 Abstract Annu. Rev. Genet. 2013.47:97-120. Downloaded from www.annualreviews.org Copyright c 2013 by Annual Reviews. All rights reserved The past fifty years have seen the development and application of nu- merous statistical methods to identify genomic regions that appear to be shaped by natural selection. These methods have been used to in- vestigate the macro- and microevolution of a broad range of organisms, including humans. Here, we provide a comprehensive outline of these methods, explaining their conceptual motivations and statistical inter- pretations. We highlight areas of recent and future development in evolutionary genomics methods and discuss ongoing challenges for re- searchers employing such tests. In particular, we emphasize the impor- tance of functional follow-up studies to characterize putative selected alleles and the use of selection scans as hypothesis-generating tools for investigating evolutionary histories. 97 GE47CH05-Sabeti ARI 29 October 2013 12:12 INTRODUCTION (90, 125) to coat color in field mice (81) to armored plates in stickleback fish (64). These As humans and other organisms moved to instances were all identified using a forward inhabit every part of the world, they were ex- genetics approach, in which a phenotype was posed to myriad new environments, diets, and first hypothesized to be adaptive and the un- pathogens, and forced to adapt, leading to the derlying loci were then identified. With on- great diversity we observe today. Uncovering going advancements in genomic technology, the mechanism of this diversification has for we can now go further, from testing evidence years fascinated scientists and nonscientists for selection on putative adaptive traits to un- alike. In 1858, Darwin and Wallace gave covering candidate genetic regions through grounds for species evolution when they genome scans. This transition from hypothesis- articulated the principle of natural selection, testing to hypothesis-generating science has the idea that beneficial traits—those that been made possible both by the new data (e.g., improve an individual’s chances to survive and genome sequences from increasing numbers of reproduce—tend to become more frequent in species and genome-wide variation data) and by populations over time. increasingly sophisticated tools that allow us to Scientists have continued to search for make sense of this deluge of data and to fine- evidence of evolution and for the specific map evidence of selection to individual candi- adaptations that underlie it. Animal and plant date variants. breeders were some of the first to identify Identifying such candidates is significant not traits that are evolving, as they witnessed only because they demonstrate evolution and dramatic changes in their stock through shed light on species histories but also because artificial selection. Haldane uncovered the first they represent biologically meaningful varia- adaptive trait in humans when he observed tion. Given that selection operates at the level of that many diseases of red blood cells seemed the phenotype, alleles showing evidence of se- to be distributed in regions where malaria was lection are likely to be of functional relevance. endemic (48). Haldane’s malaria hypothesis Thus, alleles implicated in selection studies are was confirmed by Allison a few years later, often linked either to resistance to infectious when he demonstrated that the sickle cell diseases, as pathogens are believed to represent mutation in the Hemoglobin-B gene (HBB) was one of the strongest selective pressures acting the target of selection for malaria resistance (4). on humans (40), or to noninfectious genetic The ability to assess evidence for selection diseases, such as those associated with autoim- at the genetic level represented a breakthrough by Harvard University on 11/26/13. For personal use only. mune diseases or metabolic disorders (54). for this pursuit. Computational analysis of pop- Further breakthroughs in genomic anno- ulation genetic data sets provides a statistically tation, genome manipulation technology, and Annu. Rev. Genet. 2013.47:97-120. Downloaded from www.annualreviews.org rigorous way to infer the action of natural selec- high-throughput molecular biology are be- tion; in this way, the field of evolutionary genet- ginning to allow researchers to progress from ics represents an antidote to the preponderance candidate variants to functionally elucidated of speculative just-so stories that some biolo- instances of evolution. Taken together, all of gists have lamented (42). Moreover, it demon- these advancements present a path to realizing strates the full realization of the modern syn- the full potential of evolutionary genomics thesis: Darwinian concepts of selection have in shedding light on species histories and been rendered quantitative and measurable in uncovering biologically meaningful variation. real populations, thanks to methodological and technological advances (1). Through evolutionary genetics, many adap- Modes of Selection tive traits have been elucidated, from lactase Natural selection is based on the simple obser- persistence and skin pigmentation in humans vation that fitness-enhancing traits, i.e., those 98 Vitti · Grossman · Sabeti GE47CH05-Sabeti ARI 29 October 2013 12:12 that improve an organism’s chance of survival trend is often further described as diversifying or reproductive success in its environment, are or disruptive selection. By contrast, when inter- more likely to be passed on to that organism’s mediate phenotypic values are favored, whether Heterozygote offspring and therefore increase in prevalence by balancing selection of codominant alleles or advantage: atrendin in the population over time. In the genomic by positive selection of alleles that underlie in- which the fitness of a era, selection refers to any nonrandom, dif- termediate phenotypes, the trend is called sta- heterozygote is greater ferential propagation of an allele as a conse- bilizing selection. than that of either quence of its phenotypic effect. There are many This diversity of modes of selection notwith- homozygote. Also referred to as specific modes of selection that have been de- standing, much research in recent years has fo- overdominance scribed, some of which share conceptual over- cused on the development of genomic methods Frequency- lap, and some of which are referred to by multi- to identify positive selection. One reason for dependent selection: ple names. In this section, we briefly define the this emphasis on positive selection is practical: a trend in which the different modes of selection that we employ in Whereas negative selection is primarily observ- fitness of a given our discussion (85). able in highly conserved regions and balancing genotype is correlated Most simply, selection may act in a direc- selection’s effect on the genome is often subtle, with its prevalence in the population (e.g., if tional manner, in which an allele is favored and positive selection leaves a more conspicuous an allele is so propagated (positive selection) or disfavored footprint on the genome that can be detected advantageous when it (negative selection, also called purifying selec- using a number of different approaches. An- is rare) tion). Random mutations are more likely to be other reason for the interest in positive selection Codominance: deleterious than beneficial, so many novel al- is theoretical: Positive selection is understood condition in which leles are immediately subject to negative selec- to be the primary mechanism of adaptation multiple alleles are tion and become removed from the gene pool (i.e., the genesis of phenotypes that are apt for dominant; the heterozygote expresses before they can achieve detectable frequency a specific environment or niche), which in turn phenotypes associated within the population. This ongoing removal poses great theoretical interest to researchers with both alleles of deleterious mutations is a form of negative (1). selection referred to as background selection. Here, we discuss the various approaches that In genetic regions under strong background se- have been used to identify positive selection lection, mutations are quickly removed from while also indicating the ways that these meth- the gene pool, resulting in highly conserved ods may be used to detect and classify instances stretches of the genome (i.e., regions where of other modes of selection (Table 1). These variation is not observed). approaches typically use summary statistics More subtle configurations of positive and to compare observed data with expectations by Harvard University on 11/26/13. For personal use only. negative selection give rise

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    26 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us