Population Genetics Models of Common Diseases Anna Di Rienzo

Total Page:16

File Type:pdf, Size:1020Kb

Population Genetics Models of Common Diseases Anna Di Rienzo Population genetics models of common diseases Anna Di Rienzo The number and frequency of susceptibility alleles for common This is because the relevant fitness effects are those that diseases are important factors to consider in the efficient acted in ancestral human populations, whereas clinical design of disease association studies. These quantities are the phenotypes are assessed in contemporary populations. results of the joint effects of mutation, genetic drift and In addition, clinical severity does not necessarily imply selection. Hence, population genetics models, informed by effects on reproduction, viability and survival, especially empirical knowledge about patterns of disease variation, can for diseases with late age of onset. be used to make predictions about the allelic architecture of common disease susceptibility and to gain an overall Disease mapping strategies rely — implicitly or explicitly understanding about the evolutionary origins of such diseases. — on population genetics quantities, such as allelic Equilibrium models and empirical studies suggest a role for heterogeneity, allele frequency spectrum and inter-eth- both rare and common variants. In addition, increasing nic differentiation; for example, standard association evidence points to changes in selective pressures on methods are most powerful for detecting common sus- susceptibility genes for common diseases; these findings are ceptibility variants with limited heterogeneity. Likewise, likely to form the basis for further modeling studies. the evidence for association is most likely to be replicated in additional populations if the susceptibility variants are Addresses Department of Human Genetics, University of Chicago, Chicago, the same and similarly common across populations. IL 60637, USA Because these quantities are functions of the underlying population genetics model, modeling studies might help Corresponding author: Di Rienzo, Anna to optimize disease-mapping strategies to the outcomes of ([email protected]) specific evolutionary scenarios. Current Opinion in Genetics & Development 2006, 16:630–636 In general, modeling studies aim at analyzing a complex biological process (e.g. the evolution of disease suscep- This review comes from a themed issue on tibility) by reducing it to more basic components (e.g. Genomes and evolution Edited by Chris Tyler-Smith and Molly Przeworski mutation rate or strength of selection etc). The ultimate goal is to understand the properties of the process and to Available online 19th October 2006 make quantitative predictions about its outcomes. Even 0959-437X/$ – see front matter though most models are too simple to be realistic, they # 2006 Elsevier Ltd. All rights reserved. can still have practical utility. For example, they can generate hypotheses that can be tested by empirical DOI 10.1016/j.gde.2006.10.002 studies of genetic variation or they can point to new approaches to disease mapping. Conversely, as we learn more about genetic susceptibility variants, modeling stud- Introduction ies might use the available empirical information to fill in Alleles that contribute to the genetic susceptibility to the gaps of knowledge regarding the genetic architecture human diseases are the focus of intense attention in human of common diseases and how it was shaped by natural genetics. They can be studied by a variety of approaches selection. Hence, population genetics modeling informed (e.g. linkage analysis and case-control studies) that aim to by empirical studies of risk variation — and the reverse — identify their role in the etiology of clinically defined will be necessary if a comprehensive understanding of the phenotypes. Disease alleles are also a specific subset of genetic bases of common diseases is to be obtained. all genetic variation present in human populations. As such, their fate is influenced jointly by the history and Here, I review the results of modeling studies performed strength of the selective pressures acting in human popu- to date, in light of the available data on susceptibility lations as well as by demographic history, including popu- variants for common diseases. Several hypotheses have lation size changes and geographic structure. Therefore, also been formulated about how changes in selective the properties of disease alleles can also be investigated by pressures acted on susceptibility genes for common dis- population genetics modeling based on evolutionary prin- eases. These hypotheses are providing a framework for ciples. An important difference between disease mapping further empirical and modeling studies. and population genetics studies is that the latter focus on the effects of susceptibility alleles on evolutionary fitness Mutation–selection balance: models and data rather than focusing on parameters of clinical severity. Different types of population genetics models are Fitness is only loosely correlated with clinical severity. likely to apply to Mendelian versus common diseases. Current Opinion in Genetics & Development 2006, 16:630–636 www.sciencedirect.com Population genetics models of common diseases Di Rienzo 631 Mendelian diseases, which are usually due to highly In a related study, Reich and Lander [2] also modeled penetrant and deleterious alleles that segregate in susceptibility alleles as being (slightly) deleterious and families, probably fit relatively simple equilibrium models showed that, conditional on a specified total frequency of of mutation–selection balance in which disease alleles are the susceptibility allele class (20%) at a single locus and removed by purifying selection and continually replen- for a ‘typical’ disease mutation rate, there is a single ished through the mutation process. Mutation–selection predominant allele within the susceptibility class [6]. balance predicts high allelic heterogeneity and low dis- Importantly, this study focused on the role of population ease prevalence, which are both common features of growth on the dynamics of disease alleles. More specifi- Mendelian diseases. Models of positive selection also cally, the authors showed that, although the number of apply to a minority of Mendelian disease alleles, most disease alleles increases very quickly if the frequency of notably sickle cell anemia, G6PD deficiency and the the susceptibility class before the onset of growth is low, thalassemias, which are caused by the pleiotropic effects the allelic heterogeneity for loci with intermediate fre- of disease alleles on resistance to malaria. For other quencies also increases, but much more slowly. For diseases (e.g. cystic fibrosis), a selective advantage has plausible times for the onset of population growth, the been invoked, but generating definitive evidence has met increase in allelic heterogeneity for loci with intermediate with considerable difficulties [1–4]. equilibrium frequency is expected to be negligible. Hence, if indeed common diseases are due to suscep- When it comes to common diseases with complex pat- tibility loci with intermediate equilibrium frequencies, terns of inheritance, however, the formulation of evolu- the allelic heterogeneity is predicted to be relatively low. tionary models is more complicated. Owing to incomplete penetrance, late age of onset, gene-by-environment inter- Owing to the different parameterization, the conclusions actions and polygenic inheritance, the connection of these two studies cannot be directly compared. Never- between common disease susceptibility and fitness is theless, they highlight important issues in modeling the not clear. Moreover, although most recessive Mendelian evolution of common disease susceptibility, and they diseases are due to loss-of-function alleles, this expecta- indicate new directions for further investigation. One tion might not apply to common diseases, thus adding to important issue is the choice of parameter values. For the uncertainty regarding the rate at which susceptibility example, an obvious difference between the two studies variants are generated by the mutation process. Perhaps concerns the assumptions about the rate at which new more importantly, part of the selective pressures acting on susceptibility alleles are created by the mutation process. susceptibility alleles for common diseases might have Pritchard [1] considered a broad range of mutation rates changed dramatically during human history, as a result (i.e. 2.5 Â 10–6 to 1.25 Â 10–4 per gene per generation), of major cultural and/or environmental transitions, imply- whereas Reich and Lander [2] considered a single and low ing that non-equilibrium models might be needed. mutation rate (i.e. 3.2 Â 10–6). The rate at which suscept- ibility alleles are generated and the rate at which they are The first models of common disease susceptibility based repaired or compensated for depend on the functional on population genetics principles were formulated within effects of mutations causing common diseases, which in the mutation–selection balance framework. Pritchard [1] turn depend on a variety of factors, including the disease modeled the evolution of a disease susceptibility locus in pathophysiology and the biological function of the gene. a randomly mating population of constant size, incorpo- For example, the mutation rate for variants leading to loss rating the effects of mutation, genetic drift and selection of function is likely to be higher than the rate for variants [5]. An important aspect of this analysis is that the total resulting in gain of function. Moreover, partial or com- frequency
Recommended publications
  • Enthusiasm Mixed with Scepticism About Single-Nucleotide Polymorphism Markers for Dissecting Complex Disorders
    European Journal of Human Genetics (1999) 7, 98–101 t © 1999 Stockton Press All rights reserved 1018–4813/99 $12.00 http://www.stockton-press.co.uk/ejhg MEETING REPORT First International SNP Meeting at Skokloster, Sweden, August 1998 Enthusiasm mixed with scepticism about single-nucleotide polymorphism markers for dissecting complex disorders Ann-Christine Syv˚anen1, Ulf Landegren2, Anders Isaksson2, Ulf Gyllensten2 and Anthony Brookes2 1Department of Medical Sciences; 2Department of Genetics and Pathology; Uppsala University, Uppsala, Sweden Single nucleotide polymorphisms (SNP) are frequent in number of participants, created a relaxed atmosphere our genomes, occurring on average once every thou- that stimulated many informal scientific discussions. sand nucleotides. They are useful as genetic markers Following an introductory lecture from Gert-Jan van because SNPs evolve slowly and because they can be Ommen (Leiden, NL), President of HUGO, there was scored by technically simple methods. Moreover, a a lively discussion on the role of HUGO in the genome great deal of the functional variation that explains project, revolving around its part in linking the diverse human diversity probably results from SNPs. In recent activities in the field, ranging from facilitating mapping years, SNP-based strategies to explain common disease and mutation analysis to nomenclature supervision, have received enormous attention from the scientific providing guidance in genome diversity and population community as well as from pharmaceutical and bio- studies and more generally in the ethical and intellec- technology companies. The enthusiasm stems largely tual property areas. During the two and a half days of from the hope that large-scale association studies using the meeting advances in technology for discovering and SNPs will help identify genes that contribute to scoring SNPs, SNP-based approaches to dissect com- common disorders with complex genetic causation, and plex diseases and the utility of SNPs in the study of allow formulation of improved therapeutic drugs.
    [Show full text]
  • Biology of Human Variation Fall 2014
    Anthropology 2110 Biology of Human Variation Fall 2014 Professor: Dr. Tamara Varney Location: AT1010 Lecture time: Tues 7-10 Phone: 807-343-8204 Office: Braun Building 2001D Email: [email protected] Office Hours: Thurs 9-11am am until Oct 15 then by appointment**(if its more convenient feel free to set an appointment time) OR you can phone, email or drop in anytime to see if I am free – just be prepared to come at another time if I am busy with something I cannot interrupt I cannot guarantee that I will respond in a timely fashion if messages are left on my voice mail rather than email and students should NOT expect less than 24 hour turnaround time to email messages. Please DO NOT assume that your message, voice or email, has been received unless you receive an acknowledgement. Course Description: This course focuses on human microevolution. Topics include evolutionary theory, the genetic background of human variation, the distribution of human variation, human adaptability, and the role of disease and other influences on human evolution. Required Textbook: (available at the University Bookstore) Mielke JH, Konigsberg LW and Relethford JH. 2011. Human Biological Variation. 2nd Edition. Oxford University Press. ISBN 13: 978-0-19-538740-7 Also see the course Desire2Learn (D2L) site (look in MyCourseLinks on the LU website) for additional resources. Evaluation: Your final grade will be based on: Term Test 1 25% Oct 27 Term Test 2 30% Nov 18 Final Exam 45% Final examination period – Dec 9-19 **Note exam contingency date is Dec 20. Please note that Fri, November 6, 2015 is the last date for withdrawal without academic penalty from this course.
    [Show full text]
  • Sequence Variations in the Public Human Genome Data Reflect a Bottlenecked Population History
    Sequence variations in the public human genome data reflect a bottlenecked population history Gabor Marth*†, Greg Schuler*, Raymond Yeh‡, Ruth Davenport§, Richa Agarwala*, Deanna Church*, Sarah Wheelan*¶, Jonathan Baker*, Ming Ward*, Michael Kholodov*, Lon Phan*, Eva Czabarka*, Janos Murvai*, David Cutlerʈ, Stephen Wooding**, Alan Rogers**, Aravinda Chakravartiʈ, Henry C. Harpending**, Pui-Yan Kwok†,††, and Stephen T. Sherry*† *National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894; ‡Department of Genetics, Washington University School of Medicine, St. Louis, MO 63130; §Division of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63130; ¶Department of Molecular Biology and Genetics and ʈMcKusick–Nathans Institute of Genetic Medicine, The Johns Hopkins University School of Medicine, Baltimore, MD 21205; **Department of Anthropology, University of Utah, Salt Lake City, UT 84112; and ††Cardiovascular Research Institute and Department of Dermatology, University of California, San Francisco, CA 94143 Contributed by Henry C. Harpending, November 5, 2002 Single-nucleotide polymorphisms (SNPs) constitute the great ma- (density) distribution of genomic sequence variations. To this jority of variations in the human genome, and as heritable variable end, we built a set of reagents (pairwise sequence alignments landmarks they are useful markers for disease mapping and re- and their corresponding sets of variation) by analyzing the solving population structure. Redundant coverage in overlaps of overlapping regions of large-insert clones sequenced as part of large-insert genomic clones, sequenced as part of the Human the human genome project. These data provided marker Genome Project, comprises a quarter of the genome, and it is density observations grouped by overlap fragment length.
    [Show full text]
  • Methods to Detect Selection in Populations with Applications to the Human
    P1: FQP/VEN July 6, 2000 12:24 Annual Reviews AR104-19 Annu. Rev. Genomics Hum. Genet. 2000. 01:539–59 Copyright c 2000 by Annual Reviews. All rights reserved METHODS TO DETECT SELECTION IN POPULATIONS WITH APPLICATIONS TO THE HUMAN Martin Kreitman Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637; e-mail: [email protected] Key Words natural selection, genetic drift, polymorphism, molecular evolution ■ Abstract The development of statistical tests of natural selection at the DNA level in population samples has been ongoing for the past 13 years. The current state of the field is reviewed, and the available tests of selection are described. All tests use predictions from the theory of neutrally evolving sites as a null hypothesis. Departures from equilibrium-neutral expectations can indicate the presence of natural selection acting either at one or more of the sites under investigation or at a sufficiently tightly linked site. Complications can arise in the interpretation of departures from neutrality if populations are not at equilibrium for mutation and genetic drift or if populations are subdivided, both of which are likely scenarios for humans. Attempts to understand the nonequilibrium configuration of silent polymorphism in human mitochondrial DNA illustrate the difficulty of distinguishing between selection and alternative demographic hypotheses. The range of plausible alternatives to selection will become better defined, however, as additional population genetic data sets become available, allowing better null models to be constructed. INTRODUCTION Kimura’s immensely influential formulation of the neutral theory of molecular evolution, which came in 1968, was based primarily on an argument about the magnitude of the genetic load that would be imposed by positive selection if it were the only driving force in protein evolution (56).
    [Show full text]
  • Human DNA Sequences: More Variation and Less Race
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 139:23–34 (2009) Human DNA Sequences: More Variation and Less Race Jeffrey C. Long,1* Jie Li,1 and Meghan E. Healy2 1Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618 2Department of Anthropology, University of New Mexico, Albuquerque, NM 87131 KEY WORDS race; DNA sequence; short tandem repeat; diversity; hierachical models ABSTRACT Interest in genetic diversity within and sity is one of nested subsets, such that the diversity in between human populations as a way to answer questions non-Sub-Saharan African populations is essentially a sub- about race has intensified in light of recent advances in set of the diversity found in Sub-Saharan African popula- genome technology. The purpose of this article is to apply tions. The actual pattern of DNA diversity creates some a method of generalized hierarchical modeling to two unsettling problems for using race as meaningful genetic DNA data sets. The first data set consists of a small sam- categories. For example, the pattern of DNA diversity ple of individuals (n 5 32 total, from eight populations) implies that some populations belong to more than one who have been fully resequenced for 63 loci that encode a race (e.g., Europeans), whereas other populations do not total of 38,534 base pairs. The second data set consists of belong to any race at all (e.g., Sub-Saharan Africans). As a large sample of individuals (n 5 928 total, from 46 popu- Frank Livingstone noted long ago, the Linnean classifica- lations) who have been genotyped at 580 loci that encode tion system cannot accommodate this pattern because short tandem repeats.
    [Show full text]
  • Bio-Skin-Color-Transcript.Pdf
    [crickets chirp] [cymbal plays] [chime plays] [music plays] [JABLONSKI (narrated):] Human brains are gray. Human blood is red. Our bones are off-white. Doesn’t matter where you’re born or to whom. But human skin is different. [music plays] Some of us have rich dark brown skin; some of us have pinkish white skin. Most of us are somewhere in between. For the longest time, why this variation exists was a real scientific mystery … that opened the door for some to invest this biological trait with moral value, and then use that to justify the suffering of others. [elephant trumpets] But biological traits aren’t good or bad. They’re features that have evolved because they enhance an organism’s odds of surviving and passing on its genes. [JABLONSKI:] Like other animal traits, the sepia rainbow of human skin color evolved through natural selection. Now, thanks to advances in anthropology and genetics, exactly how and why it did is no longer a mystery. [music plays] [background discussion] [JABLONSKI:] Biological anthropologists like myself spend our lives studying how humans evolved, and why we differ from one another physically. [music plays] [JABLONSKI (narrated):] Our skin provides one of the most visible markers of human variability. It’s something that sets us apart from our closest animal relatives. Under their dark fur, chimpanzees have pale skin, and millions of years ago that was probably also the case for the primates that were our common ancestors. So where did humanity’s range of skin colors come from? From physics we know that the color of any object comes from the wavelengths of light that it reflects back to an observer’s eye.
    [Show full text]
  • Evidence for Positive Selection and Population Structure at the Human MAO-A Gene
    Evidence for positive selection and population structure at the human MAO-A gene Yoav Gilad*†, Shai Rosenberg‡, Molly Przeworski§, Doron Lancet*, and Karl Skorecki†‡ *Department of Molecular Genetics and the Crown Human Genome Center, The Weizmann Institute of Science, Rehovot 76100, Israel; ‡Rappaport Faculty of Medicine and Research Institute, Technion–Israel Institute of Technology, and Rambam Medical Center, Haifa 31096, Israel; and §Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, United Kingdom Communicated by Eviatar Nevo, University of Haifa, Haifa, Israel, November 19, 2001 (received for review May 15, 2001) We report the analysis of human nucleotide diversity at a genetic locus known to be involved in a behavioral phenotype, the mono- amine oxidase A gene. Sequencing of five regions totaling 18.8 kb and spanning 90 kb of the monoamine oxidase A gene was carried out in 56 male individuals from seven different ethnogeographic groups. We uncovered 41 segregating sites, which formed 46 distinct haplotypes. A permutation test detected substantial pop- ulation structure in these samples. Consistent with differentiation between populations, linkage disequilibrium is higher than ex- pected under panmixia, with no evidence of a decay with distance. The extent of linkage disequilibrium is not typical of nuclear loci and suggests that the underlying population structure may have been accentuated by a selective sweep that fixed different hap- lotypes in different populations, or by local adaptation. In support of this suggestion, we find both a reduction in levels of diversity Fig. 1. Overall genomic structure and sequencing strategy for the MAO-A (as measured by a Hudson–Kreitman–Aguade test with the DMD44 gene.
    [Show full text]
  • Understanding Rare and Common Diseases in the Context of Human Evolution Lluis Quintana-Murci1,2,3
    Quintana-Murci Genome Biology (2016) 17:225 DOI 10.1186/s13059-016-1093-y REVIEW Open Access Understanding rare and common diseases in the context of human evolution Lluis Quintana-Murci1,2,3 Abstract (e.g., migration, admixture, and changes in population size), and natural selection. The wealth of available genetic information is allowing The plethora of genetic information generated in the the reconstruction of human demographic and adaptive last ten years, thanks largely to the publication of se- history. Demography and purifying selection affect the quencing datasets for both modern human populations purge of rare, deleterious mutations from the human and ancient DNA samples [14–18], is making it possible population, whereas positive and balancing selection to reconstruct the genetic history of our species, and to can increase the frequency of advantageous variants, define the parameters characterizing human demographic improving survival and reproductioninspecific history: expansion out of Africa, the loss of genetic diversity environmental conditions. In this review, I discuss with increasing distance from Africa (i.e., the “serial founder how theoretical and empirical population genetics effect”), demographic expansions over different time scales, studies, using both modern and ancient DNA data, and admixture with ancient hominins [16–21]. These stud- are a powerful tool for obtaining new insight into ies are also revealing the extent to which selection has acted the genetic basis of severe disorders and complex on the human genome, providing insight into the way in disease phenotypes, rare and common, focusing which selection removes deleterious variation and the po- particularly on infectious disease risk. tential of human populations to adapt to the broad range of climatic, nutritional, and pathogenic environments they have occupied [22–28].
    [Show full text]
  • Homo Sapiens
    Global landscape of recent inferred Darwinian selection for Homo sapiens Eric T. Wang*†, Greg Kodama‡, Pierre Baldi*†‡, and Robert K. Moyzis*†§ *Department of Biological Chemistry, College of Medicine, ‡Donald Bren School of Information and Computer Sciences, and †Institute of Genomics and Bioinformatics, University of California, Irvine, CA 92697 Communicated by Douglas C. Wallace, University of California, Irvine, CA, November 9, 2005 (received for review June 8, 2005) By using the 1.6 million single-nucleotide polymorphism (SNP) F test statistics, positive scores are indicative of unusually high genotype data set from Perlegen Sciences [Hinds, D. A., Stuve, L. L., heterozygosity within the data set. Additionally, these tests Nilsen, G. B., Halperin, E., Eskin, E., Ballinger, D. G., Frazer, K. A. & usually do not take distance between variable sites into consid- Cox, D. R. (2005) Science 307, 1072–1079], a probabilistic search for eration and rely heavily on statistics obtained from rare muta- the landscape exhibited by positive Darwinian selection was con- tional events. The selection criterion for the Perlegen and ducted. By sorting each high-frequency allele by homozygosity, we HapMap genotyping efforts, however, was the high heterozy- search for the expected decay of adjacent SNP linkage disequilib- gosity and equal spacing of SNPs (1, 2). Hence, these data sets rium (LD) at recently selected alleles, eliminating the need for have high ascertainment bias. Using tests that rely on heterozy- inferring haplotype. We designate this approach the LD decay gosity and frequency of rare mutations to infer selection on such (LDD) test. By these criteria, 1.6% of Perlegen SNPs were found to biased data sets should be largely meaningless.
    [Show full text]
  • NIEHS and EPA Children's Environmental Health and Disease
    SESSION 1: Plenary Session Couple Based Approach for Assessing Environmental Pollutants and Human Reproduction and Development – Recent Findings from the LIFE Study Germaine Buck Louis, National Institute of Child Health and Human Development (NICHD) The Longitudinal Investigation of Fertility and the Environment (LIFE) Study was designed to assess environmental pollutants and lifestyle in relation to a spectrum of reproductive and developmental outcomes. This talk will briefly address the LIFE Study’s experience in overcoming key methodologic challenges underlying the design and conduct of population based research with preconception enrollment of couples planning pregnancies. In addition, emerging results regarding couples’ exposures to persistent environmental chemicals and human fecundity as measured by time-to-pregnancy will be presented. Environmental Epigenetics – Results and Opportunities for Children's Environmental Health and Disease Prevention Research Andrea Baccarelli, Harvard School of Public Health Epigenetics investigates heritable changes in gene expression that occur without changes in DNA sequence. Several epigenetic mechanisms, including DNA methylation and histone modifications, can change genome function under exogenous influence. Results obtained from animal models indicate that in utero or early-life environmental exposures produce effects that can be inherited transgenerationally and are accompanied by epigenetic alterations. The search for human equivalents of the epigenetic mechanisms identified in animal models
    [Show full text]
  • Notes: Introduction to Single Nucleotide Polymorphisms (Snps)
    VCU Bioinformatics and Bioengineering Summer Institute The Origin of Human Genetic Variability from Cross-species Comparisons Notes: Introduction to Single Nucleotide Polymorphisms (SNPs) I. Overview The Scenario begins with a grand idea of discovering the source of human variability. By the end, it gets bogged down in a technical problem, but let's leave that aside for the moment and go back to the big picture. Human variability – at least its genetic component – is the result of differences in only a tiny fraction of the human genome. In contrast, two bacteria, Escherichia coli and Salmonella typhimurium, that you would be hard pressed to tell apart by their looks or metabolic capabilities differ in as much as x%. One lesson we might take away, of course, is that we are all genetic siblings under the skin, but the question remains: Why is it that so little difference in DNA sequence makes so large a difference in at least appearance? The most abundant genetic variants in human genomes are single basepair differences, also called single nucleotide polymorphisms (SNPs). Understanding how SNPs arise from might therefore help us understand the major source of human variability. II. Technical notes on sequences SNPs by definition have positions where more than one nucleotide appear in the population. This poses a problem in representation. How do you convey the idea that the nucleotide at a given position in a sequence might be either an A or a G? A system of nomenclature has grown up to deal with this problem: Nuc Symbol Nuc Symbol Nuc Symbol A A A or C M (aMino) A or C or G V (not T) C C A or G R (puRine) A or C or T H (not G) G G A or T W (Weak) A or G or T D (not C) T T C or G S (Strong) C or G or T B (not A) C or T Y (pYrimidine) A or C or G or T N (aNy) G or T K (Keto) Thus the Scenario's first SNP sequence, GGGCGGGGARCCTGGCCACCA, represents a sequence that permits either an A or a G in the 10th position.
    [Show full text]
  • A Bayesian Method for Detecting and Characterizing Allelic Heterogeneity and Boosting Signals in Genome-Wide Association Studies
    Statistical Science 2009, Vol. 24, No. 4, 430–450 DOI: 10.1214/09-STS311 c Institute of Mathematical Statistics, 2009 A Bayesian Method for Detecting and Characterizing Allelic Heterogeneity and Boosting Signals in Genome-Wide Association Studies Zhan Su, Niall Cardin , the Wellcome Trust Case Control Consortium, Peter Donnelly and Jonathan Marchini Abstract. The standard paradigm for the analysis of genome-wide as- sociation studies involves carrying out association tests at both typed and imputed SNPs. These methods will not be optimal for detecting the signal of association at SNPs that are not currently known or in regions where allelic heterogeneity occurs. We propose a novel associa- tion test, complementary to the SNP-based approaches, that attempts to extract further signals of association by explicitly modeling and es- timating both unknown SNPs and allelic heterogeneity at a locus. At each site we estimate the genealogy of the case-control sample by tak- ing advantage of the HapMap haplotypes across the genome. Allelic heterogeneity is modeled by allowing more than one mutation on the branches of the genealogy. Our use of Bayesian methods allows us to as- sess directly the evidence for a causative SNP not well correlated with known SNPs and for allelic heterogeneity at each locus. Using simu- lated data and real data from the WTCCC project, we show that our method (i) produces a significant boost in signal and accurately identi- fies the form of the allelic heterogeneity in regions where it is known to exist, (ii) can suggest new signals that are not found by testing typed or imputed SNPs and (iii) can provide more accurate estimates of effect sizes in regions of association.
    [Show full text]