Basic Principles and Laboratory Analysis of Genetic Variation Jesus Gonzalez-Bosquet and Stephen J

Basic Principles and Laboratory Analysis of Genetic Variation Jesus Gonzalez-Bosquet and Stephen J

UNIT 2. BIOMARKERS: PRACTICAL ASPECTS CHAPTER 6. Basic principles and laboratory analysis of genetic variation Jesus Gonzalez-Bosquet and Stephen J. Chanock UNIT 2 CHAPTER 6 CHAPTER Summary With the draft of the human genome of their functional significance. agnostically using dense data sets and advances in technology, the Understanding the true effect of with billions of data points. These approach toward mapping complex genetic variability on the risk of developments have transformed diseases and traits has changed. complex diseases is paramount. the field, moving it away from the Human genetics has evolved into The importance of designing pursuit of hypothesis-driven, limited the study of the genome as a high-quality studies to assess candidate studies to large-scale complex structure harbouring clues environmental contributions, as well scans across the genome. Together for multifaceted disease risk with as the interactions between genes these developments have spurred a the majority still unknown. The and exposures, cannot be stressed dramatic increase in the discovery discovery of new candidate regions enough. This chapter will address of genetic variants associated with by genome-wide association studies the basic issues of genetic variation, or linked to human diseases and (GWAS) has changed strategies for including population genetics, as traits, many through genome-wide the study of genetic predisposition. well as analytical platforms and association studies (GWAS) (1). More genome-wide, “agnostic” tools needed to investigate the Already over 7400 novel regions of approaches, with increasing contribution of genetics to human the genome have been associated numbers of participants from high- diseases and traits. with more than 75 human diseases quality epidemiological studies are or traits in large-scale GWAS (2). for the first time replicating results Introduction Each region now represents a new in different settings. However, new- candidate “region” that harbours found regions (which become the New advances in microchip putative genes, which will require new candidate “genes”) require technologies and informatics allow extensive mapping of the variants extensive follow-up and investigation geneticists to look across the genome to explore the genomic architecture Unit 2 • Chapter 6. Basic principles and laboratory analysis of genetic variation 99 of the region and its contribution mechanisms and outcomes. The scope of genetic variation to human diseases and traits. Eventually these insights will be The return to exploring candidate applied to treatment or preventive The spectrum of human genetic regions differs from the old approach measures that are best suited for the variation is enormous with respect of nominating favoured genes, individual (known as personalized to both the types of genetic variation because it is driven by findings medicine). Individualization of and the sheer magnitude of the that reach conclusive thresholds treatments based on the greatest number of variants in any given based on more rigorous statistical likelihood for efficacy, while genome. Even though two genomes considerations. minimizing (or avoiding) deleterious are estimated to differ by less While there is ample opportunity toxicities, represents a long-term than 0.5%, there are still several to survey thousands of genetic goal, but one that is in the distant million differences; the majority are variants, often well chosen future. While the opportunity to vestigial, but a small proportion and based on an emerging begin to develop evidence-based probably contribute to disease risk. understanding of the structure of individualized therapeutics, also The most common type of variation genetic variation and its patterns of known as pharmacogenomics, is is a single nucleotide base change, inheritance, the ability to analyse promising, its realization will require followed by small insertions or the interaction between genetic a nuanced understanding of the deletions in sequence. Progressively variants and the environment has contribution of genetic variation to larger structural alterations and lagged. This is mainly because complex diseases. copy number variants are fewer the measurement tools for the This chapter will address the in absolute number, but perhaps latter have not undergone the basic issues of genetic variation, affect more bases (Figure 6.1). So transformative shift observed including population genetics as far, available technologies have in assessing genetic variation. well as analytical platforms and accelerated the discovery and The integration of environmental tools needed to investigate the characterization of diversity in the exposure with genetic factors contribution of genetics to human human genome. In the first wave of should provide insights into disease diseases and traits. annotation, common variants have Figure 6.1. Genetic variant frequencies and estimated effect size for genetic contribution 100 been described, many of which the stable substitution of a single efforts, such as the 1000 Genomes are universal to all populations. base, which by definition is observed Project, indicate these estimates The ability to ascertain estimates in at least 1% of a population. Though are low (http://www.1000genomes. for lower frequency variants is this definition has been useful org/). There are estimated to be a dependent upon the number of for cataloging genetic variation, greater number of SNPs with lower subjects surveyed, as well as the the advent of next-generation MAFs and, unlike common SNPs, population genetic history of the sequencing technology has revealed the majority may be population- subjects used for discovery. New the sheer breadth of variations in specific (Figure 6.2). The majority of sequencing technologies, referred different populations with estimated common SNPs, with a MAF greater to as next-generation sequencing, frequencies well below 1%. Still, for than 15–20%, are widespread in allow for the ability to catalogue the purpose of current applications human populations (8,9). Only a variants with lower frequencies and of genetic variation, the SNP is the small subset of high-frequency will certainly shift the paradigms most commonly annotated variant. SNPs (less than 10%) appear to be further. Generally, the interrogation The minor allele frequency (MAF) found in a single population, again UNIT 2 of genetic variation continues to is designated for the lower allele suggesting the universal ancestry of reveal greater complexity in different frequency observed at a locus in common SNPs (9). 6 CHAPTER human populations, which manifests one particular population, but often Previously in candidate gene as differences in frequencies of there can be major differences approach studies, SNPs in coding variants. in estimated MAFs between regions were often selected on the populations with distinct histories. basis of an in silico predicted effect, Single-nucleotide The literature suggests that there are but with little supporting biological polymorphisms (SNPs) more than perhaps 15 million SNPs evidence. The attempt to classify with a MAF greater than 1% (3–5), coding variants, known as a coding The most common sequence and 10 million SNPs with a MAF SNP (cSNP), has focused on the variation in the genome, the single- greater than 10% (3,6,7); however predicted effect on the actual coding nucleotide polymorphism (SNP), is recent large-scale sequencing sequence. The majority of cSNPs Figure 6.2. Estimated number of SNPs in the human genome in relation with their minor allele frequency (MAF). Source: (5). Reprinted by permission from Macmillan Publishers Ltd: Nature Genetics, copyright (2003). Unit 2 • Chapter 6. Basic principles and laboratory analysis of genetic variation 101 do not alter the predicted amino scale surveys of cell lines, as well as are catalogued in a public database, acid and are known as synonymous laboratory data. the Online Mendelian Inheritance in SNPs. However, a subset of Nearly half of the more than Man (OMIM) (http://www.ncbi.nlm. variants are predicted to shift the 10 million human SNPs in the nih.gov/omim/). amino acid and are known as non- international public database for synonymous coding SNPs. Though SNPs, or dbSNP (http://www.ncbi. The correlation of common this subset was initially of great nih.gov/SNP/), have been validated genetic variants interest, very few non-synonymous with genotyping assays by the SNP coding SNPs have actually been Consortium and the International Most SNPs are not inherited conclusively associated with human HapMap Project (8,20). Until independently but in blocks, resulting diseases or traits, and even fewer recently, only a small percentage in sets of SNPs being transmitted have corroborative biological had been verified by sequencing, together between generations data to provide plausibility for the but with the advent of the 1000 (4,28,29). These blocks are defined association (10,11). Nonetheless, Genomes Project, nearly all common by linkage disequilibrium (LD), which the analysis of synonymous and (MAF >10%) and uncommon (MAF estimates the correlation between non-synonymous SNPs has been between 1 and 10%) variants should SNPs on shared chromosomes quite informative for evolutionary be confirmed by next generation passed down from ancestral studies (12,13). sequence technology (21,22). chromosomes. LD is defined as the There has been considerable In the current build, roughly one non-random association of alleles effort to calculate

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    22 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us