Online Resources for SNP Analysisreview 65
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Biology of Human Variation Fall 2014
Anthropology 2110 Biology of Human Variation Fall 2014 Professor: Dr. Tamara Varney Location: AT1010 Lecture time: Tues 7-10 Phone: 807-343-8204 Office: Braun Building 2001D Email: [email protected] Office Hours: Thurs 9-11am am until Oct 15 then by appointment**(if its more convenient feel free to set an appointment time) OR you can phone, email or drop in anytime to see if I am free – just be prepared to come at another time if I am busy with something I cannot interrupt I cannot guarantee that I will respond in a timely fashion if messages are left on my voice mail rather than email and students should NOT expect less than 24 hour turnaround time to email messages. Please DO NOT assume that your message, voice or email, has been received unless you receive an acknowledgement. Course Description: This course focuses on human microevolution. Topics include evolutionary theory, the genetic background of human variation, the distribution of human variation, human adaptability, and the role of disease and other influences on human evolution. Required Textbook: (available at the University Bookstore) Mielke JH, Konigsberg LW and Relethford JH. 2011. Human Biological Variation. 2nd Edition. Oxford University Press. ISBN 13: 978-0-19-538740-7 Also see the course Desire2Learn (D2L) site (look in MyCourseLinks on the LU website) for additional resources. Evaluation: Your final grade will be based on: Term Test 1 25% Oct 27 Term Test 2 30% Nov 18 Final Exam 45% Final examination period – Dec 9-19 **Note exam contingency date is Dec 20. Please note that Fri, November 6, 2015 is the last date for withdrawal without academic penalty from this course. -
Genetic and Environmental Exposures Constrain Epigenetic Drift Over the Human Life Course
Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press Research Genetic and environmental exposures constrain epigenetic drift over the human life course Sonia Shah,1,9 Allan F. McRae,1,9 Riccardo E. Marioni,1,2,3,9 Sarah E. Harris,2,3 Jude Gibson,4 Anjali K. Henders,5 Paul Redmond,6 Simon R. Cox,3,6 Alison Pattie,6 Janie Corley,6 Lee Murphy,4 Nicholas G. Martin,5 Grant W. Montgomery,5 John M. Starr,3,7 Naomi R. Wray,1,10 Ian J. Deary,3,6,10 and Peter M. Visscher1,3,8,10 1Queensland Brain Institute, The University of Queensland, Brisbane, 4072, Queensland, Australia; 2Medical Genetics Section, Centre for Genomic and Experimental Medicine, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, United Kingdom; 3Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, EH8 9JZ, United Kingdom; 4Wellcome Trust Clinical Research Facility, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, United Kingdom; 5QIMR Berghofer Medical Research Institute, Brisbane, 4029, Queensland, Australia; 6Department of Psychology, University of Edinburgh, Edinburgh, EH8 9JZ, United Kingdom; 7Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, EH8 9JZ, United Kingdom; 8University of Queensland Diamantina Institute, Translational Research Institute, The University of Queensland, Brisbane, 4072, Queensland, Australia Epigenetic mechanisms such as DNA methylation (DNAm) are essential for regulation of gene expression. DNAm is dy- namic, influenced by both environmental and genetic factors. Epigenetic drift is the divergence of the epigenome as a function of age due to stochastic changes in methylation. -
Small Variants Frequently Asked Questions (FAQ) Updated September 2011
Small Variants Frequently Asked Questions (FAQ) Updated September 2011 Summary Information for each Genome .......................................................................................................... 3 How does Complete Genomics map reads and call variations? ........................................................................... 3 How do I assess the quality of a genome produced by Complete Genomics?................................................ 4 What is the difference between “Gross mapping yield” and “Both arms mapped yield” in the summary file? ............................................................................................................................................................................. 5 What are the definitions for Fully Called, Partially Called, Half-Called and No-Called?............................ 5 In the summary-[ASM-ID].tsv file, how is the number of homozygous SNPs calculated? ......................... 5 In the summary-[ASM-ID].tsv file, how is the number of heterozygous SNPs calculated? ....................... 5 In the summary-[ASM-ID].tsv file, how is the total number of SNPs calculated? .......................................... 5 In the summary-[ASM-ID].tsv file, what regions of the genome are included in the “exome”? .............. 6 In the summary-[ASM-ID].tsv file, how is the number of SNPs in the exome calculated? ......................... 6 In the summary-[ASM-ID].tsv file, how are variations in potentially redundant regions of the genome counted? ..................................................................................................................................................................... -
A Pilot Analysis of DNA Methylation in Candidate Genes in Brain
cells Article Epigenetic Study in Parkinson’s Disease: A Pilot Analysis of DNA Methylation in Candidate Genes in Brain Luis Navarro-Sánchez 1 , Beatriz Águeda-Gómez 1, Silvia Aparicio 1 and Jordi Pérez-Tur 1,2,3,* 1 Unitat de Genètica Molecular, Instituto de Biomedicina de Valencia, CSIC, 46010 València, Spain; [email protected] (L.N.-S.); [email protected] (B.Á.-G.); [email protected] (S.A.) 2 Centro de Investigación Biomédica en Red en Enfermedades Neurodegenerativas (CIBERNED), 46010 València, Spain 3 Unidad Mixta de Genética y Neurología, Instituto de Investigación Sanitaria La Fe, 46026 València, Spain * Correspondence: [email protected]; Tel.: +34-96-339-1755 Received: 25 July 2018; Accepted: 21 September 2018; Published: 26 September 2018 Abstract: Efforts have been made to understand the pathophysiology of Parkinson’s disease (PD). A significant number of studies have focused on genetics, despite the fact that the described pathogenic mutations have been observed only in around 10% of patients; this observation supports the fact that PD is a multifactorial disorder. Lately, differences in miRNA expression, histone modification, and DNA methylation levels have been described, highlighting the importance of epigenetic factors in PD etiology. Taking all this into consideration, we hypothesized that an alteration in the level of methylation in PD-related genes could be related to disease pathogenesis, possibly due to alterations in gene expression. After analysing promoter regions of five PD-related genes in three brain regions by pyrosequencing, we observed some differences in DNA methylation levels (hypo and hypermethylation) in substantia nigra in some CpG dinucleotides that, possibly through an alteration in Sp1 binding, could alter their expression. -
Expressed Sequence Tag Analysis of the Response of Apple
Physiologia Plantarum 133: 298–317. 2008 Copyright ª Physiologia Plantarum 2008, ISSN 0031-9317 Expressed sequence tag analysis of the response of apple (Malus x domestica ‘Royal Gala’) to low temperature and water deficit Michael Wisniewskia,*, Carole Bassetta,*, John Norellia, Dumitru Macarisina, Timothy Artlipa, Ksenija Gasicb and Schuyler Korbanb aUnited States Department of Agriculture – Agricultural Research Service (USDA-ARS), Appalachian Fruit Research Station, 2217 Wiltshire Road, Kearneysville, WV 25430, USA bDepartment of Natural Resources and Environmental Sciences, University of Illinois, 310 ERML, 1201 W. Gregory Drive, Urbana, IL 61801, USA Correspondence Leaf, bark, xylem and root tissues were used to make nine cDNA libraries from *Corresponding author, non-stressed (control) ‘Royal Gala’ apple trees, and from ‘Royal Gala’ trees e-mail: [email protected]; exposed to either low temperature (5°C for 24 h) or water deficit (45% of [email protected] saturated pot mass for 2 weeks). Over 22 600 clones from the nine libraries # Received 26 September 2007; revised 3 were subjected to 5 single-pass sequencing, clustered and annotated using January 2008 BLASTX. The number of clusters in the libraries ranged from 170 to 1430. Regarding annotation of the sequences, BLASTX analysis indicated that within doi: 10.1111/j.1399-3054.2008.01063.x the libraries 65–72% of the clones had a high similarity to known function genes, 6–15% had no functional assignment and 15–26% were completely novel. The expressed sequence tags were combined into three classes (control, low-temperature and water deficit) and the annotated genes in each class were placed into 1 of 10 different functional categories. -
Dbsnp Resources in the UCSC Database Today We Will Discuss
dbSNP resources in the UCSC database Today we will discuss some of the variation data from dbSNP as displayed on the UCSC Genome Browser. We will start at genome.ucsc.edu at the main Browser page and we will ‘reset all user settings’, which brings us to the most recent genome assembly, hg38. [0:28 Set up GB to hg18 defaults] So to begin, let’s go down to hg18, one of the earlier version of the genome, and we will hit the go button. At the Browser graphic page, we will hit the ‘hide all’ button to turn all the data tracks off. [0:45 Turn on SNPs (130) track] Scrolling to the bottom of the screen, to the Variation and Repeats bluebar group, we’ll draw your attention to the dbSNP track, SNPs version 130. In addition, we have earlier versions: 129, 128 and 126. At the time of version 130, before moving to hg19, dbSNP was mapping all new SNPs to the hg18 genome assembly. If we click into the link above the pulldown menu, we can see that this table was last updated in 2009. We’re going to turn the track on to ‘dense’ and then ‘Submit’. This shows us all the SNPs in version 130, and let’s turn on the gene track, UCSC Genes to ‘pack’so we will see the scale. We have 310,000 bases and we have the default gene GABRA3. The region is annotated by many SNPs. Now let’s zoom into one of the exons here in the middle of the screen, and you can that at this resolution, around 5,000 basepairs, there are a dozen or so SNPs in view. -
Sequence Variations in the Public Human Genome Data Reflect a Bottlenecked Population History
Sequence variations in the public human genome data reflect a bottlenecked population history Gabor Marth*†, Greg Schuler*, Raymond Yeh‡, Ruth Davenport§, Richa Agarwala*, Deanna Church*, Sarah Wheelan*¶, Jonathan Baker*, Ming Ward*, Michael Kholodov*, Lon Phan*, Eva Czabarka*, Janos Murvai*, David Cutlerʈ, Stephen Wooding**, Alan Rogers**, Aravinda Chakravartiʈ, Henry C. Harpending**, Pui-Yan Kwok†,††, and Stephen T. Sherry*† *National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894; ‡Department of Genetics, Washington University School of Medicine, St. Louis, MO 63130; §Division of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63130; ¶Department of Molecular Biology and Genetics and ʈMcKusick–Nathans Institute of Genetic Medicine, The Johns Hopkins University School of Medicine, Baltimore, MD 21205; **Department of Anthropology, University of Utah, Salt Lake City, UT 84112; and ††Cardiovascular Research Institute and Department of Dermatology, University of California, San Francisco, CA 94143 Contributed by Henry C. Harpending, November 5, 2002 Single-nucleotide polymorphisms (SNPs) constitute the great ma- (density) distribution of genomic sequence variations. To this jority of variations in the human genome, and as heritable variable end, we built a set of reagents (pairwise sequence alignments landmarks they are useful markers for disease mapping and re- and their corresponding sets of variation) by analyzing the solving population structure. Redundant coverage in overlaps of overlapping regions of large-insert clones sequenced as part of large-insert genomic clones, sequenced as part of the Human the human genome project. These data provided marker Genome Project, comprises a quarter of the genome, and it is density observations grouped by overlap fragment length. -
Methods to Detect Selection in Populations with Applications to the Human
P1: FQP/VEN July 6, 2000 12:24 Annual Reviews AR104-19 Annu. Rev. Genomics Hum. Genet. 2000. 01:539–59 Copyright c 2000 by Annual Reviews. All rights reserved METHODS TO DETECT SELECTION IN POPULATIONS WITH APPLICATIONS TO THE HUMAN Martin Kreitman Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637; e-mail: [email protected] Key Words natural selection, genetic drift, polymorphism, molecular evolution ■ Abstract The development of statistical tests of natural selection at the DNA level in population samples has been ongoing for the past 13 years. The current state of the field is reviewed, and the available tests of selection are described. All tests use predictions from the theory of neutrally evolving sites as a null hypothesis. Departures from equilibrium-neutral expectations can indicate the presence of natural selection acting either at one or more of the sites under investigation or at a sufficiently tightly linked site. Complications can arise in the interpretation of departures from neutrality if populations are not at equilibrium for mutation and genetic drift or if populations are subdivided, both of which are likely scenarios for humans. Attempts to understand the nonequilibrium configuration of silent polymorphism in human mitochondrial DNA illustrate the difficulty of distinguishing between selection and alternative demographic hypotheses. The range of plausible alternatives to selection will become better defined, however, as additional population genetic data sets become available, allowing better null models to be constructed. INTRODUCTION Kimura’s immensely influential formulation of the neutral theory of molecular evolution, which came in 1968, was based primarily on an argument about the magnitude of the genetic load that would be imposed by positive selection if it were the only driving force in protein evolution (56). -
1 a Primer on Molecular Biology
1APrimer on Molecular Biology Alexander Zien Modern molecular biology provides a rich source of challenging machine learning problems. This tutorial chapter aims to provide the necessary biological background knowledge required to communicate with biologists and to understand and properly formalize a number of most interesting problems in this application domain. The largest part of the chapter (its first section) is devoted to the cell as the basic unit of life. Four aspects of cells are reviewed in sequence: (1) the molecules that cells make use of (above all, proteins, RNA, and DNA); (2) the spatial organization of cells (“compartmentalization”); (3) the way cells produce proteins (“protein expression”); and (4) cellular communication and evolution (of cells and organisms). In the second section, an overview is provided of the most frequent measurement technologies, data types, and data sources. Finally, important open problems in the analysis of these data (bioinformatics challenges) are briefly outlined. 1.1 The Cell The basic unit of all (biological) life is the cell. A cell is basically a watery solution of certain molecules, surrounded by a lipid (fat) membrane. Typical sizes of cells range from 1 µm (bacteria) to 100 µm (plant cells). The most important properties Life of a living cell (and, in fact, of life itself) are the following: It consists of a set of molecules that is separated from the exterior (as a human being is separated from his or her surroundings). It has a metabolism, that is, it can take up nutrients and convert them into other molecules and usable energy. The cell uses nutrients to renew its constituents, to grow, and to drive its actions (just like a human does). -
The Genetics of Bipolar Disorder
Molecular Psychiatry (2008) 13, 742–771 & 2008 Nature Publishing Group All rights reserved 1359-4184/08 $30.00 www.nature.com/mp FEATURE REVIEW The genetics of bipolar disorder: genome ‘hot regions,’ genes, new potential candidates and future directions A Serretti and L Mandelli Institute of Psychiatry, University of Bologna, Bologna, Italy Bipolar disorder (BP) is a complex disorder caused by a number of liability genes interacting with the environment. In recent years, a large number of linkage and association studies have been conducted producing an extremely large number of findings often not replicated or partially replicated. Further, results from linkage and association studies are not always easily comparable. Unfortunately, at present a comprehensive coverage of available evidence is still lacking. In the present paper, we summarized results obtained from both linkage and association studies in BP. Further, we indicated new potential interesting genes, located in genome ‘hot regions’ for BP and being expressed in the brain. We reviewed published studies on the subject till December 2007. We precisely localized regions where positive linkage has been found, by the NCBI Map viewer (http://www.ncbi.nlm.nih.gov/mapview/); further, we identified genes located in interesting areas and expressed in the brain, by the Entrez gene, Unigene databases (http://www.ncbi.nlm.nih.gov/entrez/) and Human Protein Reference Database (http://www.hprd.org); these genes could be of interest in future investigations. The review of association studies gave interesting results, as a number of genes seem to be definitively involved in BP, such as SLC6A4, TPH2, DRD4, SLC6A3, DAOA, DTNBP1, NRG1, DISC1 and BDNF. -
Human DNA Sequences: More Variation and Less Race
AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 139:23–34 (2009) Human DNA Sequences: More Variation and Less Race Jeffrey C. Long,1* Jie Li,1 and Meghan E. Healy2 1Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618 2Department of Anthropology, University of New Mexico, Albuquerque, NM 87131 KEY WORDS race; DNA sequence; short tandem repeat; diversity; hierachical models ABSTRACT Interest in genetic diversity within and sity is one of nested subsets, such that the diversity in between human populations as a way to answer questions non-Sub-Saharan African populations is essentially a sub- about race has intensified in light of recent advances in set of the diversity found in Sub-Saharan African popula- genome technology. The purpose of this article is to apply tions. The actual pattern of DNA diversity creates some a method of generalized hierarchical modeling to two unsettling problems for using race as meaningful genetic DNA data sets. The first data set consists of a small sam- categories. For example, the pattern of DNA diversity ple of individuals (n 5 32 total, from eight populations) implies that some populations belong to more than one who have been fully resequenced for 63 loci that encode a race (e.g., Europeans), whereas other populations do not total of 38,534 base pairs. The second data set consists of belong to any race at all (e.g., Sub-Saharan Africans). As a large sample of individuals (n 5 928 total, from 46 popu- Frank Livingstone noted long ago, the Linnean classifica- lations) who have been genotyped at 580 loci that encode tion system cannot accommodate this pattern because short tandem repeats. -
Dbsnp Columns
dbSNP Columns RS dbSNP ID (i.e. rs number) RSPOS Chr position reported in dbSNP RV RS orientation is reversed Variation Property. Documentation is at VP ftp://ftp.ncbi.nlm.nih.gov/snp/specs/dbSNP_BitField_latest.pdf Pairs each of gene symbol:gene id. The gene symbol and id are delimited by a GENEINFO colon (:) and each pair is delimited by a vertical bar (|) dbSNPBuildID First dbSNP Build for RS SAO Variant Allele Origin: 0 - unspecified, 1 - Germline, 2 - Somatic, 3 - Both Variant Suspect Reason Codes (may be more than one value added together) 0 - unspecified, 1 - Paralog, 2 - byEST, 4 - oldAlign, 8 - Para_EST, 16 - SSR 1kg_failed, 1024 - other WGT Weight, 00 - unmapped, 1 - weight 1, 2 - weight 2, 3 - weight 3 or more VC Variation Class PM Variant is Precious(Clinical,Pubmed Cited) Provisional Third Party Annotation(TPA) (currently rs from PHARMGKB who will TPA give phenotype data) PMC Links exist to PubMed Central article S3D Has 3D structure - SNP3D table SLO Has SubmitterLinkOut - From SNP->SubSNP->Batch.link_out Has non-synonymous frameshift A coding region variation where one allele in NSF the set changes all downstream amino acids. FxnClass = 44 Has non-synonymous missense A coding region variation where one allele in NSM the set changes protein peptide. FxnClass = 42 Has non-synonymous nonsense A coding region variation where one allele in NSN the set changes to STOP codon (TER). FxnClass = 41 Has reference A coding region variation where one allele in the set is identical REF to the reference sequence. FxnCode = 8 Has synonymous A coding region variation where one allele in the set does not SYN change the encoded amino acid.