An Ontology Based Query Engine for Querying Biological Sequences

Total Page:16

File Type:pdf, Size:1020Kb

An Ontology Based Query Engine for Querying Biological Sequences Faculteit Bio-ingenieurswetenschappen Academiejaar 2015-2016 An ontology based query engine for querying biological sequences Jim Clauwaert Promotor: Prof. Dr. ir. Wim van Criekinge Tutor: Martijn Devisscher Masterproef voorgedragen tot het behalen van de graad van Master in de bio-ingenieurswetenschappen: Cel- en genbiotechnologie Foreword This thesis came about during the academic year of 2015-2016, and it has been worked on as a final project for my masters degree in bio-engineering. In many ways, this year has been very heavy. Even though many times working on my thesis meant not working on something else I should have been working on, I have enjoyed researching the subject and am satisfied when looking back at the work invested. This feeling of comfort is due to many external influences that have guided and supported me. I wish to extend my gratitude to the people that have stood by my side throughout the last year. First, I would like to thank Martijn Devisscher, my tutor and the spiritual father of boinq. The rich experience obtained through my work on boinq and The Semantic Web is mainly attributed to the positive working environment he created. I have been given both the responsibility and the trust to handle important parts of the boinq program. This gave me not only the opportunity, but also the ability to think for myself and introduce solutions when these presented themselves. Through weekly appointments, I was able to follow-up and discuss work, and get directions when no path was obvious. Through these elements I feel that I was able to contribute in the creation of boinq, and that my input was of value. This has been both my strongest motivation and fulfilling aspect of my thesis. I also extend my gratitude to the BioBix group. Specifically, to my promoter, Prof. Wim Van Criekinge, for helping in making this thesis a possibility, Prof. Tim de Meyer and dr. Gerben Menschaert, for helping me define a use case and assisting me during. I want to thank my family for supporting me all these years. I want to thank my friends for being awesome in general. A special thanks to Meaghan Blanchard, for being the first helping hand when correcting and revising my work, and being there for whatever reason. Gent, 2016 Jim Clauwaert i Table of Contents Foreword i 1 Abstract 1 2 Introduction 3 3 The Semantic Web 5 3.1 Introduction . .5 3.2 What is The Semantic Web? . .5 3.3 RDF . .6 3.3.1 Structure of RDF . .7 3.3.2 Vocabularies of RDF . .9 3.4 Linked Data . 12 3.4.1 Inference . 12 3.4.2 Linked databases . 13 3.5 RDF data management . 15 3.5.1 RDF formats . 16 3.5.2 Triplestores . 17 3.6 SPARQL . 18 3.6.1 SPARQL syntax . 18 4 Boinq 23 4.1 Introduction . 23 4.2 Design . 23 4.2.1 Data unification . 24 4.2.2 Data organization . 24 4.3 Comparison to other frameworks . 25 4.3.1 Biological query building . 25 4.3.2 Semantic access to sequence information . 26 4.4 Material and methods . 26 5 Genomic Data Implementation 29 5.1 Introduction . 29 5.2 Genomic Data . 29 5.2.1 Browser Extensible Data format . 30 5.2.2 Generic Feature Format . 31 5.2.3 Variant Call Format . 33 5.2.4 Sequence Alignment/Map format . 35 5.3 Data integration into The Semantic Web . 36 5.3.1 Overview . 37 5.3.2 Basic data model . 38 5.3.3 Vocabularies . 39 5.3.4 Data models . 40 5.3.5 Metadata . 48 5.3.6 Practical implementation . 49 5.4 Evaluation . 52 5.4.1 sparql-bed and sparql-vcf . 52 5.4.2 Big data files . 53 5.4.3 JBrowse . 53 iii iv TABLE OF CONTENTS 6 Biological research in RDF 55 6.1 Introduction . 55 6.2 A biomarker for colon cancer . 55 6.2.1 Introduction . 55 6.2.2 Material and methods . 56 6.2.3 Results . 58 6.3 Discussion . 59 6.3.1 Methods . 59 6.3.2 Results . 60 7 Conclusion and Future Prospects 63 A Code Examples 65 B Tables 71 C Figures 75 List of Acronyms List of Acronyms B Boinq Bio ontology integrated query platform BED Browser Extensible Data C CDS Coding DNA Sequence CNV Copy Number Variations CIMP CpG Island Methylator Phenotype CRC Colorectal Cancer CTD Comparative Toxicogenomics Database D DBMS Database Management Systems DDBJ DNA Databank of Japan DKO Double Knock-Out E EBI-EMBL The European Bioinformatics Institute G GDA Gene Disease Association GFF/GFF3 General Feature Format GFVO Genomic Feature and Variation Ontology GMOD Generic Model Organism Database GRC Genome Reference Consortium GTF Genetic Transfer Object I v vi TABLE OF CONTENTS IRI International Resource Identifier J JSON-LD JavaScript Object Notation for Linked Data M MeSH Medical Subject Headings N NCBI National Center for Biotechnology Information NCI National Cancer Institute NHGRI National Human Genome Research Institute O OWL Web Ontology Language R RDF Resource Description Framework RDFS Resource Description Framework Schema S SKOS Simple Knowledge Organization System SNP Single Nucleotide Polymorphism SO Sequence Ontology SPARQL SPARQL Protocol and RDF Query Language STS Spring Tool Suite T TCGA The Cancer Genome Atlas U UniProt The Universal Protein Resource TABLE OF CONTENTS vii URI Uniform Resource Identifier URL Uniform Resource Locator V VCF Variant Call Format W W3C World Wide Web Consortium WT Wild Type WWW World Wide Web X XML Extensible Markup Language XSD XML Schema Definition 1 Abstract English version The Semantic Web is an enhancement of the World Wide Web with a focus on providing a standardized framework for exchanging data. This allows for a web of data not limited by applications and data formats. Technologies created for The Semantic Web have increasingly been adapted by public databases. Boinq is a web platform that aims to connect the researcher to biological databases based upon semantic web technologies. One design goal is the ability to manage and implement custom data into the data framework of The Semantic Web. Data integration of four different data formats has been realized with the creation of custom data structures and converters. Integrated data covers varying levels of high throughput sequencing data, represented in the BED, GFF, VCF, and SAM format. It has been shown that the use of The Semantic Web offers a fast way to select and combine data from public databases. Obstacles preventing a widespread use of the technology are still existing, including the level of knowledge needed about The Semantic Web and used databases, a lack of tools to manage and analyze data from a semantic environment, and the incomplete state of several public databases. 1 2 CHAPTER 1. ABSTRACT Nederlandse versie Het Semantische web is een gevorderde versie van het World Wide Web met een focus op het creÃńren van een gestandardiseerde omgeving voor het distribueren van data. Hierbij wordt een web van data verwezenlijkt dat niet gelimiteerd is door de diverse applicaties en datafor- maten. TechnologiÃńn gecreÃńerd voor Het Semantische Web worden met toenemende interesse geadopteerd door publieke databanken. Boinq is een webapplicatie die ernaar streeft biologische databanken gebouwd op semantische technologien toegankelijker te maken voor de onderzoeker. EÃľn van de doeleinden van het project is het aanmaken van een functionaliteit die eigen data kan inbrengen en beheren in een semantische omgeving. De data integratie van vier verschillende dataformaten is mogelijk gemaakt met de creatie van aangepaste data structuren and converters. GeÃŕntegreerde data is terug te vinden in diverse niveaus van high throughput sequencing data, zoals te vinden in het BED, GFF, VCF en SAM formaat. Er is aangetoond dat het gebruik van Het Semantische Web een snelle optie biedt voor het selecteren en combineren van data komende van publieke databanken. Hindernissen in een algemeen gebruik van Het Semantisch Web zijn echter nog bestaand, daarbij horen een hoge eis aan kennis over Het Semantische Web and gebruikte datasets, toepassingen voor het beheer en de analyze van data, en de incomplete status van publieke datasets. 2 Introduction Since Tim Berners-Lee invented the World Wide Web in 1989, he has continuously worked on defining and improving its construction [79]. In 1994, he founded the World Wide Web Consortium (W3C), an organization focused on generating specifications, guidelines, software and tools to improve the internet. In 2004, W3C defined the specifications of the Resource Description Framework (RDF) in its first iteration. The RDF was created as a guideline and framework to optimize data interchange throughout an ever growing web, a first step towards The Semantic Web. The specifications for RDF 1.1, the second iteration, followed in 2014 [76]. In 1955, the first amino acid sequence was determined by Robert W. Holley and his colleagues. It was the catalyst for a boom in genetic sequence data that has continued to grow exponentially since 1995. In 2007, cost reductions of genome sequencing allowed for another significant boost in new data generation. The vast influx of genomic data has brought the birth of many different databases and formats, causing a hindrance in cataloging, processing and researching data between different sources. Due to the further development and maturing of the technologies created by The Semantic Web, an increasing investment into the adaptation of these technologies for genomic databases has been realized. Although semantic web integration is only adapted by some databases, a conscious effort is invested to expand this technology by major bioinformatic institutes, including EMBL-EBI. Boinq [25] is a web platform that aims to serve as a connection between the researcher and The Semantic Web.
Recommended publications
  • A Transcriptional Signature of Postmitotic Maintenance in Neural Tissues
    Neurobiology of Aging 74 (2019) 147e160 Contents lists available at ScienceDirect Neurobiology of Aging journal homepage: www.elsevier.com/locate/neuaging Postmitotic cell longevityeassociated genes: a transcriptional signature of postmitotic maintenance in neural tissues Atahualpa Castillo-Morales a,b, Jimena Monzón-Sandoval a,b, Araxi O. Urrutia b,c,*, Humberto Gutiérrez a,** a School of Life Sciences, University of Lincoln, Lincoln, UK b Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK c Instituto de Ecología, Universidad Nacional Autónoma de México, Ciudad de México, Mexico article info abstract Article history: Different cell types have different postmitotic maintenance requirements. Nerve cells, however, are Received 11 April 2018 unique in this respect as they need to survive and preserve their functional complexity for the entire Received in revised form 3 October 2018 lifetime of the organism, and failure at any level of their supporting mechanisms leads to a wide range of Accepted 11 October 2018 neurodegenerative conditions. Whether these differences across tissues arise from the activation of Available online 19 October 2018 distinct cell typeespecific maintenance mechanisms or the differential activation of a common molecular repertoire is not known. To identify the transcriptional signature of postmitotic cellular longevity (PMCL), Keywords: we compared whole-genome transcriptome data from human tissues ranging in longevity from 120 days Neural maintenance Cell longevity to over 70 years and found a set of 81 genes whose expression levels are closely associated with Transcriptional signature increased cell longevity. Using expression data from 10 independent sources, we found that these genes Functional genomics are more highly coexpressed in longer-living tissues and are enriched in specific biological processes and transcription factor targets compared with randomly selected gene samples.
    [Show full text]
  • DNA Methylation of Developmental Genes in Pediatric Medulloblastomas Identified by Denaturation Analysis of Methylation Differences
    DNA methylation of developmental genes in pediatric medulloblastomas identified by denaturation analysis of methylation differences Scott J. Diedea,b, Jamie Guenthoerc, Linda N. Gengc, Sarah E. Mahoneyc, Michael Marottad,e, James M. Olsona,b, Hisashi Tanakad,e, and Stephen J. Tapscotta,c,1 Divisions of aClinical Research and cHuman Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109; bDepartment of Pediatrics, University of Washington School of Medicine, Seattle, WA 98195; dDepartment of Molecular Genetics, Cleveland Clinic, Cleveland, OH 44195; and eLerner Research Institute, Cleveland Clinic, Cleveland, OH 44195 Edited by Mark T. Groudine, Fred Hutchinson Cancer Research Center, Seattle, WA, and approved November 6, 2009 (received for review July 8, 2009) DNA methylation might have a significant role in preventing normal Results differentiation in pediatric cancers. We used a genomewide method An Assay to Detect Palindrome Formation Enriches for CpG Methylation. for detecting regions of CpG methylation on the basis of the in- Earlier work from our laboratory focused on identifying regions of creased melting temperature of methylated DNA, termed denatu- the genome susceptible to DNA palindrome formation, a rate- ration analysis of methylation differences (DAMD). Using the DAMD limiting step in gene amplification. We previously described a fi fi assay, we nd common regions of cancer-speci c methylation method to obtain a genomewide analysis of palindrome formation changes in primary medulloblastomas in critical developmental reg- (GAPF) on the basis of the efficient intrastrand base pairing in ulatory pathways, including Sonic hedgehog (Shh), Wingless (Wnt), large palindromic sequences (3). Palindromic sequences can rap- retinoic acid receptor (RAR), and bone morphogenetic protein idly anneal intramolecularly to form “snap-back” DNA under (BMP).
    [Show full text]
  • Setd1 Histone 3 Lysine 4 Methyltransferase Complex Components in Epigenetic Regulation
    SETD1 HISTONE 3 LYSINE 4 METHYLTRANSFERASE COMPLEX COMPONENTS IN EPIGENETIC REGULATION Patricia A. Pick-Franke Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Master of Science in the Department of Biochemistry and Molecular Biology Indiana University December 2010 Accepted by the Faculty of Indiana University, in partial fulfillment of the requirements for the degree of Master of Science. _____________________________________ David Skalnik, Ph.D., Chair _____________________________________ Kristin Chun, Ph.D. Master’s Thesis Committee _____________________________________ Simon Rhodes, Ph.D. ii DEDICATION This thesis is dedicated to my sons, Zachary and Zephaniah who give me great joy, hope and continuous inspiration. I can only hope that I successfully set a good example demonstrating that one can truly accomplish anything, if you never give up and reach for your dreams. iii ACKNOWLEDGEMENTS I would like to thank my committee members Dr. Skalnik, Dr. Chun and Dr. Rhodes for allowing me to complete this dissertation. They have been incredibly generous with their flexibility. I must make a special thank you to Jeanette McClintock, who willingly gave her expertise in statistical analysis with the Cfp1 microarray data along with encouragement, support and guidance to complete this work. I would like to thank Courtney Tate for her ceaseless willingness to share ideas, and her methods and materials, and Erika Dolbrota for her generous instruction as well as the name of a good doctor. I would also like to acknowledge the superb mentorship of Dr. Jeon Heong Lee, PhD and the contagious passion and excitement for the life of science of Dr.
    [Show full text]
  • Peripheral Nerve Single-Cell Analysis Identifies Mesenchymal Ligands That Promote Axonal Growth
    Research Article: New Research Development Peripheral Nerve Single-Cell Analysis Identifies Mesenchymal Ligands that Promote Axonal Growth Jeremy S. Toma,1 Konstantina Karamboulas,1,ª Matthew J. Carr,1,2,ª Adelaida Kolaj,1,3 Scott A. Yuzwa,1 Neemat Mahmud,1,3 Mekayla A. Storer,1 David R. Kaplan,1,2,4 and Freda D. Miller1,2,3,4 https://doi.org/10.1523/ENEURO.0066-20.2020 1Program in Neurosciences and Mental Health, Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5G 1X8, Canada, 2Institute of Medical Sciences University of Toronto, Toronto, Ontario M5G 1A8, Canada, 3Department of Physiology, University of Toronto, Toronto, Ontario M5G 1A8, Canada, and 4Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada Abstract Peripheral nerves provide a supportive growth environment for developing and regenerating axons and are es- sential for maintenance and repair of many non-neural tissues. This capacity has largely been ascribed to paracrine factors secreted by nerve-resident Schwann cells. Here, we used single-cell transcriptional profiling to identify ligands made by different injured rodent nerve cell types and have combined this with cell-surface mass spectrometry to computationally model potential paracrine interactions with peripheral neurons. These analyses show that peripheral nerves make many ligands predicted to act on peripheral and CNS neurons, in- cluding known and previously uncharacterized ligands. While Schwann cells are an important ligand source within injured nerves, more than half of the predicted ligands are made by nerve-resident mesenchymal cells, including the endoneurial cells most closely associated with peripheral axons. At least three of these mesen- chymal ligands, ANGPT1, CCL11, and VEGFC, promote growth when locally applied on sympathetic axons.
    [Show full text]
  • Identification of Novel Regulatory Genes in Acetaminophen
    IDENTIFICATION OF NOVEL REGULATORY GENES IN ACETAMINOPHEN INDUCED HEPATOCYTE TOXICITY BY A GENOME-WIDE CRISPR/CAS9 SCREEN A THESIS IN Cell Biology and Biophysics and Bioinformatics Presented to the Faculty of the University of Missouri-Kansas City in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY By KATHERINE ANNE SHORTT B.S, Indiana University, Bloomington, 2011 M.S, University of Missouri, Kansas City, 2014 Kansas City, Missouri 2018 © 2018 Katherine Shortt All Rights Reserved IDENTIFICATION OF NOVEL REGULATORY GENES IN ACETAMINOPHEN INDUCED HEPATOCYTE TOXICITY BY A GENOME-WIDE CRISPR/CAS9 SCREEN Katherine Anne Shortt, Candidate for the Doctor of Philosophy degree, University of Missouri-Kansas City, 2018 ABSTRACT Acetaminophen (APAP) is a commonly used analgesic responsible for over 56,000 overdose-related emergency room visits annually. A long asymptomatic period and limited treatment options result in a high rate of liver failure, generally resulting in either organ transplant or mortality. The underlying molecular mechanisms of injury are not well understood and effective therapy is limited. Identification of previously unknown genetic risk factors would provide new mechanistic insights and new therapeutic targets for APAP induced hepatocyte toxicity or liver injury. This study used a genome-wide CRISPR/Cas9 screen to evaluate genes that are protective against or cause susceptibility to APAP-induced liver injury. HuH7 human hepatocellular carcinoma cells containing CRISPR/Cas9 gene knockouts were treated with 15mM APAP for 30 minutes to 4 days. A gene expression profile was developed based on the 1) top screening hits, 2) overlap with gene expression data of APAP overdosed human patients, and 3) biological interpretation including assessment of known and suspected iii APAP-associated genes and their therapeutic potential, predicted affected biological pathways, and functionally validated candidate genes.
    [Show full text]
  • An Integrative Transcriptome-Wide Analysis of Amyotrophic Lateral Sclerosis for the Identification of Potential Genetic Markers and Drug Candidates
    International Journal of Molecular Sciences Article An Integrative Transcriptome-Wide Analysis of Amyotrophic Lateral Sclerosis for the Identification of Potential Genetic Markers and Drug Candidates Sungmin Park 1 , Daeun Kim 2 , Jaeseung Song 2 and Jong Wha J. Joo 1,* 1 Department of Computer Engineering, Dongguk University, Seoul 04620, Korea; [email protected] 2 Department of Life Science, Dongguk University, Seoul 04620, Korea; [email protected] (D.K.); [email protected] (J.S.) * Correspondence: [email protected] Abstract: Amyotrophic lateral sclerosis (ALS) is a neurodegenerative neuromuscular disease. Al- though genome-wide association studies (GWAS) have successfully identified many variants signifi- cantly associated with ALS, it is still difficult to characterize the underlying biological mechanisms inducing ALS. In this study, we performed a transcriptome-wide association study (TWAS) to iden- tify disease-specific genes in ALS. Using the largest ALS GWAS summary statistic (n = 80,610), we identified seven novel genes using 19 tissue reference panels. We conducted a conditional analysis to verify the genes’ independence and to confirm that they are driven by genetically regulated expres- sions. Furthermore, we performed a TWAS-based enrichment analysis to highlight the association of important biological pathways, one in each of the four tissue reference panels. Finally, utilizing a connectivity map, a database of human cell expression profiles cultured with bioactive small Citation: Park, S.; Kim, D.; Song, J.; molecules, we discovered functional associations between genes and drugs to identify 15 bioactive Joo, J.W.J. An Integrative small molecules as potential drug candidates for ALS. We believe that, by integrating the largest ALS Transcriptome-Wide Analysis of GWAS summary statistic with gene expression to identify new risk loci and causal genes, our study Amyotrophic Lateral Sclerosis for the Identification of Potential Genetic provides strong candidates for molecular basis experiments in ALS.
    [Show full text]
  • Proteomic Landscape of the Human Choroid–Retinal Pigment Epithelial Complex
    Supplementary Online Content Skeie JM, Mahajan VB. Proteomic landscape of the human choroid–retinal pigment epithelial complex. JAMA Ophthalmol. Published online July 24, 2014. doi:10.1001/jamaophthalmol.2014.2065. eFigure 1. Choroid–retinal pigment epithelial (RPE) proteomic analysis pipeline. eFigure 2. Gene ontology (GO) distributions and pathway analysis of human choroid– retinal pigment epithelial (RPE) protein show tissue similarity. eMethods. Tissue collection, mass spectrometry, and analysis. eTable 1. Complete table of proteins identified in the human choroid‐RPE using LC‐ MS/MS. eTable 2. Top 50 signaling pathways in the human choroid‐RPE using MetaCore. eTable 3. Top 50 differentially expressed signaling pathways in the human choroid‐RPE using MetaCore. eTable 4. Differentially expressed proteins in the fovea, macula, and periphery of the human choroid‐RPE. eTable 5. Differentially expressed transcription proteins were identified in foveal, macular, and peripheral choroid‐RPE (p<0.05). eTable 6. Complement proteins identified in the human choroid‐RPE. eTable 7. Proteins associated with age related macular degeneration (AMD). This supplementary material has been provided by the authors to give readers additional information about their work. © 2014 American Medical Association. All rights reserved. 1 Downloaded From: https://jamanetwork.com/ on 09/25/2021 eFigure 1. Choroid–retinal pigment epithelial (RPE) proteomic analysis pipeline. A. The human choroid‐RPE was dissected into fovea, macula, and periphery samples. B. Fractions of proteins were isolated and digested. C. The peptide fragments were analyzed using multi‐dimensional LC‐MS/MS. D. X!Hunter, X!!Tandem, and OMSSA were used for peptide fragment identification. E. Proteins were further analyzed using bioinformatics.
    [Show full text]
  • Structure-Function Relationships of Rna and Protein in Synaptic Plasticity
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2017 Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity Sarah Middleton University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Bioinformatics Commons, Biology Commons, and the Neuroscience and Neurobiology Commons Recommended Citation Middleton, Sarah, "Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity" (2017). Publicly Accessible Penn Dissertations. 2474. https://repository.upenn.edu/edissertations/2474 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/2474 For more information, please contact [email protected]. Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity Abstract Structure is widely acknowledged to be important for the function of ribonucleic acids (RNAs) and proteins. However, due to the relative accessibility of sequence information compared to structure information, most large genomics studies currently use only sequence-based annotation tools to analyze the function of expressed molecules. In this thesis, I introduce two novel computational methods for genome-scale structure-function analysis and demonstrate their application to identifying RNA and protein structures involved in synaptic plasticity and potentiation—important neuronal processes that are thought to form the basis of learning and memory. First, I describe a new method for de novo identification of RNA secondary structure motifs enriched in co-regulated transcripts. I show that this method can accurately identify secondary structure motifs that recur across three or more transcripts in the input set with an average recall of 0.80 and precision of 0.98. Second, I describe a tool for predicting protein structural fold from amino acid sequence, which achieves greater than 96% accuracy on benchmarks and can be used to predict protein function and identify new structural folds.
    [Show full text]
  • DNA Methylation of Developmental Genes in Pediatric Medulloblastomas Identified by Denaturation Analysis of Methylation Differences
    DNA methylation of developmental genes in pediatric medulloblastomas identified by denaturation analysis of methylation differences Scott J. Diedea,b, Jamie Guenthoerc, Linda N. Gengc, Sarah E. Mahoneyc, Michael Marottad,e, James M. Olsona,b, Hisashi Tanakad,e, and Stephen J. Tapscotta,c,1 Divisions of aClinical Research and cHuman Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109; bDepartment of Pediatrics, University of Washington School of Medicine, Seattle, WA 98195; dDepartment of Molecular Genetics, Cleveland Clinic, Cleveland, OH 44195; and eLerner Research Institute, Cleveland Clinic, Cleveland, OH 44195 Edited by Mark T. Groudine, Fred Hutchinson Cancer Research Center, Seattle, WA, and approved November 6, 2009 (received for review July 8, 2009) DNA methylation might have a significant role in preventing normal Results differentiation in pediatric cancers. We used a genomewide method An Assay to Detect Palindrome Formation Enriches for CpG Methylation. for detecting regions of CpG methylation on the basis of the in- Earlier work from our laboratory focused on identifying regions of creased melting temperature of methylated DNA, termed denatu- the genome susceptible to DNA palindrome formation, a rate- ration analysis of methylation differences (DAMD). Using the DAMD limiting step in gene amplification. We previously described a assay, we find common regions of cancer-specific methylation method to obtain a genomewide analysis of palindrome formation changes in primary medulloblastomas in critical developmental reg- (GAPF) on the basis of the efficient intrastrand base pairing in ulatory pathways, including Sonic hedgehog (Shh), Wingless (Wnt), large palindromic sequences (3). Palindromic sequences can rap- retinoic acid receptor (RAR), and bone morphogenetic protein idly anneal intramolecularly to form “snap-back” DNA under (BMP).
    [Show full text]
  • UCLA Electronic Theses and Dissertations
    UCLA UCLA Electronic Theses and Dissertations Title A Systems Genetics Approach to the Identification of Causal Genes in Heart Failure Using a Large Mouse Panel Permalink https://escholarship.org/uc/item/5hb90099 Author Rau, Christoph Daniel Publication Date 2013 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA Los Angeles A Systems Genetics Approach to the Identification of Causal Genes in Heart Failure Using a Large Mouse Panel A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Microbiology, Immunology and Molecular Genetics by Christoph Daniel Rau 2013 ABSTRACT OF THE DISSERTATION A Systems Genetics Approach to the Identification of Causal Genes in Heart Failure Using a Large Mouse Panel By Christoph Daniel Rau Doctor of Philosophy in Microbiology, Immunology and Molecular Genetics University of California, Los Angeles, 2013 Professor Aldons Jake Lusis, Chair Heart failure (HF) accounts for 1 in 9 deaths in the United States and is the leading cause of hospitalization for people over the age of 65 and the incidence of HF is predicted to rise over the coming years. The complexity which underlies common forms ii of HF has hindered the study of the disease in humans, and approaches, such as genome- wide association studies (GWAS), have had only modest success in identifying genes which are related to this disease. Here we describe the use of a panel of mice to facilitate the study of this complex disorder, reducing heterogeneity and facilitating systems-level approaches. We used the β-adrenergic agonist isoproterenol to induce HF in 105 unique strains drawn from the Hybrid Mouse Diversity Panel, a novel mouse resource population for the analysis of complex traits.
    [Show full text]