Genomic Data Mining for Functional Annotation of Human Long Noncoding Rnas Brian L

Total Page:16

File Type:pdf, Size:1020Kb

Genomic Data Mining for Functional Annotation of Human Long Noncoding Rnas Brian L Clemson University TigerPrints All Dissertations Dissertations 5-2018 Genomic Data Mining for Functional Annotation of Human Long Noncoding RNAs Brian L. Gudenas Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations Recommended Citation Gudenas, Brian L., "Genomic Data Mining for Functional Annotation of Human Long Noncoding RNAs" (2018). All Dissertations. 2146. https://tigerprints.clemson.edu/all_dissertations/2146 This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected]. GENOMIC DATA MINING FOR FUNCTIONAL ANNOTATION OF HUMAN LONG NONCODING RNAs A Dissertation Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Genetics by Brian L. Gudenas May 2018 Accepted by: Dr. Liangjiang Wang, Committee Chair Dr. Hong Luo Dr. Anand Srivastava Dr. Rajandeep Sekhon ABSTRACT Life may have begun in an RNA world, which is supported by the increasingly vital role that RNA has been shown to perform in biological systems. To understand how the genome encodes life, one must look to the transcriptome, the set of all RNA molecules in a cell. The transcriptome illustrates which RNA transcripts are expressed at what times and this orchestrated network of gene expression is responsible for multicellular development. In humans, most genes are noncoding RNAs, meaning that they do not encode proteins. The largest class of noncoding genes are long noncoding RNAs (lncRNAs), RNA transcripts greater in length than 200 nucleotides which lack protein-coding capacity. Some lncRNAs have been shown to be key regulators; however, most lncRNAs are uncharacterized. Therefore, we developed genomic data mining methodologies for lncRNA functional annotation. Many lncRNAs are brain-specific and their dysregulation is suspected to be involved in neurodevelopmental disorders. Two prevalent brain disorders are intellectual disability (ID) and autism spectrum disorder (ASD), which are genetically heterogeneous with unidentified genetic risk factors. In this study, we created brain developmental gene coexpression networks, for ID and ASD, to identify lncRNAs associated with known disease genes. We found lncRNAs highly co-expressed with ID genes which harbored ID- associated copy number variants (CNVs). To find ASD-associated lncRNAs we identified lncRNAs differentially expressed in the ASD brain and then refined these candidates by filtering for associations with ASD risk genes in a human brain developmental coexpression network. These candidate-ASD associated lncRNAs were associated with the ii synaptic transmission and immune response pathways, in addition to residing within ASD- associated CNVs at a high frequency. The mechanism by which lncRNAs function is partly determined by functional motifs in the RNA transcript sequence. To identify lncRNA motifs, we developed a genetic algorithm capable of finding long motifs and found a motif associated with lncRNA nuclear localization. LncRNA functions are compartmentalized within the cell; therefore, knowledge of lncRNA subcellular localization provides insight into their biological function. We developed a deep learning model that predicts lncRNA subcellular localization from lncRNA transcript sequences. This model obtained high prediction accuracy on lncRNAs with known localizations suggesting that sequence motifs are involved in subcellular localization. In summary, we developed genomic data mining methods for the functional characterization of lncRNAs based on their expression patterns and transcript sequences. iii DEDICATION For my mom and dad, without their lifelong love and support I would have never made it this far. iv ACKNOWLEDGMENTS I want to sincerely thank my research advisor Dr. Wang for his excellent mentorship on bioinformatics, genetics and science. I would not be the scientist I am today without his guidance. I also want to extend gratitude to my graduate committee members, Dr. Luo, Dr. Srivastava and Dr. Sekhon. They have been incredible resources of scientific wisdom which have greatly helped my academic development. v TABLE OF CONTENTS Page TITLE PAGE .................................................................................................................... i ABSTRACT ..................................................................................................................... ii DEDICATION ................................................................................................................ iv ACKNOWLEDGMENTS ............................................................................................... v LIST OF TABLES ........................................................................................................ viii LIST OF FIGURES ........................................................................................................ ix CHAPTER I. LITERATURE REVIEW OF THE FUNCTIONAL ANNOTATION OF LONG NONCODING RNAS THROUGH GENOMIC DATA MINING ... 1 Introduction .............................................................................................. 1 Biological function of lncRNAs .............................................................. 3 Subcellular localization of lncRNAs ....................................................... 5 lncRNAs in human disease ...................................................................... 8 Genomic data mining ............................................................................. 10 Expression-based lncRNA functional annotation .................................. 14 Sequence-based lncRNA functional annotation .................................... 17 Concluding remarks ............................................................................... 21 II. GENE COEXPRESSION NETWORKS IN HUMAN BRAIN DEVELOPMENTAL TRANSCRIPTOMES IMPLICATE THE ASSOCIATION OF LONG NONCODING RNAS WITH INTELLECTUAL DISABILITY ............................................................................................... 22 Introduction ............................................................................................ 23 Methods.................................................................................................. 25 Results and Discussion .......................................................................... 29 Conclusion ............................................................................................. 38 III. INTEGRATIVE GENOMIC ANALYSES FOR IDENTIFICATION AND PRIORITIZATION OF LONG NONCODING RNAS ASSOCIATED WITH AUTISM ...................................................................................................... 40 vi Introduction ............................................................................................ 41 Results and Discussion .......................................................................... 44 Conclusions ............................................................................................ 55 Methods.................................................................................................. 56 IV. A GENETIC ALGORITHM FOR FINDING DISCRIMINATIVE FUNCTIONAL MOTIFS IN LONG NONCODING RNA ........................ 62 Introduction ............................................................................................ 63 Methods.................................................................................................. 65 Results .................................................................................................... 68 Conclusion ............................................................................................. 71 V. PREDICTION OF LNCRNA SUBCELLULAR LOCALIZATION WITH DEEP LEARNING FROM SEQUENCE FEATURES ............................... 73 Introduction ............................................................................................ 74 Methods.................................................................................................. 77 Results .................................................................................................... 83 Conclusion ............................................................................................. 91 V. CONCLUSION ............................................................................................ 93 APPENDICES ............................................................................................................... 98 A: Additional files............................................................................................. 99 B: Supplementary figures ............................................................................... 100 REFERENCES ............................................................................................................ 103 vii LIST OF TABLES Table Page 2.1 Prioritized list of ID-associated lncRNA candidates ................................... 37 5.1 Performance metrics on the test set ............................................................. 86 viii LIST OF FIGURES Figure Page 1.1 Functional themes of lncRNAs ...................................................................... 5 1.2 LncRNA cellular functions ............................................................................ 7 1.3 Overview of coexpression network analysis...............................................
Recommended publications
  • ARTICLE Doi:10.1038/Nature10523
    ARTICLE doi:10.1038/nature10523 Spatio-temporal transcriptome of the human brain Hyo Jung Kang1*, Yuka Imamura Kawasawa1*, Feng Cheng1*, Ying Zhu1*, Xuming Xu1*, Mingfeng Li1*, Andre´ M. M. Sousa1,2, Mihovil Pletikos1,3, Kyle A. Meyer1, Goran Sedmak1,3, Tobias Guennel4, Yurae Shin1, Matthew B. Johnson1,Zˇeljka Krsnik1, Simone Mayer1,5, Sofia Fertuzinhos1, Sheila Umlauf6, Steven N. Lisgo7, Alexander Vortmeyer8, Daniel R. Weinberger9, Shrikant Mane6, Thomas M. Hyde9,10, Anita Huttner8, Mark Reimers4, Joel E. Kleinman9 & Nenad Sˇestan1 Brain development and function depend on the precise regulation of gene expression. However, our understanding of the complexity and dynamics of the transcriptome of the human brain is incomplete. Here we report the generation and analysis of exon-level transcriptome and associated genotyping data, representing males and females of different ethnicities, from multiple brain regions and neocortical areas of developing and adult post-mortem human brains. We found that 86 per cent of the genes analysed were expressed, and that 90 per cent of these were differentially regulated at the whole-transcript or exon level across brain regions and/or time. The majority of these spatio-temporal differences were detected before birth, with subsequent increases in the similarity among regional transcriptomes. The transcriptome is organized into distinct co-expression networks, and shows sex-biased gene expression and exon usage. We also profiled trajectories of genes associated with neurobiological categories and diseases, and identified associations between single nucleotide polymorphisms and gene expression. This study provides a comprehensive data set on the human brain transcriptome and insights into the transcriptional foundations of human neurodevelopment.
    [Show full text]
  • Experimental Eye Research 129 (2014) 93E106
    Experimental Eye Research 129 (2014) 93e106 Contents lists available at ScienceDirect Experimental Eye Research journal homepage: www.elsevier.com/locate/yexer Transcriptomic analysis across nasal, temporal, and macular regions of human neural retina and RPE/choroid by RNA-Seq S. Scott Whitmore a, b, Alex H. Wagner a, c, Adam P. DeLuca a, b, Arlene V. Drack a, b, Edwin M. Stone a, b, Budd A. Tucker a, b, Shemin Zeng a, b, Terry A. Braun a, b, c, * Robert F. Mullins a, b, Todd E. Scheetz a, b, c, a Stephen A. Wynn Institute for Vision Research, The University of Iowa, Iowa City, IA, USA b Department of Ophthalmology and Visual Sciences, Carver College of Medicine, The University of Iowa, Iowa City, IA, USA c Department of Biomedical Engineering, College of Engineering, The University of Iowa, Iowa City, IA, USA article info abstract Article history: Proper spatial differentiation of retinal cell types is necessary for normal human vision. Many retinal Received 14 September 2014 diseases, such as Best disease and male germ cell associated kinase (MAK)-associated retinitis pigmen- Received in revised form tosa, preferentially affect distinct topographic regions of the retina. While much is known about the 31 October 2014 distribution of cell types in the retina, the distribution of molecular components across the posterior pole Accepted in revised form 4 November 2014 of the eye has not been well-studied. To investigate regional difference in molecular composition of Available online 5 November 2014 ocular tissues, we assessed differential gene expression across the temporal, macular, and nasal retina and retinal pigment epithelium (RPE)/choroid of human eyes using RNA-Seq.
    [Show full text]
  • Impact of Disseminated Neuroblastoma Cells on the Identification of the Relapse-Seeding Clone
    Author Manuscript Published OnlineFirst on February 22, 2017; DOI: 10.1158/1078-0432.CCR-16-2082 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Impact of disseminated neuroblastoma cells on the identification of the relapse-seeding clone M. Reza Abbasi1, Fikret Rifatbegovic1, Clemens Brunner1, Georg Mann2, Andrea Ziegler1, Ulrike Pötschger1, Roman Crazzolara3, Marek Ussowicz4, Martin Benesch5, Georg Ebetsberger-Dachs6, Godfrey C.F. Chan7, Neil Jones8, Ruth Ladenstein1,9, Inge M. Ambros1, Peter F. Ambros1,9 1CCRI, Children’s Cancer Research Institute, Vienna, Austria; 2St. Anna Children’s Hospital, Vienna, Austria; 3Department of Pediatrics, Medical University of Innsbruck, Innsbruck, Austria; 4Department of Pediatric Hematology and Oncology, Wroclaw Medical University, Wroclaw, Poland; 5Department of Pediatrics and Adolescent Medicine, Medical University of Graz, Graz, Austria; 6Department of Pediatrics, Kepler University Clinic Linz, Linz, Austria; 7Department of Pediatrics and Adolescent Medicine, University of Hong Kong, Hong Kong; 8Department of Pediatrics and Adolescent Medicine, Paracelsus Medical University, Salzburg, Austria; 9Department of Pediatrics, Medical University of Vienna, Vienna, Austria. Running title: Genomics of disseminated neuroblastoma cells Keywords: Neuroblastoma, disseminated tumor cells, bone marrow, tumor evolution, tumor heterogeneity Financial Support: St. Anna Kinderkrebsforschung; Austrian National Bank (ÖNB), Grant Nos. 15114, 16611; Austrian Science
    [Show full text]
  • L1 Retrotransposition Is a Common Feature of Mammalian Hepatocarcinogenesis
    Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press Research L1 retrotransposition is a common feature of mammalian hepatocarcinogenesis Stephanie N. Schauer,1,12 Patricia E. Carreira,1,12 Ruchi Shukla,2,12 Daniel J. Gerhardt,1,3 Patricia Gerdes,1 Francisco J. Sanchez-Luque,1,4 Paola Nicoli,5 Michaela Kindlova,1 Serena Ghisletti,6 Alexandre Dos Santos,7,8 Delphine Rapoud,7,8 Didier Samuel,7,8 Jamila Faivre,7,8,9 Adam D. Ewing,1 Sandra R. Richardson,1 and Geoffrey J. Faulkner1,10,11 1Mater Research Institute–University of Queensland, Woolloongabba, QLD 4102, Australia; 2Northern Institute for Cancer Research, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom; 3Invenra, Incorporated, Madison, Wisconsin 53719, USA; 4Department of Genomic Medicine, GENYO, Centre for Genomics and Oncological Research: Pfizer-University of Granada- Andalusian Regional Government, PTS Granada, 18016 Granada, Spain; 5Department of Experimental Oncology, European Institute of Oncology, 20146 Milan, Italy; 6Humanitas Clinical and Research Center, 20089 Milan, Italy; 7INSERM, U1193, Paul- Brousse University Hospital, Hepatobiliary Centre, Villejuif 94800, France; 8Université Paris-Sud, Faculté de Médecine, Villejuif 94800, France; 9Assistance Publique-Hôpitaux de Paris (AP-HP), Pôle de Biologie Médicale, Paul-Brousse University Hospital, Villejuif 94800, France; 10School of Biomedical Sciences, University of Queensland, Brisbane, QLD 4072, Australia; 11Queensland Brain Institute, University of Queensland, Brisbane, QLD 4072, Australia The retrotransposon Long Interspersed Element 1 (LINE-1 or L1) is a continuing source of germline and somatic mutagenesis in mammals. Deregulated L1 activity is a hallmark of cancer, and L1 mutagenesis has been described in numerous human malignancies.
    [Show full text]
  • A High Throughput, Functional Screen of Human Body Mass Index GWAS Loci Using Tissue-Specific Rnai Drosophila Melanogaster Crosses Thomas J
    Washington University School of Medicine Digital Commons@Becker Open Access Publications 2018 A high throughput, functional screen of human Body Mass Index GWAS loci using tissue-specific RNAi Drosophila melanogaster crosses Thomas J. Baranski Washington University School of Medicine in St. Louis Aldi T. Kraja Washington University School of Medicine in St. Louis Jill L. Fink Washington University School of Medicine in St. Louis Mary Feitosa Washington University School of Medicine in St. Louis Petra A. Lenzini Washington University School of Medicine in St. Louis See next page for additional authors Follow this and additional works at: https://digitalcommons.wustl.edu/open_access_pubs Recommended Citation Baranski, Thomas J.; Kraja, Aldi T.; Fink, Jill L.; Feitosa, Mary; Lenzini, Petra A.; Borecki, Ingrid B.; Liu, Ching-Ti; Cupples, L. Adrienne; North, Kari E.; and Province, Michael A., ,"A high throughput, functional screen of human Body Mass Index GWAS loci using tissue-specific RNAi Drosophila melanogaster crosses." PLoS Genetics.14,4. e1007222. (2018). https://digitalcommons.wustl.edu/open_access_pubs/6820 This Open Access Publication is brought to you for free and open access by Digital Commons@Becker. It has been accepted for inclusion in Open Access Publications by an authorized administrator of Digital Commons@Becker. For more information, please contact [email protected]. Authors Thomas J. Baranski, Aldi T. Kraja, Jill L. Fink, Mary Feitosa, Petra A. Lenzini, Ingrid B. Borecki, Ching-Ti Liu, L. Adrienne Cupples, Kari E. North, and Michael A. Province This open access publication is available at Digital Commons@Becker: https://digitalcommons.wustl.edu/open_access_pubs/6820 RESEARCH ARTICLE A high throughput, functional screen of human Body Mass Index GWAS loci using tissue-specific RNAi Drosophila melanogaster crosses Thomas J.
    [Show full text]
  • Characterizing Genomic Duplication in Autism Spectrum Disorder by Edward James Higginbotham a Thesis Submitted in Conformity
    Characterizing Genomic Duplication in Autism Spectrum Disorder by Edward James Higginbotham A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department of Molecular Genetics University of Toronto © Copyright by Edward James Higginbotham 2020 i Abstract Characterizing Genomic Duplication in Autism Spectrum Disorder Edward James Higginbotham Master of Science Graduate Department of Molecular Genetics University of Toronto 2020 Duplication, the gain of additional copies of genomic material relative to its ancestral diploid state is yet to achieve full appreciation for its role in human traits and disease. Challenges include accurately genotyping, annotating, and characterizing the properties of duplications, and resolving duplication mechanisms. Whole genome sequencing, in principle, should enable accurate detection of duplications in a single experiment. This thesis makes use of the technology to catalogue disease relevant duplications in the genomes of 2,739 individuals with Autism Spectrum Disorder (ASD) who enrolled in the Autism Speaks MSSNG Project. Fine-mapping the breakpoint junctions of 259 ASD-relevant duplications identified 34 (13.1%) variants with complex genomic structures as well as tandem (193/259, 74.5%) and NAHR- mediated (6/259, 2.3%) duplications. As whole genome sequencing-based studies expand in scale and reach, a continued focus on generating high-quality, standardized duplication data will be prerequisite to addressing their associated biological mechanisms. ii Acknowledgements I thank Dr. Stephen Scherer for his leadership par excellence, his generosity, and for giving me a chance. I am grateful for his investment and the opportunities afforded me, from which I have learned and benefited. I would next thank Drs.
    [Show full text]
  • Variation in Protein Coding Genes Identifies Information Flow
    bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 1 1 2 3 4 5 Variation in protein coding genes identifies information flow as a contributor to 6 animal complexity 7 8 Jack Dean, Daniela Lopes Cardoso and Colin Sharpe* 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Institute of Biological and Biomedical Sciences 25 School of Biological Science 26 University of Portsmouth, 27 Portsmouth, UK 28 PO16 7YH 29 30 * Author for correspondence 31 [email protected] 32 33 Orcid numbers: 34 DLC: 0000-0003-2683-1745 35 CS: 0000-0002-5022-0840 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Abstract bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 2 1 Across the metazoans there is a trend towards greater organismal complexity. How 2 complexity is generated, however, is uncertain. Since C.elegans and humans have 3 approximately the same number of genes, the explanation will depend on how genes are 4 used, rather than their absolute number.
    [Show full text]
  • The Role of Lamin Associated Domains in Global Chromatin Organization and Nuclear Architecture
    THE ROLE OF LAMIN ASSOCIATED DOMAINS IN GLOBAL CHROMATIN ORGANIZATION AND NUCLEAR ARCHITECTURE By Teresa Romeo Luperchio A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy Baltimore, Maryland March 2016 © 2016 Teresa Romeo Luperchio All Rights Reserved ABSTRACT Nuclear structure and scaffolding have been implicated in expression and regulation of the genome (Elcock and Bridger 2010; Fedorova and Zink 2008; Ferrai et al. 2010; Li and Reinberg 2011; Austin and Bellini 2010). Discrete domains of chromatin exist within the nuclear volume, and are suggested to be organized by patterns of gene activity (Zhao, Bodnar, and Spector 2009). The nuclear periphery, which consists of the inner nuclear membrane and associated proteins, forms a sub- nuclear compartment that is mostly associated with transcriptionally repressed chromatin and low gene expression (Guelen et al. 2008). Previous studies from our lab and others have shown that repositioning genes to the nuclear periphery is sufficient to induce transcriptional repression (K L Reddy et al. 2008; Finlan et al. 2008). In addition, a number of studies have provided evidence that many tissue types, including muscle, brain and blood, use the nuclear periphery as a compartment during development to regulate expression of lineage specific genes (Meister et al. 2010; Szczerbal, Foster, and Bridger 2009; Yao et al. 2011; Kosak et al. 2002; Peric-Hupkes et al. 2010). These large regions of chromatin that come in molecular contact with the nuclear periphery are called Lamin Associated Domains (LADs). The studies described in this dissertation have furthered our understanding of maintenance and establishment of LADs as well as the relationship of LADs with the epigenome and other factors that influence three-dimensional chromatin structure.
    [Show full text]
  • Table S1. 103 Ferroptosis-Related Genes Retrieved from the Genecards
    Table S1. 103 ferroptosis-related genes retrieved from the GeneCards. Gene Symbol Description Category GPX4 Glutathione Peroxidase 4 Protein Coding AIFM2 Apoptosis Inducing Factor Mitochondria Associated 2 Protein Coding TP53 Tumor Protein P53 Protein Coding ACSL4 Acyl-CoA Synthetase Long Chain Family Member 4 Protein Coding SLC7A11 Solute Carrier Family 7 Member 11 Protein Coding VDAC2 Voltage Dependent Anion Channel 2 Protein Coding VDAC3 Voltage Dependent Anion Channel 3 Protein Coding ATG5 Autophagy Related 5 Protein Coding ATG7 Autophagy Related 7 Protein Coding NCOA4 Nuclear Receptor Coactivator 4 Protein Coding HMOX1 Heme Oxygenase 1 Protein Coding SLC3A2 Solute Carrier Family 3 Member 2 Protein Coding ALOX15 Arachidonate 15-Lipoxygenase Protein Coding BECN1 Beclin 1 Protein Coding PRKAA1 Protein Kinase AMP-Activated Catalytic Subunit Alpha 1 Protein Coding SAT1 Spermidine/Spermine N1-Acetyltransferase 1 Protein Coding NF2 Neurofibromin 2 Protein Coding YAP1 Yes1 Associated Transcriptional Regulator Protein Coding FTH1 Ferritin Heavy Chain 1 Protein Coding TF Transferrin Protein Coding TFRC Transferrin Receptor Protein Coding FTL Ferritin Light Chain Protein Coding CYBB Cytochrome B-245 Beta Chain Protein Coding GSS Glutathione Synthetase Protein Coding CP Ceruloplasmin Protein Coding PRNP Prion Protein Protein Coding SLC11A2 Solute Carrier Family 11 Member 2 Protein Coding SLC40A1 Solute Carrier Family 40 Member 1 Protein Coding STEAP3 STEAP3 Metalloreductase Protein Coding ACSL1 Acyl-CoA Synthetase Long Chain Family Member 1 Protein
    [Show full text]
  • Molecular Sciences High-Resolution Chromosome Ideogram Representation of Currently Recognized Genes for Autism Spectrum Disorder
    Int. J. Mol. Sci. 2015, 16, 6464-6495; doi:10.3390/ijms16036464 OPEN ACCESS International Journal of Molecular Sciences ISSN 1422-0067 www.mdpi.com/journal/ijms Article High-Resolution Chromosome Ideogram Representation of Currently Recognized Genes for Autism Spectrum Disorders Merlin G. Butler *, Syed K. Rafi † and Ann M. Manzardo † Departments of Psychiatry & Behavioral Sciences and Pediatrics, University of Kansas Medical Center, Kansas City, KS 66160, USA; E-Mails: [email protected] (S.K.R.); [email protected] (A.M.M.) † These authors contributed to this work equally. * Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.: +1-913-588-1873; Fax: +1-913-588-1305. Academic Editor: William Chi-shing Cho Received: 23 January 2015 / Accepted: 16 March 2015 / Published: 20 March 2015 Abstract: Recently, autism-related research has focused on the identification of various genes and disturbed pathways causing the genetically heterogeneous group of autism spectrum disorders (ASD). The list of autism-related genes has significantly increased due to better awareness with advances in genetic technology and expanding searchable genomic databases. We compiled a master list of known and clinically relevant autism spectrum disorder genes identified with supporting evidence from peer-reviewed medical literature sources by searching key words related to autism and genetics and from authoritative autism-related public access websites, such as the Simons Foundation Autism Research Institute autism genomic database dedicated to gene discovery and characterization. Our list consists of 792 genes arranged in alphabetical order in tabular form with gene symbols placed on high-resolution human chromosome ideograms, thereby enabling clinical and laboratory geneticists and genetic counsellors to access convenient visual images of the location and distribution of ASD genes.
    [Show full text]
  • L1 Retrotransposition Is a Common Feature of Mammalian Hepatocarcinogenesis
    Downloaded from genome.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press Research L1 retrotransposition is a common feature of mammalian hepatocarcinogenesis Stephanie N. Schauer,1,12 Patricia E. Carreira,1,12 Ruchi Shukla,2,12 Daniel J. Gerhardt,1,3 Patricia Gerdes,1 Francisco J. Sanchez-Luque,1,4 Paola Nicoli,5 Michaela Kindlova,1 Serena Ghisletti,6 Alexandre Dos Santos,7,8 Delphine Rapoud,7,8 Didier Samuel,7,8 Jamila Faivre,7,8,9 Adam D. Ewing,1 Sandra R. Richardson,1 and Geoffrey J. Faulkner1,10,11 1Mater Research Institute–University of Queensland, Woolloongabba, QLD 4102, Australia; 2Northern Institute for Cancer Research, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom; 3Invenra, Incorporated, Madison, Wisconsin 53719, USA; 4Department of Genomic Medicine, GENYO, Centre for Genomics and Oncological Research: Pfizer-University of Granada- Andalusian Regional Government, PTS Granada, 18016 Granada, Spain; 5Department of Experimental Oncology, European Institute of Oncology, 20146 Milan, Italy; 6Humanitas Clinical and Research Center, 20089 Milan, Italy; 7INSERM, U1193, Paul- Brousse University Hospital, Hepatobiliary Centre, Villejuif 94800, France; 8Université Paris-Sud, Faculté de Médecine, Villejuif 94800, France; 9Assistance Publique-Hôpitaux de Paris (AP-HP), Pôle de Biologie Médicale, Paul-Brousse University Hospital, Villejuif 94800, France; 10School of Biomedical Sciences, University of Queensland, Brisbane, QLD 4072, Australia; 11Queensland Brain Institute, University of Queensland, Brisbane, QLD 4072, Australia The retrotransposon Long Interspersed Element 1 (LINE-1 or L1) is a continuing source of germline and somatic mutagenesis in mammals. Deregulated L1 activity is a hallmark of cancer, and L1 mutagenesis has been described in numerous human malignancies.
    [Show full text]
  • Supplementary Table 1. Loci Associated with Heifer Conception Rate at First Breeding BTA1 UMD 3.1 Position2 SNP ID3,4 P-Value5 M
    Supplementary Materials: Supplementary Table 1. Loci Associated with Heifer Conception Rate at First Breeding BTA1 UMD 3.1 SNP ID3,4 p-value5 Models6 Positional Proportion Position2 Candidate of Variance Genes7 Explained8 -10 ǂ 1 3,546,098 rs136767715 7.56 X 10 dominant TIAM1 0.02 1 7,862,063 rs135020930 6.11 X 10-15 additive 0.03 -15 6.11 X 10 dominant 0.03 1 11,250,683 rs110647268* 9.77 X 10-16 additive 0.03 9.77 X 10-16 dominant 0.03 -9 1 11,812,309 rs109447046 8.50 X 10 dominant 0.02 1 16,446,932 rs110075902* 1.29 X 10-8 additive 0.02 -8 3.07 X 10 dominant 0.02 1 29,618,772 rs43224096 2.35 X 10-16 additive 0.03 1.67 X 10-18 dominant 0.04 1 32,700,496 rs43226666* 1.09 X 10-8 additive CADM2ǂǂ 0.02 1.55 X 10-11 dominant 0.02 -12 ǂ 1 45,719,982 rs43682172 5.72 X 10 dominant ABI3BP 0.02 1 62,056,267 rs42760220 3.48 X 10-9 dominant 0.02 1 62,158,499 rs42675527 5.50 X 10-10 additive LOC538060 0.02 6.96 X 10-13 dominant 0.03 1 64,591,504 rs43652271 2.57 X 10-10 additive UPK1Bǂ 0.02 4.01 X 10-12 dominant 0.02 -11 ǂ 1 64,894,430 rs42911131 2.48 X 10 dominant CD80 0.02 1 86,830,692 rs110457263 1.40 X 10-8 additive 0.02 1 94,300,486 rs41606324* 6.28 X 10-24 additive NLGN1ǂ 0.05 3.65 X 10-24 dominant 0.05 -8 ǂ 1 94,998,936 rs135160436 1.17 X 10 dominant SPATA16 0.02 1 99,584,226 rs110848318 1.05 X 10-11 dominant 0.02 1 112,961,101 rs110435444 4.04 X 10-10 additive PLCH1ǂ 0.02 -12 1.09 X 10 dominant 0.03 1 117,359,373 rs134846486 1.78 X 10-9 dominant 0.02 -8 ǂ 1 122,822,402 rs110884711* 1.82 X 10 dominant PLSCR5 0.02 1 123,956,245 rs132851673 6.18
    [Show full text]