A Comprehensive Database of A-To-I RNA Editing Events in Humans Ernesto Picardi1,2,3,*, Anna Maria D’Erchia1,2, Claudio Lo Giudice1 and Graziano Pesole1,2,3,*

Total Page:16

File Type:pdf, Size:1020Kb

A Comprehensive Database of A-To-I RNA Editing Events in Humans Ernesto Picardi1,2,3,*, Anna Maria D’Erchia1,2, Claudio Lo Giudice1 and Graziano Pesole1,2,3,* Nucleic Acids Research Advance Access published September 1, 2016 Nucleic Acids Research, 2016 1 doi: 10.1093/nar/gkw767 REDIportal: a comprehensive database of A-to-I RNA editing events in humans Ernesto Picardi1,2,3,*, Anna Maria D’Erchia1,2, Claudio Lo Giudice1 and Graziano Pesole1,2,3,* 1Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari, Via Orabona 4, 70126 Bari, Italy, 2Institute of Biomembranes and Bioenergetics, National Research Council, Via Amendola 165/A, 70126 Bari, Italy and 3National Institute of Biostructures and Biosystems (INBB), 00136 Roma, Italy Received August 01, 2016; Revised August 19, 2016; Accepted August 22, 2016 ABSTRACT diseases and cancer (3–5). RNA editing by A-to-I modifi- cation contributes to transcriptome and proteome expan- RNA editing by A-to-I deamination is the prominent sion (6) and has several functional and regulatory implica- / Downloaded from co- post-transcriptional modification in humans. It tions, altering codon identity, creating or destroying splice is carried out by ADAR enzymes and contributes to sites and affecting base-pairing interactions in secondary both transcriptomic and proteomic expansion. RNA and tertiary RNA structures (7,8). editing has pivotal cellular effects and its deregula- The advent of high-throughput sequencing technologies tion has been linked to a variety of human disorders has largely improved the computational detection of RNA including neurological and neurodegenerative dis- editing events at genomic scale (9), revealing its pervasive http://nar.oxfordjournals.org/ eases and cancer. Despite its biological relevance, nature in the human transcriptome. Recently, we have pro- many physiological and functional aspects of RNA filed RNA editing in six human tissues (brain, lung, mus- editing are yet elusive. Here, we present REDIportal, cle, heart, kidney and liver) from three individuals using high coverage directional RNAseq and whole genome se- available online at http://srv00.recas.ba.infn.it/atlas/, quencing (WGS) data (6). By our large survey, we identified the largest and comprehensive collection of RNA more than 3 millions of A-to-I events differently distributed editing in humans including more than 4.5 millions across six tissues, thus producing the first RNA editing at- of A-to-I events detected in 55 body sites from thou- las in humans (6). Despite these findings, many functional sands of RNAseq experiments. REDIportal embeds aspects of RNA editing are yet unknown and further in- by guest on October 24, 2016 RADAR database and represents the first editing re- vestigations are needed to elucidate the dynamic regulation source designed to answer functional questions, en- of editing sites. To shed light on potential functional roles abling the inspection and browsing of editing lev- of RNA editing, we have developed an ad hoc bioinformat- els in a variety of human samples, tissues and ics resource named REDIportal, comprising the largest and body sites. In contrast with previous RNA editing non-redundant collection of RNA editing events across 55 databases, REDIportal comprises its own browser human body sites grouped in 30 tissues. Currently, RNA editing events are annotated in three (JBrowse) that allows users to explore A-to-I changes main databases: DARNED (http://darned.ucc.ie/)(10), in their genomic context, empathizing repetitive ele- RADAR (http://rnaedit.com/)(11) and REDIdb (http: ments in which RNA editing is prominent. //srv00.recas.ba.infn.it/py script/REDIdb/)(12,13). While the last is devoted to organellar RNA editing, DARNED INTRODUCTION provides information on A-to-I changes for human, mouse A growing literature describes RNA editing as an essential and fruit fly. However, it is not updated since 2013 and does co-/post-transcriptional process, whereby a genetic mes- not provide editing levels information. RADAR, instead, sage is modified from the corresponding DNA template by annotates A-to-I events in human, mouse and fly likewise means of substitutions, insertions and/or deletions (1). The DARNED and incorporates editing levels for 38% of stored deamination of adenosines (As) to inosines (Is) by the fam- positions, since based on a limited number of RNAseq sam- ily of ADAR enzymes acting on double RNA strands is the ples and mainly from LCL cell lines that may not be optimal prominent RNA editing event occurring in humans (2). A- for RNA editing studies (6). to-I changes are pivotal for cellular homeostasis as attested In contrast, REDIportal includes more than 4.5 millions by the association between RNA editing dysregulation and of A-to-I changes obtained merging RNA editing positions human disorders such as neurological/neurodegenerative from our Inosinome ATLAS (6) and RADAR database *To whom correspondence should be addressed. Tel: +39 0805443588; Fax: +39 0805443317; Email: [email protected] Correspondence may also be addressed to Ernesto Picardi. Tel: +39 0805443308; Fax: +39 0805443317; Email: [email protected] C The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] 2 Nucleic Acids Research, 2016 (11). This large and non-redundant collection of RNA edit- combination of two computational strategies on strand ori- ing sites has been employed to interrogate more than 2,500 ented RNAseq reads (6). Initially, RNA editing events were GTEx RNAseq experiments from 55 body sites of 147 indi- detected using REDItools (15) and stringent filters, espe- viduals for which WGS data are available. REDIportal al- cially in case of positions falling in non-repetitive regions for lows the study of dynamic regulation of RNA editing con- which the RNA editing detection is challenging (6). Editing tributing to elucidate its biological roles in physiological candidates not supported by homozygous genomic DNA, and pathological conditions. REDIportal annotates also a obtained by whole genome resequencing of the same indi- plethora of additional info and embeds a specific genome vidual, were excluded (6). Then, unaligned RNA reads were browser (JBrowse) to explore RNA editing events in their rescued through the pipeline described in Porath et al. (16) genomic context. in order to detect RNA editing sites in hyper-edited reads. Our portal has been conceived to collect RNA editing Merging ATLAS and RADAR positions yielded a com- events/levels from a huge amount of RNAseq data in order prehensive and non-redundant RNA editing catalogue to create the largest repository and bioinformatics infras- comprising 4,668,508 sites. This huge collection was used tructure for RNA editing. to interrogate aligned RNAseq reads from GTEx project Hereafter, we describe main REDIportal features includ- through REDItools, employing a large computational farm ing database architecture and content as well as source data at the Italian National Institute for Nuclear Physics (INFN) for calling A-to-I events. that includes about 10,000 cores. An ad hoc script was finally applied to add genomic support from GTEx WGS data to exclude SNPs resembling editing events at transcript level Downloaded from RNAseq DATA COLLECTION (Figure 1). The number of detected RNA editing events per RNA editing data stored in REDIportal derive from 2,660 tissue group as well as the amount of tissue exclusive A-to-I RNAseq experiments in 150 human individuals. Of these, changes are reported in Table 1 and graphically displayed in 2,642 RNAseqs originate from the Genotype-Tissue Ex- Figure 2A. pression (GTEx) project, the largest collection of high- RNA editing sites were annotated using ANNOVAR http://nar.oxfordjournals.org/ throughput genomic data for studying gene expression in (17) tool and the following databases: (i) RepeatMasker for different normal tissues obtained from hundreds of individ- repetitive elements; (ii) dbSNP (version 142) for genomic uals. Remaining 18 RNAseqs were produced in our lab and single nucleotide polymorphisms; (iii) Gencode (v19), Ref- used to create the first RNA editing atlas in humans (6). Al- seq and UCSC for gene and transcript annotations; (iv) though the current GTEx release (v6) includes more than PhastCons for conservation scores across 46 species and 8,500 RNAseq data, we selected only 2,642 experiments for (v) RADAR and DARNED for known A-to-I changes. All which matched RNAseq and WGS data were available, al- repositories but RADAR and DARNED were downloaded lowing more reliable RNA editing calls. from UCSC genome browser. RADAR and DARNED po- RNAseq data used in REDIportal encompass 55 hu- sitions were obtained from corresponding web sites. by guest on October 24, 2016 man body sites from 30 different tissues with an over- representation of brain, skin, blood and esophagus (Table DATABASE CONTENT AND ARCHITECTURE 1). On average, there are 18 RNAseq data per individual and 50 million reads per experiment. The majority of RNAseq REDIportal collects 4,668,508 A-to-I editing sites in two data derives from unstranded libraries of polyA enriched main MySQL tables. The first table stores basic info includ- RNA (2×76 bp), while only 72 experiments are from li- ing genomic positions, strand, genes and transcripts, SNP braries of total RNA preserving strand orientation (2×100 accessions and RepeatMasker elements. This table com- and 2×150 bp). prises also a binary string for a fast search of tissues and GTEx datasets were downloaded from the database body sites in which each RNA editing position has been of Genotypes and Phenotypes (dbGaP) with accession observed. The second table, instead, includes RNA editing phs000424.v6.p1 in sra format and converted in standard levels per tissue and body site as well as RNAseq and WGS fastq by means of fastq-dump program that is part of the support.
Recommended publications
  • Analyses of Allele-Specific Gene Expression in Highly Divergent
    ARTICLES Analyses of allele-specific gene expression in highly divergent mouse crosses identifies pervasive allelic imbalance James J Crowley1,10, Vasyl Zhabotynsky1,10, Wei Sun1,2,10, Shunping Huang3, Isa Kemal Pakatci3, Yunjung Kim1, Jeremy R Wang3, Andrew P Morgan1,4,5, John D Calaway1,4,5, David L Aylor1,9, Zaining Yun1, Timothy A Bell1,4,5, Ryan J Buus1,4,5, Mark E Calaway1,4,5, John P Didion1,4,5, Terry J Gooch1,4,5, Stephanie D Hansen1,4,5, Nashiya N Robinson1,4,5, Ginger D Shaw1,4,5, Jason S Spence1, Corey R Quackenbush1, Cordelia J Barrick1, Randal J Nonneman1, Kyungsu Kim2, James Xenakis2, Yuying Xie1, William Valdar1,4, Alan B Lenarcic1, Wei Wang3,9, Catherine E Welsh3, Chen-Ping Fu3, Zhaojun Zhang3, James Holt3, Zhishan Guo3, David W Threadgill6, Lisa M Tarantino7, Darla R Miller1,4,5, Fei Zou2,11, Leonard McMillan3,11, Patrick F Sullivan1,5,7,8,11 & Fernando Pardo-Manuel de Villena1,4,5,11 Complex human traits are influenced by variation in regulatory DNA through mechanisms that are not fully understood. Because regulatory elements are conserved between humans and mice, a thorough annotation of cis regulatory variants in mice could aid in further characterizing these mechanisms. Here we provide a detailed portrait of mouse gene expression across multiple tissues in a three-way diallel. Greater than 80% of mouse genes have cis regulatory variation. Effects from these variants influence complex traits and usually extend to the human ortholog. Further, we estimate that at least one in every thousand SNPs creates a cis regulatory effect.
    [Show full text]
  • BIOINFORMATICS Pages 1–7
    Vol. 00 no. 00 2010 BIOINFORMATICS Pages 1–7 Integrative classification and analysis of multiple arrayCGH datasets with probe alignment Ze Tian and Rui Kuang∗ Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, USA Received on XXXXX; revised on XXXXX; accepted on XXXXX Associate Editor: XXXXXXX ABSTRACT 2009). Chromosome copy number variations can be measured by Motivation: Array comparative genomic hybridization (ArrayCGH) comparative genomic hybridization (CGH), which compares the is widely used to measure DNA copy numbers in cancer research. copy number of a differentially labeled case sample with a reference ArrayCGH data report log-ratio intensities of thousands of probes DNA from a normal individual. ArrayCGH technology based on sampled along the chromosomes. Typically, the choices of the DNA microarray can currently allow genome-wide identification locations and the lengths of the probes vary in different experiments. of regions with copy number variations at different resolutions This discrepancy in choosing probes poses a challenge in integrated (Carter, 2007). The arrayCGH data was used to discriminate healthy classification or analysis across multiple arrayCGH datasets. We patients from cancer patients and classify patients of different cancer propose an alignment based framework to integrate arrayCGH subtypes. Thus, arrayCGH data is considered as a new source of samples generated from different probe sets. The alignment biomarkers that provide important information of candidate cancer framework seeks an optimal alignment between the probe series loci for the classification of patients and discovery of molecular of one arrayCGH sample and the probe series of another sample, mechanisms of cancers (Sykes et al., 2009).
    [Show full text]
  • Genome-Wide Profiling of RNA Editing Sites in Sheep Yuanyuan Zhang1,2, Deping Han1, Xianggui Dong1, Jiankui Wang1, Jianfei Chen1, Yanzhu Yao1, Hesham Y
    Zhang et al. Journal of Animal Science and Biotechnology (2019) 10:31 https://doi.org/10.1186/s40104-019-0331-z RESEARCH Open Access Genome-wide profiling of RNA editing sites in sheep Yuanyuan Zhang1,2, Deping Han1, Xianggui Dong1, Jiankui Wang1, Jianfei Chen1, Yanzhu Yao1, Hesham Y. A. Darwish1,3, Wansheng Liu2* and Xuemei Deng1* Abstract Background: The widely observed RNA-DNA differences (RDDs) have been found to be due to nucleotide alteration by RNA editing. Canonical RNA editing (i.e., A-to-I and C-to-U editing) mediated by the adenosine deaminases acting on RNA (ADAR) family and apolipoprotein B mRNA editing catalytic polypeptide-like (APOBEC) family during the transcriptional process is considered common and essential for the development of an individual. To date, an increasing number of RNA editing sites have been reported in human, rodents, and some farm animals; however, genome-wide detection of RNA editing events in sheep has not been reported. The aim of this study was to identify RNA editing events in sheep by comparing the RNA-seq and DNA-seq data from three biological replicates of the kidney and spleen tissues. Results: A total of 607 and 994 common edited sites within the three biological replicates were identified in the ovine kidney and spleen, respectively. Many of the RDDs were specific to an individual. The RNA editing-related genes identified in the present study might be evolved for specific biological functions in sheep, such as structural constituent of the cytoskeleton and microtubule-based processes. Furthermore, the edited sites found in the ovine BLCAP and NEIL1 genes are in line with those in previous reports on the porcine and human homologs, suggesting the existence of evolutionarily conserved RNA editing sites and they may play an important role in the structure and function of genes.
    [Show full text]
  • Chromosome 20 Shows Linkage with DSM-IV Nicotine Dependence in Finnish Adult Smokers
    Nicotine & Tobacco Research, Volume 14, Number 2 (February 2012) 153–160 Original Investigation Chromosome 20 Shows Linkage With DSM-IV Nicotine Dependence in Finnish Adult Smokers Kaisu Keskitalo-Vuokko, Ph.D.,1 Jenni Hällfors, M.Sc.,1,2 Ulla Broms, Ph.D.,1,3 Michele L. Pergadia, Ph.D.,4 Scott F. Saccone, Ph.D.,4 Anu Loukola, Ph.D.,1,3 Pamela A. F. Madden, Ph.D.,4 & Jaakko Kaprio, M.D., Ph.D.1,2,3 1 Hjelt Institute, Department of Public Health, University of Helsinki, Helsinki, Finland 2 Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland 3 National Institute for Health and Welfare (THL), Helsinki, Finland 4 Department of Psychiatry, Washington University School of Medicine, St. Louis, MO Corresponding Author: Jaakko Kaprio, M.D., Ph.D., Department of Public Health, University of Helsinki, PO Box 41 (Manner- heimintie 172), Helsinki 00014, Finland. Telephone: +358-9-191-27595; Fax: +358-9-19127570; E-mail: [email protected] Received February 5, 2011; accepted June 21, 2011 2009). Despite several gene-mapping studies, the genes underlying Abstract liability to nicotine dependence (ND) remain largely unknown. Introduction: Chromosome 20 has previously been associated Recently, Han, Gelernter, Luo, and Yang (2010) performed a with nicotine dependence (ND) and smoking cessation. Our meta-analysis of 15 genome-wide linkage scans of smoking aim was to replicate and extend these findings. behavior. Linkage signals were observed on chromosomal regions 17q24.3–q25.3, 5q33.1–q35.2, 20q13.12–32, and 22q12.3–13.32. Methods: First, a total of 759 subjects belonging to 206 Finnish The relevance of the chromosome 20 finding is highlighted families were genotyped with 18 microsatellite markers residing by the fact that CHRNA4 encoding the nicotinic acetylcholine on chromosome 20, in order to replicate previous linkage findings.
    [Show full text]
  • Detailed Characterization of Human Induced Pluripotent Stem Cells Manufactured for Therapeutic Applications
    Stem Cell Rev and Rep DOI 10.1007/s12015-016-9662-8 Detailed Characterization of Human Induced Pluripotent Stem Cells Manufactured for Therapeutic Applications Behnam Ahmadian Baghbaderani 1 & Adhikarla Syama2 & Renuka Sivapatham3 & Ying Pei4 & Odity Mukherjee2 & Thomas Fellner1 & Xianmin Zeng3,4 & Mahendra S. Rao5,6 # The Author(s) 2016. This article is published with open access at Springerlink.com Abstract We have recently described manufacturing of hu- help determine which set of tests will be most useful in mon- man induced pluripotent stem cells (iPSC) master cell banks itoring the cells and establishing criteria for discarding a line. (MCB) generated by a clinically compliant process using cord blood as a starting material (Baghbaderani et al. in Stem Cell Keywords Induced pluripotent stem cells . Embryonic stem Reports, 5(4), 647–659, 2015). In this manuscript, we de- cells . Manufacturing . cGMP . Consent . Markers scribe the detailed characterization of the two iPSC clones generated using this process, including whole genome se- quencing (WGS), microarray, and comparative genomic hy- Introduction bridization (aCGH) single nucleotide polymorphism (SNP) analysis. We compare their profiles with a proposed calibra- Induced pluripotent stem cells (iPSCs) are akin to embryonic tion material and with a reporter subclone and lines made by a stem cells (ESC) [2] in their developmental potential, but dif- similar process from different donors. We believe that iPSCs fer from ESC in the starting cell used and the requirement of a are likely to be used to make multiple clinical products. We set of proteins to induce pluripotency [3]. Although function- further believe that the lines used as input material will be used ally identical, iPSCs may differ from ESC in subtle ways, at different sites and, given their immortal status, will be used including in their epigenetic profile, exposure to the environ- for many years or even decades.
    [Show full text]
  • Mclean, Chelsea.Pdf
    COMPUTATIONAL PREDICTION AND EXPERIMENTAL VALIDATION OF NOVEL MOUSE IMPRINTED GENES A Dissertation Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Chelsea Marie McLean August 2009 © 2009 Chelsea Marie McLean COMPUTATIONAL PREDICTION AND EXPERIMENTAL VALIDATION OF NOVEL MOUSE IMPRINTED GENES Chelsea Marie McLean, Ph.D. Cornell University 2009 Epigenetic modifications, including DNA methylation and covalent modifications to histone tails, are major contributors to the regulation of gene expression. These changes are reversible, yet can be stably inherited, and may last for multiple generations without change to the underlying DNA sequence. Genomic imprinting results in expression from one of the two parental alleles and is one example of epigenetic control of gene expression. So far, 60 to 100 imprinted genes have been identified in the human and mouse genomes, respectively. Identification of additional imprinted genes has become increasingly important with the realization that imprinting defects are associated with complex disorders ranging from obesity to diabetes and behavioral disorders. Despite the importance imprinted genes play in human health, few studies have undertaken genome-wide searches for new imprinted genes. These have used empirical approaches, with some success. However, computational prediction of novel imprinted genes has recently come to the forefront. I have developed generalized linear models using data on a variety of sequence and epigenetic features within a training set of known imprinted genes. The resulting models were used to predict novel imprinted genes in the mouse genome. After imposing a stringency threshold, I compiled an initial candidate list of 155 genes.
    [Show full text]
  • High-Throughput Biochemical Analysis of In-Vivo Location Data Reveals Novel Distinct Classes of POU5F1(Oct4)/DNA Complexes. ABST
    Downloaded from genome.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press High-throughput Biochemical Analysis of in-vivo Location Data Reveals Novel Distinct Classes of POU5F1(Oct4)/DNA complexes. Dean Tantin1^, Matthew Gemberling2^, Catherine Callister1, William Fairbrother* 2,3 1 2 3 Department of Pathology, University of Utah School of Medicine, Salt Lake City, Utah 84112; MCB Department, Brown University, Providence, Rhode Island. Center for Computational Molecular Biology, Brown University, Providence, Rhode Island 02912 ^ These Authors contributed equally to this work. *To whom correspondence should be addressed: Will Fairbrother, Brown University, Providence RI [email protected] ABSTRACT: The transcription factor POU5F1 is a key regulator of embryonic stem (ES) cell pluripotency and a known oncoprotein. We have developed a novel high-throughput binding assay called MEGAshift (microarray evaluation of genomic aptamers by shift) that we use to pinpoint the exact location, affinity and stoichiometry of the DNA-protein complexes identified by chromatin immunoprecipitation studies. We consider all genomic regions identified as POU5F1-ChIP-enriched in both human and mouse. Compared to regions that are ChIP- enriched in a single species, we find these regions more likely to be near actively transcribed genes in ES cells. We re-synthesize these genomic regions as a pool of tiled 35-mers. This oligonucleotide pool is then assayed for binding to recombinant POU5F1 by gel shift. The degree of binding for each oligonucleotide is accurately measured on a specially designed microarray. We explore the relationship between experimentally determined and computationally predicted binding strengths, find many novel functional combinations of POU5F1 half sites and demonstrate efficient motif discovery by incorporating binding information into a motif finding algorithm.
    [Show full text]
  • The DNA Sequence and Comparative Analysis of Human Chromosome 20
    articles The DNA sequence and comparative analysis of human chromosome 20 P. Deloukas, L. H. Matthews, J. Ashurst, J. Burton, J. G. R. Gilbert, M. Jones, G. Stavrides, J. P. Almeida, A. K. Babbage, C. L. Bagguley, J. Bailey, K. F. Barlow, K. N. Bates, L. M. Beard, D. M. Beare, O. P. Beasley, C. P. Bird, S. E. Blakey, A. M. Bridgeman, A. J. Brown, D. Buck, W. Burrill, A. P. Butler, C. Carder, N. P. Carter, J. C. Chapman, M. Clamp, G. Clark, L. N. Clark, S. Y. Clark, C. M. Clee, S. Clegg, V. E. Cobley, R. E. Collier, R. Connor, N. R. Corby, A. Coulson, G. J. Coville, R. Deadman, P. Dhami, M. Dunn, A. G. Ellington, J. A. Frankland, A. Fraser, L. French, P. Garner, D. V. Grafham, C. Grif®ths, M. N. D. Grif®ths, R. Gwilliam, R. E. Hall, S. Hammond, J. L. Harley, P. D. Heath, S. Ho, J. L. Holden, P. J. Howden, E. Huckle, A. R. Hunt, S. E. Hunt, K. Jekosch, C. M. Johnson, D. Johnson, M. P. Kay, A. M. Kimberley, A. King, A. Knights, G. K. Laird, S. Lawlor, M. H. Lehvaslaiho, M. Leversha, C. Lloyd, D. M. Lloyd, J. D. Lovell, V. L. Marsh, S. L. Martin, L. J. McConnachie, K. McLay, A. A. McMurray, S. Milne, D. Mistry, M. J. F. Moore, J. C. Mullikin, T. Nickerson, K. Oliver, A. Parker, R. Patel, T. A. V. Pearce, A. I. Peck, B. J. C. T. Phillimore, S. R. Prathalingam, R. W. Plumb, H. Ramsay, C. M.
    [Show full text]
  • Uncovering Cancer Gene Regulation by Accurate Regulatory Network Inference from Uninformative Data
    www.nature.com/npjsba ARTICLE OPEN Uncovering cancer gene regulation by accurate regulatory network inference from uninformative data Deniz Seçilmiş 1, Thomas Hillerton1, Daniel Morgan 1, Andreas Tjärnberg 2, Sven Nelander3, Torbjörn E. M. Nordling 4 and ✉ Erik L. L. Sonnhammer 1 The interactions among the components of a living cell that constitute the gene regulatory network (GRN) can be inferred from perturbation-based gene expression data. Such networks are useful for providing mechanistic insights of a biological system. In order to explore the feasibility and quality of GRN inference at a large scale, we used the L1000 data where ~1000 genes have been perturbed and their expression levels have been quantified in 9 cancer cell lines. We found that these datasets have a very low signal-to-noise ratio (SNR) level causing them to be too uninformative to infer accurate GRNs. We developed a gene reduction pipeline in which we eliminate uninformative genes from the system using a selection criterion based on SNR, until reaching an informative subset. The results show that our pipeline can identify an informative subset in an overall uninformative dataset, allowing inference of accurate subset GRNs. The accurate GRNs were functionally characterized and potential novel cancer-related regulatory interactions were identified. npj Systems Biology and Applications (2020) 6:37 ; https://doi.org/10.1038/s41540-020-00154-6 1234567890():,; INTRODUCTION where the main aim is to improve the SNR of the dataset by Living organisms are orchestrated by the biochemical reactions permanently removing the least informative genes and their that occur as a result of the interactions between biomolecules.
    [Show full text]
  • RNA Over-Editing of BLCAP Contributes To
    Cancer Letters 357 (2015) 510–519 Contents lists available at ScienceDirect Cancer Letters journal homepage: www.elsevier.com/locate/canlet Original Articles RNA over-editing of BLCAP contributes to hepatocarcinogenesis identified by whole-genome and transcriptome sequencing Xueda Hu a,b,1, Shengqing Wan b,1, Ying Ou c,1, Boping Zhou a,d,1, Jialou Zhu b, Xin Yi b, Yanfang Guan b, Wenlong Jia b, Xing Liu c, Qiudao Wang c, Yao Qi c, Qing Yuan c, Wanqiu Huang e, Weijia Liao f, Yun Wang c, Qinghua Zhang c, Huasheng Xiao c, Xinchun Chen a,d, Jian Huang a,c,d,* a Shenzhen Key Laboratory of Infection and Immunity, Shenzhen Third People’s Hospital, Guangdong Medical College, Shenzhen 518112, China b BGI-Shenzhen, Shenzhen 518083, China c Shanghai-MOST Key Laboratory for Disease and Health Genomics, Chinese National Human Genome Center and National Engineering Center for Biochip at Shanghai, Shanghai, China d Guangdong Key Laboratory of Diagnosis & Treatment for Emerging Infectious Disease, Shenzhen Third People’s Hospital, Guangdong Medical college, Shenzhen 518112, China e Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China f Hepatology Institute of Guilin Medical University, Guilin, Guangxi Zhuang Autonomous Region, China ARTICLE INFO ABSTRACT Article history: Hepatocellular carcinoma (HCC) is one of the most common cancers worldwide, although the treat- Received 2 October 2014 ment of this disease has changed little in recent decades because most of the genetic events that initiate Received in revised form 26 November this disease remain unknown. To better understand HCC pathogenesis at the molecular level and to uncover 2014 novel tumor-initiating events, we integrated RNA-seq and DNA-seq data derived from two pairs of HCC Accepted 2 December 2014 tissues.
    [Show full text]
  • And Tissue-Specific Imprinting of a Tumour Suppressor Gene
    Human Molecular Genetics, 2009, Vol. 18, No. 1 118–127 doi:10.1093/hmg/ddn322 Advance Access published on October 4, 2008 Transcript- and tissue-specific imprinting of a tumour suppressor gene Reiner Schulz1, Ruth B. McCole1, Kathryn Woodfine1,{, Andrew J. Wood1,{, Mandeep Chahal1, David Monk2, Gudrun E. Moore2 and Rebecca J. Oakey1,Ã 1Department of Medical and Molecular Genetics, King’s College London, London SE1 9RT, UK and 2Clinical and Molecular Genetics, Institute of Child Health, University College London, London WC1N 1EH, UK Received June 25, 2008; Revised September 21, 2008; Accepted October 2, 2008 The Bladder Cancer-Associated Protein gene (BLCAP; previously BC10) is a tumour suppressor that limits cell proliferation and stimulates apoptosis. BLCAP protein or message are downregulated or absent in a var- iety of human cancers. In mouse and human, the first intron of Blcap/BLCAP contains the distinct Neuronatin (Nnat/NNAT) gene. Nnat is an imprinted gene that is exclusively expressed from the paternally inherited allele. Previous studies found no evidence for imprinting of Blcap in mouse or human. Here we show that Blcap is imprinted in mouse and human brain, but not in other mouse tissues. Moreover, Blcap produces multiple dis- tinct transcripts that exhibit reciprocal allele-specific expression in both mouse and human. We propose that the tissue-specific imprinting of Blcap is due to the particularly high transcriptional activity of Nnat in brain, as has been suggested previously for the similarly organized and imprinted murine Commd1/U2af1-rs1 locus. For Commd1/U2af1-rs1, we show that it too produces distinct transcript variants with reciprocal allele- specific expression.
    [Show full text]
  • Evolutionarily Conserved Human Targets of Adenosine to Inosine RNA Editing Erez Y
    1162–1168 Nucleic Acids Research, 2005, Vol. 33, No. 4 doi:10.1093/nar/gki239 Evolutionarily conserved human targets of adenosine to inosine RNA editing Erez Y. Levanon1,2,*, Martina Hallegger3, Yaron Kinar1, Ronen Shemesh1, Kristina Djinovic-Carugo4, Gideon Rechavi2, Michael F. Jantsch3 and Eli Eisenberg1,5 1Compugen Ltd, 72 Pinchas Rosen St, Tel-Aviv 69512, Israel, 2Department of Pediatric Hemato-Oncology, Safra Children’s Hospital, Sheba Medical Center and Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel, 3Max F. Perutz Laboratories, Department of Chromosome Biology, University of Vienna, Rennweg 14, A-1030 Vienna, Austria, 4Max F. Perutz Laboratories, University Departments at Vienna Biocenter, Institute for Theoretical Chemistry and Molecular Structural Biology, University of Vienna, Campus Vienna Biocenter 6/1, Rennweg 95b, A-1030 Vienna, Austria and 5School of Physics and Astronomy, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel Received December 23, 2004; Accepted January 20, 2005 ABSTRACT the double-stranded RNA-specific ADAR family predomin- antly acting on precursor messenger RNAs (2). As inosines in A-to-I RNA editing by ADARs is a post-transcriptional mRNA are recognized as guanosines (G) by the ribosome in mechanism for expanding the proteomic repertoire. the course of translation, RNA-editing can lead to the forma- Genetic recoding by editing was so far observed for tion of an altered protein if editing leads to a codon exchange. only a few mammalian RNAs that are predominantly ADAR-mediated RNA editing is essential for the development expressed in nervous tissues. However, as these edit- and normal life of both invertebrates and vertebrates (3–5).
    [Show full text]