Deriving and Prioritizing Gene Lists from Pubmed

Total Page:16

File Type:pdf, Size:1020Kb

Deriving and Prioritizing Gene Lists from Pubmed Jourquin et al. BMC Genomics 2012, 13(Suppl 8):S20 http://www.biomedcentral.com/1471-2164/13/S8/S20 RESEARCH Open Access GLAD4U: deriving and prioritizing gene lists from PubMed literature Jérôme Jourquin1,2, Dexter Duncan1, Zhiao Shi3,4, Bing Zhang1,2* From The International Conference on Intelligent Biology and Medicine (ICIBM) Nashville, TN, USA. 22-24 April 2012 Abstract Background: Answering questions such as “Which genes are related to breast cancer?” usually requires retrieving relevant publications through the PubMed search engine, reading these publications, and creating gene lists. This process is not only time-consuming, but also prone to errors. Results: We report GLAD4U (Gene List Automatically Derived For You), a new, free web-based gene retrieval and prioritization tool. GLAD4U takes advantage of existing resources of the NCBI to ensure computational efficiency. The quality of gene lists created by GLAD4U for three Gene Ontology (GO) terms and three disease terms was assessed using corresponding “gold standard” lists curated in public databases. For all queries, GLAD4U gene lists showed very high recall but low precision, leading to low F-measure. As a comparison, EBIMed’s recall was consistently lower than GLAD4U, but its precision was higher. To present the most relevant genes at the top of a list, we studied two prioritization methods based on publication count and the hypergeometric test, and compared the ranked lists and those generated by EBIMed to the gold standards. Both GLAD4U methods outperformed EBIMed for all queries based on a variety of quality metrics. Moreover, the hypergeometric method allowed for a better performance by thresholding genes with low scores. In addition, manual examination suggests that many false-positives could be explained by the incompleteness of the gold standards. The GLAD4U user interface accepts any valid queries for PubMed, and its output page displays the ranked gene list and information associated with each gene, chronologically-ordered supporting publications, along with a summary of the run and links for file export and functional enrichment and protein interaction network analysis. Conclusions: GLAD4U has a high overall recall. Although precision is generally low, the prioritization methods successfully rank truly relevant genes at the top of the lists to facilitate efficient browsing. GLAD4U is simple to use, and its interface can be found at: http://bioinfo.vanderbilt.edu/glad4u. Background The current answers to these questions are primarily The physical development and phenotype of organisms contained in the articles indexed in the MEDLINE data- can be thought of as a product of genes interacting with base. Traditionally, answering these questions requires each other and with the environment. Therefore, it is individuals to retrieve relevant publications through the common for a scientist to ask questions like “Which PubMed search engine and then to create gene lists by genes are related to breast cancer?”, “Which genes are manually extracting gene-centered information from involved in embryonic development?”, and “Which genes retrieved literature. This process is not only time-consum- are functionally related to TP53?” ing, but also prone to errors. First, it is difficult to ascertain that all relevant literature is processed. Second, it is unli- kely that all relationships in a publication will be detected. * Correspondence: [email protected] Third, individual researchers tend to extrapolate based on 1Department of Biomedical Informatics, Vanderbilt University School of Medicine, 400 Eskind Biomedical Library, 2209 Garland Avenue, Nashville, TN domain knowledge. 37232, USA Full list of author information is available at the end of the article © 2012 Jourquin et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Jourquin et al. BMC Genomics 2012, 13(Suppl 8):S20 Page 2 of 12 http://www.biomedcentral.com/1471-2164/13/S8/S20 Over the past decade, bioinformatics approaches have Results been developed to address this issue. One of the most suc- Overall quality of the retrieved gene lists cessful projects in this area is the Gene Ontology (GO) GLAD4U relies on the NCBI eSearch API to find publi- project [1]. GO produces a structured, precisely defined, cations related to a user query and on the gene-to-publi- and controlled vocabulary (i.e., GO terms) for describing cation link table to identify genes from the retrieved the roles of genes and gene products in different species. publications. We used three GO biological process terms Genes are associated with GO terms through manual (apoptosis, cell adhesion and DNA repair) and three dis- curation as well as computational inference. A researcher ease terms (hypertension, obesity and schizophrenia) as can now go to the GO website [2] to get a list of genes queries to evaluate the overall quality of the retrieved related to a GO term of interest. However, as the GO gene lists. For each query, using a corresponding gene list vocabulary only describes gene products in terms of their curated by GO or GAD/OMIM as a gold standard, we associated biological processes, cellular components and calculated the precision, recall and F-measure of the molecular functions, users are limited by questions linked retrieved gene list. As shown in Table 1, gene lists to this limited vocabulary. Moreover, processes, functions retrieved for all queries showed very high recall (0.90 or components that are unique to diseases, such as onco- ±0.03 for GO terms and 0.96±0.05 for disease terms). In genesis, are not included in GO because causing cancer is contrast to the high recall, the precision was generally not the normal function of any gene. low (0.16±0.04 for GO terms and 0.06±0.02 for disease A useful resource specifically designed for disease stu- terms), leading to low F-measures (0.27±0.05 for GO dies is the Online Mendelian Inheritance in Man (OMIM terms and 0.12±0.03 for disease terms). EBIMed’s recall [3]) project. OMIM is a comprehensive, authoritative, is consistently lower than GLAD4U (0.47±0.15 for GO and timely compendium of human genes and genetic terms and 0.44±0.11 for disease terms). However, its pre- phenotypes. It contains information on all known Men- cision is higher than GLAD4U (0.20±0.05 for GO terms delian disorders. However, information on complex dis- and 0.16±0.04 for disease terms), resulting in better eases such as cancer and diabetes is lacking in OMIM. F-measures (0.27±0.03 for GO terms and 0.23±0.04 for In addition to manual curation, text mining tools have disease terms). beendevelopedtoassistgenelistcreation[4].Asan The low precision of GLAD4U may be partially attrib- example, EBIMed [5,6] combines text mining with co- uted to the incompleteness of the annotation in GO and occurrence-based analysis to generate a prioritized list of GAD/OMIM. However, it is likely that the original gene genes for a user-provided query. Specifically, EBIMed lists include many irrelevant genes. In this case, a priori- collects MEDLINE records and available full text docu- tization step that ranks truly relevant genes at the top ments for a user-provided query, identifies protein of a list would certainly facilitate efficient browsing. names, drugs, species, or GO terms in the documents, and prioritizes genes/proteins based on the number of Performance of the prioritization methods co-occurrences of the different pairs (protein/protein, We studied the performance of two methods to prioritize protein/drug, protein/species, protein/GO term) in the the gene lists. The first, “GLAD4U Counts”, is based solely sentences of the documents in which they appear. on the number of supporting publications as commonly EBIMed and similar tools, such as FACTA [7] and SciMi- implemented in other software [10,11]. The second, ner [8], provide more flexible ways to create gene lists “GLAD4U Hypergeometric”,isproposedinthisstudy, that are not limited to certain aspects of biology. Never- which is based on the Hypergeometric test (see the Meth- theless, they usually require heavy computation, and the ods section for details). We used the above mentioned relevance of the resulted gene lists to the input queries three GO terms and three disease terms as queries to eval- has not been systematically evaluated. uate the performance of our prioritization methods. We Here, we report GLAD4U (Gene List Automatically also included the prioritized gene lists returned by Derived For You), a new web-based gene retrieval and EBIMed for comparison. prioritization tool. GLAD4U takes advantage of existing Figure 1 depicts the precision/recall curves from this resources at the National Center for Biotechnology Infor- comparative evaluation. For all queries, based on manual mation (NCBI) to ensure computational efficiency. It inspection of the curves, both GLAD4U Counts and provides a simple user interface that facilitates intuitive GLAD4U Hypergeometric outperformed EBIMed, espe- usage and interpretation of results. The quality of gene cially at the high precision range. Between the two lists created by GLAD4U is assessed using corresponding GLAD4U methods, the Hypergeometric method per- “gold standard” lists curated in GO, GAD (Genetic Asso- formed better than the Counts method for GO term ciation
Recommended publications
  • Recombinant Rabbit Anti-Human RAB8A Monoclonal Antibody, Clone NKG-S33-80-4 (CABT- Z616R) This Product Is for Research Use Only and Is Not Intended for Diagnostic Use
    Recombinant Rabbit Anti-Human RAB8A Monoclonal Antibody, clone NKG-S33-80-4 (CABT- Z616R) This product is for research use only and is not intended for diagnostic use. PRODUCT INFORMATION Immunogen Recombinant full length protein within Human RAB8A aa 1 to the C-terminus. Isotype IgG Source/Host Rabbit Species Reactivity Human Clone NKG-S33-80-4 Purification Protein A purified Conjugate Unconjugated Applications WB, IHC-P, ICC/IF, FC, IP Recommended dilution: WB: 1:1000 IHC-P: 1:1000 ICC/IF: 1:500 FC: 1:600 IP: 1:30 Positive Control WB: HEK-293T, HCT116 and HeLa whole cell lysates. IHC-P: Human bladder cancer and breast tissues. ICC/IF: A549 and HeLa cells. Flow Cyt: A549 cells. IP: A549 whole cell lysate. Format Liquid Concentration Lot specific Size 100 μl Buffer PBS, 40% Glycerol (glycerin, glycerine), 0.05% BSA, pH 7.2. Preservative 0.01% Sodium azide Storage Store at +4℃ short term (1-2 weeks). Upon delivery aliquot. Store at -20℃. Stable for 12 months at -20℃. 45-1 Ramsey Road, Shirley, NY 11967, USA Email: [email protected] Tel: 1-631-624-4882 Fax: 1-631-938-8221 1 © Creative Diagnostics All Rights Reserved Ship Wet ice Warnings This product is for research use only and is not intended for diagnostic use. BACKGROUND Introduction RAB8A may be involved in vesicular trafficking and neurotransmitter release. Together with RAB11A, RAB3IP, the exocyst complex, PARD3, PRKCI, ANXA2, CDC42 and DNMBP promotes transcytosis of PODXL to the apical membrane initiation sites (AMIS), apical surface formation and lumenogenesis. Together with MYO5B and RAB11A participates in epithelial cell polarization.
    [Show full text]
  • Computational Analyses of Small Molecules Activity from Phenotypic Screens
    Computational analyses of small molecules activity from phenotypic screens Azedine Zoufir Hughes Hall This dissertation is submitted for the degree of Doctor of Philosophy July 2018 Declaration This thesis is submitted as the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. It is not substantially the same as any that I have submitted, or, is being concurrently submitted for a degree or diploma or other qualification at the University of Cambridge or any other University or similar institution except as declared in the preface and specified in the text. I further state that no substantial part of my dissertation has already been submitted, or, is being concurrently submitted for any such degree, diploma or other qualification at the University of Cambridge or any other University or similar institution except as declared in the Preface and specified in the text. This dissertation does not exceed the word limit of 60,000 words. Azedine Zoufir July 2018 Summary Title: Computational analyses of small molecules activity from phenotypic screens Author: Azedine Zoufir Drug discovery is no longer relying on the one gene-one disease paradigm nor on target-based screening alone to discover new drugs. Phenotypic-based screening is regaining momentum to discover new compounds since those assays provide an environment closer to the physiological state of the disease and allow to better anticipate off-target effects and other factors that can limit the efficacy of the drugs. However, uncovering the mechanism of action of the compounds active in those assays relies on in vitro techniques that are expensive and time- consuming.
    [Show full text]
  • Original Article Identification of Differentially Expressed Genes Between Male and Female Patients with Acute Myocardial Infarction Based on Microarray Data
    Int J Clin Exp Med 2019;12(3):2456-2467 www.ijcem.com /ISSN:1940-5901/IJCEM0080626 Original Article Identification of differentially expressed genes between male and female patients with acute myocardial infarction based on microarray data Huaqiang Zhou1,2*, Kaibin Yang2*, Shaowei Gao1, Yuanzhe Zhang2, Xiaoyue Wei2, Zeting Qiu1, Si Li2, Qinchang Chen2, Yiyan Song2, Wulin Tan1#, Zhongxing Wang1# 1Department of Anesthesiology, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China; 2Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China. *Equal contributors and co-first au- thors. #Equal contributors. Received May 31, 2018; Accepted August 4, 2018; Epub March 15, 2019; Published March 30, 2019 Abstract: Background: Coronary artery disease has been the most common cause of death and the prognosis still needs further improving. Differences in the incidence and prognosis of male and female patients with coronary artery disease have been observed. We constructed this study hoping to understand those differences at the level of gene expression and to help establish gender-specific therapies. Methods: We downloaded the series matrix file of GSE34198 from the Gene Expression Omnibus database and identified differentially expressed genes between male and female patients. Gene ontology, Kyoto Encyclopedia of Genes and Genomes pathway enrichment analy- sis, and GSEA analysis of differentially expressed genes were performed. The protein-protein interaction network was constructed of the differentially expressed genes and the hub genes were identified. Results: A total of 215 up-regulated genes and 353 down-regulated genes were identified. The differentially expressed pathways were mainly related to the function of ribosomes, virus, and related immune response as well as the cell growth and proliferation.
    [Show full text]
  • A Trafficome-Wide Rnai Screen Reveals Deployment of Early and Late Secretory Host Proteins and the Entire Late Endo-/Lysosomal V
    bioRxiv preprint doi: https://doi.org/10.1101/848549; this version posted November 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 A trafficome-wide RNAi screen reveals deployment of early and late 2 secretory host proteins and the entire late endo-/lysosomal vesicle fusion 3 machinery by intracellular Salmonella 4 5 Alexander Kehl1,4, Vera Göser1, Tatjana Reuter1, Viktoria Liss1, Maximilian Franke1, 6 Christopher John1, Christian P. Richter2, Jörg Deiwick1 and Michael Hensel1, 7 8 1Division of Microbiology, University of Osnabrück, Osnabrück, Germany; 2Division of Biophysics, University 9 of Osnabrück, Osnabrück, Germany, 3CellNanOs – Center for Cellular Nanoanalytics, Fachbereich 10 Biologie/Chemie, Universität Osnabrück, Osnabrück, Germany; 4current address: Institute for Hygiene, 11 University of Münster, Münster, Germany 12 13 Running title: Host factors for SIF formation 14 Keywords: siRNA knockdown, live cell imaging, Salmonella-containing vacuole, Salmonella- 15 induced filaments 16 17 Address for correspondence: 18 Alexander Kehl 19 Institute for Hygiene 20 University of Münster 21 Robert-Koch-Str. 4148149 Münster, Germany 22 Tel.: +49(0)251/83-55233 23 E-mail: [email protected] 24 25 or bioRxiv preprint doi: https://doi.org/10.1101/848549; this version posted November 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.
    [Show full text]
  • README Version Is Dedicated to the Human Genome
    AnnotSV Manual Version 3.0.7 AnnotSV is a program for annotating and ranking structural variations from genomes of several organisms. This README version is dedicated to the human genome. https://lbgi.fr/AnnotSV/ Copyright (C) 2017-2021 GEOFFROY Véronique Please feel free to contact me for any suggestions or bug reports email: [email protected] AnnotSV documentation 2021/04/26 1 LEXIQUE 1000g: 1000 Genomes Project (phase 3) ACMG: American College of Medical Genetics and Genomics BED: Browser Extensible Data bp: base pair CDS: CoDing Sequence CNV: Copy Number Variation DDD: Deciphering Developmental Disorders DECIPHER: DatabasE of genomic varIation and Phenotype in Humans using Ensembl Resources DEL: Deletion DGV: Database of Genomic Variants DNA: DesoxyriboNucleic Acid DUP: Duplication ENCODE: Encyclopedia of DNA Elements ExAC: Exome Aggregation Consortium GH: GeneHancer GRCh37: Genome Reference Consortium Human Build 37 GRCh38: Genome Reference Consortium Human Build 38 HI: Haploinsufficiency hom: homozygous htz: heterozygous ID: Identifier indel: Insertion/deletion INS: Insertion INV: Inversion LoF: Loss of Function MCNV: multiallelic CNV MEI: Mobile Element Insertion misZ = Z scoreindicating gene intolerance to missense variation NAHR: Non-Allelic Homologous Recombination OMIM: Online Mendelian Inheritance in Man pLI: score indicating gene intolerance to a loss of function variation SNV: Single Nucleotide Variation SV: Structural Variations synZ = Z score indicating gene intolerance to synonymous variation TAD: Topologically
    [Show full text]
  • A Different View on DNA Amplifications Indicates Frequent, Highly Complex, and Stable Amplicons on 12Q13-21 in Glioma
    A Different View on DNA Amplifications Indicates Frequent, Highly Complex, and Stable Amplicons on 12q13-21 in Glioma Ulrike Fischer,1 Andreas Keller,3 Petra Leidinger,1 Stephanie Deutscher,1 Sabrina Heisel,1 Steffi Urbschat,2 Hans-Peter Lenhof,3 and Eckart Meese1 Departments of 1Human Genetics and 2Neurosurgery, Saarland University, Homburg/Saar, Germany and 3Center for Bioinformatics, Saarland University, Saarbru¨cken, Germany Abstract Introduction To further understand the biological significance of DNA amplification does not occur in normal human cells amplifications for glioma development and recurrencies, but in multidrug-resistant cells and in tumor cells. Numerous we characterized amplicon frequency and size in studies described gene amplification in various human tumors low-grade glioma and amplicon stability in vivo in by cytogenetic and molecular genetic means. The breakage- recurring glioblastoma. We developed a 12q13-21 fusion-bridge cycle (1) is the most popular model to explain amplicon–specific genomic microarray and a intrachromosomal amplifications and many amplified structures bioinformatics amplification prediction tool to analyze like mixed ladders that were found in homogeneously staining amplicon frequency, size, and maintenance in 40 glioma regions (2). The ‘‘episome model’’ proposes that episomes samples including 16 glioblastoma, 10 anaplastic result from excision of small circular DNA that enlarges by astrocytoma, 7 astrocytoma WHO grade 2, and 7 overreplication or recombination until it becomes cytogeneti- pilocytic astrocytoma. Whereas previous studies cally visible as double minute chromosomes (3). Recent results reported two amplified subregions, we found a more of Tanaka et al. (4) showthat large palindromic sequences were complex situation with many amplified subregions. present in human cancers and the location of palindromes in the Analyzing 40 glioma, we found that all analyzed cancer genome serves as a structural platform to support glioblastoma and the majority of pilocytic astrocytoma, subsequent gene amplification.
    [Show full text]
  • The Impact of Chromosomal Translocation Locus and Fusion Oncogene Coding Sequence in Synovial Sarcomagenesis
    HHS Public Access Author manuscript Author ManuscriptAuthor Manuscript Author Oncogene Manuscript Author . Author manuscript; Manuscript Author available in PMC 2016 September 23. Published in final edited form as: Oncogene. 2016 September 22; 35(38): 5021–5032. doi:10.1038/onc.2016.38. The impact of chromosomal translocation locus and fusion oncogene coding sequence in synovial sarcomagenesis Kevin B. Jones1,2,3,*, Jared J. Barrott1,2,3, Mingchao Xie4, Malay Haldar5, Huifeng Jin1,2,3, Ju-Fen Zhu1,2,3, Michael J. Monument1,3, Tim L. Mosbruger3,6, Ellen M. Langer5, R. Lor Randall1,3, Richard K. Wilson4,7,8,9, Bradley R. Cairns2,3,10, Li Ding4,7,8,9, and Mario R. Capecchi5 1Department of Orthopaedics, University of Utah, Salt Lake City, Utah 84112, USA 2Department of Oncological Sciences, University of Utah, Salt Lake City, Utah 84112, USA 3Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA 4Department of Medicine, Washington University, St. Louis, Missouri 63108, USA 5Department of Human Genetics, University of Utah, Salt Lake City, Utah, USA 6Deparment of Bioinformatics, University of Utah, Salt Lake City, Utah, 84112, USA 7McDonnell Genome Institute, Washington University, St. Louis, Missouri 63108, USA 8Department of Genetics, Washington University, St. Louis, Missouri 63108, USA 9Siteman Cancer Center, Washington University, St. Louis, Missouri 63108, USA 10Howard Hughes Medical Institute, University of Utah, Salt Lake City, UT 84112, USA Abstract Synovial sarcomas are aggressive soft-tissue malignancies that express chromosomal translocation-generated fusion genes, SS18-SSX1 or SS18-SSX2 in most cases. Here, we report a mouse sarcoma model expressing SS18-SSX1, complementing our prior model expressing SS18- SSX2.
    [Show full text]
  • 393LN V 393P 344SQ V 393P Probe Set Entrez Gene
    393LN v 393P 344SQ v 393P Entrez fold fold probe set Gene Gene Symbol Gene cluster Gene Title p-value change p-value change chemokine (C-C motif) ligand 21b /// chemokine (C-C motif) ligand 21a /// chemokine (C-C motif) ligand 21c 1419426_s_at 18829 /// Ccl21b /// Ccl2 1 - up 393 LN only (leucine) 0.0047 9.199837 0.45212 6.847887 nuclear factor of activated T-cells, cytoplasmic, calcineurin- 1447085_s_at 18018 Nfatc1 1 - up 393 LN only dependent 1 0.009048 12.065 0.13718 4.81 RIKEN cDNA 1453647_at 78668 9530059J11Rik1 - up 393 LN only 9530059J11 gene 0.002208 5.482897 0.27642 3.45171 transient receptor potential cation channel, subfamily 1457164_at 277328 Trpa1 1 - up 393 LN only A, member 1 0.000111 9.180344 0.01771 3.048114 regulating synaptic membrane 1422809_at 116838 Rims2 1 - up 393 LN only exocytosis 2 0.001891 8.560424 0.13159 2.980501 glial cell line derived neurotrophic factor family receptor alpha 1433716_x_at 14586 Gfra2 1 - up 393 LN only 2 0.006868 30.88736 0.01066 2.811211 1446936_at --- --- 1 - up 393 LN only --- 0.007695 6.373955 0.11733 2.480287 zinc finger protein 1438742_at 320683 Zfp629 1 - up 393 LN only 629 0.002644 5.231855 0.38124 2.377016 phospholipase A2, 1426019_at 18786 Plaa 1 - up 393 LN only activating protein 0.008657 6.2364 0.12336 2.262117 1445314_at 14009 Etv1 1 - up 393 LN only ets variant gene 1 0.007224 3.643646 0.36434 2.01989 ciliary rootlet coiled- 1427338_at 230872 Crocc 1 - up 393 LN only coil, rootletin 0.002482 7.783242 0.49977 1.794171 expressed sequence 1436585_at 99463 BB182297 1 - up 393
    [Show full text]
  • Annotsv Manual
    AnnotSV Manual Version 2.1 AnnotSV is a program for annotating and ranking structural variations from the human genome. https://lbgi.fr/AnnotSV/ Copyright (C) 2017-2019 GEOFFROY Véronique Please feel free to contact me for any suggestions or bug reports email: [email protected] AnnotSV documentation v2.1 2019/04/18 1 LEXIQUE 1000g: 1000 Genomes Project (phase 3) ACMG: American College of Medical Genetics and Genomics BED: Browser Extensible Data bp: base pair CDS: CoDing Sequence CNV: Copy Number Variation DDD: Deciphering Developmental Disorders DECIPHER: DatabasE of genomiC varIation and Phenotype in Humans using Ensembl Resources DEL: Deletion DGV: Database of Genomic Variants DNA: DesoxyriboNucleic Acid DUP: Duplication ENCODE: Encyclopedia of DNA Elements ExAC: Exome Aggregation Consortium GH: GeneHancer GRCh37: Genome Reference Consortium Human Build 37 GRCh38: Genome Reference Consortium Human Build 38 HI: Haploinsufficiency hom: homozygous htz: heterozygous ID: Identifier indel: Insertion/deletion INS: Insertion INV: Inversion LoF: Loss of Function MCNV: multiallelic CNV MEI: Mobile Element Insertion misZ = Z score indicating gene intolerance to missense variation NAHR: Non-Allelic Homologous Recombination NM: RefSeq identifiers OMIM: Online Mendelian Inheritance in Man pLI: score computed by the ExAc consortium to indicate gene intolerance to a loss of function variation SNV: Single Nucleotide Variation SV: Structural Variations synZ = Z score indicating gene intolerance to synonymous variation TAD: Topologically Associating
    [Show full text]
  • Powerful Gene-Based Testing by Integrating Long-Range Chromatin Interactions and Knockoff Genotypes
    medRxiv preprint doi: https://doi.org/10.1101/2021.07.14.21260405; this version posted July 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Powerful gene-based testing by integrating long-range chromatin interactions and knockoff genotypes Shiyang Ma1, James L. Dalgleish1, Justin Lee2, Chen Wang1, Linxi Liu3, Richard Gill4;5, Joseph D. Buxbaum6, Wendy Chung7, Hugues Aschard8, Edwin K. Silverman9, Michael H. Cho9, Zihuai He2;10, Iuliana Ionita-Laza1;# 1 Department of Biostatistics, Columbia University, New York, NY, 10032, USA 2 Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA, 94305, USA 3 Department of Statistics, University of Pittsburgh, Pittsburgh, PA, 15260, USA 4 Department of Human Genetics, Genentech, South San Francisco, CA, 94080, USA 5 Department of Epidemiology, Columbia University, New York, NY, 10032, USA 6 Departments of Psychiatry, Neuroscience, and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA 7 Department of Pediatrics and Medicine, Herbert Irving Comprehensive Cancer Center, Columbia Uni- versity Irving Medical Center, New York, NY, 10032, USA 8 Department of Computational Biology, Institut Pasteur, Paris, France 9 Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA 10 Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA 94305, USA # e-mail: [email protected] Abstract Gene-based tests are valuable techniques for identifying genetic factors in complex traits.
    [Show full text]
  • The Architecture and Evolution of Cancer Neochromosomes
    Cancer Cell Article The Architecture and Evolution of Cancer Neochromosomes Dale W. Garsed,1,8,9,15 Owen J. Marshall,5,15,17 Vincent D.A. Corbin,2,3,13,15 Arthur Hsu,2,15 Leon Di Stefano,2 Jan Schro¨ der,2,3 Jason Li,1 Zhi-Ping Feng,2,3 Bo W. Kim,5 Mark Kowarsky,2 Ben Lansdell,2 Ross Brookwell,6 Ola Myklebost,10 Leonardo Meza-Zepeda,10 Andrew J. Holloway,1 Florence Pedeutour,7 K.H. Andy Choo,5 Michael A. Damore,11 Andrew J. Deans,12 Anthony T. Papenfuss,2,3,4,8,13,16,* and David M. Thomas1,8,14,16,* 1Cancer Genomics, Peter MacCallum Cancer Centre, East Melbourne, VIC 3002, Australia 2Bioinformatics Division, The Walter & Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia 3Department of Medical Biology, University of Melbourne, VIC 3010, Australia 4Department of Mathematics and Statistics, University of Melbourne, VIC, 3010, Australia 5Chromosome Research, Murdoch Childrens Research Institute, and Department of Paediatrics, Royal Children’s Hospital, University of Melbourne, Parkville, VIC 3052, Australia 6Sullivan Nicolaides Pathology, Indooroopilly, QLD 4068, Australia 7Laboratory of Solid Tumors Genetics, Nice University Hospital, Nice 06107, France 8Sir Peter MacCallum Department of Oncology, University of Melbourne, VIC 3010, Australia 9Department of Pathology, University of Melbourne, VIC 3010, Australia 10Department of Tumor Biology, Oslo University Hospital, Norwegian Radium Hospital, Oslo 0424, Norway 11Amgen, Thousand Oaks, CA 91320, USA 12St Vincent’s Institute, Fitzroy, VIC 3065, Australia 13Bioinformatics
    [Show full text]
  • Rabbit Anti-RAB8A/FITC Conjugated Antibody
    SunLong Biotech Co.,LTD Tel: 0086-571- 56623320 Fax:0086-571- 56623318 E-mail:[email protected] www.sunlongbiotech.com Rabbit Anti-RAB8A/FITC Conjugated antibody SL6176R-FITC Product Name: Anti-RAB8A/FITC Chinese Name: FITC标记的ras癌基因家族Rab8蛋白/原癌基因c MEL抗体 MEL; Mel transforming oncogene; Oncogene c mel; Oncogene c-mel; RAB8; RAB8A; Alias: RAB8A member RAS oncogene family; RAB8A_HUMAN; Ras associated protein RAB8; Ras related protein Rab 8A; Ras-related protein Rab-8A. Organism Species: Rabbit Clonality: Polyclonal React Species: Human,Mouse,Rat,Chicken,Cow,Sheep, IF=1:50-200 Applications: not yet tested in other applications. optimal dilutions/concentrations should be determined by the end user. Molecular weight: 23kDa Cellular localization: The cell membrane Form: Lyophilized or Liquid Concentration: 1mg/ml immunogen: KLH conjugated synthetic peptide derived from human RAB8A Lsotype: IgG Purification: affinitywww.sunlongbiotech.com purified by Protein A Storage Buffer: 0.01M TBS(pH7.4) with 1% BSA, 0.03% Proclin300 and 50% Glycerol. Store at -20 °C for one year. Avoid repeated freeze/thaw cycles. The lyophilized antibody is stable at room temperature for at least one month and for greater than a year Storage: when kept at -20°C. When reconstituted in sterile pH 7.4 0.01M PBS or diluent of antibody the antibody is stable for at least two weeks at 2-4 °C. background: The protein encoded by this gene is a member of the RAS superfamily which are small GTP/GDP-binding proteins with an average size of 200 amino acids. The RAS-related Product Detail: proteins of the RAB/YPT family may play a role in the transport of proteins from the endoplasmic reticulum to the Golgi and the plasma membrane.
    [Show full text]