NIH Public Access Author Manuscript Mol Cell
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Analysis of Gene Expression Data for Gene Ontology
ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Robert Daniel Macholan May 2011 ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION Robert Daniel Macholan Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Department Chair Dr. Zhong-Hui Duan Dr. Chien-Chung Chan _______________________________ _______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Yingcai Xiao Dr. George R. Newkome _______________________________ Date ii ABSTRACT A tremendous increase in genomic data has encouraged biologists to turn to bioinformatics in order to assist in its interpretation and processing. One of the present challenges that need to be overcome in order to understand this data more completely is the development of a reliable method to accurately predict the function of a protein from its genomic information. This study focuses on developing an effective algorithm for protein function prediction. The algorithm is based on proteins that have similar expression patterns. The similarity of the expression data is determined using a novel measure, the slope matrix. The slope matrix introduces a normalized method for the comparison of expression levels throughout a proteome. The algorithm is tested using real microarray gene expression data. Their functions are characterized using gene ontology annotations. The results of the case study indicate the protein function prediction algorithm developed is comparable to the prediction algorithms that are based on the annotations of homologous proteins. -
Trait Locus Chr Position P-Value Effect Effect SE Ref MA MAF Gene Gene Function FP Rs137787931 14 1880378 4.27E-07
Table S1. The SNPs which reached significance in single-locus association for fat percentages at suggestive threshold (P < 2.22 × 10-5). Trait Locus Chr Position P-Value Effect Effect SE Ref MA MAF Gene Gene function FP rs137787931 14 1,880,378 4.27E-07 -0.06 0.01 T C 0.42 MROH1 - rs134432442 14 1,736,599 8.69E-07 0.06 0.01 C T 0.49 CPSF1 mRNA polyadenylation FP = fat percentage, Chr = chromosome, Ref = reference allele, MA = minor allele, MAF = minor allele frequency, MROH1 = maestro heat like repeat family member 1, CPSF1 = cleavage and polyadenylation specific factor 1. Table S2. The 1-Mb SNP windows surpass suggestive significance level that is proportion of genetic variance (PVE) at 0.19 % and their window posterior probability of association (WPPA) for percentages of milk fat (FP) and crude protein (CPP), milk urea (MU) and efficiency of crude protein utilization (ECPU). Trait Window Chr Start-end window (Mb) Start SNP End SNP No. of SNP PVE (%) WPPA Gene Gene function FP 1849 18 15-16 15,039,844 15,954,290 21 0.31 0.22 GPT2 Regulation of biosynthesis CPP 323 3 26-27 26,020,004 26,925,312 18 0.36 0.22 TRIM45 Protein ubiquitination 1567 14 62-63 62,081,472 62,960,995 17 0.34 0.12 UBR5 Protein ubiquitination MU 563 5 23-24 23,019,369 23,949,571 19 0.34 0.08 UBE2N Protein ubiquitination 1084 9 75-76 75,026,578 75,935,285 19 0.33 0.14 TNFAIP3 Protein ubiquitination 1677 16 1-2 1,033,239 1,972,109 24 0.32 0.14 ATP2B4, Urinary bladder smooth muscle contraction, REN Kidney development 4 1 3-4 3,079,342 3,987,104 19 0.31 0.18 UBR1 Protein -
Seq2pathway Vignette
seq2pathway Vignette Bin Wang, Xinan Holly Yang, Arjun Kinstlick May 19, 2021 Contents 1 Abstract 1 2 Package Installation 2 3 runseq2pathway 2 4 Two main functions 3 4.1 seq2gene . .3 4.1.1 seq2gene flowchart . .3 4.1.2 runseq2gene inputs/parameters . .5 4.1.3 runseq2gene outputs . .8 4.2 gene2pathway . 10 4.2.1 gene2pathway flowchart . 11 4.2.2 gene2pathway test inputs/parameters . 11 4.2.3 gene2pathway test outputs . 12 5 Examples 13 5.1 ChIP-seq data analysis . 13 5.1.1 Map ChIP-seq enriched peaks to genes using runseq2gene .................... 13 5.1.2 Discover enriched GO terms using gene2pathway_test with gene scores . 15 5.1.3 Discover enriched GO terms using Fisher's Exact test without gene scores . 17 5.1.4 Add description for genes . 20 5.2 RNA-seq data analysis . 20 6 R environment session 23 1 Abstract Seq2pathway is a novel computational tool to analyze functional gene-sets (including signaling pathways) using variable next-generation sequencing data[1]. Integral to this tool are the \seq2gene" and \gene2pathway" components in series that infer a quantitative pathway-level profile for each sample. The seq2gene function assigns phenotype-associated significance of genomic regions to gene-level scores, where the significance could be p-values of SNPs or point mutations, protein-binding affinity, or transcriptional expression level. The seq2gene function has the feasibility to assign non-exon regions to a range of neighboring genes besides the nearest one, thus facilitating the study of functional non-coding elements[2]. Then the gene2pathway summarizes gene-level measurements to pathway-level scores, comparing the quantity of significance for gene members within a pathway with those outside a pathway. -
Genetic and Genomic Analysis of Hyperlipidemia, Obesity and Diabetes Using (C57BL/6J × TALLYHO/Jngj) F2 Mice
University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Nutrition Publications and Other Works Nutrition 12-19-2010 Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P. Stewart Marshall University Hyoung Y. Kim University of Tennessee - Knoxville, [email protected] Arnold M. Saxton University of Tennessee - Knoxville, [email protected] Jung H. Kim Marshall University Follow this and additional works at: https://trace.tennessee.edu/utk_nutrpubs Part of the Animal Sciences Commons, and the Nutrition Commons Recommended Citation BMC Genomics 2010, 11:713 doi:10.1186/1471-2164-11-713 This Article is brought to you for free and open access by the Nutrition at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Nutrition Publications and Other Works by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. Stewart et al. BMC Genomics 2010, 11:713 http://www.biomedcentral.com/1471-2164/11/713 RESEARCH ARTICLE Open Access Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P Stewart1, Hyoung Yon Kim2, Arnold M Saxton3, Jung Han Kim1* Abstract Background: Type 2 diabetes (T2D) is the most common form of diabetes in humans and is closely associated with dyslipidemia and obesity that magnifies the mortality and morbidity related to T2D. The genetic contribution to human T2D and related metabolic disorders is evident, and mostly follows polygenic inheritance. The TALLYHO/ JngJ (TH) mice are a polygenic model for T2D characterized by obesity, hyperinsulinemia, impaired glucose uptake and tolerance, hyperlipidemia, and hyperglycemia. -
Essential Genes and Their Role in Autism Spectrum Disorder
University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2017 Essential Genes And Their Role In Autism Spectrum Disorder Xiao Ji University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Bioinformatics Commons, and the Genetics Commons Recommended Citation Ji, Xiao, "Essential Genes And Their Role In Autism Spectrum Disorder" (2017). Publicly Accessible Penn Dissertations. 2369. https://repository.upenn.edu/edissertations/2369 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/2369 For more information, please contact [email protected]. Essential Genes And Their Role In Autism Spectrum Disorder Abstract Essential genes (EGs) play central roles in fundamental cellular processes and are required for the survival of an organism. EGs are enriched for human disease genes and are under strong purifying selection. This intolerance to deleterious mutations, commonly observed haploinsufficiency and the importance of EGs in pre- and postnatal development suggests a possible cumulative effect of deleterious variants in EGs on complex neurodevelopmental disorders. Autism spectrum disorder (ASD) is a heterogeneous, highly heritable neurodevelopmental syndrome characterized by impaired social interaction, communication and repetitive behavior. More and more genetic evidence points to a polygenic model of ASD and it is estimated that hundreds of genes contribute to ASD. The central question addressed in this dissertation is whether genes with a strong effect on survival and fitness (i.e. EGs) play a specific oler in ASD risk. I compiled a comprehensive catalog of 3,915 mammalian EGs by combining human orthologs of lethal genes in knockout mice and genes responsible for cell-based essentiality. -
Chromosomal Rearrangements Are Commonly Post-Transcriptionally Attenuated in Cancer
bioRxiv preprint doi: https://doi.org/10.1101/093369; this version posted February 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Chromosomal rearrangements are commonly post-transcriptionally attenuated in cancer 1 3 1 3, 4, 5 Emanuel Gonçalves , Athanassios Fragoulis , Luz Garcia-Alonso , Thorsten Cramer , 1,2# 1# Julio Saez-Rodriguez , Pedro Beltrao 1 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK 2 RWTH Aachen University, Faculty of Medicine, Joint Research Centre for Computational Biomedicine, Aachen 52057, Germany 3 Molecular Tumor Biology, Department of General, Visceral and Transplantation Surgery, RWTH University Hospital, Pauwelsstraße 30, 52074 Aachen, Germany 4 NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht University, Maastricht, The Netherlands 5 ESCAM – European Surgery Center Aachen Maastricht, Germany and The Netherlands # co-last authors: [email protected]; [email protected] Running title: Chromosomal rearrangement attenuation in cancer Keywords: Cancer; Gene dosage; Proteomics; Copy-number variation; Protein complexes 1 bioRxiv preprint doi: https://doi.org/10.1101/093369; this version posted February 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Abstract Chromosomal rearrangements, despite being detrimental, are ubiquitous in cancer and often act as driver events. -
Large-Scale Rnai Screening Uncovers New Therapeutic Targets in the Human Parasite
bioRxiv preprint doi: https://doi.org/10.1101/2020.02.05.935833; this version posted February 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 Large-scale RNAi screening uncovers new therapeutic targets in the human parasite 2 Schistosoma mansoni 3 4 Jipeng Wang1*, Carlos Paz1*, Gilda Padalino2, Avril Coghlan3, Zhigang Lu3, Irina Gradinaru1, 5 Julie N.R. Collins1, Matthew Berriman3, Karl F. Hoffmann2, James J. Collins III1†. 6 7 8 1Department of Pharmacology, UT Southwestern Medical Center, Dallas, Texas 75390 9 2Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, 10 Aberystwyth, Wales, UK. 11 3Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK 12 *Equal Contribution 13 14 15 16 17 18 19 20 21 22 23 †To whom correspondence should be addressed 24 [email protected] 25 UT Southwestern Medical Center 26 Department of Pharmacology 27 6001 Forest Park. Rd. 28 Dallas, TX 75390 29 United States of America bioRxiv preprint doi: https://doi.org/10.1101/2020.02.05.935833; this version posted February 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 30 ABSTRACT 31 Schistosomes kill 250,000 people every year and are responsible for serious morbidity in 240 32 million of the world’s poorest people. -
Aneuploidy: Using Genetic Instability to Preserve a Haploid Genome?
Health Science Campus FINAL APPROVAL OF DISSERTATION Doctor of Philosophy in Biomedical Science (Cancer Biology) Aneuploidy: Using genetic instability to preserve a haploid genome? Submitted by: Ramona Ramdath In partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biomedical Science Examination Committee Signature/Date Major Advisor: David Allison, M.D., Ph.D. Academic James Trempe, Ph.D. Advisory Committee: David Giovanucci, Ph.D. Randall Ruch, Ph.D. Ronald Mellgren, Ph.D. Senior Associate Dean College of Graduate Studies Michael S. Bisesi, Ph.D. Date of Defense: April 10, 2009 Aneuploidy: Using genetic instability to preserve a haploid genome? Ramona Ramdath University of Toledo, Health Science Campus 2009 Dedication I dedicate this dissertation to my grandfather who died of lung cancer two years ago, but who always instilled in us the value and importance of education. And to my mom and sister, both of whom have been pillars of support and stimulating conversations. To my sister, Rehanna, especially- I hope this inspires you to achieve all that you want to in life, academically and otherwise. ii Acknowledgements As we go through these academic journeys, there are so many along the way that make an impact not only on our work, but on our lives as well, and I would like to say a heartfelt thank you to all of those people: My Committee members- Dr. James Trempe, Dr. David Giovanucchi, Dr. Ronald Mellgren and Dr. Randall Ruch for their guidance, suggestions, support and confidence in me. My major advisor- Dr. David Allison, for his constructive criticism and positive reinforcement. -
A Compendium of Co-Regulated Protein Complexes in Breast Cancer Reveals Collateral Loss Events
bioRxiv preprint doi: https://doi.org/10.1101/155333; this version posted June 26, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. A compendium of co-regulated protein complexes in breast cancer reveals collateral loss events Colm J. Ryan*1, Susan Kennedy1, Ilirjana Bajrami2, David Matallanas1, Christopher J. Lord2 1Systems Biology Ireland, School of Medicine, University College Dublin, Dublin 4, Ireland 2The Breast Cancer Now Toby Robins Breast Cancer Research Centre and CRUK Gene Function Laboratory, The Institute of Cancer Research, London, SW3 6JB, United Kingdom. * Correspondence: [email protected] Summary Protein complexes are responsible for the bulk of activities within the cell, but how their behavior and composition varies across tumors remains poorly understood. By combining proteomic profiles of breast tumors with a large-scale protein-protein interaction network, we have identified a set of 258 high-confidence protein complexes whose subunits have highly correlated protein abundance across tumor samples. We used this set to identify complexes that are reproducibly under- or over- expressed in specific breast cancer subtypes. We found that mutation or deletion of one subunit of a complex was often associated with a collateral reduction in protein expression of additional complex members. This collateral loss phenomenon was evident from proteomic, but not transcriptomic, profiles suggesting post- transcriptional control. Mutation of the tumor suppressor E-cadherin (CDH1) was associated with a collateral loss of members of the adherens junction complex, an effect we validated using an engineered model of E-cadherin loss. -
Supplementary Materials
Supplementary materials Supplementary Table S1: MGNC compound library Ingredien Molecule Caco- Mol ID MW AlogP OB (%) BBB DL FASA- HL t Name Name 2 shengdi MOL012254 campesterol 400.8 7.63 37.58 1.34 0.98 0.7 0.21 20.2 shengdi MOL000519 coniferin 314.4 3.16 31.11 0.42 -0.2 0.3 0.27 74.6 beta- shengdi MOL000359 414.8 8.08 36.91 1.32 0.99 0.8 0.23 20.2 sitosterol pachymic shengdi MOL000289 528.9 6.54 33.63 0.1 -0.6 0.8 0 9.27 acid Poricoic acid shengdi MOL000291 484.7 5.64 30.52 -0.08 -0.9 0.8 0 8.67 B Chrysanthem shengdi MOL004492 585 8.24 38.72 0.51 -1 0.6 0.3 17.5 axanthin 20- shengdi MOL011455 Hexadecano 418.6 1.91 32.7 -0.24 -0.4 0.7 0.29 104 ylingenol huanglian MOL001454 berberine 336.4 3.45 36.86 1.24 0.57 0.8 0.19 6.57 huanglian MOL013352 Obacunone 454.6 2.68 43.29 0.01 -0.4 0.8 0.31 -13 huanglian MOL002894 berberrubine 322.4 3.2 35.74 1.07 0.17 0.7 0.24 6.46 huanglian MOL002897 epiberberine 336.4 3.45 43.09 1.17 0.4 0.8 0.19 6.1 huanglian MOL002903 (R)-Canadine 339.4 3.4 55.37 1.04 0.57 0.8 0.2 6.41 huanglian MOL002904 Berlambine 351.4 2.49 36.68 0.97 0.17 0.8 0.28 7.33 Corchorosid huanglian MOL002907 404.6 1.34 105 -0.91 -1.3 0.8 0.29 6.68 e A_qt Magnogrand huanglian MOL000622 266.4 1.18 63.71 0.02 -0.2 0.2 0.3 3.17 iolide huanglian MOL000762 Palmidin A 510.5 4.52 35.36 -0.38 -1.5 0.7 0.39 33.2 huanglian MOL000785 palmatine 352.4 3.65 64.6 1.33 0.37 0.7 0.13 2.25 huanglian MOL000098 quercetin 302.3 1.5 46.43 0.05 -0.8 0.3 0.38 14.4 huanglian MOL001458 coptisine 320.3 3.25 30.67 1.21 0.32 0.9 0.26 9.33 huanglian MOL002668 Worenine -
Whole-Genome Cancer Analysis As an Approach to Deeper Understanding of Tumour Biology
British Journal of Cancer (2010) 102, 243 – 248 & 2010 Cancer Research UK All rights reserved 0007 – 0920/10 $32.00 www.bjcancer.com Minireview Whole-genome cancer analysis as an approach to deeper understanding of tumour biology ,1 2 RL Strausberg* and AJG Simpson 1J Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA; 2Ludwig Institute for Cancer Research, 605 Third Avenue, New York, NY 10158, USA Recent advances in DNA sequencing technology are providing unprecedented opportunities for comprehensive analysis of cancer genomes, exomes, transcriptomes, as well as epigenomic components. The integration of these data sets with well-annotated phenotypic and clinical data will expedite improved interventions based on the individual genomics of the patient and the specific disease. British Journal of Cancer (2010) 102, 243–248. doi:10.1038/sj.bjc.6605497 www.bjcancer.com Published online 22 December 2009 & 2010 Cancer Research UK Keywords: genome; transcriptome; exome; chromosome; sequencing The family of diseases that we refer to as cancer represents a field of tumourigenesis, on the basis of knowledge of gene families and of application for genomics of truly special importance and signal transduction networks. More recently, technological changes opportunity. It is perhaps the first area in which not only will in DNA sequencing, including the recently introduced ‘NextGen’ genomics continue to make major contributions to the under- instruments and associated molecular technologies, have enabled standing of the disease through holistic discovery of causal both higher-throughput and more sensitive assays that provide genome-wide perturbations but will also be the first field in which important new opportunities for basic discovery and clinical whole-genome analysis is used in clinical applications such as application (Mardis, 2009). -
Open Data for Differential Network Analysis in Glioma
International Journal of Molecular Sciences Article Open Data for Differential Network Analysis in Glioma , Claire Jean-Quartier * y , Fleur Jeanquartier y and Andreas Holzinger Holzinger Group HCI-KDD, Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Auenbruggerplatz 2/V, 8036 Graz, Austria; [email protected] (F.J.); [email protected] (A.H.) * Correspondence: [email protected] These authors contributed equally to this work. y Received: 27 October 2019; Accepted: 3 January 2020; Published: 15 January 2020 Abstract: The complexity of cancer diseases demands bioinformatic techniques and translational research based on big data and personalized medicine. Open data enables researchers to accelerate cancer studies, save resources and foster collaboration. Several tools and programming approaches are available for analyzing data, including annotation, clustering, comparison and extrapolation, merging, enrichment, functional association and statistics. We exploit openly available data via cancer gene expression analysis, we apply refinement as well as enrichment analysis via gene ontology and conclude with graph-based visualization of involved protein interaction networks as a basis for signaling. The different databases allowed for the construction of huge networks or specified ones consisting of high-confidence interactions only. Several genes associated to glioma were isolated via a network analysis from top hub nodes as well as from an outlier analysis. The latter approach highlights a mitogen-activated protein kinase next to a member of histondeacetylases and a protein phosphatase as genes uncommonly associated with glioma. Cluster analysis from top hub nodes lists several identified glioma-associated gene products to function within protein complexes, including epidermal growth factors as well as cell cycle proteins or RAS proto-oncogenes.