Biological Pathway Analysis: Trends and Applications

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Biological Pathway Analysis: Trends and Applications Pathway resources, data access, and pathway standards BIME 591 2017.1.11 Lucy Lu Wang Uses for pathway resources • Functional analysis • General information retrieval • Biosimulation • Visualizing relationships • Disease modeling What information should be represented? • Entities and their relationships • Biological entities (chemical species, genes, proteins/enzymes, cofactors) • Cellular entities (locality, type of cell, compartments) • Organism • Function • Rates? • Reactions • Reference to a published resource • Synonyms/aliases • Related pathways • Analogous pathways from different species (ortholog?) • Curation status • Meta-information for how to draw the pathway diagram Strengths and weaknesses of pathway representation Strengths Weaknesses • Useful abstraction for • Pathways are not human interpretation independent • Representations may • Clarifies relationships disagree between between genes and resources molecules • Human curation is slow and • Connects genes to expensive function • Lack of consistency in naming, classification, search, and download • Pathway boundaries are arbitrary Today’s topics Pathway resources and available data Data access channels Relevant linked databases Pathway Resources Different ways to view pathways: Pathway ontologies Pathway diagrams Pathway representations Protein-protein interaction networks Gene sets Pathway Ontologies These describe the organizational hierarchy of pathways, i.e. pathway classes. Actual pathway representations are instances of those classes. Pathway ontologies: • BioPortal’s Pathway Ontology • INOH (Integrating Network Objects w/ Hierarchies) — defunct, data available through PathwayCommons Compare: http://purl.bioontology.org/ontology/PW (ontology only) http://reactome.org/PathwayBrowser/ (ontology w/ pathway instances) Pathway Diagrams Visual display of pathway information (entities and relationships) Some are backed by pathway representations e.g. Reactome “Glycolysis” pathway Pathway Representations <?xml version="1.0" encoding="UTF-8" standalone=“no"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bp="http://www.biopax.org/release/biopax-level3.owl#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xml:base=“http://www.reactome.org/biopax/59/71387#"> <owl:Ontology rdf:about=“"> <owl:imports rdf:resource=“http://www.biopax.org/release/biopax-level3.owl"/> <rdfs:comment rdf:datatype=“http://www.w3.org/2001/XMLSchema#string"> BioPAX pathway converted from "Metabolism of carbohydrates" in the Reactome database. </rdfs:comment> </owl:Ontology> <bp:Pathway rdf:ID=“Pathway1"> <bp:pathwayComponent rdf:resource=“#Pathway2"/> <bp:pathwayComponent rdf:resource=“#BiochemicalReaction12"/> <bp:pathwayComponent rdf:resource=“#BiochemicalReaction13"/> <bp:pathwayComponent rdf:resource=“#Pathway3"/> <bp:pathwayComponent rdf:resource=“#Pathway6"/> Protein-Protein Interaction Network An example network of proteins involved in glycolysis From Krishna et al. (2014) Systems genomics evaluation of the SH-SY5Y neuroblastoma cell line as a model for Parkinson’s disease Gene Set An example glycolysis gene set ALDOA HK3 ALDOB PFKL ALDOC PGAM2 BPGM PGK1 ENO1 PGK2 ENO2 PGM1 ENO3 PGM2 GALM PGM3 GCK PKLR GPI TPI1 HK2 Databases of pathway representations Pathguide: http://pathguide.org/ Pathway Data Standards BioPAX (Biological pathway exchange) SBML (Systems biology markup language) Other: PSI-MI (Proteomics Standards Institute’s molecular interactions) KGML (KEGG markup language) GPML (Graphical pathway markup language) SBGN (Systems biology graphical notation) BioPAX Triple store with many pathway specific keywords BioPAX2 and BioPAX3 specifications NOT interoperable! http://www.biopax.org/webprotege/ http://www.biopax.org/mediawiki/index.php/Specification SBML XML encoding designed for describing biological models (entities, interactions, rates etc) e.g. given a set of reactions and initial conditions, how does the system proceed? References: libSBML SBML parser python libAntimony http://sbml.org/Documents/Specifications Pathway resource reference database stack Most entities in pathway databases are cross-referenced to identifiers in the following linked databases Pathway/Process: Protein/Enzyme: Gene Ontology EC Number Cellular location Entrez-protein Cellular location: MOPED Pathway Gene Ontology UniProt Reaction: Small molecule: EC Number (Enzyme CAS Reaction Commission) ChEBI ChemSpider Gene: HMDB Protein Small DNA Ensembl KEGG or Enzyme molecule or RNA Entrez-gene PubChem GeneCards HUGO Gene OMIM And now, some popular pathway databases… Public databases Reactome Central curation Supports BioPAX and SBML Public databases Reactome BioCyc (HumanCyc) Focuses on metabolic pathways Supports BioPAX and SBML Public databases Reactome BioCyc (HumanCyc) WikiPathways Community-based curation Uses GPML Public databases — other • NCI Pathway Interaction Database — data available at NDex or through PathwayCommons • PANTHER pathways — primarily signaling pathways • NetPath — focuses on signaling transduction pathways • SMPDB (Small Molecule Pathway Database) — focuses on human small molecule pathways • SignaLink — signaling pathways, cross-talks Pathway Diagrams: • BioCarta — available as gene sets from MSigDB; formed the basis of NCI PID • KEGG pathway diagrams — available at PathwayCommons, converted by BioModels • PharmGKB — available in BioPAX and GPML KEGG A history of pathway resources 1995 available subscription only BioCarta diagrams 2000 HumanCyc Reactome 2005 PANTHER NCI PID WikiPathways NetPath SMPDB 2010 2015 Subscription databases KEGG (Kyoto Encyclopedia of Genes and Genomes) Last public version: 2011 IPA (Ingenuity Pathway Analysis) MetaCore TRANSFAC Professional Last public version: 2005 Size of resources? Canonical versus species-specific pathways Pathway uniqueness Pathway overlap Hard to judge… Data access APIs/web services SPARQL endpoints Raw data APIs Quick questions might be best answered through API calls: Documentation: BioCyc Reactome Pathway Commons KEGG WikiPathways Libraries available in various programming languages, e.g. python bioservices SPARQL endpoints SPARQL is a query language for RDF Some resources have online SPARQL endpoints: Reactome WikiPathways Pathway Commons Or you can host your own: Stardog Virtuoso Raw data Download directly from sites with open data: Reactome HumanCyc Pathway Commons Interact with data through any programming language RDF libraries: rdflib (python), rrdf (R), Jena (Java) Paxtools (BioPAX — Java and R) libSBML (SBML — C/C++, Matlab, Java, Python) Redland (RDF library): C Let’s see some examples Assignment: • Find a partner NOTE: At least one person per pair should have taken KR • Pick a disease with a complex genetic component • Before next class, spend some time on Google Scholar or PubMed and find some of the genes that show association with your disease • We will use this next time Next class: 2017.1.18 Experimenting with APIs, SPARQL, and RDF libraries Think about: • When might you want to access data through APIs versus SPARQL versus directly? Read: http://www.pathwaycommons.org/pc2/ http://www.dataversity.net/introduction-to-sparql/ .
Recommended publications
  • Transdifferentiation of Human Mesenchymal Stem Cells

    Transdifferentiation of Human Mesenchymal Stem Cells

    Transdifferentiation of Human Mesenchymal Stem Cells Dissertation zur Erlangung des naturwissenschaftlichen Doktorgrades der Julius-Maximilians-Universität Würzburg vorgelegt von Tatjana Schilling aus San Miguel de Tucuman, Argentinien Würzburg, 2007 Eingereicht am: Mitglieder der Promotionskommission: Vorsitzender: Prof. Dr. Martin J. Müller Gutachter: PD Dr. Norbert Schütze Gutachter: Prof. Dr. Georg Krohne Tag des Promotionskolloquiums: Doktorurkunde ausgehändigt am: Hiermit erkläre ich ehrenwörtlich, dass ich die vorliegende Dissertation selbstständig angefertigt und keine anderen als die von mir angegebenen Hilfsmittel und Quellen verwendet habe. Des Weiteren erkläre ich, dass diese Arbeit weder in gleicher noch in ähnlicher Form in einem Prüfungsverfahren vorgelegen hat und ich noch keinen Promotionsversuch unternommen habe. Gerbrunn, 4. Mai 2007 Tatjana Schilling Table of contents i Table of contents 1 Summary ........................................................................................................................ 1 1.1 Summary.................................................................................................................... 1 1.2 Zusammenfassung..................................................................................................... 2 2 Introduction.................................................................................................................... 4 2.1 Osteoporosis and the fatty degeneration of the bone marrow..................................... 4 2.2 Adipose and bone
  • Lithium Chloride Sensitivity in Yeast and Regulation of Translation

    Lithium Chloride Sensitivity in Yeast and Regulation of Translation

    International Journal of Molecular Sciences Article Lithium Chloride Sensitivity in Yeast and Regulation of Translation Maryam Hajikarimlou 1 , Kathryn Hunt 1, Grace Kirby 1, Sarah Takallou 1, Sasi Kumar Jagadeesan 1, Katayoun Omidi 1, Mohsen Hooshyar 1, Daniel Burnside 1, Houman Moteshareie 1, Mohan Babu 2, Myron Smith 1, Martin Holcik 3 , Bahram Samanfar 1,4 and Ashkan Golshani 1,* 1 Department of Biology and Ottawa Institute of Systems Biology, Carleton University, Ottawa, ON K1S 5B6, Canada; [email protected] (M.H.); [email protected] (K.H.); [email protected] (G.K.); [email protected] (S.T.); [email protected] (S.K.J.); [email protected] (K.O.); [email protected] (M.H.); [email protected] (D.B.); [email protected] (H.M.); [email protected] (M.S.); [email protected] (B.S.) 2 Department of Biochemistry, Research and Innovation Centre, University of Regina, Regina, SK S4S 0A2, Canada; [email protected] 3 Department of Health Sciences, Carleton University, Ottawa, ON K1S 5B6, Canada; [email protected] 4 Agriculture and Agri-Food Canada, Ottawa Research and Development Centre (ORDC), Ottawa, ON K1Y 4X2, Canada * Correspondence: [email protected] Received: 21 May 2020; Accepted: 7 August 2020; Published: 10 August 2020 Abstract: For decades, lithium chloride (LiCl) has been used as a treatment option for those living with bipolar disorder (BD). As a result, many studies have been conducted to examine its mode of action, toxicity, and downstream cellular responses. We know that LiCl is able to affect cell signaling and signaling transduction pathways through protein kinase C and glycogen synthase kinase-3, which are considered to be important in regulating gene expression at the translational level.
  • UC San Diego Electronic Theses and Dissertations

    UC San Diego Electronic Theses and Dissertations

    UC San Diego UC San Diego Electronic Theses and Dissertations Title Genome-Scale Reconstruction and Analysis of Eukaryotic Metabolic Networks Permalink https://escholarship.org/uc/item/5rq7d96x Author Hurlen, Natalie Christine Publication Date 2016-07-30 Supplemental Material https://escholarship.org/uc/item/5rq7d96x#supplemental Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO GENOME-SCALE RECONSTRUCTION AND ANALYSIS OF EUKARYOTIC METABOLIC NETWORKS A Dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Bioengineering by Natalie Christine Hurlen Committee in charge: Professor Bernhard Ø. Palsson, Chair Professor Edward A. Dennis Professor Jeffrey D. Esko Professor Andrew D. McCulloch Professor Lanping Amy Sung 2006 Copyright Natalie Christine Hurlen, 2006 All rights reserved. The Dissertation of Natalie Christine Hurlen is approved, and it is acceptable in quality and form for publication on microfilm: Chair University of California, San Diego 2006 iii This Dissertation is dedicated to Mom, Pops, Mike, and “the boys” My dearest Erik and the Hurlen family and is in loving memory of Bohdan and Olga Chapla Henry and Eleanor Duarte iv We've called the human genome the book of life, but it's really three books: a history book, a shop manual and parts list, and a textbook of medicine more profoundly detailed than ever. Francis Collins, Director of the National Human Genome Research Institute
  • Bioinformatics Analysis of Potential Key Genes and Mechanisms in Type 2 Diabetes Mellitus Basavaraj Vastrad1, Chanabasayya Vastrad*2

    Bioinformatics Analysis of Potential Key Genes and Mechanisms in Type 2 Diabetes Mellitus Basavaraj Vastrad1, Chanabasayya Vastrad*2

    bioRxiv preprint doi: https://doi.org/10.1101/2021.03.28.437386; this version posted May 10, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Bioinformatics analysis of potential key genes and mechanisms in type 2 diabetes mellitus Basavaraj Vastrad1, Chanabasayya Vastrad*2 1. Department of Biochemistry, Basaveshwar College of Pharmacy, Gadag, Karnataka 582103, India. 2. Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad 580001, Karnataka, India. * Chanabasayya Vastrad [email protected] Ph: +919480073398 Chanabasava Nilaya, Bharthinagar, Dharwad 580001 , Karanataka, India bioRxiv preprint doi: https://doi.org/10.1101/2021.03.28.437386; this version posted May 10, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Abstract Type 2 diabetes mellitus (T2DM) is etiologically related to metabolic disorder. The aim of our study was to screen out candidate genes of T2DM and to elucidate the underlying molecular mechanisms by bioinformatics methods. Expression profiling by high throughput sequencing data of GSE154126 was downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) between T2DM and normal control were identified. And then, functional enrichment analyses of gene ontology (GO) and REACTOME pathway analysis was performed. Protein–protein interaction (PPI) network and module analyses were performed based on the DEGs. Additionally, potential miRNAs of hub genes were predicted by miRNet database . Transcription factors (TFs) of hub genes were detected by NetworkAnalyst database. Further, validations were performed by receiver operating characteristic curve (ROC) analysis and real-time polymerase chain reaction (RT-PCR).
  • Complete Human and Rabbit Mrna Sequences and Direct Mapping of This Highly Polymorphic Marker on Human Chromosome 1 D

    Complete Human and Rabbit Mrna Sequences and Direct Mapping of This Highly Polymorphic Marker on Human Chromosome 1 D

    Proc. Natl. Acad. Sci. USA Vol. 89, pp. 411-415, January 1992 Genetics Phosphoglucomutase 1: Complete human and rabbit mRNA sequences and direct mapping of this highly polymorphic marker on human chromosome 1 D. B. WHITEHOUSE, W. PUTT, J. U. LOVEGROVE, K. MORRISON, M. HOLLYOAKE, M. F. Fox, D. A. HOPKINSON, AND Y. H. EDWARDS Medical Research Council Human Biochemical Genetics Unit, Galton Laboratory, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, United Kingdom Communicated by James V. Neel, September 23, 1991 ABSTRACT A cDNA clone encoding the mRNA for the activity is retained for prolonged periods in stains recovered highly polymorphic human enzyme phosphoglucomutase 1 from scenes of crimes. (PGM1; EC 5.4.2.2) has been isolated and characterized. This The extensive heterozygosity of PGMJ and certain fea- was achieved indirectly by first isolating a rabbit cDNA from tures of the variant isozyme patterns have led to speculation an expression library using anti-rabbit PGM antibodies. A about the role of intragenic recombination in generating this comparison of the nucleotide sequences shows that the homol- diversity. The isozymes associated with the two alleles, ogies between human and rabbit PGM1 mRNAs are 92% and PGMJ*1 and PGMI*2, discovered by starch gel electropho- 97% for the coding nucleotide sequence and the amino acid resis, can be subdivided by isoelectric focusing into two sequence, respectively. The derived rabbit amino acid sequence classes: one designated "+" is more anodal than the other, is in complete agreement with the published protein sequence designated "-". Thus four alleles (PGMI*J+, 1, 2+, and for rabbit muscle PGM.