Using Ontology and Semantic Web Services to Support Modeling in Systems Biology

Total Page:16

File Type:pdf, Size:1020Kb

Using Ontology and Semantic Web Services to Support Modeling in Systems Biology Centre for Mathematics & Physics in the Life Sciences and EXperimental Biology University College London Using Ontology and Semantic Web Services to Support Modeling in Systems Biology Zhouyang Sun Submitted for the degree of Doctor of Philosophy At University College London December 2008 Revised for the final submission 2009 I, Zhouyang Sun, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Signature: Date: Abstract This thesis addresses the problem of collaboration among experimental biologists and modelers in the study of systems biology by using ontology and Semantic Web Services techniques. Modeling in systems biology is concerned with using experimental information and mathematical methods to build quantitative models across different biological scales. This requires interoperation among various knowledge sources and services. Ontology and Semantic Web Services potentially provide an infrastructure to meet this requirement. In our study, we propose an ontology-centered framework within the Semantic Web infrastructure that aims at standardizing various areas of knowledge involved in the biological modeling processes. In this framework, first we specify an ontology-based meta-model for building biological models. This meta-model supports using shared biological ontologies to annotate biological entities in the models, allows semantic queries and automatic discoveries, enables easy model reuse and composition, and serves as a basis to embed external knowledge. We also develop means of transforming biological data sources and data analysis methods into Web Services. These Web Services can then be composed together to perform parameterization in biological modeling. The knowledge of decision-making and workflow of parameterization processes are then recorded by the semantic descriptions of these Web Services, and embedded in model instances built on our proposed meta-model. We use three cases of biological modeling to evaluate our framework. By examining our ontology-centered framework in practice, we conclude that by using ontology to represent biological models and using Semantic Web Services to standardize knowledge components in modeling processes, greater capabilities of knowledge sharing, reuse and collaboration can be achieved. We also conclude that ontology- based biological models with formal semantics are essential to standardize knowledge in compliance with the Semantic Web vision. TABLE OF CONTENT CHAPTER 1 : INTRODUCTION ...................................................................................... 10 1.1 Background ........................................................................................................ 10 1.2 Methods, Contributions, and Originality ........................................................ 10 1.3 Thesis Outline ..................................................................................................... 11 CHAPTER 2 : MOTIVATION ......................................................................................... 13 2.1 What is Systems Biology? .................................................................................. 13 2.2 What is Involved in Modeling in Systems Biology? ........................................ 14 2.3 Semantic Web for Modeling in Systems Biology ............................................ 15 2.4 Chapter Summary ............................................................................................. 16 CHAPTER 3 : REVIEW OF TECHNIQUES AND RELATED WORK .................................. 17 3.1 Ontology .............................................................................................................. 18 3.1.1 What is Ontology? .......................................................................................... 18 3.1.2 Ontology Representation Levels ..................................................................... 20 3.1.3 Ontology Languages ....................................................................................... 22 3.1.4 Current Development of Ontology in Life Sciences ....................................... 24 3.2 Agent-based Systems and Web Service Infrastructure .................................. 28 3.2.1 Agent-based Systems ...................................................................................... 29 3.2.2 Web Service Infrastructure in the Life Sciences ............................................. 31 3.3 Summary ............................................................................................................. 34 CHAPTER 4 : CASES OF BIOLOGICAL MODELING ...................................................... 35 4.1 Hodgkin & Huxley Case .................................................................................... 36 4.1.1 Biological background .................................................................................... 36 4.1.2 Mathematical modeling .................................................................................. 39 4.2 Lewis & Hudspeth Case .................................................................................... 43 4.2.1 Biological Background ................................................................................... 43 4.2.2 Experimental Data Acquisition ....................................................................... 46 4.2.3 Mathematical Modeling .................................................................................. 47 4.2.4 Computational Simulation .............................................................................. 51 4.3 Case of Hormone-induced Calcium Oscillation Composite Model ............... 54 4.3.1 Background ..................................................................................................... 54 4.3.2 Understand Intracellular Calcium Oscillation by Model Integration ............. 55 4.3.3. Mathematical Modelling ................................................................................ 58 4.4 Chapter Summary ............................................................................................. 65 CHAPTER 5 : USING SEMANTIC WEB TECHNOLOGIES TO SUPPORT BIOLOGICAL MODELING ................................................................................................................... 67 5.1 Workflow of Modeling Processes ..................................................................... 67 5.2 Typology of Modeling Knowledge .................................................................... 76 5.3 From Modeling Knowledge to Semantic Web Components .......................... 78 5.4 Our Approach .................................................................................................... 82 5.4.1 Create abstract biological models by using ontology ..................................... 83 5.4.2 From experimental Data to Database Web Services ....................................... 83 5.4.3 From Analysing Methods to Web Services .................................................... 84 5.4.4 Use OWL-S to specify Parameterisation in Computational Models .............. 84 5.4.5 Outcome .......................................................................................................... 86 5.5 Chapter Summary ............................................................................................. 86 CHAPTER 6 : DESCRIPTION OF THE FRAMEWORK FOR BIOLOGICAL MODELING ... 87 6.1 Build Biological Models in OWL ...................................................................... 88 6.1.1 Using OWL format for the Meta-model ......................................................... 88 6.1.2 Meta-model for the Crucial Modeling Components ....................................... 91 6.1.3 Meta-model Uses Shared Biological Ontologies for Instantiation ................. 95 6.1.4 Use Meta-model to generate computational simulations ................................ 97 6.2 Transforming Experimental Data into Semantic Web Services ................... 97 6.2.1 Transform Data Source into Relational Database ........................................... 99 6.2.2 Generate Java Entity Classes from Relational Database .............................. 102 6.2.3 Generic Java Methods for Database Control ................................................ 106 6.2.4 Semantic Description for Data Web Service ................................................ 108 6.3 Transforming Analysis Methods to Semantic Web Services ....................... 113 6.4 Web Service Composition ............................................................................... 114 6.5 Generate Simulation by using Web Service Composition Models .............. 116 6.6 Summary ........................................................................................................... 117 CHAPTER 7 : FRAMEWORK EVALUATION BY CASE STUDIES ................................... 119 7.1 Model Discovery ............................................................................................... 119 7.2 Model Reuse ..................................................................................................... 127 7.3 Automatic generation of simulations ............................................................. 132 7.4 Model Composition .......................................................................................... 133 7.5 Model Configuration ....................................................................................... 136 7.6 Summary ..........................................................................................................
Recommended publications
  • Semantic Web Tools: an Overview 7Th International CALIBER 2009 Semantic Web Tools: an Overview
    Semantic Web Tools: An Overview 7th International CALIBER 2009 Semantic Web Tools: An Overview D Shivalingaiah Umesha Naik Abstract The WWW is the largest single information resource humanity. Unfortunately, despite its dependence on computers to operate at all, most of the information is only understandable by humans and not by computers. While computers can use the syntax of HTML documents to display in a web browser, Web computers can’t understand the content the semantics. Human beings are capable of using the Web to carry out tasks such as finding the information. However, a computer cannot accomplish the same tasks without human direction because web pages are designed to be read by public, not machines. The semantic web is a vision of information that is understandable by computers, so that public can perform more of the tedious work involved in finding, sharing and combining information on the web. The paper emphasizes the semantic tools available. Keywords: Semantic Web, Ontology, Resource Description Framework, Web Ontology Language, Web Service Modeling Language, Web Service Modeling Framework, DARPA 1. Introduction the resulting network of Linked Data the Giant Global Graph, in contrast to the HTML-based The Semantic Web takes the solution further. The WWW. Semantic Web is not a separate entity from the WWW. It is an extension to the Web that adds new Web 2.0 is focused on people the Semantic Web is data and metadata to existing Web documents, focused on machines. The Web requires a human extending those documents into data. This operator, using computer systems to perform the extension of Web documents to data is what will tasks required to find, search and aggregate its enable the Web to be processed automatically by information.
    [Show full text]
  • Gbrowse Moby: a Web-Based Browser for Biomoby Services Mark Wilkinson*
    Source Code for Biology and Medicine BioMed Central Software review Open Access Gbrowse Moby: a Web-based browser for BioMoby Services Mark Wilkinson* Address: Assistant Professor, Department of Medical Genetics, University of British Columbia, James Hogg iCAPTURE Centre for Cardiovascular and Pulmonary Research, St. Paul's Hospital, Rm. 166, 1081 Burrard St., Vancouver, BC, V6Z 1Y6, Canada Email: Mark Wilkinson* - [email protected] * Corresponding author Published: 24 October 2006 Received: 08 July 2006 Accepted: 24 October 2006 Source Code for Biology and Medicine 2006, 1:4 doi:10.1186/1751-0473-1-4 This article is available from: http://www.scfbm.org/content/1/1/4 © 2006 Wilkinson; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: The BioMoby project aims to identify and deploy standards and conventions that aid in the discovery, execution, and pipelining of distributed bioinformatics Web Services. As of August, 2006, approximately 680 bioinformatics resources were available through the BioMoby interoperability platform. There are a variety of clients that can interact with BioMoby-style services. Here we describe a Web-based browser-style client – Gbrowse Moby – that allows users to discover and "surf" from one bioinformatics service to the next using a semantically-aided browsing interface. Results: Gbrowse Moby is a low-throughput, exploratory tool specifically aimed at non- informaticians. It provides a straightforward, minimal interface that enables a researcher to query the BioMoby Central web service registry for data retrieval or analytical tools of interest, and then select and execute their chosen tool with a single mouse-click.
    [Show full text]
  • Biological Data Integration Using Semantic Web Technologies
    Biological data integration using Semantic Web technologies Pasquier C Phone: +33 492 07 6947 Fax: +33 492 07 6432 Email: [email protected] Institute of Signaling, Developmental Biology & Cancer CNRS - UMR 6543, University of Nice Sophia-Antipolis Parc Valrose, 06108 NICE cedex 2, France. Summary Current research in biology heavily depends on the availability and efficient use of information. In order to build new knowledge, various sources of biological data must often be combined. Semantic Web technologies, which provide a common framework allowing data to be shared and reused between applications, can be applied to the management of disseminated biological data. However, due to some specificities of biological data, the application of these technologies to life science constitutes a real challenge. Through a use case of biological data integration, we show in this paper that current Semantic Web technologies start to become mature and can be applied for the development of large applications. However, in order to get the best from these technologies, improvements are needed both at the level of tool performance and knowledge modeling. Keywords Data integration, Semantic Web, OWL, RDF, SPARQL, Knowledge Base System (KBS) Introduction Biology is now an information-intensive science and research in genomics, transcriptomics and proteomics heavily depend on the availability and the efficient use of information. When data were structured and organized as a collection of records in dedicated, self-sufficient databases, information was retrieved by performing queries on the database using a specialized query language; for example SQL (Structured Query Language) for relational databases or OQL (Object Query Language) for object databases.
    [Show full text]
  • Ontology Based Data Integration in Life Sciences
    Ontology based data integration in Life Sciences Dmitry Repchevskiy A questa tesi doctoral està subjecta a la llicència Reconeixement 3.0. Espanya de Creative Commons . Esta tesis doctoral está sujeta a la licencia Reconocimi ento 3.0. España de Creative Commons . Th is doctoral thesis is licensed under the Creative Commons Att ribution 3.0. Spain License . UNIVERSITY OF BARCELONA FACULTY OF BIOLOGY Doctorate Program: Biomedicine Research Line: Bioinformatics 2010-2015 Ontology based data integration in life sciences Submitted by Dmitry Repchevskiy in fulfillment of the requirements for the doctoral degree by the University of Barcelona Supervisor: Dr. Josep Lluís Gelpí Buchaca Department of Biochemistry and Molecular Biology University of Barcelona Dmitry Repchevskiy Barcelona Supercomputing Center National Institute of Bioinformatics ACKNOWLEDGEMENTS “It is not the consciousness of men that determines their being, but, on the contrary, their social being that determines their consciousness.” Karl Marx Without any doubts this thesis wouldn’t be possible without many people that I had a pleasure to work with. If only I tried to enumerate all their invaluable help, all the chats and communications we had, this thesis would require a second volume. First and foremost, I would like to thank my supervisor Josep Lluís Gelpí, who gave me a lot of freedom in defining projects I worked on. Not all of them have been included into this thesis and some of them never reached the end, but the real experience gathered in these years allowed me to go get into this final point. I would also like to thank my colleague José María Fernández from Spanish National Cancer Research Centre (CNIO), who always found a time to discuss technological aspects of my projects and to our entire group just to be with me all these long years.
    [Show full text]
  • Strategies for Amassing, Characterizing, and Applying Third-Party Metadata in Bioinformatics
    STRATEGIES FOR AMASSING, CHARACTERIZING, AND APPLYING THIRD-PARTY METADATA IN BIOINFORMATICS by BENJAMIN MCGEE GOOD B.Sc., The University of California at San Diego, 1998 M.Sc., The University of Sussex, 2000 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Bioinformatics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) April 2009 © Benjamin McGee Good, 2009 Abstract Bioinformatics resources on the Web are proliferating rapidly. For biomedical researchers, the vital data they contain is often difficult to locate and to integrate. The semantic Web initiative is an emerging collection of standards for sharing and integrating distributed information resources via the World Wide Web. In particular, these standards define languages for the provision of the metadata that facilitates both discovery and integration of distributed resources. This metadata takes the form of ontologies used to annotate information resources on the Web. Bioinformatics researchers are now considering how to apply these standards to enable a new generation of applications that will provide more effective ways to make use of increasingly diverse and distributed biological information. While the basic standards appear ready, the path to achieving the potential they entail is muddy. How are we to create all of the needed ontologies? How are we to use them to annotate increasingly large bodies of information? How are we to judge the quality of these ontologies and these proliferating annotations? As new metadata generating systems emerge on the Web, how are we to compare these to previous systems? The research conducted for this dissertation seeks new answers to these questions.
    [Show full text]
  • The Semantic Automated Discovery and Integration (SADI) Web Service
    Wilkinson et al. Journal of Biomedical Semantics 2011, 2:8 http://www.jbiomedsem.com/content/2/1/8 JOURNAL OF BIOMEDICAL SEMANTICS DATABASE Open Access The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation Mark D Wilkinson*, Benjamin Vandervalk and Luke McCarthy * Correspondence: Abstract [email protected] Department of Medical Genetics, Background: The complexity and inter-related nature of biological data poses a Heart + Lung Institute at St. Paul’s Hospital, University of British difficult challenge for data and tool integration. There has been a proliferation of Columbia, Vancouver, BC, Canada interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community. Description: SADI - Semantic Automated Discovery and Integration - is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services “stack”, SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers. Conclusions: SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users.
    [Show full text]