Standards and Tools for Model Exchange and Analysis in Systems Biology

Standards and Tools for Model Exchange and Analysis in Systems Biology Ralph Gauges Dissertation submitted to the Combined Faculties for the Natural Sciences and for Mathematics of the Ruperto-Carola University of Heidelberg, Germany for the degree of Doctor of Natural Sciences presented by Diplom-Biochemiker Ralph Gauges born in: Sigmaringen, Germany Oral-examination: 07/11/2011 Standards and Tools for Model Exchange and Analysis in Systems Biology Referees: Prof. Dr. Ursula Kummer Dr. Rebecca Wade Contents Zusammenfassung vii Summary x Abbreviations xvii 1 Introduction 1 2 Materials & Methods 19 2.1 Operating Systems . 19 2.2 Programming Languages . 20 2.3 Unit Testing . 24 2.4 Debugging & Profiling Tools . 26 2.5 Libraries & Standards . 29 3 SBML Layout & Render Extension 46 3.1 SBML & Diagrams . 46 3.2 Alternative Diagram Formats . 47 3.3 Design & History . 49 3.4 The SBML Layout Extension Specification . 51 3.5 Implementation Of The Layout Extension . 57 3.6 The SBML Render Extension . 61 3.7 Third Party Implementations . 86 3.8 The SBML Layout And Render Extension In NF-κB Modeling 87 4 Standards In COPASI 93 4.1 SBML Support In COPASI . 93 4.2 Layout And Render Information In COPASI . 112 4.3 Graphical Display Of Time Course Simulation Data . 115 4.4 Graphical Display Of Elementary Modes . 116 4.5 COPASI Language Bindings . 117 4.6 NF-κB Modeling with COPASI . 120 i ii CONTENTS 4.7 Work Contributions . 128 5 Expression Normalization 130 5.1 Normal Form Classes . 131 5.2 Expression Tree Classes . 137 5.3 Normalization Algorithm . 142 5.4 Testing . 150 5.5 Normalizing BioModels Expressions . 150 5.6 Finding Kinetic Laws For The NF-κB Model . 153 6 Discussion 157 6.1 Layout And Render Information In SBML . 157 6.2 Standards In COPASI . 160 6.3 Expression Normalization . 164 Acknowledgments 171 List of Figures 1.1 example of a simple reaction network . .6 1.2 alternative model specifications . .8 1.3 reaction display and differential equation display in COPASI . 10 1.4 electrical circuit diagram vs. SBGN diagram . 14 2.1 XML structure example . 33 2.2 Resource Description Framework example . 35 2.3 SBML Document structure . 39 2.4 Scalable Vector Graphics example . 40 2.5 SVG rendering example . 41 3.1 diagram examples . 49 3.2 SBML XML structure example . 52 3.3 model structure vs. layout structure . 53 3.4 example of role attribute usage . 54 3.5 bounding box example . 55 3.6 line segment example . 55 3.7 cubic bézier example . 55 3.8 curve definition example . 56 3.9 layout rendered with different styles . 57 3.10 SBML layout classes . 59 3.11 embedding render information in SBML documents . 63 3.12 RGB and RGBA value specification . 64 3.13 color definition example . 65 3.14 linear gradient example . 65 3.15 radial gradient example . 66 3.16 example for association by type . 67 3.17 example for association by role . 68 3.18 example for association by id . 69 3.19 render primitives . 71 3.20 render curve example . 71 iii iv LIST OF FIGURES 3.21 stroke example . 72 3.22 line ending example . 73 3.23 application of line ending example . 74 3.24 example for enableRotationalMapping attribute . 75 3.25 polygon example . 76 3.26 fill style example . 76 3.27 rectangle example . 77 3.28 rectangle with rounded corners . 77 3.29 ellipse example . 77 3.30 circle example . 78 3.31 text element example . 78 3.32 bitmap example . 79 3.33 complex style example . 80 3.34 coordinate schema . 81 3.35 relative vs. absolute coordinates . 81 3.36 transformation matrix example . 82 3.37 render extension classes . 83 3.38 render extension implementations . 85 3.39 CellDesigner diagram for NF-κB activation model . 89 3.40 display of imported CellDesigner layout information . 91 4.1 SBML data structure vs. COPASI data structure . 98 4.2 SBML test suite results . 106 4.3 SBML online test suite results . 107 4.4 simulator comparison website . 109 4.5 stochastic test suite results . 110 4.6 COPASI screen shot for MIRIAM annotation . 111 4.7 layout structure in COPASI files . 113 4.8 render classes in COPASI . 113 4.9 COPASI layout rendering . 114 4.10 time course animation . 116 4.11 graphical elementary mode display . 117 4.12 Results of elementary mode analysis of initial NF-κB model . 123 4.13 diagram of extended NF-κB model . 124 4.14 Results of elementary mode analysis of extended NF-κB model 125 4.15 time course data plot from COPASI . 126 4.16 several animation frames for a time series of the NF-κB model 127 5.1 structure of normalized expression . 132 5.2 normal form data structure classes . 133 5.3 CNormalFraction structure . 133 LIST OF FIGURES v 5.4 CNormalSum structure . 134 5.5 CNormalProduct structure . 134 5.6 CNormalItemPower structure . 135 5.7 CNormalGeneralPower structure . 135 5.8 CNormalChoice structure . 136 5.9 CNormalLogical structure . 137 5.10 CNormalChoiceLogical structure . 137 5.11 expression tree node classes . 138 5.12 expression tree example . 139 5.13 choice node example . 141 5.14 function expansion example . 143 5.15 normalization process . 144 5.16 operations on numbers . 146 5.17 example for the canceling operation . 146 5.18 conversion of CNormalGeneralPower .............. 148 5.19 result of BioModels analysis . 152 6.1 ambiguous rate law example . 166 6.2 comparison result output . 167 6.3 SBO term SBO:0000054 (screen shot) . 169 List of Tables 1.1 Definition of "Omics"-terms . .4 2.1 SWIG language support . 44 3.1 role attribute values . 54 3.2 predefined glyph types . 67 4.1 supported SBML versions . 95 4.2 language binding versions . 119 5.1 types for constant nodes . 139 5.2 functions represented by function nodes . 140 5.3 logical node types . 141 vi Zusammenfassung Die Systembiologie hat sich in den letzten Jahren zu einem wichtigen Werk- zeug der biochemischen Forschung entwickelt. Systembiologie beschäftigt sich mit der Simulation und Analyse biologischer und biochemischer Vorgänge mit Hilfe von Computern. Diese Vorgänge werden in rechnerischen Modellen beschrieben, wobei verschiedene mathematische Formalismen zum Einsatz kommen. Durch die Simulation solcher Modelle ist es meist möglich, Vorher- sagen über das Verhalten der repräsentierten Systeme zu machen und so gezielt Experimente zu planen, um diese Vorhersagen zu bestätigen. Dadurch besteht die Möglichkeit, ein besseres Verständnis über das zu untersuchende System zu erlangen bzw. existierende Hypothesen bezüglich des Systems zu überprüfen. Oft kann auch aufgrund der gezielteren Planung von Experi- menten durch den Einsatz systembiologischer Methoden kostbare Zeit im Labor eingespart werden. Für die Analyse und Simulation solcher Modelle wird geeignete Software benötigt und viele Forschungsgruppen auf der ganzen Welt sowie eine Reihe von Firmen arbeiten an der Entwicklung solcher Programme. Die meisten dieser Programme widmen sich gezielt einem bestimmten Aspekt der systembiologischen Forschung. So ermöglichen manche Programme die Simulation von Modellen, während andere Programme Forscher bei der Erstellung der mathematischen Repräsentation unterstützen und wieder andere Programme sich auf die Visualisierung der verschiedenen Analyseergebnisse spezialisieren. Da es vermutlich kein Programm gibt, welches sämtliche Arbeitsschritte unterstützt, die zur Erstellung, Simulation und umfassenden Analyse dieser Modelle nötig sind, sind Systembiologen bei ihrer Arbeit oft auf eine ganze Reihe von Programmen angewiesen. Um sinnvoll mit dieser Vielzahl an Pro- grammen arbeiten zu können ist es notwendig, dass die einzelnen Programme den Austausch von Daten und insbesondere Modellen unterstützen. Dies war zu Beginn dieser Arbeit keinesfalls selbstverständlich, sondern eher die Aus- vii viii ZUSAMMENFASSUNG nahme als die Regel. Erst die Entwicklung bestimmter Standards auf dem Gebiet der System- biologie änderte die Situation grundlegend. Mit Hilfe dieser Standards ist es nun möglich, Daten und Modelle einmal zu erstellen, um sie anschließend in verschiedenen Programmen zu simulieren, zu analysieren oder zur Visuali- sierung zu verwenden. Ein grosser Teil dieser Arbeit beschäftigt sich dementsprechend auch mit der Entwicklung, der Erweiterung und der Implementierung solcher Stan- dards in Form verschiedener Computerprogramme. Ein solches Programm, welches Forscher bei der Erstellung und Analyse von Modellen biochemischer Reaktionsnetzwerke unterstützt, ist das von unserer Gruppe mitent- wickelte Programm COPASI. Teile dieser Arbeit beschreiben somit auch die Implementierung einiger wichtiger Standards im Rahmen der Entwick- lung von COPASI. Besonders die von mir entwickelte Erweiterung der Sys- tem Biology Markup Language (SBML) um die Möglichkeit grafische Dar- stellungen von Modellen zusammen mit der mathematischen Beschreibung abzuspeichern bzw. auszutauschen nimmt hierbei eine zentrale Rolle ein. Die vorliegende Arbeit beschreibt sowohl die Entwicklung dieser grafischen Erweiterung als auch die Implementierung des zugrundeliegenden SBML Formats sowie der Erweiterung in Form verschiedener Werkzeuge für die systembiologische Forschung. Eine weitere wichtige Rolle spielt die Ver- wendung der grafischen Erweiterung zur Visualisierung von Simulations- und Analyseergebnissen und die Erstellung entsprechender Visualisierungswerk- zeuge in COPASI. Der dritte Teil dieser Arbeit beschäftigt sich ferner mit der Analyse mathematischer Ausdrücke wie sie oft im Zusammenhang mit Modellen biochemischer Reaktionsnetzwerke verwendet werden. Die beschriebene Ana- lysemethode dient dazu, mathematische Ausdrücke umzuformen (normali- sieren),.

Standards and Tools for Model Exchange and Analysis in Systems Biology

Web-Link Name Reference DRSC Flockhart I, Booker M, Kiger A, Et Al.: Flyrnai: the Drosophila Rnai Screening Center Database

Mendel® SBML Editor User Guide Table of Contents

A Bayesian Inference Transcription Factor Activity Model for the Analysis of Single-Cell Transcriptomes

Data Management in Systems Biology I

Tellurium: a Python Based Modeling and Reproducibility Platform for Systems Biology

COPASI Documentation VERSION 4.3 (BUILD 25) 1 / 113

A Modular Mathematical Model of Exercise-Induced Changes In

CYCLONET—An Integrated Database on Cell Cycle Regulation And

View

Comprehensive Modeling Platform for Photosynthetic Organisms

Synthetic Animal: Trends in Animal Breeding and Genetics

COPASI Documentation I