Comprehensive Modeling Platform for Photosynthetic Organisms

Total Page:16

File Type:pdf, Size:1020Kb

Comprehensive Modeling Platform for Photosynthetic Organisms MASARYK UNIVERSITY FACULTY OF INFORMATICS COMPREHENSIVE MODELING PLATFORM FOR PHOTOSYNTHETIC ORGANISMS THESIS MATEJ KLEMENT 2012 Contents 1 Introduction4 1.1 Objectives..................................5 2 State of the art6 2.1 Data Exchange formats..........................8 2.1.1 SBML................................8 2.1.2 CellML................................9 2.1.3 BioPAX................................9 2.1.4 PSI-MI................................ 10 2.1.5 SBGN................................ 10 2.1.6 Format of Matlab.......................... 11 2.1.7 Format Octave........................... 11 2.2 Data Exchange and modeling tools................... 11 2.2.1 Biomodels.net........................... 12 2.2.2 CellML.org.............................. 13 2.2.3 Copasi................................ 13 2.2.4 Vcell................................. 13 2.2.5 E-cell................................. 14 2.2.6 ProMot................................ 14 2.2.7 PaxTools............................... 15 2.2.8 Matlab................................ 15 2.2.9 Octave................................ 16 2.2.10Scilab................................ 16 2.2.11BioUML............................... 16 2.3 Annotation ontology............................ 17 2.3.1 Gene Ontology........................... 17 2.3.2 KEGG................................ 17 2.3.3 SBO................................. 17 2.4 Photosynthesis modeling......................... 17 3 Aims 18 3.1 Theoretical aims.............................. 18 3.2 Practical aims............................... 19 3.3 Methodology................................ 19 3.4 Progression schedule........................... 20 2 3.5 Expected Outputs............................. 20 4 Results 21 4.1 Design and specification......................... 21 4.1.1 Ontology tree............................ 21 4.1.2 Model structure.......................... 21 4.1.3 Connecting ontology and model................. 22 4.1.4 Annotation database........................ 22 4.1.5 Ontology and model annotation................. 23 4.2 Implemented system............................ 24 4.3 Conclusion................................. 27 5 Publications 28 3 Chapter 1 Introduction In last decades a great number of computer driven sciences has emerged which was caused by the fast development in microchip technology. One of those sciences is systems biology which is new field in biology aiming at system-level understanding of biological systems[14]. At the beginning molecular biology was researching biological systems and did remarkable progress in this area but recently is focusing on identifi- cation of genes and functions of their products which are components of systems. Next major task is to understand components of biological systems revealed by molecular biology at the system level. Systems biology was established to achieve this long-term task. While systems biology covers all aspects of analyzing behavior of system models computational systems biology aims only at the narrower part of this research. Compu- tational systems biology targets at understanding of system level of biological systems by analyzing biological data using computational techniques[16]. The latest enormous advance of genome sequencing projects, microarrays, proteomics and metabolomics moved this field forward giving more powerful tools and knowledge to discover re- lations and behavior among data. With systems biology in mind new sophisticated computational methods are being developed to analyze the data generated by that technology in systematic way deciphering complex and networked biological processes and phenomena taking place in cells, tissues and organisms. Latest development in information technology, cheap and accessible computer power, global networks and databases become widely accessible for mathematical modeling and simulation of com- plex biological systems. Simulation and modeling combines the use of different system analysis tools like discrete mathematics, stochastics, differential equations, complex system simulation with model-database integration architectures. Creating and test- ing of quantitative models unraveling hierarchical and non-linear character of cellular system will be feasible through cooperative work of theoretical and experimental bi- ologists working together with system analysts, computer scientists, mathematicians, engineers and physicists. These long-run efforts demand comprehensive tools to share knowledge and data among participating capacities. As a result of the latest trends moving from extremely reduced models and analyses which is caused by possibility of cooperation of large teams of scientists around the world, there are starting to be large amount of simulated data from thousands of com- ponents like mRNA or proteins. Connection of these simulated data creates compact 4 blocks of cellular machinery in action. Dynamic models describing these processes can be created from these blocks. These comprehensive models explicitly represents large amount of biochemical reactions at relatively high level of detail. But mentioned dynamic models present another challenge which originates from transcription of non- linear systems to models. This problem is estimation of numerical parameters which can be solved in inverse fashion where simulated data are compared to experiments by sophisticated software for searching of local and global minimum in multidimensional space. Last decade was fruitful for systems biology and formats, languages and tools han- dling these formats. Thanks to this development many tools were created and are used to present. All tools aimed on this field are mostly of general nature. This means there are not any tools dedicated for photosynthesis, its modeling and research. De- spite the fact photosynthesis research can bring solution of renewable fuels or artificial oxygen production main aims of current biology is research of DNA, mRNA, proteins, etc. Another pullback is that photosynthesis belongs to another field which is physics because of character of several reactions. This was reason of formation of CyanoTeam project which aims at solving of problems of photosynthesis. This project is done with cooperation with PSI company and Global Change Research Centre AS CR, v.v.i. The second chapter describes the current development of systems biology, more precisely ways of handling biological models, tools handling these models and annota- tions integrating these models in broader context. Third chapter deals with aims and objectives to be reached as well as steps which are necessary to undertake to reach this aims. Fourth chapter contains current results and state of work as well as described implementation. In fifth chapter are described achieved publications. 1.1 Objectives The main objective of this work is to create a tool and methodology providing for par- ticipating sides in photosynthesis research place for exchange and maintenance of dynamic models and knowledge about this process. Nedbal et al. created concept called Comprehensive Modeling Space[18] which describes fundamental ideas of photo- synthesis models specification. This work should propose solutions for Comprehensive Modeling Space conception and introduce methodology providing set of rules for correct encoding of modular models, data composition, suggest naming convention for indi- vidual model components (called Comprehensive Modeling Platform) and also should contain practical output in form of implemented application covering matters of visual- ization, sharing, exploring, maintenance, annotation and dynamic analysis of models on generally available platform running in the web environment. Mentioned models should support top-down and bottom-up modeling strategies. Moreover, the solution should support communication with common available tools and formats. The main benefit of the whole concept should be its domain specific aim which should bring pos- sibility to describe and understand better the given area of interest than general tools and approaches. 5 Chapter 2 State of the art In last decade systems biology went through a large improvement[14] caused by over- all progress in information technology. This fact facilitates better and faster sharing of information about particular examined processes. As a result of being systems biology new science, primarily researched areas are those of common interest which includes mostly DNA, gene profiling and protein-protein interactions. Computational systems biology concerns with subgroup of problems addressed by systems biology putting stress primarily on data analysis in systematic manner which originates from improvement of new technologies. It is necessary to have ability to exchange this newly discovered knowledge among specialists. Several languages for systems biology were created to share knowledge and models with this intention in mind. Systems biology models are mostly those of dynamic type what means they describe dynamics of modeled system in time and mainly are aimed for population development of examined process. The best-known and most common languages include the format SBML[7] along with the format CellML[21] while both these formats were created as subset of XML[8]. Advantage of these formats is keeping structure and lucidity of models necessary for cooperation of various teams and for passing of knowledge. There are more formats similar to those mentioned above like BioPax[5], PSI-MI[11] or SBGN[25] which area aimed for narrower part of scope. BioPax was developed for
Recommended publications
  • Web-Link Name Reference DRSC Flockhart I, Booker M, Kiger A, Et Al.: Flyrnai: the Drosophila Rnai Screening Center Database
    web-link Name Reference http://flyrnai.org/cgi-bin/RNAi_screens.pl DRSC Flockhart I, Booker M, Kiger A, et al.: FlyRNAi: the Drosophila RNAi screening center database. Nucleic Acids Res. 34(Database issue):D489-94 (2006) http://flybase.bio.indiana.edu/ FlyBase Grumbling G, Strelets V.: FlyBase: anatomical data, images and queries. Nucleic Acids Res. 34(Database issue):D484-8 (2006) http://rnai.org/ RNAiDB Gunsalus KC, Yueh WC, MacMenamin P, et al.: RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects. Nucleic Acids Res. 32(Database issue):D406-10 (2004) http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIMOMIM Hamosh A, Scott AF, Amberger JS, et al.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33(Database issue):D514-7 (2005) http://www.phenomicdb.de/ PhenomicDB Kahraman A, Avramov A, Nashev LG, et al.: PhenomicDB: a multi-species genotype/phenotype database for comparative phenomics. Bioinformatics 21(3):418-20 (2005) http://www.worm.mpi-cbg.de/phenobank2/cgi-bin/MenuPage.pyPhenoBank http://www.informatics.jax.org/ MGI - Mouse Genome Informatics Eppig JT, Bult CJ, Kadin JA, et al.: The Mouse Genome Database (MGD): from genes to mice--a community resource for mouse biology. Nucleic Acids Res. 33(Database issue):D471-5 (2005) http://flight.licr.org/ FLIGHT Sims D, Bursteinas B, Gao Q, et al.: FLIGHT: database and tools for the integration and cross-correlation of large-scale RNAi phenotypic datasets. Nucleic Acids Res. 34(Database issue):D479-83 (2006) http://www.wormbase.org/ WormBase Schwarz EM, Antoshechkin I, Bastiani C, et al.: WormBase: better software, richer content.
    [Show full text]
  • A Bayesian Inference Transcription Factor Activity Model for the Analysis of Single-Cell Transcriptomes
    Downloaded from genome.cshlp.org on October 7, 2021 - Published by Cold Spring Harbor Laboratory Press Method A Bayesian inference transcription factor activity model for the analysis of single-cell transcriptomes Shang Gao,1,2,3 Yang Dai,1 and Jalees Rehman1,2,3,4 1Department of Bioengineering, 2Department of Medicine, Division of Cardiology, 3Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, Illinois 60612, USA; 4University of Illinois Cancer Center, Chicago, Illinois 60612, USA Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful experimental approach to study cellular heterogeneity. One of the challenges in scRNA-seq data analysis is integrating different types of biological data to consistently recognize discrete biological functions and regulatory mechanisms of cells, such as transcription factor activities and gene regulatory networks in distinct cell populations. We have developed an approach to infer transcription factor activities from scRNA-seq data that leverages existing biological data on transcription factor binding sites. The Bayesian inference tran- scription factor activity model (BITFAM) integrates ChIP-seq transcription factor binding information into scRNA-seq data analysis. We show that the inferred transcription factor activities for key cell types identify regulatory transcription factors that are known to mechanistically control cell function and cell fate. The BITFAM approach not only identifies bio- logically meaningful transcription factor activities,
    [Show full text]
  • Data Management in Systems Biology I
    Data management in systems biology I – Overview and bibliography Gerhard Mayer, University of Stuttgart, Institute of Biochemical Engineering (IBVT), Allmandring 31, D-70569 Stuttgart Abstract Large systems biology projects can encompass several workgroups often located in different countries. An overview about existing data standards in systems biology and the management, storage, exchange and integration of the generated data in large distributed research projects is given, the pros and cons of the different approaches are illustrated from a practical point of view, the existing software – open source as well as commercial - and the relevant literature is extensively overviewed, so that the reader should be enabled to decide which data management approach is the best suited for his special needs. An emphasis is laid on the use of workflow systems and of TAB-based formats. The data in this format can be viewed and edited easily using spreadsheet programs which are familiar to the working experimental biologists. The use of workflows for the standardized access to data in either own or publicly available databanks and the standardization of operation procedures is presented. The use of ontologies and semantic web technologies for data management will be discussed in a further paper. Keywords: MIBBI; data standards; data management; data integration; databases; TAB-based formats; workflows; Open Data INTRODUCTION the foundation of a new journal about biological The large amount of data produced by biological databases [24], the foundation of the ISB research projects grows at a fast rate. The 2009 (International Society for Biocuration) and special edition of the annual Nucleic Acids Research conferences like DILS (Data Integration in the Life database issue mentions 1170 databases [1]; alone Sciences) [25].
    [Show full text]
  • Standards and Tools for Model Exchange and Analysis in Systems Biology
    Standards and Tools for Model Exchange and Analysis in Systems Biology Ralph Gauges Dissertation submitted to the Combined Faculties for the Natural Sciences and for Mathematics of the Ruperto-Carola University of Heidelberg, Germany for the degree of Doctor of Natural Sciences presented by Diplom-Biochemiker Ralph Gauges born in: Sigmaringen, Germany Oral-examination: 07/11/2011 Standards and Tools for Model Exchange and Analysis in Systems Biology Referees: Prof. Dr. Ursula Kummer Dr. Rebecca Wade Contents Zusammenfassung vii Summary x Abbreviations xvii 1 Introduction 1 2 Materials & Methods 19 2.1 Operating Systems . 19 2.2 Programming Languages . 20 2.3 Unit Testing . 24 2.4 Debugging & Profiling Tools . 26 2.5 Libraries & Standards . 29 3 SBML Layout & Render Extension 46 3.1 SBML & Diagrams . 46 3.2 Alternative Diagram Formats . 47 3.3 Design & History . 49 3.4 The SBML Layout Extension Specification . 51 3.5 Implementation Of The Layout Extension . 57 3.6 The SBML Render Extension . 61 3.7 Third Party Implementations . 86 3.8 The SBML Layout And Render Extension In NF-κB Modeling 87 4 Standards In COPASI 93 4.1 SBML Support In COPASI . 93 4.2 Layout And Render Information In COPASI . 112 4.3 Graphical Display Of Time Course Simulation Data . 115 4.4 Graphical Display Of Elementary Modes . 116 4.5 COPASI Language Bindings . 117 4.6 NF-κB Modeling with COPASI . 120 i ii CONTENTS 4.7 Work Contributions . 128 5 Expression Normalization 130 5.1 Normal Form Classes . 131 5.2 Expression Tree Classes . 137 5.3 Normalization Algorithm .
    [Show full text]
  • A Modular Mathematical Model of Exercise-Induced Changes In
    bioRxiv preprint doi: https://doi.org/10.1101/2021.05.31.446385; this version posted May 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 A modular mathematical model of 2 exercise-induced changes in metabolism, 3 signaling, and gene expression in human 4 skeletal muscle 5 6 Akberdin I.R.1,2,3,4, Kiselev I.N.1,4,5, Pintus S.S.1,4,5, Sharipov, R.N.1,3,4,5, Vertyshev A.Yu.6, 7 Vinogradova O.L.7, Popov D.V.7, Kolpakov F.A.1,4,5 8 9 1 BIOSOFT.RU, LLC, Novosibirsk, Russian Federation, [email protected] 10 2 Federal Research Center Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia 11 3 Novosibirsk State University, Novosibirsk, Russia 12 4 Sirius University of Science and Technology, Sochi, Russia 13 5 Federal Research Center for Information and Computational Technologies, Novosibirsk, Russia5 14 6 CJSC "Sites-Tsentr", Moscow, Russia 15 7 Institute of Biomedical Problems of the Russian Academy of Sciences, Moscow, Russia 16 17 Abstract 18 Skeletal muscle is the principal contributor to exercise-induced changes in human 19 metabolism. Strikingly, although it has been demonstrated that a lot of 20 metabolites accumulating in blood and human skeletal muscle during an exercise 21 activate different signaling pathways and induce expression of many genes in 22 working muscle fibres, the system understanding of signaling-metabolic 23 pathways interrelations with downstream genetic regulation in the skeletal 24 muscle is still elusive.
    [Show full text]
  • CYCLONET—An Integrated Database on Cell Cycle Regulation And
    D550–D556 Nucleic Acids Research, 2007, Vol. 35, Database issue doi:10.1093/nar/gkl912 CYCLONET—an integrated database on cell cycle regulation and carcinogenesis Fedor Kolpakov1,2, Vladimir Poroikov3, Ruslan Sharipov1,2,4,*, Yury Kondrakhin1,2, Alexey Zakharov3, Alexey Lagunin3, Luciano Milanesi5 and Alexander Kel6 1Institute of Systems Biology, 15, Detskiy proezd, Novosibirsk 630090, Russia, 2Design Technological Institute of Digital Techniques, Siberian Branch of Russian Academy of Sciences, 6, Institutskaya, Novosibirsk 630090, Russia, 3Institute of Biomedical Chemistry of Russian Academy of Medical Sciences, 10, 4 Pogodinskaya Street, Moscow 119121, Russia, Institute of Cytology and Genetics, Siberian Branch of Downloaded from Russian Academy of Sciences, 10, Lavrentyev Avenue, Novosibirsk 630090, Russia, 5CNR-Institute of Biomedical Technologies, 93, Via Fratelli Cervi, Segrate (MI) 20090, Italy and 6BIOBASE GmbH, 33, Halchtersche Strasse, Wolfenbuettel 38304, Germany Received August 16, 2006; Revised October 12, 2006; Accepted October 13, 2006 http://nar.oxfordjournals.org/ ABSTRACT INTRODUCTION Computational modelling of mammalian cell cycle The main goal of the Cyclonet database is to integrate information from genomics, proteomics, chemoinformatics regulation is a challenging task, which requires at McGill University Libraries on December 28, 2012 comprehensive knowledge on many interrelated and systems biology on mammalian cell cycle regulation in processes in the cell. We have developed a web- normal and pathological states.
    [Show full text]
  • View
    Boyarskikh et al. BMC Medical Genomics 2018, 11(Suppl 1):12 DOI 10.1186/s12920-018-0330-5 RESEARCH Open Access Computational master-regulator search reveals mTOR and PI3K pathways responsible for low sensitivity of NCI-H292 and A427 lung cancer cell lines to cytotoxic action of p53 activator Nutlin-3 Ulyana Boyarskikh1, Sergey Pintus2, Nikita Mandrik2, Daria Stelmashenko2, Ilya Kiselev2, Ivan Evshin2, Ruslan Sharipov2, Philip Stegmaier3, Fedor Kolpakov2, Maxim Filipenko1 and Alexander Kel1,2,3* From Belyaev Conference Novosibirsk, Russia. 07-10 August 2017 Abstract Background: Small molecule Nutlin-3 reactivates p53 in cancer cells by interacting with the complex between p53 and its repressor Mdm-2 and causing an increase in cancer cell apoptosis. Therefore, Nutlin-3 has potent anticancer properties. Clinical and experimental studies of Nutlin-3 showed that some cancer cells may lose sensitivity to this compound. Here we analyze possible mechanisms for insensitivity of cancer cells to Nutlin-3. Methods: We applied upstream analysis approach implemented in geneXplain platform (genexplain.com) using TRANSFAC® database of transcription factors and their binding sites in genome and using TRANSPATH® database of signal transduction network with associated software such as Match™ and Composite Module Analyst (CMA). Results: Using genome-wide gene expression profiling we compared several lung cancer cell lines and showed that expression programs executed in Nutlin-3 insensitive cell lines significantly differ from that of Nutlin-3 sensitive cell lines. Using artificial intelligence approach embed in CMA software, we identified a set of transcription factors cooperatively binding to the promoters of genes up-regulated in the Nutlin-3 insensitive cell lines.
    [Show full text]
  • Synthetic Animal: Trends in Animal Breeding and Genetics
    Open Access Insights in Biology and Medicine Review Article Synthetic Animal: Trends in Animal Breeding and Genetics ISSN 1 2 2639-6769 Abolfazl Bahrami * and Ali Najafi 1Department of Animal Science, University College of Agriculture and Natural Resources, University of Tehran, Karaj, Iran 2Molecular Biology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran *Address for Correspondence: A Bahrami, Abstract Department of Animal Science, Tehran University, Karaj, I.R. Iran, Tel/Fax: +98 9199300065; Email: Synthetic biology is an interdisciplinary branch of biology and engineering. The subject [email protected] combines various disciplines from within these domains, such as biotechnology, evolutionary Submitted: 31 December 2018 biology, molecular biology, systems biology, biophysics, computer engineering, and genetic Approved: 10 January 2019 engineering. Synthetic biology aims to understand whole biological systems working as a unit, Published: 11 January 2019 rather than investigating their individual components and design new genome. Signifi cant advances have been made using systems biology and synthetic biology approaches, especially in the fi eld Copyright: © 2018 Bahrami A, et al. This is of bacterial and eukaryotic cells. Similarly, progress is being made with ‘synthetic approaches’ in an open access article distributed under the genetics and animal sciences, providing exciting opportunities to modulate, genome design and Creative Commons Attribution License, which permits unrestricted use, distribution, and fi nally synthesis animal for favorite traits. reproduction in any medium, provided the original work is properly cited Keywords: Synthetic biology; Systems biology; Introduction Synthetic approaches; Genetic engineering Animal breeding In 1859, Charles Darwin published his book ‘On the origin of species’, based on the indings that he collected during his voyage on ‘the Beagle’ [1].
    [Show full text]
  • Systems Biotechnology: an Emerging Trend in Metabolic Engineering Of
    Research Article OPEN ACCESS Freely available online doi:10.4172/jcsb.1000054 JCSB/Vol.3 Issue 2 Systems Biotechnology: an Emerging Trend in Metabolic Engineering of Industrial Microorganisms Chellapandi P1*, Sivaramakrishnan S2 and Viswanathan M.B3 1Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli-620024, Tamilnadu, India 2Department of Biotechnology, School of Life Sciences, Bharathidasan University, Tiruchirappalli-620024, Tamilnadu, India 3Department of Plant Sciences, School of Life Sciences, Bharathidasan University, Tiruchirappalli-620024, Tamilnadu, India Abstract The improvement of production processes to achieve commercially viable production levels is a prerequisite to any bioprocess. Currently, production strains are enhanced using a combination of random and targeted approach. By combining metabolomics technology and genome data analysis, it is possible to replace empirical target-selection strategies with a more scientifi c approach. All steps of biotechnological development, from up-stream and mid-stream to down-stream processes will benefi t signifi cantly by taking systems biotechnological approaches. The prevalence of genome sequence information, in concert with modern molecular biology advances, should have facilitated the easy manipulation of specifi c genes and pathways for the production of microbial metabolites. A notable success has been made on designing optimized production systems that maximize productivity and minimize raw materials costs for valuable metabolites. A remarkable advantage of this approach up-to-data and its relevant web resources is critically reviewed in this article. Indeed, this alternative approach will not only hopefully be useful for improving the productivity of many meaningful metabolites including antibiotics, enzymes, organic acids, etc. from industrially signifi cant microorganisms but also will ensure correlation of many experimental reliabilities.
    [Show full text]
  • A Bayesian Inference Transcription Factor Activity Model for the Analysis of Single-Cell Transcriptomes
    Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press Method A Bayesian inference transcription factor activity model for the analysis of single-cell transcriptomes Shang Gao,1,2,3 Yang Dai,1 and Jalees Rehman1,2,3,4 1Department of Bioengineering, 2Department of Medicine, Division of Cardiology, 3Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, Illinois 60612, USA; 4University of Illinois Cancer Center, Chicago, Illinois 60612, USA Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful experimental approach to study cellular heterogeneity. One of the challenges in scRNA-seq data analysis is integrating different types of biological data to consistently recognize discrete biological functions and regulatory mechanisms of cells, such as transcription factor activities and gene regulatory networks in distinct cell populations. We have developed an approach to infer transcription factor activities from scRNA-seq data that leverages existing biological data on transcription factor binding sites. The Bayesian inference tran- scription factor activity model (BITFAM) integrates ChIP-seq transcription factor binding information into scRNA-seq data analysis. We show that the inferred transcription factor activities for key cell types identify regulatory transcription factors that are known to mechanistically control cell function and cell fate. The BITFAM approach not only identifies bio- logically meaningful transcription factor activities,
    [Show full text]