University of Trento Modelling and Inference Strategies For

PhD Dissertation International Doctorate School in Information and Communication Technologies DISI - University of Trento Modelling and Inference strategies for Biological Systems Alida Palmisano Advisor: Prof. Corrado Priami Universitàdegli Studi di Trento Co-Advisor: Dott. Paola Lecca The Microsoft Research - University of Trento Centre for Computational Systems Biology December 2010 \Would you tell me, please, which way I ought to go from here?" \That depends a good deal on where you want to get to", said the Cat. \I don't much care where {" said Alice. \Then it doesn't matter which way you go", said the Cat \{so long as I get somewhere" , Alice added as an explanation. \Oh, you're sure to do that" , said the Cat, \if you only walk long enough." [Charles Lutwidge Dodgson] Abstract For many years, computers have played an important role in helping scientists to store, manipulate, and analyze data coming from many different disciplines. In recent years, however, new technological capabilities and new ways of thinking about the usefulness of computer science is extending the reach of computers from simple analysis of collected data to hypothesis generation. The aim of this work is to provide a contribution in the Computational Systems Biology field. The main purpose of this recent discipline is to enhance the intertwined relationship connecting Biology and Computer Science, by developing tools and theoretical frameworks able to formally and quantitatively investigate the interactions among the components of biological systems. The final goal of these efforts is to assemble the different pieces into a working model of a living, responding, reproducing cell; a model that can be used for performing in-silico tests and simulations in order to understand and predict possible emergent properties. In this thesis we present the application to real biological case studies of a specific concur- rent modelling language (derived by the metaphors of \molecules-as-object" {introduced by Fontana{ and \cells-as-computations" {introduced by Regev and Shapiro{ at the end of last century) and the development and implementation of a tool for inferring knowledge from experimental data in order to link the numerical aspects of a model to real wet-lab data. Keywords [systems biology, stochastic modelling, process calculi, kinetic inference, cell cycle] Contents 1 Introduction 1 1.1 The context . .1 1.2 The problem . .4 1.3 The contribution . .6 1.4 Structure of the Thesis . .8 1.5 Publications . .8 2 Preliminaries 11 2.1 Modelling of Biological Systems . 11 2.2 Inference of Biological Systems . 17 2.3 Cell cycle models . 22 I The modelling framework 29 3 Modelling with BlenX 31 3.1 The language . 31 3.2 Translation from ODEs to BlenX ....................... 38 3.3 Issue: different abstraction levels . 41 4 Budding Yeast Cell Cycle 45 4.1 Introduction . 45 4.1.1 Physiology of the cell cycle . 46 4.1.2 Molecular mechanism of cell cycle control . 48 4.1.3 Saccharomyces cerevisiae regulatory network . 51 4.2 Irreversibility study . 53 4.3 Viability study . 59 4.4 Stochastic analysis . 64 4.5 Stochastic pedigree analysis . 74 i 4.6 Preliminary conclusions . 81 II The inference framework 85 5 KInfer 87 5.1 The model for inference . 88 5.1.1 Variance of the estimated parameters . 91 5.1.2 Parameter space restriction . 92 5.1.3 Parameter inference in not - mass action models . 94 5.2 The inference tool: KInfer . 96 5.3 Some concluding considerations . 101 6 Case studies 103 6.1 Gene transcriptional regulation . 103 6.2 Didactic biochemical network . 106 6.3 Genetic regulatory network . 107 6.4 Real case studies . 109 6.4.1 Glucose metabolisms of Lactococcus lactis . 110 6.4.2 Binding affinity of IκB kinase . 112 6.4.3 A transcription control of cell cycle . 115 6.5 Other existing inference tools . 120 7 Discussions and Future work 123 8 Conclusions 129 Bibliography 137 A Appendix 153 A.1 BlenX model of Budding Yeast Cell Cycle . 153 A.1.1 .prog file . 153 A.1.2 .types file . 158 A.1.3 .func file . 158 A.2 Stochastic Simulation Algorithm . 163 A.3 Continuous Stochastic Logic and PRISM . 164 A.4 Deterministic data (Chen model) . 165 ii List of Tables 2.1 A concise picture of the mapping between biology and process calculi . 13 3.1 BlenX input files of the Budding Yeast Cell Cycle (running example) . 36 4.1 Stochastic characterization of all the available mutants described in the deterministic framework . 69 6.1 Estimated parameter values: gene transcriptional regulation (model 1) . 104 6.2 Estimated parameter values: gene transcriptional regulation (model 2) . 105 6.3 Estimated parameter values: didactical biochemical network . 106 6.5 Estimated parameter values: gene regulatory network . 109 6.6 Estimated parameter values: regulation of glycolysis in L. lactis . 110 6.7 Estimated parameter values: binding affinity of IκB kinase . 114 6.9 Estimated parameter values: Cdc20 transcription sub-network . 117 6.11 Estimates of the parameters of the Gauss error function approximating the Hill function . 119 iii List of Figures 1.1 Modelling cycle of a biological system . .5 2.1 Structure of a cell . 23 2.2 Budding yeast cells. 24 3.1 Graphical representation of a BlenX entity. 32 3.2 Intuitive behavior of some BlenX primitives. 34 3.3 Model, used as a running example for explaining the translation procedure, of the antagonism between Cdk/CycB complex and the Cdh1/APC that drives cell cycle oscillations. 35 3.4 Example of the inverse problem causing inconsistencies between the deterministic and the stochastic behaviour of a chemical reaction system. 40 3.5 Inconsistencies between different ways of unpacking the same complex regulatory mechinism. 43 4.1 Phases of the cell cycle. The evolution in time of the budding yeast case is shown. 47 4.2 A seesaw metaphor for the antagonism between Cdk/Cyclin and Cdh1/APC and Sic1 proteins that drive the cell cycle oscillations . 49 4.3 Protein waves during the cell cycle . 50 4.4 Regulatory network of cell cycle . 51 4.5 Network of the antagonistic interactions between Cdk/CycB and Cdh1/APC 54 4.6 Visual representation of noise-free oscillations of Cdk/CycB and Cdh1/APC 56 4.7 Probability plots of the behavior of Cdk/CycB and Cdh1/APC with respect to the value of a parameter of the cell cycle model . 57 4.8 Sample simulations of the wild type model . 58 4.9 Probability of a single Cdh1/APC peak between two cell divisions . 59 4.10 Graphical representation of cell cycle engine . 60 4.11 Equations for the activation/inactivation of the Cdc20 protein and the rate law for the growing of the mass . 60 v 4.12 BlenX code resulting from the translation of a subset of mathematical equations . 61 4.13 Statistics on properties of a simplified model of cell cycle . 62 4.14 Stochastic simulations for a nutritional sensitive mutant . 63 4.15 BlenX code with templates . 64 4.16 Rules that define stages of the cell cycle and arrest/error type . 66 4.17 Statistics on properties of a complete model of cell cycle . 67 4.18 For all the viable mutants, a comparison of the statistics collected from the deterministic solution of the model and stochastic runs of the same parameter sets . 68 4.19 For all the inviable mutants, a comparison of the arrest/error type deduced from the deterministic solution of the model and stochastic runs of the same parameter sets . 72 4.20 Example of an oscillating but inviable mutant . 73 4.21 Graphical representation of the idea behind spPeAnBY............ 75 4.22 Graphical representation of spPeAnBYarchitecture. 76 4.23 Pedigree analysis of the stochastic version of the Chen model . 78 4.24 Pedigree tree with dead branches . 78 4.25 Statistics on the output of the pedigree analysis of the stochastic version of the Chen model . 79 4.26 Histograms of the genealogical classification of a colony of budding yeast cells . 80 4.27 Histograms of the synchronicity of a colony of budding yeast cells . 80 5.1 Screenshots of the inference tool: KInfer . 96 5.2 Graphical representation of the general architecture of KInfer . 97 6.1 Gene transcriptional regulation (model 1) . 104 6.2 Time series of gene transcription case study (model 1) . 104 6.3 Gene transcriptional regulation (model 2) . 105 6.4 Time series of gene transcription case study (model 2) . 105 6.5 Didactical example of a biochemical network . 106 6.6 Time series of the didactical biochemical network case study . 107 6.7 Model of a gene regulatory network . 108 6.8 Time series of the gene regulatory network model . 109 6.9 Model of the pathway of regulation of glycolysis in L. lactis . 111 6.10 Time series of the pathway of regulation of glycolysis in L. lactis . 112 6.11 Model of the binding affinity of IκB kinase . 113 vi 6.13 Time series of GST-IκB-P . 114 6.14 A simplified model for the transcription of the gene Cdc20 in budding yeast 115 6.16 Time course of the Cdc20 mRNA in a simplified model for the transcription of the Cdc20 gene . 118 6.17 Solutions of the model including Cdc20 transcription process with the rate coefficients inferred from the different experimental time courses . 118 6.18 Fit of Gauss error function to Hill function . 120 vii Chapter 1 Introduction 1.1 The context For many years, computers have played an important role in helping scientists to store, manipulate, and analyze data coming from many different disciplines. In recent years, however, new technological capabilities and new ways of thinking about the usefulness of computer science is extending the reach of computers from simple analysis of collected data to hypothesis generation.

Load more