Stochastic Algorithms for Biochemical Processes
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITA` DEGLI STUDI DI MILANO BICOCCA Facolt`adi Scienze MM.FF.NN. Dottorato di Ricerca in Informatica XXII Ciclo Stochastic algorithms for biochemical processes Paolo Cazzaniga Supervisor Prof. Giancarlo Mauri Tutor Prof. Paola Bonizzoni PhD Coordinator Prof. Stefania Bandini ANNO ACCADEMICO 2008–2009 iii Acknowledgments I would like to express my sincere gratitude to Professor Giancarlo Mauri who has been my supervisor. He gave me many helpful suggestions and important advice during the course of my PhD. Special thanks are due to Daniela Besozzi for her constant encourage- ment, her useful help (on many occasions) and her friendship. I would also like to thank Dario Pescini for his assistance with writing code, his great effort to explain things and for all the last minute rushes to airports and train stations because we were always late. A particular thanks to Prof. Stephen Gilmore and Jane Hillston who hosted me for 6 months in Edinburgh. And I should also say thank you to my flatmates, the guys working at the LFCS and all the other people I met there. I wish to thank the people working at DISCo for their friendship, help and for the parties we had. I am grateful to the people who helped me during my PhD, and to my friends for the fun and laughter we shared. Indeed, I owe a special thanks to my family for its support during this period and to Laura who believed in me and supported me throughout my PhD. Contents Introduction ix 1 Stochastic algorithms for the simulation of biochemical sys- tems 1 1.1 Gillespie’s stochastic simulation algorithm . ... 2 1.2 Thenextreactionmethod . 6 1.3 The tau-leaping algorithm . 11 1.4 Thenextsubvolumemethod . 16 1.5 The binomial tau-leap spatial stochastic simulation algorithm 20 1.6 Otheralgorithms ......................... 25 2 Optimization algorithms 29 2.1 Geneticalgorithms . 30 2.2 Particle swarm optimizer . 37 3 Membrane systems 45 3.1 Basic notions of membrane systems . 46 3.2 Dynamical probabilistic P systems . 52 3.3 A DPP application: metapopulation systems . 58 3.3.1 Themodellingframework . 60 3.3.2 Simulations and results . 66 3.3.3 Discussion......................... 72 4 τ–DPP 75 4.1 Tau-leapingprocedureinDPPs . 76 4.2 τ-DPPalgorithm ......................... 80 4.3 A test case for τ-DPP ...................... 84 4.4 τ-DPP with size associated to volumes and objects . 84 4.5 A test case for Sτ-DPP ..................... 90 4.6 Implementation of the algorithms . 94 4.7 Discussion and future developments . 96 v Contents 5 Modelling chemical and biological systems 99 5.1 The Ras/cAMP/PKA signalling pathway in the yeast Sac- charomyces cerevisiae ......................101 5.1.1 The Ras/cAMP/PKA pathway description . 101 5.1.2 Thestochasticmodel. 103 5.1.3 The Ras2 • GTP generation module . 106 5.1.4 Generation of cAMP, coupling with PKA and feed- backloops.........................107 5.1.5 Discussion......................... 110 5.2 The repressilator: a genetic oscillators coupled with a quorum sensingmechanism . 113 5.2.1 A multivolume model for coupled genetic oscillators . 114 5.2.2 Results of simulations . 116 5.2.3 Discussion and future developments . 121 5.3 First steps toward a wet implementation for τ-DPP . 122 5.3.1 Chemical computing . 123 5.3.2 Definition and simulation of component reaction net- works using τ-DPP....................125 5.3.2.1 The NAND and XOR logic circuits . 126 5.3.2.2 Fredkin gates and circuits . 130 5.3.2.3 The SUB instruction . 138 5.3.2.4 The SUBADD module . 140 5.3.3 Discussion and open problems . 141 5.4 A study on the combined interplay between stochastic fluc- tuations and the number of flagella in bacterial chemotaxis . 144 5.4.1 The modeling of bacterial chemotaxis . 145 5.4.2 Stochastic simulations of chemotactic response regulator149 5.4.3 The interplay between stochastic fluctuations and the number of bacterial flagella . 153 5.4.4 Discussion......................... 157 6 The role of parameters in chemical and biological systems 161 6.1 Parameter estimation of biochemical systems . 163 6.2 Parameter sweep application . 165 6.3 Fitnessfunction. 166 6.4 A comparison of GAs and PSO for parameter estimation in stochastic biochemical systems . 169 6.4.1 Systems of biochemical reactions . 169 6.4.2 GAs and PSO settings . 171 6.4.3 Experimental results . 172 6.4.4 Discussion......................... 176 6.5 Stochastic simulations on a grid framework for parameter sweep applications in biological models . 178 6.5.1 TheEGEEgridplatform . 179 vi Contents 6.5.2 PSAoverthegrid . 180 6.5.3 Bacterial chemotaxis: a case study . 183 6.5.4 Results ..........................184 6.5.5 Performance discussion . 187 6.5.6 Conclusion ........................193 7 Conclusion and future work 195 Bibliography 201 vii Introduction Aims and motivations After the completion of the human genome sequencing (and of a lot of other genomes), the main challenge for the modern biology is to understand com- plex biological processes such as metabolic pathways, gene regulatory net- works and cell signalling pathways, which are the basis of the functioning of living cells. This goal can only be achieved by using mathematical modelling tools and computer simulation techniques, to integrate experimental data and to make predictions on the system behaviour that will be then exper- imentally checked, so as to gain insights into the working and the general principles of organization of biological systems. In fact, the formal modelling of biological systems allows the develop- ment of simulators, which can be used to understand how the described sys- tem behaves in normal conditions, and how it reacts to (simulated) changes in the environment or to alterations of some of its components. Simulations present many advantages over conventional experimental biology in terms of cost, ease to use and speed. For instance, some experiments that are infeasible in vivo can be conducted in silico, e.g. it is possible to knock out many vital genes from the cells and monitor their individual and collective impact on cellular metabolism. Evidently such experiments cannot be done in vivo because the cell may not survive. Therefore, the development of pre- dictive in silico models offers opportunities for unprecedented control over the system. In the last few years a wide variety of models of cellular processes have been proposed, based on different formalisms. For example, chemical kinetic models attempt to represent a cellular process as a system of distinct chemi- cal reactions. In this case, the network state is defined by the instantaneous quantity (or concentration) of each molecular species of interest in the cell, and different molecular species may interact via one or more reactions. Usu- ally, reactions are represented by a system of coupled differential equations that relate the quantity of reactants to the quantity of products, according to a reaction rate and other parameters. Recently, it has been pointed out that transcription, translation and other cellular processes may not behave deterministically but instead are ix Introduction: aims and motivations better modelled as random events [123]. Several models address this concern by abandoning differential equations in favour of stochastic relations that describe each chemical reaction in terms of molecular collisions [6, 74]. The use of stochastic methods for the study of biological systems is motivated by the fact that these systems are usually composed by many chemical interactions among a large number of chemical species, whereby the molecular quantities involved can be small (few tens of molecules). In systems having these characteristics, noise plays a major role in the system’s dynamics [128]. Two different kind of noise can be identified in biological systems: ex- trinsic and intrinsic [59, 176]. The first one is related to the experimental conditions; for instance, the variation of temperature, pressure, light, or fluctuations of other cellular factors, are all sources of extrinsic noise. On the other hand, there are stochastic events occurring during the processes of gene expression, at the level of transcription, translation and protein degra- dation, which result in intrinsic noise. The role of noise in biological systems has been studied and quantified [59, 176, 192], proving that the classic deterministic and continuous ap- proach (like, for instance, ordinary differential equations) is unsuitable for the modelling, simulation and analysis of phenomena like cellular pathways, especially when the gene transcription and translation machinery is involved. The deterministic approach is based on the law of mass action, an empir- ical law which provides a simple relation between reaction rates and molec- ular species concentrations. Given the initial molecular concentrations, by using the law of mass action, a temporal description of the component con- centrations can be obtained. The law of mass action considers chemical reactions to be macroscopic, continuous and deterministic. However, in the study of “small” systems, the law of mass action becomes inadequate, and it more is suitable to apply stochastic approaches, since (1) they take into account the discreteness of the quantity of the molecular species and the inherently random character of the phenomena; (2) they are in accordance with the theories of thermodynamics and stochastic processes; and (3) they are appropriate for the description of systems characterised by instability phenomena [198]. On the other hand, one major problem related to stochastic methods is that they are difficult to implement analytically; hence, they are im- plemented by means of numerical simulations whose computation time is usually very expensive. In this thesis we provide a discrete and stochastic framework for the modelling, simulation and analysis of biological and chemical systems. To this aim, we will start from the description of other techniques and meth- ods that are present in literature, as the basis to build and compare our approach.