Computational Methodology for Enhanced Sensitivity Analysis of Gene Regulatory Networks
Total Page:16
File Type:pdf, Size:1020Kb
Computational methodology for enhanced sensitivity analysis of gene regulatory networks A dissertation presented by Mattia Petroni to The Faculty of Computer and Information Science in partial fulfilment of the requirements for the degree of Doctor of Philosophy in the subject of Computer and Information Science Ljubljana, The thesis was supported by the national postgraduate programme Higher Educa- tion National Scheme (Inovativna shema za sofinanciranje doktorskega šudija za spodbu- janje sodelovanja z gospodarstvom in reševanja aktualnih družbenih izzivov — generacija Univerza v Ljubljani), financed by the European Union (EU), University of Ljubl- jana and Slovenian Ministry of Higher Education, Science and Technology. APPROVAL I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person nor material which to a substantial extent has been accepted for the award of any other degree or diploma of the university or other institute of higher learning, except where due acknowledgement has been made in the text. — Mattia Petroni — April The submission has been approved by dr. Miha Moškon Assistant Professor of Computer and Information Science advisor dr. Roman Jerala Professor of Molecular Biology and Biochemistry external examiner dr. Miha Mraz Professor of Computer and Information Science examiner dr. Bojan Orel Professor of Mathematics examiner dr. Nikolaj Zimic Professor of Computer and Information Science examiner PREVIOUS PUBLICATION I hereby declare that the research reported herein was previously published/submitted for publication in peer reviewed journals or publicly presented at the following occa- sions: [] M. Petroni, N. Zimic, M. Mraz and M. Moškon. Stochastic simulation algorithm for gene regulatory networks with multiple binding sites. Journal of Computational Biology, volume , number , pages –, , doi: ./cmb.., Mary Ann Liebert Inc. [] M. Moškon, J. Bordon, M. Mraz, N. Zimic and M. Petroni. Computational approaches in synthetic and systems biology. Chapter in Recent Advances in Systems Biology Research, Pages –, Editors: André X. C. N. Valente, Abhijit Sarkar and Yuan Gao. , ISBN: ----, Nova Science Publishers. [] M. Petroni, M. Moškon. A nested stochastic simulation algorithm for the gene regulatory network of the Epstein-Barr virus. Abstract at th Workshop on Algorithms in Bioinformatics, September , , Wroclaw, Poland. [] M. Petroni, N. Zimic, M. Mraz and M. Moškon. Multiscale stochastic simulation algorithm for complex gene regulatory networks. Abstract in th CFGBC Symposium From arrays and sequencing to understanding diseases: book of abstracts, Pages . Ljubljana, [] M. Petroni, N. Zimic, M. Mraz and M. Moškon. A parallel algorithm for stochastic multiscale simulation of gene regulatory networks with multiple binding sites. Book of abstracts at th CFGBC Symposium, pages . Ljubljana, June - July , I certify that I have obtained a written permission from the copyright owner(s) to include the above published material(s) in my thesis. I certify that the above material describes work completed during my registration as graduate student at the University of Ljubljana. University of Ljubljana Faculty of Computer and Information Science Mattia Petroni Computational methodology for enhanced sensitivity analysis of gene regulatory networks ABSTRACT Biological computing is held towards a new era of processing platforms based on the bio-logical computer structures that are at the heart of biological systems with informa- tion processing capabilities. These bio-logical computer structures are mostly based on gene regulatory networks, mainly because their dynamics reminds the computer logic structures functioning. The use of these bio-structures is still in its early days since they are for the time being far less effective than their silicon counterparts. However, their use can be already exploited for a wide range of applications, covering pharmacological, medical and industrial. In order to develop such applications, a precise design that is based on computational modelling is vital in the process of their implementation. Gene regulatory networks can be described as a chemical reacting systems. The dynamics of such systems is defined at the molecular level with a set of interacting reac- tions. The stochastic simulation algorithm can be used to generate the time evolution trajectories of each chemical species by firing each reaction according to a Monte-Carlo experiment. The main shortcoming of this approach is its computational complexity, which increases linearly with the total number of reactions that have to be simulated. When the number of reactions becomes too high, the stochastic simulation algorithm turns out to be impracticable. This is the case of certain gene regulatory networks, which can be either found in nature or can be artificially constructed. An additional problem lies in the fact that reactions in such networks can often occur at different time scales, which can differ by many orders of magnitude. Such scenario occurs when gene regulatory networks contain multiple cis-regulatory binding sites, on which differ- ent transcription factors are able to bind non-cooperatively. The transcription factors binding occurs much faster than the average reactions in the gene expression, therefore, this time-scale gap needs to be accounted into the simulation. Moreover, the transcrip- tion control can be affected by specific dispositions of the bound transcription factors, i ii Abstract Mattia Petroni which is only possible to simulate, if all the reactions that can produce the same dis- positions are defined. The number of such reactions increases exponentially with the number of binding sites. In order to decrease the time complexity of the stochastic simulation algorithm for such gene regulatory networks, an alternative algorithm called the dynamic multi-scale stochastic algorithm (DMSSA) is proposed, in which the reactions involved in the transcription regulation can be simulated independently, by performing the stochas- tic simulation algorithm in a nested fashion. This is conditioned by the property of the set of reactions, describing the gene regulatory network, being divided into two subsets, i.e. a set of “fast” reactions, which occur frequently in a short time scale, and a set of“slow” reactions, which occur less frequently in longer time scales. This thesis demonstrates the equivalence between this approach and the standard stochastic sim- ulation algorithm and shows its capabilities on two gene regulatory models, that are commonly used as examples in systems and synthetic biology. The thesis focuses on how to identify the most important input parameters ofmulti- scale models, that affect the system the most. This is a common practice during the design of bio-logical structures and can be achieved with the sensitivity analysis. It may be difficult to carry out such analysis for complex reaction networks exhibiting different time scales. In order to cope with this issue, an alternative computation of the elementary effects in the Morris screening method is proposed, which is able to sort all the model parameters, independently on their structural or time scale definitions, in order of importance, i.e. which parameter carries the largest influence on the response of the model. To ease the use of the simulation algorithm and to perform the sensitivity analy- sis, the thesis presents ParMSSA, an OpenCL based engine for performing parallel stochastic simulations on multi-core architectures. ParMSSA aims to accelerate the simulations, performed with our approach. ParMSSA is capable to run concurrently multiple instances of DMSSA, which are usually needed for reducing the noisy re- sults of stochastic simulations. ParMSSA provides also a framework for performing the Morris screening experiment on reaction networks, which allows users to carry out the sensitivity analysis of observed systems. The simulation results provided by the ParMSSA can be easily interpreted and can be used to assess the robustness of the bio-logical computer structures. The proposed algorithms and the proposed simulation engine were applied ontwo Computational methodology for enhanced sensitivity analysis of gene regulatory networks iii case studies, i.e. on the Epstein-Barr virus genetic switch and on the synthetic repres- silator with multiple transcription factor binding sites. The results of the sensitivity analysis of the repressilator revealed that larger numbers of binding sites increase the robustness of the system and thus the robustness of the oscillatory behaviour. Key words: stochastic modelling, multi-scale modelling, sensitivity analysis, systems biology, synthetic biology, stochastic simulation algorithm, multiple transcription fac- tors binding sites Univerza v Ljubljani Fakulteta za računalništvo in informatiko Mattia Petroni Računalniško podprta metodologija za analizo občutljivosti večnivojskih stohastičnih modelov bioloških preklopnih gradnikov POVZETEK Biološko računalništvo je zasnovano na procesnih platformah, ki izhajajo iz bioloških preklopnih struktur z zmožnostjo procesiranja informacij. Te strukture večinoma te- meljijo na gensko regulatornih omrežjih (GRO). Njihova dinamika spominja na de- lovanje računalniških preklopnih gradnikov. Uporaba bioloških preklopnih struktur je trenutno še v povojih, saj je njihova učinkovitost neprimerno manjša od silicijevih ekvivalentov. Kljub temu njihove aplikacije že