Parallel Bifurcation Analysis Parametrized
Total Page:16
File Type:pdf, Size:1020Kb
Masaryk University Faculty of Informatics Jakub Kadlecaj PARALLEL BIFURCATION ANALYSIS in PARAMETRIZED BOOLEAN NETWORKS Master's thesis Brno, Spring 2020 Declaration Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Jakub Kgdlecaj i Acknowledgements Ďakujem vedúcemu tejto práce, Lubošovi Brimovi, za jeho vede nie počas posledných dvoch rokov. Ďakujem Samovi Pastvovi za jeho konzultácie nielen k tejto práci. Ďakujem Davidovi Šafránkovi, Nikolovi Benešovi, a dalším členom Sybily za to, že ma toho veľa naučili. Ďakujem aj mojej rodine za ich večnú podporu. ii Abstract Boolean networks provide a useful modelling tool for various phenomena from science and engineering. Any long-term behavior of a Boolean network eventually converges to a so- called attractor. Depending on various logical parameters, the structure and quality of at- tractors can undergo a significant change, known as a bifurcation. In this thesis I present Aeon—a tool for automatic analysis of attractor bifurcations in parametrized Boolean networks with asynchronous semantics. Keywords Boolean networks • attractors • bifurcation analysis iii Contents 1 Introduction 1 1.1 Historical background 1 1.2 Novelty of the approach 3 2 Preliminaries 4 2.1 Non-parametrized Boolean networks 4 2.2 Parametrized Boolean networks 9 2.3 Problem statement 12 3 The algorithm 13 3.1 Semi-symbolic representation 13 3.2 Computing valid parametrizations 16 3.3 Constructing the parametrized state transition graph 18 3.4 Attractor detection 19 3.5 Attractor classification 23 4 The tool and its features 24 4.1 Getting the tool running 25 4.2 Creating and editing models 26 4.3 File import and export 27 4.4 Model analysis 29 4.5 The help panel and the manual 30 5 Implementation 31 5.1 Client 31 5.2 Compute engine 33 5.3 The server and its interface 34 6 Evaluation 36 7 Conclusion 41 A Aeon file format specification 44 B The Aeon manual 45 iv 1 Introduction The Boolean network is a relatively simple and intuitive tool useful in mathematical mod• elling of variety of phenomena from science and engineering. Nevertheless, even though it consists just of a set of Boolean variables and functions operating on them, the Boolean network can exhibit a very complex behavior. The goal of this thesis is to design and to implement a parallel software tool that is able to discover and classify the attractors of Boolean networks depending on various logical pa• rameters. These attractors are sets of states toward which the network evolves, and so they represent a long-term behavior of the Boolean network. The first three chapters of this work deal with the theoretical aspects of Boolean networks alongside of the used computational methods; the remaining four chapters describe the tool's usage, the details of its implemen• tation, and evaluation of its performance. 1.1 Historical background The concept of the Boolean network can be viewed as an amalgamation of various, often historically unrelated, formalisms; be it the neural networks, cellular automata, or gene reg• ulatory networks. In 1943, Warren McCulloch and Walter Pitts came with their landmark computational calculus for mathematical modelling of the neural activity, which garnered a great deal of attention, and it is known today as the neural network. The active research in this area has led Stephen Kleene to investigate, today ubiquitous, finite state automata J5'12' These ideas then led Kauffman and Thomas independently to use Boolean networks in the 5 context of biological modelling in late 1960s.'- '10' Their interpretations are sometimes ref- fered to as the Kauffman networks and the Thomas networks, respectively. The Kauffman's gene regulatory networks describe interactions between genes. A gene can be either turned on or off, and its value is updated by a logical function, which takes the values of the other genes as an input. Kauffman realized that "because the net has a finite number of states, as it proceeds through a sequence of states, it must be trapped in a re-entrant cycle of states," t10' and so exactly described the concept of attractors. However, such a simplified modelling is known since at least 1961, when Jacob and Monod modeled genes as binary devices.'1'9-' 1940 1950 I960 1970 I98o 1990 The cellular automata theory come from another perspective. Around late 1940s, John von Neumann became interested in formal self-replicating machines. Stanislaw Ulam, his then- colleague at the Manhattan Project and his friend, suggested to use a discrete grid of finite states to better formulate the problem. Von Neumann succeeded in his endeavor by con• structing self-replicating, two-dimensional cellular automaton with 29 states. Regrettably, his work was published only after his death in 1966 by Arthur Burks J17' 1 Later, in 1980s, Stephen Wolfram was conducting his seminal research on cellular automata in their modern form: as a typically infinite n-dimensional grid of binary states called cells, evolving by updating each cell according to some rule—a function depending on the cell's state and its neighborhood. The similarities between this interpretation of cellular automata and the Boolean networks are obvious; the main difference is the finiteness of Boolean net• works. Moreover, Wolfram introduced a classification of the cellular automata based on their behavior into four informally defined classes of increasing complexity:'-22' P'231' stabil• ity, oscillation, chaos, and complex. The behavior of Boolean networks can be classified into quite similar classes. Attractors In the study of dynamical systems, an attractor is a set of values or states to• wards which a given system evolves, converges. Essentially, for a given dynamical system, finding of the attractors coincides with answering the question— What will happen eventu• ally? The notion of an attractor started to appear in about 1960s in study of flows, l-13' and its precise definition naturally depends on the studied class of systems. In the context of systems biology, the attractors are often calledphenotypes, for they represent the long-term behavior. In the same way Wolfram used the classes of the behaviors of cellular automata, the attractors of the Boolean network can be classified into categories by their behavior. The most common and intuitive classes of attractors are the stability, where the attractor consists of a single, fixed-point, equilibrious state; or the oscillation, where a number of states repeat themselves in a predictable fashion. Because this predictability is a rather vague notion, it is not simple to determine the degree to which a system is predictable. For this reason, in this work, only the simplest of the oscillating attractors is considered—a cycle. Figure 1: A visualization of the well-known Lorenz attractor that represents a set of solutions of the Lorenz system of ordinary differential equations. (Wikimedia Commons) Bifurcation analysis In the context of dynamical systems, a bifurcation—the word's lit• eral meaning being forking, branching in two—is a qualitative change of a system caused by but a small change in the parameter value of that system. This term was first used by Henri Poincare in 1885 when describing such changes in dynamics of a fluid. The bifurcation analysis traditionally applies to dynamical systems that are parametrized by continous vari• ables. However, in the context of discrete systems such is the Boolean network, there is no clear-cut way of ordering the parametrizations, so it is not easy to determine how close two parameter settings are—how small a change is between them. 2 Parameters The ability to introduce parameters to Boolean network models becomes very convenient in situations, where the update functions, governing each variable, are not entirely known. To illustrate such a situation, the diagram below describes a simple envi• ronmental model that shows relationships between a number of entities. The green arrows represent a positive effect, while the red arrows represent a negative effect. In this case, a negative effect could represent a reduction of the affected population. This model may have been constructed by observing only the individual relationships, so the behavior on higher levels—perhaps intricate environmental interactions—may be still unknown. sunlight flowers V seeds cats When studying its long-term behavior—searching for its attractors—this model has many possible interpretations. One of the possible behaviors is the following cycle of events that repeats itself indefinitely: at first, there is negligible cat population, which leads to an over• population of mice. The overpopulation of mice leads to a nimiety of the felines as well. The overpopulation of cats then results in a diminishment of the mice population, which in turn reduces the cat population itself. Another possible long-term behavior of this model is an equilibrium, such there is a plenty of sunshine, seeds, mice, and cats.a The parametrization of the model enables studying its behavior even without the exact knowledge of the governing mechanims; for this reason, the ability to study parametrized models can be of a great practical importance. 1.2 Novelty of the approach This work builds atop an existing algorithm to create an accessible piece of software, that has a strong potential to help understanding many phenomena modelled using a Boolean network. There is a number of software tools that deal with Boolean network models, for instance: • GINsim'-16' - a desktop GUI application for simulating and parameter synthesis of parametrized Boolean networks, • pyBoolNet^11' - a Python package for the generation, modification and analysis of Boolean networks, • BoolNet^14' - an R language library for analysis of Boolean networks, including at- tractor detection of non-parametrized Boolean networks, and • The Cell Collective^ - a web-based framework for biological modelling and analysis, that supports various classes of formalisms, including Boolean networks.