A Tutorial on Analysis and Simulation of Boolean Gene Regulatory Network Models

Current Genomics, 2009, 10, 511-525 511 A Tutorial on Analysis and Simulation of Boolean Gene Regulatory Network Models Yufei Xiao*,1,2 1Dept. of Epidemiology & Biostatistics; 2Greehey Children's Cancer Research Institute, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA Abstract: Driven by the desire to understand genomic functions through the interactions among genes and gene products, the research in gene regulatory networks has become a heated area in genomic signal processing. Among the most studied mathematical models are Boolean networks and probabilistic Boolean networks, which are rule-based dynamic systems. This tutorial provides an introduction to the essential concepts of these two Boolean models, and presents the up-to-date analysis and simulation methods developed for them. In the Analysis section, we will show that Boolean models are Markov chains, based on which we present a Markovian steady-state analysis on attractors, and also reveal the relationship between probabilistic Boolean networks and dynamic Bayesian networks (another popular genetic network model), again via Markov analysis; we dedicate the last subsection to structural analysis, which opens a door to other topics such as network control. The Simulation section will start from the basic tasks of creating state transition diagrams and finding attractors, proceed to the simulation of network dynamics and obtaining the steady-state distributions, and finally come to an algorithm of generating artificial Boolean networks with prescribed attractors. The contents are arranged in a roughly logical order, such that the Markov chain analysis lays the basis for the most part of Analysis section, and also prepares the readers to the topics in Simulation section. Received on: December 04, 2008 - Revised on: May 11, 2009 - Accepted on: May 11, 2009 1. INTRODUCTION The roles of mathematical models for gene regulatory networks include: In most living organisms, genome carries the hereditary information that governs their life, death, and reproduction. • Describing genetic regulations at a system level; Central to genomic functions are the coordinated interactions • Enabling artificial simulation of network behavior; between genes (both the protein-coding DNA sequences and regulatory non-coding DNA sequences), RNAs and proteins, • Predicting new structures and relationships; forming the so called gene regulatory networks (or genetic • Making it possible to analyze or intervene in the regulatory networks). network through signal processing methods. The urgency of understanding gene regulations from Among various mathematical endeavors are two Boolean systems level has increased tremendously ever since the models, Boolean networks (BNs) [1] and probabilistic early stage of genomics research. A driving force is that, if Boolean networks (PBNs) [2], in which each node (gene) we can build good gene regulatory network models and takes on two possible values, ON or OFF (or 1 and 0), and apply intervention techniques to control the genes, we may the way genes interact with each other is formulated by find better treatment for diseases resulting from aberrant standard logic functions. They constitute an important class gene regulations, such as cancer. In the past decade, the of models for gene regulatory networks, in that they capture invention of high throughput technologies has made it some fundamental characteristics of gene regulations, are possible to harvest large quantities of data efficiently, which conceptually simple, and their rule-based structures bear is turning the quantitative study of gene regulatory networks physical and biological meanings. Moreover, Boolean into a reality. Such study requires the application of signal models can be physically implemented by electronic circuits, processing techniques and fast computing algorithms to and demonstrate rich dynamics that can be studied using process the data and interpret the results. These needs in turn mathematical and signal processing theory (for instance, have fueled the development of genomic signal processing Markov chains [2, 3]). and the use of mathematical models to describe the complex interactions between genes. In practice, Boolean models have been successfully applied to describe real gene regulatory relations (for instance, the drosophila segment polarity network [4]), and the attractors of BNs and PBNs have been associated with *Address correspondence to this author at the Computational Biology and cellular phenotypes in the living organisms [5]. The Bioinformatics Division, Greehey Children's Cancer Research Institute, association of network attractors and actual phenotypes has University of Texas Health Science Center at San Antonio, San Antonio, inspired the development of control strategy [6] to increase TX 78229, USA; Tel: 210-562-9080; Fax: 210-562-9135; the possibility of reaching desirable attractors (“good” E-mail: [email protected] 1389-2029/09 $55.00+.00 ©2009 Bentham Science Publishers Ltd. 512 Current Genomics, 2009, Vol. 10, No. 7 Yufei Xiao phenotypes) and decrease the likelihood of undesirable Probabilistic Boolean networks were introduced to address attractors (“bad” phenotypes such as cancer). The effort of this issue [2, 7], such that they are composed of a family of applying control theory to Boolean models is especially Boolean networks, each of which is considered a context [8]. appealing in the medical community, as it holds potential to At any given time, gene regulations are governed by one guide the effective intervention and treatment in cancer. component Boolean network, and network switchings are The author would like to bring the fundamentals of possible such that at a later time instant, genes can interact Boolean models to a wider audience in light of their under a different context. In this sense, probabilistic Boolean theoretical value and pragmatic utility. This tutorial will networks are more flexible in modeling and interpreting introduce the basic concepts of Boolean networks and biological data. probabilistic Boolean networks, present the mathematical Definition 2 [2, 3, 7] A probabilistic Boolean network is essentials, and discuss some analyses developed for the defined on V = {x , , x }, x {0,1}, and consists of r models and the common simulation issues. It is written for 1 ! n i ! researchers in the genomic signal processing area, as well as Boolean networks (V,f ), , (V,f ), with associated !1 1 ! !r r researchers with general mathematics, statistics, engineering, network selection probabilities c1,!,cr such that or computer science backgrounds who are interested in this r c = 1. The network function of the j -th BN is topic. It intends to provide a quick reference to the ! j=1 j fundamentals of Boolean models, allowing the readers to f = ( f (1) , , f (n) ) . At any time, genes are regulated by one apply those techniques to their own studies. Formal j j ! j definitions and mathematical foundations will be laid out of the BNs, and at the next time instant, there is a probability concisely, with some in-depth mathematical details left to q (switching probability) to change network; once a change the references. is decided upon, we choose a BN randomly (from r BNs) by the selection probabilities. Let p be the rate of random gene 2. PRELIMINARIES perturbation (flipping a gene value from 0 to 1 or 1 to 0), the In Boolean models, each variable (known as a node) can state transition of PBN at t (assuming operation under ! ) take two possible values, 1 (ON) and 0 (OFF). A node can j represent a gene, RNA sequence, or protein, and its value (1 is probabilistic, namely [3], or 0) indicates its measured abundance (expressed or $f (x(t)), with probability (1! p)n , unexpressed; high or low). In this paper, we use “node” and j (2) x(t + 1) = % n “gene” interchangeably. &x(t) " # , with probability 1! (1! p) , A state in Boolean models is a binary vector of all the where ! is bit-wise modulo-2 addition, ! = (! 1,!,! n ) is a gene values measured at the same time, and is also called the random vector with Pr{! = 1} = p , and x(t) ! " denotes a gene activity (or expression) profile (GAP). The state space i of a Boolean model consists of all the possible states, and its random perturbation on the state x(t) (one or more genes size will be 2n for a model with n nodes. are flipped). Let the set of network functions be F = {f1,!,fr }, and we denote the PBN by G(V, F, c, p) (see Definition 1 [2, 7] A Boolean network is defined on a set Remark 1). of n binary-valued nodes (genes) V = {x1,!, xn}, xi !{0,1}, Alternatively, the PBN can be represented as where each node xi has ki parent nodes (regulators) chosen G(V, !,", p) , with ! = {!1,!,!n} and ! = {!1,!,! n} . from V , and its value at time t + 1 is determined by its In this representation, each node xi is regarded as being parent nodes at t through a Boolean function fi , regulated by a set of l(i) Boolean functions (1) xi (t +1) = fi (xi1(t), xi2 (t),..., xik (t)), {i1,!,iki} ! {1,!,n}. (i) (i) i "i = {! 1 ,!,! l(i)} with the corresponding set of function (i) (i) l(i) (i) ki is called the connectivity of xi , and fi is the regulatory selection probabilities ! = {! ,!,! } ( ! = 1). i 1 l(i) " j=1 j function. Defining network function , we f = ( f1,!, fn ) The two representations are related such that any network denote the Boolean network as !(V, f ) . Let the network function f j is a realization of the regulatory functions of n state at time t be , the state transition x(t) = (x1(t),!, xn (t)) genes by choosing one function from the function set !i for x(t) x(t 1) is governed by f , written as ! + each gene xi , and we can write x(t + 1) = f(x(t)) . (1) (n) (3) f j = (" j ,!," j ), ji !{1,!,l(i)}. In Boolean networks, genetic interactions and regulations 1 n are hard-wired with the assumption of biological Moreover, if it is an independent PBN, namely determinism.

A Tutorial on Analysis and Simulation of Boolean Gene Regulatory Network Models

Markov Model Checking of Probabilistic Boolean Networks Representations of Genes

Online Spectral Clustering on Network Streams Yi

Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence

Boolean Network Control with Ideals

Random Boolean Networks As a Toy Model for the Brain

A Random Boolean Network Model and Deterministic Chaos

Reward-Driven Training of Random Boolean Network Reservoirs for Model-Free Environments

Solving Satisfiability Problem by Quantum Annealing

A Multi-Agent System for Environmental Monitoring Using Boolean Networks and Reinforcement Learning

List of Glossary Terms

Introduction to Natural Computation Lecture 18 Random Boolean

UCLA Samueli Engineering Annoucemente 2019-20