Parameter Identification of Canalyzing Boolean Functions with Ternary

Parameter Identification of Canalyzing Boolean Functions with Ternary

Parameter Identification of Canalyzing Boolean Functions with Ternary Vectors for Gene Networks Annika Eichler1 and Gerwald Lichtenberg2 1Automatic Control Laboratory, ETH Zurich, Physikstrasse, Zurich, Switzerland 2Faculty Life Sciences, Hamburg University of Applied Sciences, Ulmenliet, Hamburg, Germany Keywords: Parameter Identification, Networks, Gene Dynamics, Systems Biology, Boolean Functions, Ternary Logic. Abstract: In gene dynamics modeling, parameters of Boolean networks are identified from continuous data under vari- ous assumptions expressed by logical constraints. These constraints may restrict the dynamics of the network to the subclass of canalyzing functions, which are known to be appropriate for genetic networks. This pa- per introduces a high performance algorithm, which solves the parameter identification problem by so called Zhegalkin identification and exploits the restriction to canalyzing functions resulting in reduced calculation time. The canalyzing constraint is formulated in terms of orthogonal ternary vector lists - which are intrinsi- cally used in a Branch-and-Cut algorithm obeying this constraint. The algorithm is applied to mRNA micro array data from mice under different contaminant conditions and good correspondence to a known apoptotic pathway can be shown. 1 INTRODUCTION the solution set with the number of interacting genes. Thus, those methods are applicable up to a model or- A current field of research in systems biology is gene der of n = 10, where already very large runtimes of dynamics modeling, since understanding the dynam- hours or days occur, Faisal (2008). ics of the genetic model could help the therapeutic Furthermore, a clustering problem has to be process (Lin and Khatri, 2013). Canalyzing Boolean solved to determine groups of genes of unknown functions have shown to be appropriate to model ge- cardinality—denoted connectivity degree—which af- netic networks, due to their common characteristics, fect each other. Combining the clustering and the as periodicity, global complexity and self organiza- Zhegalkin identification problem leads to a problem tion (Kauffman, 1993). In genetic networks canal- of discrete optimization with even higher complex- ization is the ability of a genotype to produce the ity. First approximations for the solution of this same phenotype regardless of environmental variabil- combined problem have been found by a preprocess- ity (Jarrah et al., 2007). Thus, due to their stabilizing ing step based on the Pearson Correlation Coeffi- effect on the discrete dynamical behavior, they turned cient in Faisal (2008). Next, exploiting efficient rep- out to describe the highly ordered dynamics of gene resentations of Zhegalkin polynomials as orthogonal networks better than other Boolean models (Kauff- ternary vector lists (OTVLs), (Bochmann and Stein- man et al., 2003). bach, 1991), and adapting tensor decomposition tech- A successful approach to identify parameters of niques from Kolda and Bader (2009) allows integra- Boolean functions from contiuous-valued signals like tion of both steps reported in Lichtenberg and Eichler microarray data uses Zhegalkin polynomials to rep- (2011). Moreover, the solution set of the identifica- resent these functions, see Lichtenberg et al. (2005); tion algorithm can be reduced by fixing the maximum Faisal et al. (2010); Veliz-Cuba et al. (2010); Breindl number of rows of the OTVL representing the solu- et al. (2013). The Zhegalkin identification problem tion. This leads to highly efficient computation with is a Mixed Integer Quadratic Program (MIQP) which controllable degree of accuracy, because optimality of can in principle be solved with standard tools like the solution is guaranteed by a Branch-and-Cut algo- CPLEX or Xpress, where Branch-and-Cut algorithms rithm used for the reduced solution set. are used. One major problem of Boolean identifica- In this paper, the latter method is restricted to the tion is the exponential growth of the cardinality of subclass of canalyzing functions due to their interest- 110 Eichler, A. and Lichtenberg, G. Parameter Identification of Canalyzing Boolean Functions with Ternary Vectors for Gene Networks. DOI: 10.5220/0005978701100118 In Proceedings of the 6th International Conference on Simulation and Modeling Methodologies, Technologies and Applications (SIMULTECH 2016), pages 110-118 ISBN: 978-989-758-199-1 Copyright c 2016 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved Parameter Identification of Canalyzing Boolean Functions with Ternary Vectors for Gene Networks ing properties. This introduces additional constraints Table 1: Number of all Boolean functions and the canaly- for the optimization problem, as already reported in zing ones. Faisal et al. (2006) and Breindl et al. (2013), but n Boolean functions CFs the reduced solution set is not efficiently exploited 1 4 4 therein. This work shows how to incorporate those 2 16 14 constraints in the identification algorithm by express- 3 256 120 ing canalizing functions as OTVLs. The proposed 4 65536 3514 algorithm for the identification of canalyzing func- 5 4.2950 109 1292276 tions is by orders of magnitude more efficient since 6 1.8447 10· 19 1.0307 1011 the search space is considerably reduced as obvious · · from Table 1. The adapted identification is applied Table 2: Truth table. to gene expression data from mRNA extracted from mouse liver cells. yn y2 y1 b(y1,...,yn) ··· This work is organizedas follows. Section 2 intro- 0 0 0 b1 ··· duces fundamentals of Boolean functions, Zhegalkin 0 0 1 b2 ··· polynomials and OTVLs. In Section 3 the Branch- 0 1 0 b3 ··· and-Cut Boolean identification algorithm from Licht- 0 1 1 b4 . ··· . enberg and Eichler (2011) is described. Section 4 . presents how to express canalyzing functions as 1 1 1 b2n ··· OTVLs and adapt the identification therefore. The re- sults on an application to real data are shown in Sec- Proposition 1 (Zhegalkin (1928)). A Zhegalkin poly- tion 5. Finally conclusion are drawn in Section 6. n nomial evaluated at Boolean values y B gives the same (Boolean) result as the Boolean function∈ repre- sented by the truth vector b. 2 FUNDAMENTALS Thus the Zhegalkinpolynomialscan be seen as the bridge between the Boolean and the real set U. Since The set 0 1 denotes the set of logicals, 0 1 B= , U=[ , ] if y U then p(y) U as well, if however y B then the unit interval.{ } Negation of Booleans is denoted by ∈ ∈ ∈ p(y) B. z=z¯, for real onesx ¯=1 x holds. With the Kro- ∈ Example 1. (continued) To illustrate this for the necker¬ product is denoted.− ⊗ Boolean function (1) the corresponding Zhegalkin 2.1 Boolean Functions and Zhegalkin polynomial is calculated as (1 y1)(1 y2) ′ 1 Polynomials − − y1(1 y2) 1 l′(y)b = − n (1 y1)y2 1 A Boolean function b : B B can be represented − → 2n y1y2 0 by its truth vector b = (b1,...,b2n )′ B , i.e. the last ∈ column of the truth table as shown in Table 2. = 1 y1y2 . (4) − Example 1. Consider the Boolean function It can be easily seen that if y1,y2 B, then the Zhe- ∈ b(y1,y2)= (y1 y2), (1) galkin polynomial leads to the same solution as the ¬ ∧ Boolean function (1), as declared in Proposition 1. which is given by the truth table y2 y1 b(y1,y2) 2.2 Ternary Vector Lists 0 0 1 0 1 1 (2) Ternary Vector Lists (TVLs) are a common concept 1 0 1 in Boolean algebra, because of its outstanding advan- 1 1 0 tages for large scale problems, Bochmann and Stein- bach (1991). A TVL of a Boolean function represents with its truth vector. b = 1 11 0 ′. n all elements of the Boolean space B2 where the func- Definition 1. A Zhegalkin polynomial p(y)= l(y) b tion is 1 by ternary vectors (TVs). A TV t has the n ′ is a multilinear polynomial with b B2 being a truth structure ∈ vector and l(y) the so called literal vector, given by n n t T = 0,1, . (5) Lichtenberg and Eichler (2011) as ∈ { −} A zero element ’0’ in the TV describes that the corre- y¯n y¯1 2n l(y)= U . (3) sponding variable appears negated, a one element ’1’ yn ⊗···⊗ y1 ∈ 111 SIMULTECH 2016 - 6th International Conference on Simulation and Modeling Methodologies, Technologies and Applications that it appears not negated. The latter ’ ’ is the don’t Table 3: Graphical representation of operands for TVLs, care symbol, that can stand for either ’1’− or ’0’. Bochmann and Steinbach (1991). A TVL with k lines is of the form 0 0 0 1 t1 . TA = 1 −1 CPL(TA) = 1 −0 T = . − − t T = 1 DIF(T ,T ) = 00 k B −− A B − Taking all lines of the truth table with ones always leads to a valid TVL of a Boolean function. TVLs with smaller number of lines might be possible by us- Lemma 1. An OTVL T is orthogonal to its comple- ing ’ ’. ment T¯ . − Example 1. (continued) With the truth table in (2) Proof. With Definition 2 two TVLs are orthogonal, if valid TVLs for the Boolean function (1) of the run- they do not have any BVs in common. The comple- ning example are ment of an OTVL T contains all BVs, that are not in 0 0 0 1 T and is thus orthogonal to T. T = 0 1 , T = , (6) 1 2 0 "1 0# − Proposition 2. For an OTVL T with k lines the num- 0 0 ber of ones in the correspondoing truth vector b is T3 = − , T4 = − . (7) ∑k Ni 0 1 0 N1 = b′1 = i=1 2 − where Ni is the numberof ’ ’s − in the i-th line of T. − − This can easily be checked by replacing ’ ’ with both − ’0’ and ’1’. Proof. The number ones in b is equivalent to the This example shows that TVLs are not unique, number of BVs in T. ATVwithno’ ’s represents a − i.e there exist different TVLs for the same Boolean single BV and since a ’ ’ stands for either 1 or 0, a − N function.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us