Analysis and Synthesis of Boolean Networks

MING LIU

Licentiate Thesis in Electronic and Computer Systems Stockholm, Sweden 2015 KTH School of Information and Communication Technology TRITA-ICT 2015:23 SE-164 40 Kista, Stockholm ISBN 978-91-7595-770-8 SWEDEN

Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framlägges till offentlig granskning för avläggande av licentiatexamen i ämnet Elektronik och datorsystem fredag den 18 december 2015 klockan 09.00 i Sal B, Electrum, Kungl Tekniska högskolan, Kista 16440, Stockholm.

© Ming Liu, November 2015

Tryck: Universitetsservice US AB iii

Abstract

In this thesis, we present techniques and algorithms for analysis and syn- thesis of synchronous Boolean and multiple-valued networks. Synchronous Boolean and multiple-valued networks are a discrete-space discrete-time model of gene regulatory networks. Their cycle of states, called , are believed to give a good indication of the possible functional modes of the system. This motivates research on algorithms for finding at- tractors. Existing decision diagram-based approaches have limited capacity due to the excessive memory requirements of decision diagrams. Simulation- based approaches can be applied to large networks, however, their results are incomplete. In the first part of this thesis, we present an algorithm, which uses a SAT-based bounded model checking approach to find all attractors in a multiple-valued network. The efficiency of the presented algorithm is evaluated by analysing 30 network models of real biological processes as well as 35 000 randomly generated 4-valued networks. The results show that our algorithm has a potential to handle an order of magnitude larger models than currently possible. One of the characteristic features of genetic regulatory networks is their inherent robustness, that is, their ability to retain functionality in spite of the introduction of random faults. In the second part of this thesis, we focus on the robustness of a special kind of Boolean networks called Balanced Boolean Networks (BBNs). We formalize the notion of robustness and introduce a method to construct BBNs for 2-singleton attractors Boolean networks. The experiment results show that BBNs are capable of tolerating single stuck-at faults. Our method improves the robustness of random Boolean networks by at least 13% on average, and in some special case, up to 61%. In the third part of this thesis, we focus on a special type of synchronous Boolean networks, namely Feedback Shift Registers (FSRs). FSR-based filter generators are used as a basic building block in many cryptographic systems, e.g. stream ciphers. Filter generators are popular because their well-defined mathematical description enables a detailed formal security analysis. We show how to modify a filter generator into a nonlinear FSR, which is faster, but slightly larger, than the original filter generator. For example, the propagation delay can be reduced 1.54 times at the expense of 1.27% extra area. The presented method might be important for applications, which require very high data rates, e.g. 5G mobile communication technology. In the fourth part of this thesis, we present a new method for detect- ing and correcting transient faults in FSRs based on duplication and parity checking. Periodic fault detection of functional circuits is very important for cryptographic systems because a random hardware fault can compromise their security. The presented method is more reliable than Triple Modular Redundancy (TMR) for large FSRs, while the area overhead of the two ap- proaches are comparable. The presented approach might be important for cryptographic systems using large FSRs. iv

Sammanfattning

I denna avhandling presenterar vi metoder och algoritmer för analys och syntes av synkrona Booleska och flervärda nätverk. Synkrona Booleska och flervärda nätverk är rums-och tidsdiskret modell av regulatoriskt gennätverk. Deras fillständscykler, som kallas attractorer, tros ge en god indikation på möjliga funktionssätt i systemet. Detta motiverar forskning om algoritmer för att hitta attraktorer. Befintliga beslutsdiagram baserade metoder har begränsad kapacitet på grund av orimliga minneskrav. Simuleringsbaserade metoder kan tillämpas på stora nätverk, men ger ofull- ständiga resultat. I den första delen av denna avhandling presenterar vi en algoritm som använder en SAT-baserad modell för gränsvärdes kontroll för att hitta alla attraktorer i flervärda nätverk. Effektiviteten av den presenterade algoritmen utvärderas genom att analysera 30 nätverksmodeller av verkliga biologiska processer samt 35 000 slumpmässigt genererade 4-värda nätverk. Resultaten visar att vår algoritm har potential att hantera en storleksordning större modeller än vad som nu är möjligt. En karakteristisk egenskap hos regulatoriskt gennätverk är dess innebo- ende robusthet, det vill säga dess förmåga att bibehålla funktionalitet trots införandet av slumpmässiga fel. I den andra delen av denna uppsats fokuserar vi på robustheten hos en speciell typ av Booleska nätverk som kallas Balanse- rade Booleska Nätverk (BBN). Vi formaliserar begreppet robusthet och inför en metod för att bygga BBN för 2 -singleton attraktorer Booleska nätverk. Experimentets resultat visar att BBN har förmåga att tolerera enstaka fel. Vår metod förbättrar robustheten läsnings slumpmässigt genererade Booleska nätverk med minst 13% i genomsnitt och i vissa specialfall upp till 61%. I den tredje delen av denna uppsats fokuserar vi på en speciell typ av synkrona Booleska nätverk, nämligen Feedback Shift Register (FSR). FSR- baserade filtergeneratorer används som en grundläggande byggsten i många kryptografiska system, t.ex. strömchiffer. Filtergeneratorer är populära ef- tersom deras väldefinierade matematiska beskrivning möjliggör en detalje- rad formell säkerhetsanalys. Vi visar hur blir en filter generator i modifierad icke-linjärt FSR, snabbare, men något större, än den ursprungliga filterge- neratorn. Exempelvis kan utbredningsfördröjningen minskas 1,54 gånger på bekostnad av 1,27 % extra yta. De presenterade metoderna kan vara viktiga för tillämpningar som kräver mycket höga datahastigheter, t.ex. 5G mobil kommunikationsteknik. I den fjärde delen av denna avhandling presenterar vi en ny metod för att detektera och korrigera transienta fel i FSRer med hjälp av duplicering och paritetskontroll. Periodiskt fel-detektering av funktionella kretsar är mycket viktigt för krypteringssystem eftersom slumpmässiga hårdvarufel kan äventy- ra dess säkerhet. Denna metod är mer pålitlig än Triple Modular Redundancy (TMR) för stora FSRer, men med jämförbar area. Det presenterade tillväga- gångssättet kan vara viktigt för kryptografiska system som använder stora FSRer. Acknowledgements

First of all, I would like to thank my supervisor, Professor Elena Dubrova, for inspiring me with so many topics and ideas, and especially, for all the patience and encouragement she offered me when I was in trouble. I would like to thank Professor Zhonghai Lu for reviewing this thesis. I would like to thank Professor Fredrik Jonsson for his help on the Swedish abstract. I would like to thank Ms. Alina Munteanu for her help on everything in school. I would like to thank all the professors and scholars who teach me, encourage me and enlighten my own research. I would like to thank all my colleagues and friends, Shaoteng Liu, Li Xie, Jue Shen, Pei Liu, Shuo Li, Yuan Yao, Xueqian Zhao, Fan Pan, Shao Tao, Nan Li and Shohreh Sharif Mansouri, for the discussions as well as happiness we shared together. Finally, I would like to thank my parents for their love and support.

Thank you! Ming Liu Stockholm, November, 2015

v

Contents

Contents vii

List of Figures ix

List of Tablesx

List of Abbreviations xi

1 Introduction1 1.1 Previous Work...... 1 1.2 Overview and Contributions of the Author...... 3

2 Background7 2.1 Notation...... 7 2.2 Boolean Networks...... 9 2.3 Feedback Shift Registers...... 11 2.4 Relationship between Boolean Networks and FSRs...... 13

3 Multiple-valued Networks 15 3.1 Multiple-valued Networks...... 15 3.2 Computation of Attractors...... 16 3.3 Converting GINML Format to CNET Format...... 18 3.4 Conclusion...... 19

4 Balanced Boolean Networks 21 4.1 Computational Scheme Based on Boolean Networks...... 21 4.2 Single Stuck-at Fault Model...... 23 4.3 Robustness Evaluation...... 24 4.4 Balanced Boolean Networks...... 25 4.5 Conclusion...... 26

5 Design of Secure FSRs 27 5.1 Filter Generators...... 27

vii viii CONTENTS

5.2 Main Ideas of the Presented Approach...... 29 5.3 The Fibonacci to Galois Transformation...... 30 5.4 Example...... 31 5.5 Conclusion...... 35

6 Design of Reliable FSRs 37 6.1 Stream Ciphers...... 37 6.2 Transient Faults and Triple Modular Redundancy...... 38 6.3 The Duplication and Parity Checking Approach...... 40 6.4 Conclusion...... 46

7 Conclusion and Future Work 49

Bibliography 51

Publications

A Finding Attractors in Synchronous Multiple-Valued Networks 59

B The Robustness of Balanced Boolean Networks 61

C A Faster Shift Register Alternative to Filter Generators 63

D A New Approach to Reliable FSRs Design 65 List of Figures

1.1 The structure of the thesis...... 4

2.1 An example of a direct graph...... 9 2.2 An example of a 3-node Boolean network...... 11 2.3 The general structure of an n-stage FSR...... 12 2.4 The relationship between a Boolean network and an FSR...... 14

3.1 A 2-node 3-valued network and its STG...... 16 3.2 The flowchart for the conversion procedure...... 20

4.1 An example of a 2-input Boolean logic OR...... 22 4.2 Examples of stuck-at faults in a Boolean network...... 23 4.3 An example of constructing a balanced Boolean network...... 25

5.1 A filter generator composed of an LFSR and a filtering function...... 28 5.2 An n-stage NLFSR constructing based on a filter generator...... 29 5.3 A 4-stage NLFSR in the Fibonacci configuration...... 30 5.4 An example of the shifting operation...... 31 5.5 An NLFSR constructed from a 256-stage LFSR with fL(x) and fN (x). 33

6.1 Encryption and decryption using a stream cipher...... 38 6.2 The pseudo-random sequence generator of stream ciphers A5/1, E0, WG-7 and Grain...... 39 6.3 The basic configuration of TMR...... 40 6.4 The block diagram for an n-stage FSR using TMR...... 40 6.5 The block diagram of the duplication and parity checking approach.. 41 6.6 An example of error detection...... 42 6.7 An example of error correction...... 42 6.8 The error-correcting circuit Cec...... 43 6.9 The reliability of our approach and TMR...... 45 6.10 The area overhead of our approach and TMR...... 47

ix List of Tables

2.1 The truth table of a Boolean function f(x) = x1 · x2...... 8

3.1 A description of the GINML format...... 18 3.2 An example of a multiple-valued network in CNET format...... 19

4.1 Evaluation of robustness of GRNs and (16, 4)-random Boolean net- works...... 24

5.1 The statistical tests report for the output sequence OutN ...... 34

6.1 Area approximation of gates and flip-flops...... 46

x List of Abbreviations

AES Advanced Encryption Standard. ANF Algebraic Normal Form.

BBN Balanced Boolean Network. BLIF Berkeley Logic Interchange Format.

DD Decision Diagram. DNA Deoxyribonucleic Acid.

FSR Feedback Shift Register.

GRN .

IC Integrated Circuit.

LFSR Linear Feedback Shift Register.

NLFSR Non-Linear Feedback Shift Register.

PRNG Pseudo-Random Number Generator. PRSG Pseudo-Random Sequence Generator.

RAM Random Access Memory. RBN Random Boolean Network. RFID Radio-Frequency Identification. RNA Ribonucleic Acid.

STG State Transition Graph.

TMR Triple Modular Redundancy.

XML eXtensible Markup Language.

xi

Chapter 1

Introduction

Studying biological systems is important for natural sciences. Biological systems are created by a complex evolutionary process of nature. The unique properties of biological systems, such as robustness and adaptability, give us tremendous op- portunities for mimicking the mechanisms of natural evolution in an attempt to generate software and hardware systems with characteristics comparable to those of biological systems. More than 40 years ago, systems biologists began to focus their research on genetic and cellular systems [10, 26]. This research not only pro- moted the understanding and development of biological systems, but also resulted in many fundamental algorithms and tools for analysis and synthesis of complex net- works in many areas, including Kauffman’s NK model for evolutionary biology [53], self-organization for social networks and computer networks [9, 63], etc.

1.1 Previous Work

In the late 1960s, Boolean networks were proposed by Kauffman as a model of Gene Regulatory Networks (GRNs) [52]. Since then, this model attracted a lot of attention in systems biology [49], evolution [7] and self-organization of complex systems [53]. Gene regulatory networks represent the fundamental cellular processes in molec- ular entities of living systems, such as proliferation, differentiation and apoptosis, which are controlled by a great amount of molecular actors interacting through a variety of regulatory mechanisms. A GRN is a collection of DNA segments in a cell, called genes, which interact with each other [5]. Each gene contains information that determines what the gene does and when the gene is active, or expressed. When a gene is active a process called transcription takes place, producing a RiboNucleic Acid (RNA) copy of the gene’s information. This piece of RNA then directs the synthesis of proteins. RNA or protein molecules resulting from the transcription process are known as gene products. In a Boolean network model, a GRN is rep- resented by a graph composed of nodes (corresponding to genes, proteins and/or

1 2 CHAPTER 1. INTRODUCTION metabolites) and edges (representing molecular interactions such as protein-DNA and protein-protein, or indirect relationships between genes). In single-celled organisms, GRNs respond to the external environment, optimis- ing the cell at a given time for survival in this environment. For example, a yeast cell, finding itself in a sugar solution, will turn on genes to make enzymes that pro- cess the sugar to alcohol. By modelling this process using a Boolean network [55], researchers were able to find that a yeast cell makes its living by gaining energy to multiply, which under normal circumstances enhances its survival prospects. Differ- ent Boolean networks of a yeast cell have been built based on experiments [24, 40]. For multi-cellular insects and animals, the same principle has been applied to gene cascades that control body-shape [25]. By modelling the gene segmentations and cell cycles using Boolean networks, researchers were able to get a better under- standing of the mechanism of epigenetics [11] by which chromatin modification may provide cellular memory by blocking or allowing transcription and finally resulting in a specific type of cells. Some evidences [51] suggest that cell types (or cell fates) in a GRN of humans and other organisms can be modelled by attractors 1 of a Boolean network. As a consequence, the analysis of Boolean networks became popular in the research related to genetic diseases [4, 22, 42]. Several tools for Boolean network analy- sis, including GINsim [47], random Boolean networks toolbox [71] for Matlab and DDLab [79] have been developed. The Boolean networks have also been used to investigate the behaviour of com- plex systems and self-organization phenomenon. Instead of exploring a specific gene regulatory system, the study of complex systems explores high-level properties of random Boolean networks. A complex system is defined as a system composed of multiple elements that interact to produce the system’s characteristics and be- haviour [53, 20]. For example, a system may have long attractors (with a large number of states), long transients 2 and be unstable, or it may have short attrac- tors, short transients and be stable. Such regimes are known as behavioural regimes. It was demonstrated [78] that the behaviour of any complex system falls into one of the following three regimes:

• Ordered: In this behavioural regime, a system tends to have single-point or very short-length attractors, short transients, and dynamic behaviour which is insensitive to the initial states;

• Complex: In this behavioural regime, a system tends to have short or medium- length attractors, medium length transients, and dynamic behaviour which is relatively insensitive to the initial states;

1In a Boolean network, an is a cycle in the state transition graph of the network. The formal definition of attractors will be given in Chapter2. 2The transient length is the time (number of steps) a system needs to reach an attractor. 1.2. OVERVIEW AND CONTRIBUTIONS OF THE AUTHOR 3

• Chaotic: In this behavioural regime, a system tends to have infinite-length attractors, long transients, and dynamic behaviour which is very sensitive to the initial states. Apart from the number and length of attractors, several other parameters have been found to have influence on the system dynamics, including system connectivity, properties of functions associated to the network’s nodes and the structure of the network [6, 61]. Kauffman has analysed the influence of the number of immediate predecessors k and the type of Boolean functions associated to the nodes on the dynamics of random Boolean networks [52, 54]. He has shown that if Boolean functions are associated to the nodes at random from the set of all possible Boolean functions of k variables, then if k ∼ 2, the network is at the borderline between chaos and order; if k is large (of the order of the total number of network’s nodes), the system falls into chaotic regime. Kauffman made a hypothesis that the regulatory structures at the edge of chaos (k ∼ 2) ensure both stability and evolutionary improvements, and these provide the background conditions for an evolution of genetic systems.

1.2 Overview and Contributions of the Author

This thesis is a collection of papers focusing on topics related to Boolean networks. The following papers are included in this thesis: Paper A Elena Dubrova, Ming Liu, and Maxim Teslenko. Finding attractors in syn- chronous multiple-valued networks using SAT-based bounded model checking. Journal of Multiple-Valued Logic and Soft Computing, 19(1-3):109–131, 2012. Paper B Ming Liu and Elena Dubrova. The robustness of balanced Boolean net- works. In Ronaldo Menezes, Alexandre Evsukoff, and Marta C. González, edi- tors, Complex Networks, volume 424 of Studies in Computational Intelligence, pages 19–30. Springer Berlin Heidelberg, 2013. ISBN 978-3-642-30286-2. Paper C Ming Liu, S.S. Mansouri, and E. Dubrova. A faster shift register alter- native to filter generators. In Proceedings of 2013 Euromicro Conference on Digital System Design (DSD), pages 713–718, Sept 2013. Paper D Ming Liu and Elena Dubrova. A new approach to reliable FSRs design. In Proceedings of 32nd Nordic Microelectronics Conference NORCHIP, Oct 2014. The structure of the thesis is shown in Figure 1.1. The results can be divided into two parts. The first part is about Boolean network analysis. It includes Chapter3 and Chapter4. In Chapter3, we calculate and analyse attractors in multiple-valued networks. In Chapter4, we analyse the robustness of a special type of Boolean networks, called balanced Boolean networks. 4 CHAPTER 1. INTRODUCTION

The second part is about synthesis of a special class of Boolean networks, namely Feedback Shift Registers (FSRs). It includes Chapter5 and Chapter6. In Chapter5, we address the problem of generating cryptographically strong pseudo-random se- quences using FSRs. In Chapter6, we show how to design reliable FSRs based on a duplication and parity checking approach.

Network analysis

Ch.3 Multiple-valued Networks Ch.4 Balanced Boolean Networks

Boolean networks

Ch.5 Design of Secure FSRs Ch.6 Design of Reliable FSRs

Network synthesis

Figure 1.1: The structure of the thesis.

The rest of the thesis is organized as follows. Chapter2 provides the necessary background on Boolean networks and feedback shift registers. Chapters3-6 introduce PaperA,B,C andD, respectively. Chapter3 introduces PaperA. In PaperA, we present a SAT-based bounded model checking [13] algorithm for computing attractors in multiple-valued networks. We first extend the SAT-based bounded model checking to the multiple-valued case. Then, we convert network models from GINML format (used as an input to GINsim tool [47]) to CNET format (used as an input in our tool) and run the experiments on multiple-valued network models of real cells. The author’s contribution to this paper includes the format conversion, experiments and analysis of results. Chapter4 introduces PaperB. In PaperB, we introduce a special type of Boolean networks, called balanced Boolean networks. We first propose a new measure of robustness of Boolean networks. Then, we run the robustness test for random Boolean networks and Boolean networks modelling GRNs of some real cells. Finally, we present an algorithm to transform an arbitrary Boolean network into a balanced Boolean network. The author’s contribution to this paper includes 1.2. OVERVIEW AND CONTRIBUTIONS OF THE AUTHOR 5 the definition of robustness, algorithm, the implementation, experiments, analysis of results and writing the paper. Chapter5 introduces PaperC. In PaperC, we present a method to transform a structure commonly used in cryptographic applications for generating pseudo- random sequences called filter generator into a Non-Linear Feedback Shift Register (NLFSR). We then use the Fibonacci to Galois transformation of FSRs to reduce the propagation delay. We show that the output sequences are statistically random via the NIST tests [69]. The author’s contribution to this paper includes NLFSR simulation, experiments, analysis of results and writing the paper. Chapter6 introduces PaperD and complements it with new results of reliability evaluation. In PaperD, we show how to design reliable FSRs using duplication and parity checking. We demonstrate that the presented approach is more reliable than Triple Modular Redundancy (TMR) for large FSRs, while its area overhead is smaller compared to TMR. The author’s contribution to this paper includes the duplication and parity checking approach, experiments, analysis of results and writing the paper. Chapter7 summarizes the thesis and suggests the directions for the future work.

Chapter 2

Background

This chapter presents the basic definitions and properties of Boolean networks and feedback shift registers.

2.1 Notation

Throughout the thesis we use “·” and “∧” for the minimum operation (also called MIN or, for the two-valued case, AND); “+” and “∨” for the maximum operation (MAX or, for the two-valued case, OR); “⊕” for the addition modulo m operation (XOR for the two-valued case); “ 0” and “ ” for the complement operation (NOT); “↔” for the logic equivalence operation. We let M := {0, 1, . . . , m − 1} be a finite set of values. We use lower-case letters a, b, c, etc to denote elements over M, and lower-case letters f, g, h, etc to denote functions. We use x1, x2, . . . , xn to denote variables of the functions and use N = {1, 2, . . . , n} to denote the set of indices of these variables. We use bold lower-case letters a, b, x, y, etc for vectors.

Boolean Function

Let B denote the Boolean domain: B := {0, 1}.

Definition 1 A Boolean function is a mapping of the form f: Bn → B, where n is the number of the variables.

Let x := (x1, x2, . . . , xn) be a vector of n Boolean variables. Definition 2 A Boolean function f(x) is linear if there is a vector w ∈ Bn such that f(x) = (w, x) = x1w1 ⊕ x2w2 ⊕ ... ⊕ xnwn.

7 8 CHAPTER 2. BACKGROUND

n The set of all linear functions on B is denoted by Ln.

Definition 3 The truth table of a Boolean function f(x) is an exhaustive list of pairs (x, f(x)).

Table 2.1: The truth table of a Boolean function f(x) = x1 · x2.

x1 x2 f(x) 0 0 0 0 1 0 1 0 0 1 1 1

Boolean Functions in Algebraic Normal Form

Throughout the thesis, the Boolean functions Bn → B are often represented using the Algebraic Normal Form (ANF) which is a polynomial of type

2n−1 X i1 i2 in f(x) = ci · x1 · x2 · ... · xn , i=0 where ci ∈ {0, 1} and (i1i2 . . . in) is the binary expansion of i [57]. We treat all Boolean functions as n-variable functions of the same variables x1, x2, . . . , xn. Some functions may not depend on all n variables.

Definition 4 The dependency set of a function f(x), denoted by dep(f), contains all variables on which the function actually depends, i.e.

dep(f) = {i | f(x)|xi=0 6= f(x)|xi=1},

where f(x)|xi=j = f(x1, . . . , xi−1, j, xi+1, . . . , xn) for j ∈ {0, 1}.

Graphs

Definition 5 The pairs (a1, b1) and (a2, b2) are ordered pairs, if

(a1, b1) = (a2, b2) if and only if a1 = a2 and b1 = b2.

Definition 6 A graph is an ordered pair G = (V,E), where V denotes a set of vertices and E denotes a set of edges. An edge represents a connection between a pair of vertices. A graph is directed if its edges are ordered pairs. 2.2. BOOLEAN NETWORKS 9

Definition 7 The immediate predecessor and immediate successor sets of a v ∈ V in a are defined as ipred(v) = {u | (u, v) ∈ E} and isucc(v) = {u | (v, u) ∈ E}, respectively.

Definition 8 The predecessor set pred(v) of a vertex v ∈ V is a subset of V con- taining all vertices from which v in reachable. Similarly, the successor set succ(v) of a vertex v ∈ V is a subset of V containing all vertices reachable from v.

a g

b d e f

h c

Figure 2.1: An example of a direct graph. The predecessor and successor sets of ver- tex e are pred(e) = {a, b, c, d} and succ(e) = {f, g, h}. The immediate predecessor and immediate successor sets of vertex e are ipred(e) = {d} and isucc(e) = {f}.

2.2 Boolean Networks

Boolean networks were proposed by Kauffman in the late 1960s as a network model of genetic regulatory networks [52]. In this model, genes are represented by nodes and the regulatory relationships between the genes are represented by edges. The gene activation is Boolean and other regulatory elements are assumed to be either active or inactive at a given point in time.

Definition 9 A Boolean network is a directed graph consisting of n vertices vi, i ∈ {1, 2, . . . , n}. Each vi has a state variable xi ∈ B which represents the current state of vi. The value of each xi is updated according to the updating function of ki type fi : B → B, where ki is the number of immediate predecessors of vi.

Boolean networks can be updated synchronously or asynchronously [43]. In this thesis, we only deal with synchronous Boolean networks. 10 CHAPTER 2. BACKGROUND

State Transition Graph At any given point of time t, the state s of a Boolean network is a vector of values of its state variables x1, x2, . . . , xn. Time is viewed as proceeding in discrete steps. For the synchronous type of update, at every time step, the next state of a network, s+, is determined from the current state, s, by updating the values of the state variables of all vertices simultaneously to the value of the corresponding updating functions fi: x+ = f (x , x , . . . , x ), i i i1 i2 iki where x+ stands for the next value of x and x , x , . . . , x are the state variables i i i1 i2 iki associated with the immediate predecessors of vi. For all 2n states of an n-node network, there are 2n transitions s → s+ defined by the updating functions. The State Transition Graph (STG) can be used to illustrate these state transitions. In an STG, vertices represent 2n possible network states and edges represent the transitions s → s+ between the states. A state transition graph can be described by a transition relation. The charac- teristic formula for the transition relation of a Boolean network is given by [19]:

n ^ T (s, s+) = (x+ ↔ f (x , x , . . . , x )), i i i1 i2 iki i=1 where f is the updating function associated with v and x , x , . . . , x are state i i i1 i2 iki variables associated with the immediate predecessors of vi.

Attractors Since a Boolean network is deterministic and finite, any sequence of its consecutive states eventually converges to either a single state, or a cycle of states, called an attractor.

Definition 10 An attractor is a set of states of a cycle in the STG of a Boolean network.

The length of an attractor is the number of states in an attractor. Attractors of length one are called single-point attractors. A set of states that will, over time, converge to an attractor is called basins of attraction of this attractor 1. The larger are the basins, the more likely it is that a network starting at a randomly chosen state will end up in the associated attractor.

Definition 11 The basins of attraction of an attractor are the predecessor set of the attractor states.

1In some papers, basins of attraction is defined as a set including the attractor states as well [6, 28]. 2.3. FEEDBACK SHIFT REGISTERS 11

000 010

001 101 110 v2 x2 + x3 x1x2 + x3 v3 100 111 011

v1 x1x2

A1 A2 (a) A 3-node Boolean network. (b) Its STG. The states are ordered as (x1x2x3).

Figure 2.2: An example of a 3-node Boolean network.

An example of a 3-node Boolean network and its STG are shown in Figure 2.2(a) and Figure 2.2(b), respectively. In Figure 2.2(a), the 3 nodes of the Boolean network are v1, v2 and v3; and the updating functions of the 3 nodes are f1 = x1x2, f2 = x2 + x3 and f3 = x1x2 + x3. The arrows indicate the predecessor and successor relations. Figure 2.2(b) shows its STG, where the 8 vertices represent all 23 network states and the edges show the state transitions. Two attractors A1 and A2 for this Boolean network are shown in yellow (also filled with slashes in a light color) and the corresponding basins are shown in blue. The attractor A1 = {100} is a single point attractor and its basins are {000}. A2 = {111, 011} is a cyclic attractor of length 2 and the basins are {001, 101, 010, 110}. The transition relation of this network is given by:

+ + + + T (s, s ) = (x1 ↔ x1x2)(x2 ↔ x2 + x3)(x3 ↔ x1x2 + x3).

2.3 Feedback Shift Registers

An n-stage FSR is a synchronous device consisting of n binary registers, called stages, connected in a chain [46]. Each stage has a single input and a single output, and it is able to store one bit of information. There are two configurations of FSRs: the Fibonacci configuration and the Galois configuration. Let xi, xi ∈ B, i ∈ {1, 2, . . . , n} be the state variable representing the value of the ith stage in an FSR and fi(x) be the updating function of the ith stage.

Definition 12 An n-stage FSR in the Fibonacci configuration is defined by: 12 CHAPTER 2. BACKGROUND

g(x1, x2, . . . , xn)

output xn xn−1 ... x2 x1

(a) An FSR in the Fibonacci configuration.

...... output gn xn gn−1 xn−1 ... g1 x1

(b) An FSR in the Galois configuration.

Figure 2.3: The general structure of an n-stage FSR in the Fibonacci and the Galois configurations.

f1(x) = x2,

f2(x) = x3, . .

fn−1(x) = xn,

fn(x) = g(x1, ..., xn), where g is the updating function (also called feedback function).

Definition 13 An n-stage FSR in the Galois configuration is defined by:

f1(x) = g1(x1, x2),

f2(x) = g2(x1, x2, x3), . .

fn−1(x) = gn−1(x1, x2, . . . , xn),

fn(x) = gn(x1, ..., xn), where gi, i ∈ {1, 2, . . . , n} is the updating function for the ith stage.

Definition 14 If functions f1(x), f2(x), . . . , fn(x) are linear functions, fi(x) ∈ Ln, the FSR is called a Linear Feedback Shift Register (LFSR); otherwise, it is called an NLFSR. 2.4. RELATIONSHIP BETWEEN BOOLEAN NETWORKS AND FSRS 13

Properties of FSRs Let an FSR consist of n stage registers. Each stage i, i ∈ {1, . . . , n} has a state n variable xi and an updating function fi : B → B. A state of an FSR is a vector of values of its state variables. At each clock cycle, the next state of an FSR is determined from its current state by simultaneously updating the value of each stage i to the value of the corresponding updating function fi, ∀i ∈ {1, . . . , n}. The period of an FSR is the length of the longest cyclic output sequence it produces. The updating functions on a Galois FSR induce the mapping F : Bn → Bn of type (x1, x2, . . . , xn) → (f1(x), f2(x), . . . , fn(x)). where dep(f1) = {1, 2}, dep(f2) = {1, 2, 3}, . . . , dep(fn) = {1, . . . , n}. If an FSR is branchless, i.e. its state transition graph consists of pure loops, then the mapping F is invertible [41]. Note that a branchless FSR can be considered as a special case of invertible multivariate transformations [73].

2.4 Relationship between Boolean Networks and FSRs

We can see a clear similarity between synchronous Boolean networks and FSRs in Figure 2.4. Figure 2.4(a) shows an n-node Boolean network and its corresponding n-stage FSR in the Galois configuration. Figure 2.4(b) shows an example of a 4- node Boolean network and a 4-stage FSR. They have the same STG, which shows in Figure 2.4(c). 14 CHAPTER 2. BACKGROUND

x x 1 x2 n

g1 g2 ... gn x1 x2 ... xn

...

... g1(x1, x2) ... gi(x1, . . . , xi, xi+1) ... g3 gn(x1, . . . , xn) ... x3

(a) A Boolean Network and its corresponding FSR in the Galois configuration.

x x ⊕ x x x 1 2 1 3 2 output x4 x3 x2 + x1

x3 x4 x1 x4

(b) An example of a 4-node Boolean network and an FSR with the same STG.

1111 1101 1110 1001 0111 0001 1010 0000 0010 0101 0100 1011 1000 1100 0110 0011

(c) The common STG (x1x2x3x4) of the example in (b).

Figure 2.4: The relationship between a Boolean network and an FSR. Chapter 3

Multiple-valued Networks

Boolean network model is useful in the context of molecular networks [47], the chaos and order analysis of random Boolean networks [14], robustness analysis of canalizing Boolean functions for random Boolean networks [54]. However, a Boolean network is not sufficient for describing some more complex biological problems. If we use a Boolean network to model a GRN, this Boolean network can only model gene expressions with two levels: active (logic 1) or repressive (logic 0). In some cases, using more than two values for modelling gene interactions can be more appropriate [25, 47]. In this chapter, we present a SAT-based bounded model checking algorithm for finding all attractors in synchronous multiple-valued networks. We first give a formal definition of multiple-valued networks; then a brief explanation of the SAT- based bounded model checking algorithm; and finally, a procedure for converting network models from GINML format to CNET format. More details about the computation of attractors and experimental results can be found in PaperA.

3.1 Multiple-valued Networks

There are many mathematical models of GRNs including Boolean and multiple- valued networks, ordinary and partial differential equations, Petri nets, Bayesian networks, stochastic equations, and process calculi [70]. The Boolean and multiple- valued network models have been shown useful for exploring the GRN in the context of cellular differentiation, cell cycle regulation, immune response, and evolution [8]. A multiple-valued network is a directed graph in which each vertex vi has an associated state variable xi and an associated multiple-valued function fi [31]. The state variable xi ∈ M, M = {0, 1, . . . , m − 1}, represents the current value of vi. ki The multiple-valued function fi : M → M, where ki is the number of immediate predecessors of vi, determines how the value of vi is updated. Similarly to the Boolean case, the state s of a multiple-valued network is a vector of values of its state variables x1, x2, . . . , xn. Time is viewed as proceeding

15 16 CHAPTER 3. MULTIPLE-VALUED NETWORKS in discrete steps. For the synchronous type of update, at every time step, the next state of a network, s+, is determined from the current state, s, by updating the values of the state variables of all vertices simultaneously to the values of the corresponding updating function fi:

x+ = f (x , x , . . . , x ), (3.1) i i i1 i2 iki where x+ stands for the next value of x and x , x , . . . , x are the state variables i i i1 i2 iki associated to the immediate predecessors of vi. Figure 3.1 shows an example of a 2-node 3-valued network and its STG. The node v1 is 3-valued and node v2 is Boolean. The updating functions are f1 = x1 + x2 and f2 = x2. As it shows in the STG in Figure 3.1(b), this network has three attractors A1 = {10}, A2 = {00, 20} and A3 = {11}.

10 21

v1 x1 + x2 A1 01

00 20 11 v2 x2

A2 A3

(a) A 2-node 3-valued network. (b) Its STG (x1x2).

Figure 3.1: A 2-node 3-valued network and its STG.

3.2 Computation of Attractors Using SAT-based Bounded Model Checking

Synchronous multiple-valued networks can be considered as a class of deterministic finite state machines. Any sequence of consecutive states of a network eventually converges to an attractor. Attractors represent the pattern of gene expressions in the corresponding cell types of the organism being modelled [53, 51]. When the effect of a disease or a mutation on an organism is studied, attractors have to be recomputed every time a fault is injected in a model. All algorithms for finding attractors in Boolean and multiple-valued networks face a state-space explosion problem that must be addressed to handle large net- works. Even if the length of attractors is restricted to one, the problem is NP- 3.2. COMPUTATION OF ATTRACTORS 17 hard [3]. The best known algorithm for finding all attractors of length one in a Boolean network with n vertices has the complexity O(1.757n) [76]. Decision Diagrams (DDs) have traditionally been used for representing and ma- nipulating Boolean and multiple-valued functions in network simulation and anal- ysis [2, 38, 33]. Once a DD is constructed for a network, calculations can be done very efficiently. However, a well-known weakness of DDs is the unpredictability of their memory requirements. In the meantime, SAT-based bounded model check- ing [13] can avoid the exponential space blow-up problem by finding an attractor without searching through the entire state space. The algorithm presented in PaperA performs a search within an STG of a network without explicitly representing the entire STG as a DD. In SAT-based bounded model checking, we first generate a multiple-valued propositional formula F representing the transition relation T (s, s+) of the network. By applying a SAT-solver to the propositional formula F , a satisfying assignment can be found, which corresponds to a valid in the STG. Since each state of the STG of a multiple-valued network has a unique next state, once a path reaches a , it never leaves it. Therefore, we can determine the presence of a loop simply by checking whether the last state of a path occurs at least twice. Clearly, all states between any two occurrences of the last state belong to a loop. A loop corresponds to an attractor. If a path of length k does exist and it is loop-free, we multiply k by m (m is the depth of the unfolding operation) and continue the search for a path of the length k ·m. This is called unfolding operation in PaperA. For example, T (s, s+) and T (s+, s++) are transition relations representing two paths of length k = 1. We can unfold the path to length k = 2 by: T (s, s++) = T (s, s+)T (s+, s++), where the depth of unfolding operation is m = 2. As an example, consider the 3-valued network in Figure 3.1. We have s = + + + (x1, x2) and s = (x1 , x2 ). The transition relation is given by: + + + T (s, s ) = (x1 ↔ x1 + x2)(x2 ↔ x2). In the algorithm, transition relations are represented by multiple-valued propo- i sitional formulas [30] using multiple-valued literals x, i ∈ {0, 1, . . . , m − 1}. For + example x1 ↔ x1 + x2 is represented as:

1 + 0 1 1 0 1 1 2 + 0 0 (x 1 ↔x1x2 + x1x2 + x1x2)(x 1 ↔x1x2). For instance, for the example in Figure 3.1, the unfolding by two steps is com- puted as follows:

+ + + T (s, s ) = (x1 ↔ x1 + x2)(x2 ↔ x2) + ++ ++ + + ++ + T (s , s ) = (x1 ↔ x1 + x2 )(x2 ↔ x2 ) 18 CHAPTER 3. MULTIPLE-VALUED NETWORKS so, the unfolding T (s, s+)T (s+, s++) is:

+ + ++ + + ++ + (x1 ↔ x1 + x2)(x2 ↔ x2)(x1 ↔ x1 + x2 )(x2 ↔ x2 ). A possible satisfying assignment for above expression is s = (00), s+ = (20), and s++ = (00). This assignment corresponds to the path 00 → 20 → 00. As the state 00 shows up twice, the loop represents the attractor A2 = {00, 20} in Figure 3.1(b).

3.3 Converting GINML Format to CNET Format

In order to test our tool on GRN models of real cells, we converted network models from GINML format (which is a common input format for specifying Boolean and multiple-valued networks for GINsim [47]) to the CNET format of our tool. In GINML format, a network is expressed in terms of node, edge and parame- ter using eXtensible Markup Language (XML) [17]. The element node represents the vertex in the network, which includes several attributes and subelements such as basevalue (corresponds to the “based level of expression” of the corresponding ), maxvalue and parameter (corresponds to the user defined updating rules). The element edge represents the interaction of nodes in the network and is related to the element parameter. All these are summarized in Table 3.1.

Table 3.1: A description of the GINML format.

Name Description element attribute graph nodeorder The vertex order for dis- play and simulation id the name of the vertex node basevalue the base level of expres- sion without input, the de- fault value 0 maxvalue the maximum level of ex- pression id the name of the edge edge from the external vertex of this interaction to the affected vertex of this interaction idActiveInteractions the name of the parameter parameter val the value of the parameter

On the other hand, our SAT-based bounded model checking tool reads a network in CNET format similar to the Berkeley Logic Interchange Format (BLIF) [12] com- 3.4. CONCLUSION 19 monly used in synthesis and verification tools. In CNET format, the line starting with “.v” specifies the total number of vertices; the line starting with “.n” describes the vertex information and the updating function of this vertex follows this state- ment; and the line starting with “.f” and “.m” are used to specify the forbidden states and multiple-valued vertices for a multiple-valued network. An example of network in CNET format is showed in Table 3.2

Table 3.2: An example of a multiple-valued network in CNET format.

Example: .v 3 Lable Description .f 3 .v The model has 3 vertices. 11 .m 3 .f The state “11” is forbid- 1 2 den for computation. .m Vertices ‘1’ and ‘2’ en- .n 1 3 1 2 3 code and represent one 3- 000 1 valued node. 1– 0 .n Vertex ‘1’ has 3 imme- -1- 0 diate predecessors, which –1 0 are vertices ‘1’, ‘2’ and ‘3’. ...

The conversion of network model from GINML format to CNET format can be done as follows: • Step 1: Read the GINML format, include the vertices information and the updating functions; • Step 2: Explore the network vertex by vertex, if current vertex is Boolean, jump to step 4; • Step 3: Encode the state variable of multiple-valued vertex using dlog2 me binary bits; • Step 4: Compute the updating functions and convert them into Boolean expressions; if all vertices are processed, go to step 5; otherwise jump to step 2 to process the next vertex; • Step 5: Optimise the updating functions of every vertex; • Step 6: Write down the model in CNET format. As we can see, the conversion procedure works for both Boolean and multiple- valued cases. Its flowchart shows in Figure 3.2.

3.4 Conclusion

This chapter presents an algorithm for finding all attractors in synchronous mul- tiple-valued networks using SAT-based bounded model checking. By analysing 20 CHAPTER 3. MULTIPLE-VALUED NETWORKS

read network

process vertex i

encode the state of multiple- vertex i is no next valued vertex Boolean? using dlog2 me binary bits yes compute updating function

optimise updating functions

write to CNET format

Figure 3.2: The flowchart of the procedure for converting GINML format to CNET format.

30 network models of real cells, as well as 35 000 randomly generated 4-valued networks of sizes between 50 to 300 vertices, we demonstrated that our approach has a potential to handle an order of magnitude larger networks than the ones possible with existing DD based algorithms. Chapter 4

Balanced Boolean Networks

Boolean networks have promoted some new concepts and methods related to cir- cuit design including circuits on-line adaptation, fault-tolerant and evolvable hard- ware, etc. Our interest in Boolean networks is due to their attractive fault-tolerant features. The parameters of network can be tuned so that it exhibits a robust behaviour, in which a minimal change in network’s connections, values of state variables, or associated functions, typically cause no variation in the network’s dy- namics. In this chapter, we describe a computational scheme based on Boolean networks. We also define robustness in the context of Boolean networks and apply single stuck-at fault model for the robustness evaluation. Finally, we introduce Balanced Boolean Networks (BBNs), which are more robust compared to random Boolean networks. More details about constructing BBNs and experiments can be found in PaperB.

4.1 Computational Scheme Based on Boolean Networks

The following computational scheme based on Boolean networks was introduced by Dubrova in [39]. Suppose that we have a Boolean network G with n vertices v1, v2, . . . , vn and m attractors A0,A1,...,Am−1. The basins of attraction of Ai’s partition the Boolean space Bn into m connected components via a dynamic process. Attractors consti- tute stable equilibrium points. We assign a value i, i ∈ {0, 1, . . . , m − 1} to the attractor Ai and assume that the set of points of the Boolean space corresponding to the states in the basins of attraction of Ai is mapped to i. Under these assumptions, G defines a function f : Bn → {0, 1, . . . , m − 1} of variables x1, x2, . . . , xn, where the value of the variable xi corresponds to the state variable of vertex vi. The mapping is unique up to the permutation of m values of f. If m = 2, then G represents a Boolean function; otherwise, it represents a binary-valued input m-valued output function.

21 22 CHAPTER 4. BALANCED BOOLEAN NETWORKS

A more formal definition is given below. Let B(Ai) denote the basins of attraction of Ai. Definition 15 For a Boolean network G with n vertices and m single-point attrac- n tors A0,A1,...,Am−1. The function fG : B → {0, 1, . . . , m − 1} represented by G is defined as follows:

fG(x1, . . . , xn) = i, if and only if (x1, . . . , xn) ∈ B(Ai) ∪ Ai,

n for all (x1, . . . , xn) ∈ B and all i ∈ {0, 1, . . . , m − 1}. An example of a Boolean network and its STG are shown in Figure 4.1(a) and 4.1(b), respectively. This Boolean network has two vertices v1 and v2 with updating 0 functions f1 = x1x2 and f2 = x1 + x2, respectively. And there are two attractors A0 = {00} and A1 = {01}. The initial state {00} terminates in the attractor A0 and states {11, 01, 10} terminate in the attractor A1. If we associate A0 with logic 0 and A1 with logic 1, the truth table of the resulting Boolean function represented by this Boolean network is shown in Figure 4.1(c). We see that this Boolean function is a 2-input Boolean logic OR.

10 v1 v2 11 0 x1x2 x1 + x2 A0 00 A1 01

(a) A Boolean network. (b) Its STG (x1x2).

State Ai function fG(x1, x2) x1 x2 0 0 A0 0 0 1 A1 1 1 0 A1 1 1 1 A1 1

(c) The truth table of a function fG repre- sented by the network in (a).

Figure 4.1: An example of a Boolean network representing a 2-input Boolean logic OR.

As it can be seen in the above example, the computational scheme based on Boolean networks use attractors to represent the value of a logic function. As a 4.2. SINGLE STUCK-AT FAULT MODEL 23 result, such a computational scheme inherits the attractive fault-tolerant features of Boolean networks. Many experimental results confirm that Boolean networks models of real cells are tolerant to faults, i.e. typically the number and length of attractors are not affected by small changes [8].

4.2 Single Stuck-at Fault Model

In this section, we introduce the single stuck-at fault model, which is the most widely used model in testing of digital circuits [48,1].

Definition 16 A single stuck-at fault is a fault which results in a line in a logic circuit being permanently stuck at a logic “0” or “1”.

The stuck-at faults are typically abbreviated as “s-a-0” or “s-a-1”. In a similar way, we can define a single stuck-at fault in a Boolean network as a fault, which causes an edge in a Boolean network to be permanently stuck at a logic value of 0 or 1. It is assumed that the basic functionality of the functions associated to the nodes of the Boolean network is not changed by the fault.

10

v v 1 2 00 11

0

x1x2 x1 + x2 01 E

s-a-0

(a) A stuck-at-0 fault on edge e12.

s-a-0 v1 v2 E 01 10 0 x1x2 x1 + x2 00 11

(b) A stuck-at-0 fault on edge e21.

Figure 4.2: Examples of stuck-at faults in a Boolean network.

For example, in Figure 4.2 the fault s-a-0 on edge e12 changes the state transition graph from Figure 4.1(b) to the state transition graph in Figure 4.2(a). The number of attractors reduces from 2 to 1. As another example in Figure 4.2(b), the fault 24 CHAPTER 4. BALANCED BOOLEAN NETWORKS

s-a-0 on edge e21 just changes the structure of the basins, while the two attractors remain the same. So, we can say that this fault is tolerated.

4.3 Robustness Evaluation

The capability of a Boolean network to keep the attractors same under the pertur- bations caused by single stuck-at faults can be quantitatively described as follows. Let G be a Boolean network with n vertices v1, v2, . . . , vn and updating functions n fi : B → B, i ∈ {1, . . . , n}. In G, every edge eij carries the value of the state variable xi of vertex vi, which is also one of the input variables of the Boolean function fj(x1, . . . , xi, . . . , xn) associated with vertex vj. Suppose that G has m attractors, A0,A1,...,Am−1.

Definition 17 If a single stuck-at fault s-a-0 or s-a-1 occurs on edge eij, it re- sults in xi ≡ 0 or xi ≡ 1 for the Boolean function fj(x1, . . . , xi, . . . , xn). If A0,A1,...,Am−1 are still attractors of the Boolean network with this fault, we say that Boolean network G tolerates the fault s-a-0 or s-a-1 on eij.

The robustness R of a Boolean network G can be defined as a ratio of tolerated faults to the total number of faults: Num R = FT , Numfault where NumFT is the number of faults tolerated by G and Numfault is the total number of faults in G. We run experiments to evaluate robustness of GRNs models of real cells and randomly generated Boolean network models. The results are shown in Table 4.1.

Table 4.1: Evaluation of robustness of GRNs and (16, 4)-random Boolean networks.

GRNs (16, 4)-random models Name nodes faults RST Name nodes faults RST Ap-1 10 32 0.656 Rdm1 16 128 0.016 Arabidopsis 15 88 0.352 Rdm2 16 128 0.016 MammalianCell 10 78 0.372 Rdm3 16 128 0 BuddingYeast2009 18 120 0.375 Rdm4 16 128 0.078 BuddingYeast2004 12 74 0.257 Rdm5 16 128 0.016 BY2004Modified 11 58 0.603 Rdm6 16 128 0.023 BuddingYeast2008 9 38 0.263 Rdm7 16 128 0 DrosophilaCellCycle 14 84 0.655 Rdm8 16 128 0.336 ERBB2 20 102 0.784 Rdm9 16 128 0 FissionYeast 10 54 0.167 Rdm10 16 128 0 T-cellReceptor 10 78 0.372 Rdm11 16 128 0 ThBoolean 40 116 0.379 Rdm12 16 128 0.25

All 12 GRNs are models of real cells taken from [36], and (16, 4)-random Boolean networks are networks with 16 nodes and each having 4 immediate predecessors 4.4. BALANCED BOOLEAN NETWORKS 25 selected at random, which are generated by our program. As the results show, GRNs models of real cells are more robust than random Boolean networks. Next we show how to improve the robustness of random Boolean networks.

4.4 Balanced Boolean Networks

Balancedness is a useful and important feature of many systems. For example, in circuits, a balanced design means smaller power consumption, smaller propagation delay and convenient circuit testing [21, 74, 77]. In a Boolean network, “balanced- ness” means that every attractor has nearly the same size of basins in its STG. As it is presented in PaperB, an n-node Boolean network can be made balanced by adding one additional node vn+1. The updating functions fi, i ∈ {1, 2, . . . , n} ? of the original network are changed to fi according to the equation:  ? fi , if fi = 0 , or 1 fi = ? (4.1) xn+1 ⊕ fn+1 ⊕ fi , otherwise.

? where 0 and 1 mean constant 0 and constant 1, respectively, and fn+1 is the updat- ? ing function of the additional node vn+1, selected from the set fn+1 ∈ {0, 1, xn+1 ⊕ 0 fG, xn+1 ⊕ fG}, where fG is the function represented by the original Boolean net- work according to Definition 15.

v1 v2 00 10 x1x2 1 01 11

? f3 = 0

v1 v2

x1x2 ⊕ x3 1 000 001 111 100 101 011

010 110 0

v3

Figure 4.3: An example of constructing a balanced Boolean network.

A balanced Boolean network can be constructed as shown in Figure 4.3. The ? updating function of the additional node is f3 = 0. The balancing of the basins im- 26 CHAPTER 4. BALANCED BOOLEAN NETWORKS proves robustness from 0.25 to 0.333, i.e. by 33.3%. More results on the evaluation of robustness of balanced Boolean networks can be found in PaperB.

4.5 Conclusion

This chapter describes a computational scheme based on Boolean networks and defines the robustness in the context of Boolean networks. We introduce a spe- cial type of Boolean networks, called balanced Boolean networks and present an algorithm to transform an arbitrary Boolean network into a BBN. The experiment results show that balanced Boolean networks have higher robustness than random Boolean networks. Chapter 5

Design of Cryptographically Strong FSR-based Pseudo-random Sequence Generators

Cryptographically strong pseudo-random sequences play important role in cryp- tography. FSR-based filter generators [23] are a popular primitive for generating pseudo-random sequences, which are used as a basic building block in many stream ciphers. Filter generators are popular because their well-defined mathematical de- scription enables a detailed formal security analysis. In this chapter, we show how to reduce the propagation delay of a filter gen- erator without compromising its security by converting a filter generator into an NLFSR, which is faster, but slightly larger, than the original filter generator. We show that the NLFSR generates an equivalent set of output sequences as the filter generator and that its period is 2n − 1, where n is the size of the NLFSR state. The propagation delay is considerably reduced while the output sequence retains the required statistical properties.

5.1 Filter Generators

Filter generators became popular cryptographic primitive ever since the weakness of LFSRs has been discovered in the 1970s [64]. As the name suggests, a filter generator uses a function to “filter” an LFSR sequence or several LFSR sequences, resulting in a cryptographically stronger pseudo-random sequence. Figure 5.1 shows a filter generator composed of an LFSR and a filtering function. It is known how to make design choices for the size of internal state, the LFSR polynomial, and the Boolean function (number and position of inputs, nonlinearity, resiliency, algebraic , etc) so that the resulting filter generator resists known attacks (e.g. distinguishing, correlation, algebraic, etc) with a sufficient security

27 28 CHAPTER 5. DESIGN OF SECURE FSRS

out LFSR f

Figure 5.1: A filter generator composed of an LFSR and a filtering function. margin [44, 16]. The filtering function is typically constructed as:

f = fN (x) ⊕ fL(x), where fN (x) is a nonlinear part, which adds resistance to linear approximation attacks and algebraic attacks and fL(x) is a linear part, which adds high resiliency.

Bent Functions Bent functions, which have the highest nonlinearity 2n−1 − 2n/2−1, have been in- troduced by Rothaus in [68].

Definition 18 For an n-variable Boolean function f(x), the Walsh-Hadamard trans- ˆ n formation Ff : B → R of f, is defined as

X (w,x) Fˆf (w) = f(x) · (−1) , n x∈B where R is the set of all real numbers and w is a vector w ∈ Bn.

Definition 19 A Boolean function f is called a bent function if

n ˆ 2 Ff (w) = 2 holds for any w ∈ Bn.

In the approach presented in PaperC, we use bent functions with dep(fB) = k = 2m, of type

fN (x) = xi1 xim+1 ⊕ ... ⊕ xim xik ⊕ fA(xi1 , xi2 , . . . , xim ), (5.1)

where ij ∈ {1, 2, . . . , n}, ∀j ∈ {1, 2, . . . , k} and fA is an AND of p arbitrary variables from the set {xi1 , xi2 , . . . , xim } for 3 ≤ p ≤ m. A proof that functions of type 5.1 are bent can be found in [65]. 5.2. MAIN IDEAS OF THE PRESENTED APPROACH 29

output

fL

+ Shift Register1 + Shift Register2

s fN

fN

Figure 5.2: An n-stage NLFSR generating the same set of output sequences as a filter generator based on an n-stage LFSR with the feedback function fL(X) and s the filtering function fN (X) ⊕ fL(X); fN (X) is a shifted copy of fN (X).

Linear Functions

A linear function fL(x) with dep(fL) = r is of type:

fL(x) = xj1 ⊕ xj2 ⊕ ... ⊕ xjr , where ji ∈ {0, 1, . . . , n − 1}, ∀i ∈ {1, 2 . . . , r}, and dep(fL) = r, such that the polynomial of degree n in GF (2) of type

g(x) = xj1 + xj2 + ... + xjr + xn is primitive 1. To summarize, the filtering functions in our approach is in the form:

f(x) = xi1 xim+1 ⊕ ... ⊕ xim xik ⊕ xj1 ⊕ xj2 ⊕ ... ⊕ xjr ⊕ fA(xi1 , xi2 , . . . , xim ).

5.2 Main Ideas of the Presented Approach

The main ideas behind the presented approach are:

• Structural change from LFSR to NLFSR Inspired by [35], we construct an n-stage NLFSR with a guaranteed large period of 2n − 1 by adding to a maximum period LFSR two copies of a non- s linear function, fN and fN , as shown in Figure 5.2.

1An irreducible polynomial of degree n is called primitive if the smallest m for which it divides xm + 1 is equal to 2n − 1 [46]. 30 CHAPTER 5. DESIGN OF SECURE FSRS

• Reducing the propagation delay As shown in Figure 5.2, the propagation delay of the NLFSR is determined by the LFSR function fL and the two non-linear Boolean functions fN and s fN . To reduce the propagation delay, we apply the Fibonacci to Galois trans- formation [32].

• Cryptographic security Several factors can influence the cryptographic strength of pseudo-random sequences. We need to carefully select LFSR polynomial and parameters of the NLFSR updating function such as nonlinearity, resiliency and algebraic degree [16]. The details are presented in PaperC.

5.3 The Fibonacci to Galois Transformation

As we mentioned in Chapter2, there are two ways to implement a feedback shift register: in the Fibonacci configuration, or in the Galois configuration. In the Fibonacci configuration, all updating functions except fn−1 are of type 2 fi(x) = xi+1, for i ∈ {0, . . . , n − 2} . In other words, the feedback connections are only applied to the input stage of the shift register. In the Galois configuration, the feedback connections are potentially applied to every stage. The difference is obvious; the Galois configuration is faster than the Fibonacci configuration, as the feedback computations are performed in parallel. Figure 5.3 shows a 4-stage NLFSR in the Fibonacci configuration with the feedback function f3 = x0 ⊕ x1 ⊕ x2 ⊕ x1x2.

+ + +

AND output x3 x2 x1 x0

Figure 5.3: A 4-stage NLFSR in the Fibonacci configuration.

The transformation of an FSR from the Fibonacci to an equivalent Galois con- figuration can be done by subsequently applying shifting operations [32]. The updating function f should be in the algebraic normal form (see in Chapter2 for the definition of Boolean functions in ANF). An n-stage NLFSR is called uniform [32] if, for some 0 ≤ τ ≤ n − 1:

∀i ∈ {0, 1, . . . , τ − 1} : fi(x) = xi+1

∀i ∈ {τ, τ + 1, . . . , n − 1} : fi(x) = xi⊕1 + hi(x)

2In this chapter, the indices of stages of FSRs start from 0 to keep agreement with the notations and definitions in paper [32]. 5.4. EXAMPLE 31

where dep(hi) ⊆ {0, 1, . . . , τ} − {xi⊕1} and “⊕” denotes addition modulo n. The stage τ is called the terminal stage of the NLFSR. Two NLFSRs are considered equivalent if sets of their output sequences are equal.

M Theorem 1 Given a uniform NLFSR with the terminal stage τ, a shifting fτ → fτ 0 results in an equivalent NLFSR if the transformed NLFSR is uniform as well.

Let fa(x) and fb(x) be the updating functions of the stages a and b of an n-stage NLFSR, respectively, such that a > b.

P Definition 20 The operation shifting, denoted by fa −→ fb, moves a set of product- terms P from the ANF of fa to the ANF of fb. The index of each variable xi of each product-term in P is changed to x(i−a+b) mod n.

x1x2 An example of the shifting operation of f3 −−−→ f2 for the 4-stage NLFSR in the Galois configuration is shown in Figure 5.4.

+ + +

AND output x3 x2 x1 x0

f3 = x0 ⊕ x1 ⊕ x2 ⊕ x1x2 f2 = x3 {x1x2} shifting operation: f3 f2

+ + AND output x3 + x2 x1 x0

f3 = x0 ⊕ x1 ⊕ x2 f2 = x3 ⊕ x0x1

Figure 5.4: An example of the shifting operation.

5.4 Example

Consider an NLFSR constructed from a 256-stage LFSR and a bent function. Fol- lowing recommendations of [16], we select k = 12, p = 3 and r = 6. A possible 32 CHAPTER 5. DESIGN OF SECURE FSRS

choice of fL(x) and fN (x) which stasifies the above parameters is:

fL(x) = x0 + x12 + x48 + x115 + x133 + x213 corresponding to the primitive polynomial of 256 [81]:

g(x) = 1 + x12 + x48 + x115 + x133 + x213 + x256 and

fN = x30x70 ⊕ x31x71 ⊕ x32x72 ⊕ x34x74 ⊕ x36x76 ⊕ x40x80 ⊕ x30x32x36. By using the approach presented in PaperC, we obtain the Galois NLFSR shown in Figure 5.5. As it shows, the FSR partitions into three parts: FSR1, FSR2 and FSR3. The NLFSR in Figure 5.5 can generate two output sequences OutN and OutL. OutN comes from the output of the stage 244 (which is a non-linear sequence) and OutL comes from the output of the stage 0 (which is a linear sequence.) Since the period of the NLFSR is very large, 2256 − 1, we can only evaluate its output sequences for a limited number of sample data. In our experiment, we generate 1 000 000 samples from the NLFSR and these samples are divided into 20 sub-sequences in the NIST statistical tests [69]. Every test is based on 1 000 000 samples and each test is repeated 20 times to complete the final result. Table 5.1 shows the statistical test report generated by the NIST test suite for the output sequence OutN . As we can see from the table, the NLFSR-based filter generator has excellent statistical properties.

NIST Test The NIST test suite [69] was developed by the National Institute of Standards and Technology of US. It was used for choosing today’s Advanced Encryption Standard (AES) [66]. It is a helpful tool to evaluate statistical properties of Pseudo-Random Number Generators (PRNGs). In this section, we describe statistical tests that we used for evaluating our approach. The statistical package sts-2.1.1 includes 15 statistical tests. For each test, the suite first applies a procedure to find the statistical value of chi-square variation χ2, a particular parameter for the given sequence, which is obtained from the theoretical studies of an identical sequence under the assumption of . Then, the test suite transforms the χ2 data to a randomness probability data, called “P-value”. The theoretical models of each test are documented in the manual and specifi- cation [69]. The following is a brief list of the 15 statistic tests: • The Frequency (Monobit) Test can determine whether the number of ones and zeroes in a sequence are approximately the same as in a truly random sequence; • Frequency Test within a Block can determine whether the frequency of ones in an M-bit block is approximately M/2; 5.4. EXAMPLE 33 232 x 233 x 235 + x 234 x ... + 238 x 239 x . ) x + ( 240 x N 241 f x + and 242 x ) x ( L + L f 243 x Out + 3 244 x FSR AND AND 18 24 18 58 x x x x -stage LFSR with 20 x 2 256 N FSR Out 244 x 245 x 1 247 + x 246 x FSR 202 ... x + 250 x 124 251 x x + 252 x 110 253 x x + 254 x 45 x + 255 x 11 Figure 5.5: An NLFSR constructed from a x + 0 x AND AND 30 36 30 70 x x x x 32 x 34 CHAPTER 5. DESIGN OF SECURE FSRS

Table 5.1: The statistical tests report for the output sequence OutN .

P-VALUE PROPORTION STATISTICAL TEST 0.534146 20/20 Frequency 0.213309 20/20 BlockFrequency 0.911413 19/20 CumulativeSums 0.122325 20/20 Runs 0.534146 20/20 LongestRun 0.048716 20/20 Rank 0.437274 20/20 FFT 0.911413 20/20 NonOverlappingTemplate 0.275709 20/20 OverlappingTemplate 0.637119 20/20 Universal 0.437274 20/20 ApproximateEntropy 0.534146 12/12 RandomExcursions 0.350485 12/12 RandomExcursionsVariant 0.739918 20/20 Serial 0.437274 20/20 LinearComplexity

• The Runs Test can determine whether the oscillation between various zeroes and ones is too fast or too slow; • Tests for the Longest-Run-of-Ones in a Block can determine whether the length of the longest run of ones within the tested sequence is consistent with a random sequence; • The Binary Matrix Rank Test can check for linear dependence among fixed length sub-strings of the original sequence; • The Discrete Fourier Transform (Spectral) Test detects periodic fea- tures (i.e., repetitive patterns that are near each other) in the tested sequence; • The Non-overlapping Template Matching Test checks the occurrences of a given non-periodic (aperiodic) pattern; • The Overlapping Template Matching Test checks the occurrences of pre-specified target strings; • Maurer’s “Universal Statistical” Test detects whether or not the se- quence can be significantly compressed without loss of information; • The Linear Complexity Test determines whether or not the sequence is complex enough to be considered random; • The Serial Test checks if the occurrences of the 2m m-bit overlapping pat- terns are approximately the same as a random sequence; • The Approximate Entropy Test compares the frequency of overlapping blocks of two consecutive/adjacent lengths (m and m+1) against the expected result for a random sequence; • The Cumulative Sums (Cusums) Test determines whether the cumula- 5.5. CONCLUSION 35

tive sum of the partial sequences occurring in the tested sequence is too large or too small relative to the expected behaviour of that cumulative sum for random sequences; • The Random Excursions Test determines if the number of visits to a particular state within a cycle deviates from what one would expect for a random sequence; • The Random Excursions Variant Test detects deviations from the ex- pected number of visits to various states in the .

5.5 Conclusion

This chapter presents a new technique for increasing the throughout of filter genera- tors. We first convert a filter generator into an NLFSR that generates an equivalent set of output sequences as the filter generator. Then, we reduce the propagation delay of the NLFSR by redistributing monomials of its feedback functions. The pre- sented approach might be useful for applications that require very high data rates, e.g. 5G mobile communication technology. Some cryptographic systems based on the presented approach have been already designed, e.g. Espresso stream cipher presented in [29].

Chapter 6

Design of Reliable FSRs

The continuing rising of circuit density and the reduction of device sizes cause a lot of new constraints and problems in Integrated Circuit (IC) design. One of the serious consequences of these changes is the reduction of circuit reliability. In this chapter, we introduce a new approach for reliability improvement of FSRs, which are the key component of stream ciphers. We present a method for detecting and correcting transient faults in FSRs based on duplication and parity checking. Periodic fault detection of functional circuits is very important for cryptographic systems because a random hardware fault can compromise their security. For example, if the output of a Pseudo-Random Sequence Generator (PRSG) contained in a stream cipher gets stuck to 0, then the stream cipher will be sending messages unencrypted. The presented method is more reliable than TMR for large FSRs, while the area overhead of the two approaches are comparable. The presented approach might be important for cryptographic systems using large FSRs.

6.1 Stream Ciphers

Stream ciphers are widely used in modern cryptographic systems. The block dia- gram of a stream cipher is shown in Figure 6.1. The main building block of a stream cipher is a cryptographically strong pseudo- random sequence generator. Key and IV represent the secret key and the initial- ization value for the PRSG, which are used to make the initial conditions of the PRSG more difficult for an attacker to guess. To encrypt, the plain text is added (typically using a bitwise XOR) with the key stream generated by the PRSG. To decrypt, the reverse operation is performed. The pseudo-random sequence generator is usually based on LFSRs or NLFSRs, not only for the good statistical properties of sequences they produce, but also for the simplicity and speed of their hardware implementation. Figure 6.2 shows the block diagrams of PRSG used in several stream ciphers: A5/1 (used for global

37 38 CHAPTER 6. DESIGN OF RELIABLE FSRS

Key Key IV PRSG IV PRSG

Key Stream Key Stream

Plain text + Cipher text Cipher text + Plain text

(a) Encryption (b) Decryption

Figure 6.1: Encryption and decryption using a stream cipher. system for mobile communications [67, 45]), E0 (used for Bluetooth communica- tions [15]), WG-7 stream cipher (used for RFID [62]) and Grain cipher [50].

6.2 Transient Faults and Triple Modular Redundancy

Hardware faults can be classified into two major classes [34]:

• A permanent fault remains active until a corrective action is taken. These faults are usually caused by physical defects in the hardware such as shorts in the circuit, broken interconnections, or stuck cells in a memory; • A transient fault (also called soft-error [80]) remains active for a short period of time, which can only be detected by online detection or concurrent checking and not by off-line testing.

Transient faults are the dominant type of faults in today’s ICs. For example, about 98% of Random Access Memories (RAM) faults are transient faults [75]. For reliable circuit design, TMR is a common form of hardware redundancy. The basic configuration shows in Figure 6.3. The modules are triplicated to perform the same computation in parallel. Majority voting is used to determine the correct result. If one of the modules fails, the majority voter masks the fault by recognizing as correct the result of the remaining two fault-free modules. It is well known that the reliability of TMR can be calculated as [34]:

2 3 RTMR = (3Rm − 2Rm)Rv, where Rm is the reliability of a module and Rv is the reliability of the voter. We apply TMR to an n-stage FSR as shown in Figure 6.4. we triplicate register stages and the feedback function. Suppose every register stage has the reliability Rm = p and the voter is 100% reliable, then, the reliability of FSR in Figure 6.4 is

2 3 n Rtmr = (3p − 2p ) . (6.1) 6.2. TRANSIENT FAULTS AND TRIPLE MODULAR REDUNDANCY 39

18 ... 10 ... 5 2 1 0

+ + +

21 ... 11 ... 1 0 +

+

22 ... 15 ... 12 ... 2 1 0

+ + + (a) A5/1 stream cipher

LF SR1 Add

LF SR2 MSB LSB

LF SR3 Blender +

LF SR4

(b) E0 stream cipher

+ β

22 21 ... 11 ... 1 0

WG

(c) WG-7 stream cipher

g(x) f(x)

NFSR + LF SR

h(x)

+

(d) Grain cipher

Figure 6.2: The pseudo-random sequence generator of stream ciphers A5/1, E0, WG-7 and Grain. 40 CHAPTER 6. DESIGN OF RELIABLE FSRS

input 1 Module 1

input 2 Module 2 Voter output

input 3 Module 3

Figure 6.3: The basic configuration of TMR.

xn xn−1 x1

out yn Voter yn−1 Voter ... Voter y1 Voter

zn zn−1 z1

ffb(z1, z2, . . . , zn)

ffb(y1, y2, . . . , yn)

ffb(x1, x2, . . . , xn)

Figure 6.4: The block diagram for an n-stage FSR using TMR.

The area overhead of TMR in Figure 6.4 is

Atmr = 3(n-stage FSR + ffb) + n × Voter. (6.2)

6.3 The Duplication and Parity Checking Approach

Our approach is based on an observation that storage elements (e.g. registers, mem- ory) are more sensitive to transient faults than combinational logic elements [56, 27]. In our reliability evaluation, we assume that the transient faults in combinational logic circuits will propagate to one of the register stages within one clock cycle, i.e. they will manifest themselves as errors in the values of register stages. Therefore, in our evaluation we consider transient faults in register stages only. Voters and error correcting circuits are assumed to be perfect. The block diagram of the presented duplication and parity checking approach shows in Figure 6.5. The output Out represents the output sequence and the output Error indicates the working status of the circuit (Error = 1 means circuit 6.3. THE DUPLICATION AND PARITY CHECKING APPROACH 41

has failed). In Figure 6.5, we duplicate the FSRs (Fx and Fy) and, when an error is detected, the error-correcting circuit (Cec) is activated to correct it.

outx FSR Fx e x Out Cec Error ey

FSR Fy outy

Figure 6.5: The block diagram of the duplication and parity checking approach.

Error Detection

We use even parity code to detect faults in the FSRs Fx and Fy. The even parity code of length n consists of all binary n-tuples that contain an even number of 1’s. Typically, the first n − 1 bits of a codeword are data carrying information, while the last bit is the check bit, determining the parity of the codeword [34]. Let us consider a 5-stage FSR with feedback function ffb as an example to explain how to detect an error by the parity checking method. In Figure 6.6, xi, i ∈ {1, 2,..., 5} represents the state of the ith stage and xpc represents the check bit. At any time t, the check for errors is done by using equation

ex = x1(t) ⊕ x2(t) ⊕ x3(t) ⊕ x4(t) ⊕ x5(t) ⊕ xpc(t), (6.3) where ex is the error checking result (ex = 1 represents that an odd number of stages in the FSR are affected by faults). Since FSRs in the Fibonacci configuration has the property that

xi(t + 1) = xi+1(t), i ∈ {1, 2, . . . , n − 1}, the parity bit xpc(t + 1) can be predicted at time t using the equation

xpc(t + 1) = x2(t) ⊕ x3(t) ⊕ x4(t) ⊕ x5(t) ⊕ ffb(t). (6.4)

As the above equations show, we can check for errors and predict the parity bit at the same time. We also see that the error-checking Equation (6.3) and parity- predicting Equation (6.4) share parts of XOR terms. At any time t, if an odd number of stages in the FSR are affected by faults, the error checking signal ex will be 1 to indicate the presence of this error. 42 CHAPTER 6. DESIGN OF RELIABLE FSRS

+ xpc

+ + + + + ex

x5 x4 x3 x2 x1 Outx

ffb

(a) A 5-stage FSR with error detection.

x5 x4 x3 x2 x1 xpc Correct case: 1 0 1 1 0 1 ex = x1 ⊕ x2 ⊕ x3 ⊕ x4 ⊕ x5 ⊕ xpc = 0

x5 x4 x3 x2 x1 xpc Error case: 1 0 1 0 0 1 ex = x1 ⊕ x2 ⊕ x3 ⊕ x4 ⊕ x5 ⊕ xpc = 1 (b) Error detection.

Figure 6.6: An example of error detection.

Error Correction

When an error is detected by the error-detecting circuit, the error signals ex and ey activate the error-correcting circuit Cec. In Figure 6.7, a fault in y2 can be detected and corrected to the value of x2.

x5 x4 x3 x2 x1 xpc 1 0 1 1 0 1 ex = x1 ⊕ x2 ⊕ x3 ⊕ x4 ⊕ x5 ⊕ xpc = 0

1 0 1 0 0 1 ey = y1 ⊕ y2 ⊕ y3 ⊕ y4 ⊕ y5 ⊕ ypc = 1 y5 y4 y3 y2 y1 ypc

Figure 6.7: An example of error correction.

The error-correcting circuit works as shown in Table 6.8(a). After simplification, the circuit Cec for every register stage is shown in Figure 6.8(b). It should be noticed that when Fx and Fy are both in error condition, the error signal Error will be activated to prevent the propagation of errors. 6.3. THE DUPLICATION AND PARITY CHECKING APPROACH 43

Error detection Function of circuit Cec ex ey 0 0 no error 0 1 correcting value of Fy to Fx 1 0 correcting value of Fx to Fy 1 1 error signal Error = 1

(a) The function of the error-correcting circuit Cec.

xi 0 xi−1 MUX yi 1 yi−1

ex AND

AND ey Error

(b) The error-correcting circuit for every register stage.

Figure 6.8: The error-correcting circuit Cec.

Reliability Evaluation In our duplication and parity checking approach, we use parity checking for error detection to indicate the working status of the circuit in real-time. We define the work of FSR as reliable if one of the following conditions holds:

• Case 1: Both Fx and Fy have no error. The output sequence is correct;

• Case 2: An odd number of stages in Fx are affected by faults and Fy has no error. The faults in Fx will be corrected by the error-correcting circuit and the output sequence will be correct;

• Case 3: An odd number of stages in Fy are affected by faults and Fx has no error. The faults in Fy will be corrected by the error-correcting circuit and the output sequence will be correct;

• Case 4: Both Fx and Fy have an odd number of stages affected by faults. The faults will be detected by the error-correcting circuit.

In the traditional dependability evaluation, the last case is called a “failed safe” state [34]. In this state, the circuit is not capable of producing a correct output any 44 CHAPTER 6. DESIGN OF RELIABLE FSRS longer, but the error is detected and reported, so corrective actions can be taken to recover from the error.

Theorem 2 The reliability of the duplication and parity checking approach for an n-stage FSR is 1 − (2p − 1)n+1 R = (pn+1 + )2, (6.5) dpc 2 where p is the reliability of every register stage in the FSR.

Proof: The reliability of the duplication and parity checking approach for an n- stage FSR is calculated by equation

2 2 2 Rdpc = Pcorrect + 2PcorrectPodd + Podd = (Pcorrect + Podd) ,

2 where Pcorrect is the probability that both FSRs work correctly, PcorrectPodd is the probability that one FSR works correctly and another one has an odd number of 2 faults (two cases), and Podd is the probability that both FSRs have an odd number of faults. Since Pcorrect is the probability that one FSR works correctly, which means n+1 n+1 registers are reliable, it is given by p . Since Podd is the probability that one FSR encounters an odd number of faults, which means an odd number of stages in the FSR are affected by faults, it is given by

bn/2c X  n + 1  P = pn−2m(1 − p)2m+1. (6.6) odd 2m + 1 m=0

Next we derive a closed formula for Equation (6.6). According to the binomial theorem [18], it is possible to expand (x + y)n into a sum of the form

n n n n X n (x + y)n = xny0 + xn−1y1 + ... + x0yn = xn−kyk (6.7) 0 1 n k k=0

n where each k is a specific positive integer known as a binomial coefficient. Similarly, (x − y)n expands into

n n n n X n (x − y)n = xny0 − xn−1y1 + ... + x0(−y)n = xn−k(−y)k. 0 1 n k k=0 (6.8) By subtracting Equation (6.8) from Equation (6.7), we get

b(n−1)/2c X  n  (x + y)n − (x − y)n = 2 xn−2m−1y2m+1. 2m + 1 m=0 6.3. THE DUPLICATION AND PARITY CHECKING APPROACH 45

The sum of the odd index terms in Equation (6.7) is

b(n−1)/2c X  n  (x + y)n − (x − y)n xn−2m−1y2m+1 = . (6.9) 2m + 1 2 m=0

1−(2p−1)n+1 The probability of Podd is calculated by Podd = 2 . Therefore, 1 − (2p − 1)n+1 R = (pn+1 + )2. dpc 2

 According to Equation (6.1) and Equation (6.5), Figure 6.9 shows the reliability comparison for our approach and TMR. We see that our duplication and parity checking approach is more reliable than TMR for large FSRs.

1

Original n-stage FSR 0.8 FSR using TMR FSR using parity

R 0.6

0.4 Reliability 0.2

0

-0.2 0 200 400 600 800 Number of stages of an FSR

Figure 6.9: The reliability of our approach and TMR, where the reliability of every register stage is p = 0.968 [72].

Area Overhead Evaluation The area overhead is calculated based on the area approximation of gates and flip-flops shown in Table 6.1. 46 CHAPTER 6. DESIGN OF RELIABLE FSRS

Table 6.1: Area approximation of gates and flip-flops.

Gate Area 2-input AND 1 2-input XOR 2 flip-flop 4

The area overhead of the duplication and parity checking approach for an n- stage FSR is

Adpc = 2(n-stage FSR + ffb + parity bit) + Adetection + Acorrection, where Adetection is the area overhead of the error-detecting circuit and Acorrection is the area overhead of the error-correcting circuit. In error-detecting circuit, error-checking needs n × 2-input XOR and parity- predicting needs (n − 1) × 2-input XOR. However, if we take into account the n − 1 overlap XOR terms, the area overhead of the error-detecting circuit reduces to

Adetection = (n + 1) × 2-input XOR = (2n + 2) × 2-input AND.

In error-correcting circuit, every stage needs a 2-input MUX (6×2-input AND), the area overhead of error-correcting circuit is

Acorrection = (n + 1) × MUX + 2 × 2-input AND = (6n + 8) × 2-input AND.

The feedback function is usually small, less than n gates, which we approximate as n × 2-input AND. Finally, the area overhead of the duplication and parity checking approach for an n-stage FSR is Adpc = (18n + 18) × 2-input AND. Similarly, one voter needs 5 × 2-input AND. The area overhead for an FSR using TMR according to Equation (6.2) is

Atmr = 20n × 2-input AND.

Our approach has a smaller area overhead compared to TMR, as it shows in Figure 6.10, which is about 10% smaller than TMR on average.

6.4 Conclusion

In this chapter, we present an approach for designing reliable FSRs using the dupli- cation and parity checking. We demonstrate that the presented approach is more reliable than TMR for large FSRs, while its area overhead is smaller than TMR. 6.4. CONCLUSION 47

2e+04

FSR using parity FSR using TMR 1.5e+04

1e+04

5,000 Area overhead (2-input AND)

0 0 200 400 600 800 Number of stages of an FSR

Figure 6.10: The area overhead of our approach and TMR.

Chapter 7

Conclusion and Future Work

The contributions of this thesis can be partitioned into two parts.

• In network analysis, we calculated and analysed the number of attractors in multiple-valued networks using a SAT-based bounded model checking algo- rithm. The presented algorithm can handle larger networks and compute attractors much faster than other tools based on DDs. We also analysed the robustness of Boolean networks based on the stuck-at fault model and showed how to construct balanced Boolean networks, which are more robust than random Boolean networks.

• In network synthesis, we transformed a filter generator to an NLFSR using Galois shifting operation. The results showed that the propagation delay is largely reduced and the output sequence has good statistical properties. We also presented an approach for designing reliable FSRs. The results showed that our duplication and parity checking approach is more reliable than TMR for large FSRs, while having a smaller area overhead.

The results presented so far lead to several directions for the future work.

• During our research, we found that there is a lot of similarities between Boolean networks and FSRs, which are key components for many crypto- graphic systems. One interesting idea is to make use of the accumulated knowledge about network dynamics of Boolean networks in the chaotic regime to construct NLFSRs with full-length cyclic attractors.

• As we known, in many random Boolean networks, a minor discrepancy of network parameters or functions may result in a totally different network characteristics, such as changes of attractor states, network dynamics and be- havioural regime. However, there are some Boolean networks that can keep network characteristics stable under these small changes. In our research, we

49 50 CHAPTER 7. CONCLUSION AND FUTURE WORK

found that balanced Boolean networks are capable to preserve attractors un- der functional changes. More work needs to be done to make use of interesting potential of Boolean networks to tolerate faults. Bibliography

[1] Miron Abramovici, Melvin A Breuer, and Arthur D Friedman. Digital systems testing and testable design. IEEE press, 1994. [2] Sheldon B Akers. Binary decision diagrams. Computers, IEEE Transactions on, 100(6):509–516, 1978. [3] T. Akutsu, S. Kuhara, O. Maruyama, and S. Miyano. A system for identifying genetic networks from gene expression patterns produced by gene disruptions and overexpressions. Genome Informatics, 9:151–160, 1998. [4] István Albert, Juilee Thakar, Song Li, Ranran Zhang, and Réka Albert. Boolean network simulations for life scientists. Source code for biology and medicine, 3(1):1–8, 2008. [5] B. Alberts, D. Bray, J. Lewis, M. Ra, K. Roberts, and J. D. Watson. Molecular Biology of the Cell. Garland Publishing, 1994. [6] Maximino Aldana. Boolean dynamics of networks with scale-free topology. Physica D: Nonlinear Phenomena, 185(1):45–66, 2003. [7] Maximino Aldana, Enrique Balleza, Stuart Kauffman, and Osbaldo Resendiz. Robustness and evolvability in genetic regulatory networks. Journal of theo- retical biology, 245(3):433–448, 2007. [8] Maximino Aldana, Susan Coppersmith, and Leo P Kadanoff. Boolean dynam- ics with random couplings. In Perspectives and Problems in Nolinear Science, pages 23–89. Springer, 2003. [9] Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. science, 286(5439):509–512, 1999. [10] Albert-Laszlo Barabasi and Zoltan N Oltvai. Network biology: understanding the cell’s functional organization. Nature reviews genetics, 5(2):101–113, 2004. [11] Mike Benton. Evolution in four dimensions: Genetic, epigenetic, behavioral, and symbolic variation in the history of life. Journal of Clinical Investigation, 115(11):2961, 2005.

51 52 BIBLIOGRAPHY

[12] UC Berkeley. Berkeley logic interchange format (BLIF). Oct Tools Distribu- tion, 2:197–247, 1992.

[13] A. Biere, A. Cimatti, E.M. Clarke, M. Fujita, and Y. Zhu. Symbolic model checking using SAT procedures instead of BDDs. Proceedings of Design Au- tomation Conference (DAC’99), pages 317–320, June 1999.

[14] Sven Bilke and Fredrik Sjunnesson. Stability of the Kauffman model. Physical Review E, 65(1):016129, 2001.

[15] SIG Bluetooth. Specification of the Bluetooth system: Profiles, version 1.1, 2001.

[16] An Braeken and Joseph Lano. On the (im)possibility of practical and secure nonlinear filters and combiners. In Proceedings of the 12th international con- ference on Selected Areas in Cryptography, SAC’05, pages 159–174, Berlin, Heidelberg, 2006. Springer-Verlag.

[17] Tim Bray, Jean Paoli, CM Sperberg-McQueen, Eve Maler, and François Yergeau. Extensible markup language (xml) 1.0, 2011.

[18] Richard A Brualdi. Introductory combinatorics. New York, 1992.

[19] J.R. Burch, E.M. Clarke, K.L. McMillan, D.L. Dill, and L.J. Hwang. Symbolic model checking: 1020 states and beyond. In Proceedings of the Fifth Annual IEEE Symposium on Logic in Computer Science, pages 1–33. IEEE Computer Society Press, 1990.

[20] Scott Camazine. Self-organization in biological systems. Princeton University Press, 2003.

[21] K Chakrabarty and JP Hayes. Balanced Boolean functions. In IEE Proceedings - Computers and Digital Techniques, volume 145, pages 52–62. IET, January 1998.

[22] Tammy MK Cheng, Sakshi Gulati, Rudi Agius, and Paul A Bates. Under- standing cancer mechanisms through network dynamics. Briefings in func- tional genomics, 11(6):543–560, 2012.

[23] T. W. Cusick and P Stˇanicˇa. Cryptographic Boolean functions and applications. Academic Press, San Diego, CA, USA, 2009.

[24] Maria I Davidich and Stefan Bornholdt. Boolean network model predicts cell cycle sequence of fission yeast. PloS one, 3(2):e1672, 2008.

[25] Eric H Davidson and Douglas H Erwin. Gene regulatory networks and the evolution of animal body plans. Science, 311(5762):796–800, 2006. BIBLIOGRAPHY 53

[26] Leandro Nunes De Castro. Fundamentals of : basic concepts, algorithms, and applications. CRC Press, 2006. [27] Yuvraj Singh Dhillon, Abdulkadir Utku Diril, and Abhijit Chatterjee. Soft- error tolerance analysis and optimization of nanometer circuits. In Design, Automation, and Test in Europe, pages 389–400. Springer, 2008. [28] Barbara Drossel. Random Boolean networks. Reviews of nonlinear dynamics and complexity, 1:69–110, 2008. [29] E. Dubrova and M. Hell. Espresso: A stream cipher for 5G wireless communi- cation systems. Cryptography and Communications, 2015. submitted, avalibile at https://eprint.iacr.org/2015/241.

[30] Elena Dubrova. Multiple-valued logic synthesis and optimization. In Logic Synthesis and Verification, pages 89–114, Norwell, MA, USA, 2002. Kluwer Academic Publishers. ISBN 0-7923-7606-4. [31] Elena Dubrova. Random multiple-valued networks: Theory and applica- tions. In Proceedings of International Symposium on Multiple-Valued Logic (ISMVL’2006), pages 27–33, May 2006. [32] Elena Dubrova. A transformation from the Fibonacci to the Galois NLFSRs. IEEE Transactions on Information Theory, 55(11):5263–5271, November 2009. [33] Elena Dubrova. Synthesis of parallel binary machines. In 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 200– 206. IEEE, 2011. [34] Elena Dubrova. Fault-tolerant design. Springer, 2013. [35] Elena Dubrova. A scalable method for constructing Galois NLFSRs with period using cross-join pairs. Information Theory, IEEE Transactions on, 59(1):703– 709, 2013. [36] Elena Dubrova, Ming Liu, and Maxim Teslenko. Finding attractors in syn- chronous multiple-valued networks using SAT-based bounded model checking. Journal of Multiple-Valued Logic and Soft Computing, 19(1-3):109–131, 2012. [37] Elena Dubrova, Maxim Teslenko, and Ming Liu. Finding attractors in syn- chronous multiple-valued networks using SAT-based bounded model check- ing. In 40th IEEE International Symposium on Multiple-Valued Logic, ISMVL 2010, Barcelona, Spain, 26-28 May 2010, pages 144–149, 2010. [38] Elena Dubrova, Maxim Teslenko, and Andres Martinelli. Kauffman networks: Analysis and applications. In Proceedings of the 2005 IEEE/ACM Interna- tional conference on Computer-aided design, pages 479–484. IEEE Computer Society, 2005. 54 BIBLIOGRAPHY

[39] Elena Dubrova, Maxim Teslenko, and Hannu Tenhunen. A computational scheme based on random Boolean networks. In Transactions on Computational Systems Biology X, pages 41–58. Springer, 2008. [40] Adrien Fauré and Denis Thieffry. Logical modelling of cell cycle control in eukaryotes: a comparative study. Molecular BioSystems, 5(12):1569–1581, 2009.

[41] Harold Fredricksen. A survey of full length nonlinear shift register cycle algo- rithms. SIAM Review, 24(2):195–221, 1982. [42] Herman F Fumiã and Marcelo L Martins. Boolean network model for cancer pathways: Predicting carcinogenesis and targeted therapy outcomes. PloS one, 8(7):e69008, 2013.

[43] A Garg, A Di Cara, I Xenarios, L Mendoza, and G De Micheli. Synchronous versus asynchronous modeling of gene regulatory networks. Bioinformatics, 24 (17):1917–1925, 2008. [44] Jovan Golic. On the security of nonlinear filter generators. In Dieter Gollmann, editor, Fast Software Encryption, volume 1039 of Lecture Notes in Computer Science, pages 173–188. Springer Berlin / Heidelberg, 1996. [45] Jovan Dj Golić. Cryptanalysis of alleged A5 stream cipher. In Advances in Cryptology-EUROCRYPT’97, pages 239–255. Springer, 1997. [46] Solomon W Golomb et al. Shift register sequences. Aegean Park Press, 1982.

[47] A Gonzalez Gonzalez, Aurélien Naldi, Lucas Sanchez, Denis Thieffry, and Claudine Chaouiya. GINsim: a software suite for the qualitative modelling, simulation and analysis of regulatory networks. Biosystems, 84(2):91–100, 2006. [48] John P Hayes. Fault modeling for digital MOS integrated circuits. Computer- Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 3(3): 200–208, 1984. [49] Michael Hecker, Sandro Lambeck, Susanne Toepfer, Eugene Van Someren, and Reinhard Guthke. Gene regulatory network inference: data integration in dynamic models–a review. Biosystems, 96(1):86–103, 2009.

[50] Martin Hell, Thomas Johansson, Alexander Maximov, and Willi Meier. The Grain family of stream ciphers. In New Stream Cipher Designs, pages 179–190. Springer, 2008. [51] Sui Huang and Stuart A Kauffman. Complex gene regulatory networks – from structure to biological observables: Cell fate determination. In Encyclopedia of complexity and systems science, pages 1180–1213. Springer, 2009. BIBLIOGRAPHY 55

[52] S. A. Kauffman. Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of Theoretical Biology, 22:437–467, 1969.

[53] S. A. Kauffman. The Origins of Order: Self-Organization and Selection of Evolution. Oxford University Press, Oxford, 1993.

[54] Stuart Kauffman, Carsten Peterson, Björn Samuelsson, and Carl Troein. Ge- netic networks with canalyzing Boolean rules are always stable. Proceedings of the National Academy of Sciences of the United States of America, 101(49): 17102–17107, 2004.

[55] Fangting Li, Tao Long, Ying Lu, Qi Ouyang, and Chao Tang. The yeast cell- cycle network is robustly designed. Proceedings of the National Academy of Sciences of the USA (PNAS), 101(14):4781–4786, 2004.

[56] Peter Lidén, Peter Dahlgren, Rolf Johansson, and Johan Karlsson. On latch- ing probability of particle induced transients in combinational networks. In Twenty-Fourth International Symposium on Fault-Tolerant Computing, 1994. FTCS-24, pages 340–349. IEEE, 1994.

[57] R. Lidl and H. Niederreiter. Introduction to Finite Fields and their Applica- tions. Cambridge Univ. Press, 1994.

[58] Ming Liu and Elena Dubrova. The robustness of balanced Boolean networks. In Ronaldo Menezes, Alexandre Evsukoff, and Marta C. González, editors, Complex Networks, volume 424 of Studies in Computational Intelligence, pages 19–30. Springer Berlin Heidelberg, 2013. ISBN 978-3-642-30286-2.

[59] Ming Liu and Elena Dubrova. A new approach to reliable FSRs design. In Proceedings of 32nd Nordic Microelectronics Conference NORCHIP, Oct 2014.

[60] Ming Liu, S.S. Mansouri, and E. Dubrova. A faster shift register alternative to filter generators. In Proceedings of 2013 Euromicro Conference on Digital System Design (DSD), pages 713–718, Sept 2013.

[61] Joseph T Lizier, Siddharth Pritam, and Mikhail Prokopenko. Information dy- namics in small-world Boolean networks. Artificial Life, 17(4):293–314, 2011.

[62] Yiyuan Luo, Qi Chai, Guang Gong, and Xuejia Lai. A lightweight stream cipher WG-7 for RFID encryption and authentication. In Global Telecom- munications Conference (GLOBECOM 2010), 2010 IEEE, pages 1–6. IEEE, 2010.

[63] Marco Mamei, Ronaldo Menezes, Robert Tolksdorf, and Franco Zambonelli. Case studies for self-organization in computer science. Journal of Systems Architecture, 52(8):443–460, 2006. 56 BIBLIOGRAPHY

[64] James L Massey. Shift-register synthesis and BCH decoding. Information Theory, IEEE Transactions on, 15(1):122–127, 1969.

[65] Willi Meier and Othmar Staffelbach. Nonlinearity criteria for cryptographic functions. In Jean-Jacques Quisquater and Joos Vandewalle, editors, Advances in Cryptology – EUROCRYPT 89´ , volume 434 of Lecture Notes in Computer Science, pages 549–562. Springer Berlin / Heidelberg, 1990.

[66] NIST FIPS Pub. Announcing the advanced encryption standard (AES). Fed- eral Information Processing Standards Publication 197, 2001.

[67] Jeremy Quirke. Security in the gsm system. AusMobile, May, pages 1–26, 2004.

[68] Oscar S Rothaus. On “bent” functions. Journal of Combinatorial Theory, Series A, 20(3):300–305, 1976.

[69] Andrew Rukhin, Juan Soto, James Nechvatal, Miles Smid, and Elaine Barker. A statistical test suite for random and pseudorandom number generators for cryptographic applications. Technical report, DTIC Document, 2001.

[70] Thomas Schlitt and Alvis Brazma. Current approaches to gene regulatory network modelling. BMC Bioinformatics, 8(6), 2007. ISSN 1471-2105.

[71] C Schwarzer. Matlab random Boolean network toolbox 2003.

[72] Tezzaron Semiconductor. Soft errors in electronic memory-a white paper (2004).

[73] Adi Shamir. Efficient signature schemes based on birational permutations. In Proceedings of CRYPTO’93, number 773 in LNCS, pages 1–12. Springer- Verlag, 1993.

[74] Yan Shouli and Edgar Sanchez-Sinencio. Low voltage analog circuit design techniques: A tutorial. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 83(2):179–196, 2000.

[75] Malcolm Smith. RAM reliability: Soft errors, 1998. URL http://www. crystallineconcepts.com/ram/ram-soft.html.

[76] Takeyuki Tamura and Tatsuya Akutsu. An improved algorithm for detecting a singleton attractor in a Boolean network consisting of AND/OR nodes. In Proceedings of the 3rd International Conference on Algebraic Biology (AB’08), volume 5147 of Lecture Notes in Computer Science, pages 216–229. Springer, July-August 2008.

[77] Gabriel Vasilescu. Electronic noise and interfering signals: principles and applications. Springer Science & Business Media, 2006. BIBLIOGRAPHY 57

[78] Kai Willadsen. Robustness in Boolean models of genetic regulatory systems. Citeseer, 2006. [79] Andrew Wuensche. Exploring discrete dynamics. Luniver Press, 2011. [80] James F Ziegler. Terrestrial cosmic rays. IBM journal of research and devel- opment, 40(1):19–39, 1996. [81] Miodrag Živković. A table of primitive binary polynomials. Mathematics of Computation, 62(205):385–386, 1994.

Appendix A

Finding Attractors in Synchronous Multiple-Valued Networks Using SAT-based Bounded Model Checking

• Elena Dubrova, Maxim Teslenko, and Ming Liu. Finding attractors in syn- chronous multiple-valued networks using SAT-based bounded model checking. In 40th IEEE International Symposium on Multiple-Valued Logic, ISMVL 2010, Barcelona, Spain, 26-28 May 2010, pages 144–149, 2010 • Elena Dubrova, Ming Liu, and Maxim Teslenko. Finding attractors in syn- chronous multiple-valued networks using SAT-based bounded model checking. Journal of Multiple-Valued Logic and Soft Computing, 19(1-3):109–131, 2012

59

Appendix B

The Robustness of Balanced Boolean Networks

• Ming Liu and Elena Dubrova. The robustness of balanced Boolean net- works. In Ronaldo Menezes, Alexandre Evsukoff, and Marta C. González, editors, Complex Networks, volume 424 of Studies in Computational Intelli- gence, pages 19–30. Springer Berlin Heidelberg, 2013. ISBN 978-3-642-30286-2

61

Appendix C

A Faster Shift Register Alternative to Filter Generators

• Ming Liu, S.S. Mansouri, and E. Dubrova. A faster shift register alternative to filter generators. In Proceedings of 2013 Euromicro Conference on Digital System Design (DSD), pages 713–718, Sept 2013

63

Appendix D

A New Approach to Reliable FSRs Design

• Ming Liu and Elena Dubrova. A new approach to reliable FSRs design. In Proceedings of 32nd Nordic Microelectronics Conference NORCHIP, Oct 2014

65