<<

Conceptualizing Chaos: Continuous Flows versus Boolean Dynamics

Mason Korb

Thesis Advisor: Winfried Just

Honors Tutorial College

June 4, 2012 Contents

1 Motivations 2

2 History and Context 13

3 Some Cool Parts of Math 18

4 Independence Of the 22

5 Different Kinds of Dynamics 33

6 Chaos in Dynamical Systems 43

7 Connecting Different Dynamics 46

8 Very Few Answers and Many Questions 48

9 Appendix of Personal Contributions 50

1 Abstract

In this undergraduate thesis, I focus on the properties of difficult prob-

lems. I am not trying to solve these problems but focus on the notion of

difficulty itself. What properties make a mathematical question hard? I

explore some different concepts which capture a problem’s difficulty. In par-

ticular, I discuss independence, chaos, NP-Completeness, as well as some

relations between these various guises of intractability. I also include some

of my research work at Ohio University and show how it relates to these

concepts. When relating this discussion of difficulty to applied mathemat-

ics, it is crucial to discuss building multiple models of the same natural

system, a process called “bi-simulation,” and comparing their mathematical

intractability. A difficult problem in one model may (or may not) become

easier in another model. In this context, we discuss several types of models

including Boolean Networks, ODEs and Markov Models.

1 Motivations

1.1 Changing our Framework

Mathematics is about giving precise form to problems. When a problem is con- ceived in a rigorous form it has a certain clarity. We are now able to tinker with the problem in this form. We can see if any can be proved about the questions we are asking. In this way, it is important to reconceptualize old prob-

2 lems; perhaps when a problem is cast in a new light certain features will stand

out that did not before. theory problems can be conceived as graph theory

problems and numerical questions are sometimes dynamical systems problems in

disguise. The more ways problems are formulated, the more accessible they are

from different areas of mathematics.

Another way of gaining insight into a problem is reframing the mathematics

itself. The reframing of a problem often sheds light on the problem itself and

sometimes in the new context the problem becomes simpler and (if we are lucky)

solvable. Of course, sometimes reframing a problem results in an equally difficult

problem. One famous example of this is the notion of NP–completeness in theo- retical science. Mathematicians working in theoretical have generated literally thousands of unsolvable and perfectly equivalent problems; if just one of them were solved we would have solved all of them and have an an- swer to the question of polynomial time reducibility over the of NP–hard problems.

Mathematics is to some extent the process of conceptualizing and reconceptu- alizing patterns. Applied mathematics focuses on patterns which exist in reality and that can be discerned by empirical science. Applied mathematicians build and study models of real world problems. These models rely on different math- ematical tools. Modeling acts as an interface between mathematics and reality.

After isolating the elements of a problem, the model makers are free to focus on the mathematics. Each new mathematical model for a real world phenomenon (or

3 pattern) is a way of reconceiving reality.

1.2 Modeling with Ordinary Differential Equations and Boolean

Networks

A mathematical model is an application of mathematics to a real world situation.

Each model specifies a set of variables which may change over time. For example, when Lorenz created his famous model of the weather he studied the interactions of the variables: temperature, air pressure, and humidity. A single state in the system is called a vector. And the collection of all the possible vectors is called the state space. Now, after we select a vector as an initial condition we would like to see how this vector changes over time. We may think of time as moving basically in two ways. In a discrete time model, time travels forward in steps. If we were only interested in the weather for one snapshot each day then we might think of time as: the first day, the second day, the third, etc. However, in a continuous time model, we imagine time moving continuously and the model makes sense when considering any moment.

Let us summarize the components of a model. A is a collec- tion of variables, their corresponding state space, a scheme for interpreting time, and a set of rules which allow us to move a vector through time. These rules for time traveling in our dynamical systems may not be explicit. In fact, it is often the case that we can relate variables in some way that does not yield a specific

4 formula.

When we adopt the that time is traveling continuously, and ad-

ditionally we have that the variables move continuously over the state space, we

call the dynamical system a flow. For example, let us consider temperature.

If at one moment we have the temperature in our model at 4◦ C and at an-

other moment we have that it at 5◦ C and if the dynamic is a flow, then we are guaranteed a time when the temperature is 4.3◦ C. In fact, (although it may be hard/impossible to compute such a time) we are guaranteed that the system trav- els through every temperature between 4◦ C and 5◦ C. On the other hand, we may have a system with a discrete state space; where there are only countably many values the vectors may take. One of the most widely studied class of discrete (in both time and state space) dynamical systems are the so-called Boolean networks

[33]. In these model the vectors in our state space are binary. That is, they can take on only two values. In a weather model, one node in the vector may be described as sunny or not sunny. Another node may record temperature greater than 0◦ C or not. Each node can take only one of these two values. We exclude the possibility of sunny but not too sunny days; we force ourselves to pick one.

Let us move away from the weather example and think about population growth. It may not be immediately clear that an unchecked population will grow exponentially in time.

C2t N(t) = C1e , (1)

5 where N is the population, C1 and C2 are positive constant, and t is time. However, we might find it more intuitive to learn that an unchecked population’s growth rate is based on the percentage of the population of reproductive capacity:

˙ N = C2N, (2)

The equation above can be read as follows: The rate of change of a population is equal to a constant times the population. Equation (2) is an ordinary differential equation (or an ODE) and while it does not give us an explicit formula for how the population changes in time, it tells us something deeper about the underlying features of population growth. The general solution of the ODE (2) is given by equation (1). The fact that this ODE has an explicit solution is a rarity! Almost every ODE does not have an explicit solution and if we added more elements to our understanding of population growth, it is unlikely that an explicit formula would exist. In particular, if we added carrying capacity, disease dynamics, etc. to our model then we would not be able to solve for an explicit formula.

When modeling a real world phenomenon we must select a paradigm in which to construct our model. Each methodology has its unique strengths and weaknesses.

For example, when dealing with a small population we know that the state space is discrete: 14.5 people is not meaningful. On the other hand, in a large population we may not need to nitpick as 14.5 million people is reasonable. The same reasoning applies to discrete versus continuous time. When creating a model for a restaurant we do not have to consider anything deeper than the fiscal day. We can simply

6 think of each unit of time as a business day where we predict a certain amount expense and revenue. On the other hand, when we look at the growth of an organism every moment is potentially significant.

1.3 Chaos

Imagine that we, like Lorenz did, have created a model for weather. What is next?

We ask the meteorologist about the temperature, humidity and air pressure. She responds in the following way: “It is between 32.3◦ C and 32.4◦ C.” We may get similar responses for the other variables. For each variable she responds with an interval rather than a specific value. We may label the state space with a range of vectors with coordinates in these intervals and select a hundred points in this interval. Next we see what our model does with these initial conditions. It looks like the weather is going to be good for the next hour. We see a cluster of points all traveling together. Next we examine what the weather will be in a week. The vectors have traveled all over the state space. Some points indicate that we might have a tornado, while some vectors indicate quiet showers. So we see the model has limited use. It may be useful for predicting the weather for today and tomor- row but it is virtually worthless when trying to predict the weather in a month.

This situation is referred to as “chaos.” Chaos is always marked by sensitivity to initial conditions. In a deterministic model we start with the assumption that a single vector will move precisely in the way the rules prescribe. However, it may

7 be impossible to pinpoint exactly which vector reflects reality. When we are forced to work with multiple vectors we may get lucky and find that similar vectors travel through the state space together. When this is not the case we say the system is sensitive to perturbations of initial conditions.

Chaos is defined differently for continuous flows and discrete dynamics. For continuous flows, sensitivity to initial conditions (and usually chaos) is captured by the Maximal Lyapunov Exponent (MLE), a defined for every ini- tial condition. A positive MLE indicates sensitivity to initial conditions on an ; a negative value indicates order. When the MLE is zero the stability of the attractor is unknown without deeper analysis.

Consider our model for weather. It cannot be said that our model could not be used to predict the weather a month from now. However, we would need very pre- cise measurements to input as our initial conditions. Note that this kind of model reflects our dependence on technology to make accurate predictions. Of course, the way nearby vectors diverge is terribly significant when considering our study of real world problems. For example, when considering weather we find that getting exponentially better readings only results in a linear increase in the accuracy of predictions. This is truly valuable knowledge when considering our methodology when exploring meteorology. It shows us that investing in more accurate ther- mometers and wind socks will help little in predicting future weather. The MLE is a measure of the speed at which nearby vectors diverge along the system’s at- tractor. What is meant by “nearby vectors?” Consider the following vector v =

8 (Temperature, Humidity, Air Pressure, Wind Speed). We need a way of measur- ing distance such that the vectors v1 = (57, 57, 57, 57) and v2 = (58, 57, 34, 89) are closer than say v1 and v3 = (100, 100, 100, 100). Of course, all types of weird metrics are studied but for the sake of applied modeling we adopt a metric where vectors are close when each one of their components are close.

What does it mean for vectors to be close in a Boolean network? Consider the following vector: v = (Above Freezing Temperature, Raining, High Air Pressure, Windy).

Note that the vector above has four dimensions because this Boolean network deals with four different attributes of weather. W = (0, 1, 1, 0) would then be interpreted as a day which is freezing, raining, has high air pressure and is not windy. We say W and (1, 1, 1, 1) have Hamming distance two because they are the same on all but two positions. Another interpretation is that they disagree on half their coordinates: they have a 50 % Hamming percentage. We might call a vector close when it has all the same features except one (Hamming distance one). With a metric to determine “closeness,” we can decide whether nearby vectors become closer in features, maintain their distance or if they grow farther apart. Note that as we form models with high dimensions, two vectors which agree everywhere but in one value are very close in Hamming percentage. That is, the closeness of vec- tors is in some ways dependent on the size of our model.

In a Boolean network, chaos corresponds to what is known as the slope of the

Derrida curve at the origin. This is a measure of how much nearby vectors con-

9 verge or diverge. When this slope is greater than one, we call the system chaotic.

When this slope is equal to one, the system is called critical. When this slope is less than one, the system is called ordered. In some ways, this slope acts as the

Boolean network analog for the MLE.

1.4 An Ongoing Research Project at Ohio University

My interest in the topics covered in this thesis was sparked by the following general question: If we model the same real world phenomenon with a Boolean network and a system of ODEs, will we get agreement when considering sensitive dependence

(and chaos)? That is, will the ODE yield a positive MLE if and only if slope of the Derrida curve at the origin (computed from the Boolean network) indicates chaos?

The question above is part of a larger ongoing research project at Ohio Univer- sity. The Research Group included at various times: Dr. W. Just, Dr. T. Young,

B. Elbert, B. Oduro, Z. Hanyuan, and M. Korb.

To answer the above question we need a way of translating between ODEs and

Boolean networks. In this research group we play with a class of “toy” models which converts Boolean networks into ODEs. This lays the groundwork for asking the question above in a meaningful and precise way. A key property of a mean- ingful conversion between the two types of models might be called ‘consistency’ between the ODE and the Boolean network. In broad terms, we have consistency

10 when the Boolean network and the ODE agree in their predictions [29].

1.5 My technical contributions so far:

1. As a way of resolving the inconsistencies in the notions of chaos for our two

different systems, I decided to focus on a specific class of Boolean networks.

The maximal Lyapunov exponent really is a measure on an attractor of

an ODE. In contrast, the measure of chaos provided by the slope of the

Derrida curve at the origin takes into account all initial conditions with

Hamming distance one. By analyzing Boolean networks whose entire state

space was an attractor the measure of the Derrida curve (which is normally

a measure on the entire space) became a measure on an attractor. Also

the set of all Boolean networks (of a fixed dimension) with each vector in

the attractor forms an elegant algebraic object called a group. Reconceiving

the dynamical systems problem in an algebraic formulation turned out to be

fairly rewarding:

(a) A note on Time Reversible Networks [39] (See section 9.1). This note

explores a class of Boolean networks that was described in [1].

(b) Distance Preserving on Binary Vectors [36]. Reconceptual-

izing the problem in an algebraic way resulted in many algebraic tech-

niques that have proved useful in ongoing research as well as a classi-

fication of the structure of a specific group of Boolean networks. (See

11 section 9.2).

(c) In particular, Boolean networks have an accessible structure for com-

puter analysis. In [38] there is a description of how to input these

Boolean networks into Matlab using software I developed.

(d) B. Elbert and I collaborated on a graphical user interface that allows

us to analyze Boolean networks and their corresponding ODEs. The

methods for the creation of this software grew out of the notes above.

2. Counterexamples. After W. Just had developed a method of converting

Boolean networks to ODES, I explored how this conversion method could

potentially fail to exhibit consistency. This led to a few avenues of explo-

ration:

(a) I developed an alternative conversion method [40]. This alternative

method uses trigonometric rather than polynomial functions. It also

had the latent effect of creating what we called ‘false-positives’ when the

functions take on the values associated with corners of the hypercube

even though that was not the intent of the conversion. An interesting

follow-up would be in seeing if these ‘false positives’ have any qualitative

effects on the conversion.

(b) I demonstrate how this conversion can be implemented with some of

the software [35] (See section 9.6).

(c) The definition of strong consistency is developed in [41] (See section

12 9.4). I isolated a form of consistency between Boolean networks and

ODEs which is very rare but offers some insight into the more general

forms of consistency seen in [29].

1.6 General Methodology

The field of dynamical systems calls upon many branches of mathematics. The

proofs presented here are of algebraic, topological, and sometimes even combi-

natorial nature. Of course, we also use classic dynamical systems techniques.

Additionally, we supplemented our theoretical work with MatLab simulations.

2 History and Context

2.1 ODEs: a History from Calculus to Chaos

The following history is an adaptation of [47]. According to some, the study of

differential equations began in 1675, when Gottfried Wilhelm von Leibniz (1646–

1716) wrote [20]: Z 1 xdx = x2 (3) 2

This is really a differential equation in disguise. The Fundamental of

Calculus tells us that the function1 defining the area under a (continuous) curve

R b has a slope defined by this curve at every point. We will use the a f(x)dx 1A is a pairing of two sets. We write f : A → B when f is a function that pairs A

and B. For example, the function in (3) is a set of ordered pairs of real numbers.

13 to denote the area under a function f along a to b. And we will let the notation f 0 denote the slope of f.

Theorem 1 (The Fundamental Theorem of Calculus)

Z a f(b) − f(a) = f 0(x)dx b

1 2 So (3) tell us something (implicitly) about the slope of the function 2 x .

Soon after Leibniz’s discovery, Isaac Newton (1643–1727) identified several ba- sic classes of first order differential equations [34].

Calculus developed rapidly in great part due to applications to real world prob- lems. Pierre-Simon, Marquis de Laplace (1749–1827) helped pioneer modern math- ematical astronomy, physics and statistics. He summarized and extended the work of his predecessors in M´ecaniqueC´eleste(Celestial Mechanics). This work trans- lated the geometric study of classical mechanics to one based on calculus, opening up a broader range of problems. We owe many techniques used to solve differen- tial equations to Laplace, in particular, the Laplace transform, and the Laplace differential operator.

But the calculus developed by Leibniz and Newton (interestingly enough, in- dependently and nearly simultaneously) really is just the tip of an iceberg. Henri

Poincar´e(1854–1912) was the first to view these equations as dynamical systems.

With this discovery he formulated the idea of the Poincar´ewall which is a way of interpreting these complicated dynamics with a discretized time scheme making them (somewhat) easier to analyze.

14 It was not until later that mathematicians became interested in dynamics with both a discrete time scheme and state space for their own sake. So from early on mathematicians have been converting continuous dynamics into discrete ones. In some ways this research project shall be evaluating the opposite conversion.

The road to understanding chaos begins with Aleksandr Mikhailovich Lyapunov

(1857–1918) who pioneered stability theory. In fact, numerical methods to find the

Lyapunov exponent are still being designed today. Another impressive leap comes with (1917–2008) and his model of climatic fluid. People had written down equations which displayed chaos and its features before Lorenz but no one had noticed its significance in mathematics. Although there has been no universally accepted mathematical definition of chaos, the popular text by Devaney

[9] isolates three components as being the essential features of chaos: sensitivity on initial conditions, a transitive mapping and dense periodic points [2].

Of course, dynamical systems did not develop in a bubble. As Lyapunov was developing techniques to analyze dynamical systems, Poincar´eand Georg Cantor

(1845–1918) were bickering about infinities. Cantor, the father of , is famous for his theory of transfinite numbers which Poincar´eat one point called a “Grave disease” of mathematics. Every mathematical object can be defined in terms of sets. As such it’s often considered a “foundational” mathematics. From it, we can build many other mathematical branches. Set theory is also important for understanding many of the “difficulties” discussed in this thesis.

15 2.2 Laplace’s Demon

The universe can really be thought of as a dynamical system. Can we model the unfolding and movement of the universe? Laplace considered a mathematical model of the universe, though he never formalized this in a theory. To be fair, the notion of a dynamical system was formulated about a century after Laplace’s time.

“We may regard the present state of the universe as the effect of its

past and the cause of its future. An intellect which at a certain moment

would know all forces that set nature in motion, and all positions of

all items of which nature is composed, if this intellect were also vast

enough to submit these data to analysis, it would embrace in a single

formula the movements of the greatest bodies of the universe and those

of the tiniest atom; for such an intellect nothing would be uncertain and

the future just like the past would be present before its eyes.” Pierre

Simon Laplace, A Philosophical Essay on Probabilities [42]

Laplace himself never used the word ‘Demon’ but knew of the philosophical impli- cations. In fact, when Napolean asked Laplace why no ‘Creator’ was mentioned in

Celestial Mechanics he responded with “Je n’avais pas besoin de cette hypoth`ese– l`a.(I had no need of that hypothesis).” The mathematicians in the history above understood that a great number of philosophical questions hung in the balance of their theorems.

16 2.3 A Brief History of Discrete Dynamics

Computers operate using only a finite set of numbers. That is, they really are just traveling through a discrete state space and time. The great advantage of comput- ers is that they travel through this space considerably faster than a human could.

Unsurprisingly, the rise of computer science parallels the study of discrete dynam- ical systems. Before the computer age, Poincar´ealready discretized continuous dynamics as a way of studying them. Alan Turing (1912–1954) gave mathematics rigorous notions of an ‘algorithm’ which is a mapping on a discrete space. Perhaps one of his most impressive contributions was the concept of Universal Turing Ma- chine, a machine which can emulate any other machine. Cellular automata were

first proposed by John Von Neumann (1903–1957) and are amongst the simplest discrete dynamical systems studied for their own sake. We can use discrete dy- namics to study a plethora of real world phenomena. Stuart A Kauffman (born

1939) examines the of Boolean networks as a way of examining cellular interactions. He argues that complexity in organisms might be in large part due to self–organization as opposed to exclusively a result of natural selection [33].

17 3 Some Cool Parts of Math

3.1 Crash Course in Set Theory

Sets are relatively general objects and set theory is often used as a foundational system. Though, lots of mathematics can certainly be built without explicitly mentioning sets. Formalization of mathematics in terms of set theory allows for a precise notion of a ‘theorem,’ and in particular the identification of some problems which are impossible to solve. Such problems exhibit in some sense the ultimate degree of difficulty.

Sets are collections of objects without repetition. {}, {1, 2, 3}, {Red, Blue, Y ellow} are all sets. We shall use A ⊂ B to indicate that A is a of B that is when all the elements (things in a set) of A are elements of B. We can have a set of colors and then we can define other sets using this set NotP rimary = {x ∈

Colors : x is not primary} defines the set of all colors which aren’t red, blue, or yellow. This notation {elements : property} is refered to as set builder notation.

Consider the following set:  {}, {1, 2, 3}, {Red, Blue, Y ellow} . This is a set of sets. The set of all of a set is called a . For a set S we define the power set of P (S) = {x : x ⊂ S}. Employing set builder notation to define a power set makes the job easy as we don’t have to list out all the subsets of S with a given property. Indeed, we will see this impossible for certain sets. This set builder notation is useful but can get us in big trouble. For example: can we define a set

S = {x is a set : x∈ / x}? Suppose for a moment that S is an of this set

18 S. Then S has the property that S/∈ S and likewise if S/∈ S we conclude S ∈ S.

Clearly something fishy is going on here. The famous somewhat more intuitive version of this goes as follows: Imagine a barber who shaves all men who do not shave themselves and only men who do not shave themselves. No contradiction yet? Well, can the barber shave himself? Now we have a problem2. We promise in the rest of this thesis to be more careful in our future construction of sets.

Contradictions like these, while amusing, can cause quite a stir in mathematics and mathematical formalism is a response to such paradoxes. Zermelo-Fraenkel set theory with the of choice (ZFC) is one of the axiomatic systems that were proposed in the early twentieth century to formulate a theory of sets without the paradoxes of naive (not based on axioms) set theory.

3.2 There exist an infinity of infinities!

Georg Cantor proved that a list that goes on forever like N = {1, 2, 3,... } actually has the smallest type of infinity. You might say “Hold on! There are certainly fewer even numbers in that list and the set of even numbers is also an infinite set!” We have to be careful about the way we count. Let us say we have two sets: Number List={1, 2, 3} and Color List={Red, Y ellow, Blue}. We say that sets have the same size when there’s a between them that is ‘one–to–one’ and

‘onto’ like the following one:

2The problem being alluded to is not the one of gender or sexual identity: For this paradox to ‘work’ we must assume the barber is male.

19 Figure 1: A (a mapping which is one–to–one and onto) from Number

List to Color List. A one–to–one and onto map from Number List to Color List induces a one–to–one and onto inverse mapping.

That is, we have a mapping where every object in the is mapped to by exactly one element in the preimage (this property is known as one–to–one) and every element in the image is accounted for (when an element gets mapped to each image we say the mapping is ‘onto’). So in this way the even natural numbers have the same size as the natural numbers: f(x) = 2x is a one–to–one and onto mapping which takes N to the even natural numbers, 2N. But then one might argue that this only goes to show our way of counting is weak. Some people may just be committed to the idea that there are less even natural numbers than natural numbers. That is, we may aim for a system where A ( B implies

|A| < |B|. We can refer this devil’s advocate to measure theory3 (there are many good ways to count and they all admit an infinity of infinities). For now we will stick to this canonical notion of cardinality (the size of sets). Cantor proved that

3Measure theory is on the list of ‘cool’ math.

20 there are more points in the continuous interval [0, 1] = {x : 0 ≤ x ≤ 1} ={ all the

points between 0 and 1} than in N using his famous diagonalization : If there was a function taking N to [0, 1] then we’d have a list matching each natural number to a real number. If we took the number generated by the diagonal of this list and altered every digit then we’d have a new number. This new number could not have appeared on the original list. How come? What if it appeared at position

117? But the 117th digit couldn’t agree because we had changed it.

1 → 0. 3 141596535879 ...

2 → 0.3 1 07596669169 ...

3 → 0.12 2 1233189348 ...

4 → 0.183 9 077882824 ...... → .

117 → 0.63 ... 228 4 6410 ...... → .

Cantor’s Diagonalization argument: The elements of a set can be listed if and only

if that set is countable. Altering each digit along the diagonal of the list gives us

a new number. Perhaps, 0.4230 ... and this new number cannot have appeared

on the list as it disagrees with the nth number on this list at its nth digit. We’ve

made sure of this by changing its value there.

So we have a proof by contradiction4. That is, we assumed there was a good

4We’ll add to the list of ‘cool’ mathematics.

21 way to match up these sets (N and [0, 1]) and found that our list was lacking a

number. And we’ve survived a proof of our first theorem!

5 Theorem 2 The cardinality of the natural numbers, often denoted ℵ0 (“aleph naught” or “aleph zero”), is less than that of [0, 1] often denoted c. That is, ℵ0 < c.

Any set of cardinality ℵ0 or less is referred to as countable. Otherwise, the set is said to be uncountable. Sets with the cardinality of c are said to have the cardinality of the continuum.

The proof above can be generalized to show (with use of the fact that |S| <

|P (S)|) that there is an infinity of infinities. We let 2|S| denote6 the cardinality of

P (S).

The seperation between ℵ0 and c via the diagonalization plays a crucial role in dynamics and computer science (both theoretically and in applications).

4 Independence Of the Axioms

Cantor spent many years trying to prove the following:

Hypothesis 1 (The Continuum Hypothesis) There exists no set S with the prop-

erty ℵ0 < |S| < c. That is, there are no infinities between countable sets and the

continuum. 5ℵ is the first letter of the Hebrew alphabet. A student of mathematics should be familiar with the Hebrew and Greek alphabets. These languages will not be the most esoteric they study. 6For finite S, this is simply a theorem. Then we extend this notation to infinite sets.

22 Kurt G¨odel(1906–1978) showed in 1940 that the continuum hypothesis cannot

be disproven. Paul Cohen (1934–2007) proved in 1963 that the hypothesis can-

not be proven.

Definition 1 When statements cannot be proven or disproven using the axioms of ZFC, they are said to be independent of ZFC.

G¨odelshowed (by using a diagonalization argument!) that any sufficiently powerful

set of axioms can’t be simultaneously consistent and complete. That is, when the

axioms are not in contradiction with each other, there will always exist statements

which cannot be proven using the axioms. This is the first form of difficulty in

mathematics. Certain statements cannot be proven to be true or false. Consistency

of axioms has an element of independence but its only in one direction: While some

statements can neither be proven or disproven (like the continuum hypothesis

and the Suslin problem7), a set of axioms can be proven false if we can find a

contradiction. On the other hand, if there is no contradiction, that is, the axioms

are consistent, then there is no proof that they are consistent.

4.1 NP–Completeness and the limitations of .

Computational complexity is probably the clearest manifestations of the division

alluded to in this thesis. Thus far, we have discussed the division between infinities

7The Suslin problem is as follows: Is there a dense and complete set R with the countable

chain condition and no maximum or minimum which is not order-isomorphic to reals? This

question was shown to be independent of ZFC by R. Solovay in 1971 [49].

23 and the division of order and chaos. These represent a division between the easy

and the hard. In the opinion of this author there’s a definable connection here. We

will make some explicit conjectures and statements in later sections. In theoretical

computer science we aim to define the difficulty of problems in terms of the number

of steps, and the amount of space required to resolve them. Arguably the most

significant question in mathematics is the division between P and NP. This is a general question about algorithms which halt in polynomial time.

Consider the following question: Say we have a statement with n binary variables and we need not all that much time (less than 2n steps) to confirm whether one vector satisfies this statement. How long should it take to generate a vector we know to satisfy this statement? Let’s say there is a function l(n) < 2n which

represents the maximum time it takes to evaluate the statement for a given vector.

Then we can at least form a weak upper bound on the amount of time required to

solve such a question.

Theorem 3 Given a statement f in binary variables x1, x2, . . . , xn there exists an

algorithm which solves f in l(n)2n steps or less.

A vector y = (y1, y2 . . . yn) of truth values is said to solve f when f(y) = 1.

Proof.

Consider the algorithm which one by one checks every solution. 

Question 1 Is there any algorithm which is faster than the one proposed in 3?

24 Yes. There are algorithms which can do better than the “check all the possibilities” method. These run in l(n)bn time for some b > 1 [19]. When there exists a constant

M such that for all large instances of a problem the algorithm runs in less than

Mg(n) steps we say that the algorithm runs in O(g(n)) time. If an algorithm runs in at most O(bn) time for some b we say that the algorithm runs in exponential time. It runs in polynomial8 time if there is some polynomial p such that the algorithm runs in at most O(p(n)) steps.

Question 2 Given a statement f in binary variables x1, x2, . . . , xn, is there an algorithm which runs in polynomial time which solves f?

The question above is yet to be answered formally. Moreover, it’s unknown whether this problem is independent of ZFC [14].

4.2 Languages

Complexity theory can be framed in terms of languages. We can think of a language as a coupling of a set and a . Let A be an alphabet of symbols and let An be all possible strings made from A of length n. We can consider candidates for words in our language to be strings from A of any length: A∗ = S An. Then with a n∈N

∗ ∗ given syntax 1L : A → {0, 1} we generate a language L = {w ∈ A : 1L(w) = 1}.

∗ This syntax 1L maybe a little misleading: While a candidate in A is either in the

8 P∞ n A polynomial is a function p(x) = n=0 αnx where all the αn’s are real but only finitely many are non–zero.

25 language or not, computing this may prove difficult. We need a machine which can compute the value of 1L(w) and we hope for a machine to do this somewhat efficiently. By a machine we mean an algorithm which can be fed only countable information in addition to this word. This machine has only finitely many internal states and reads the (possibly countably infinite) additional information one bit at a time. Note that machines are significantly weaker than a functions. Given a function f and a preimage w, we are not guaranteed we can ever find f(w), the image of w; the function only guarantees the existence of f(w). Machines are much more complicated: they are liable to get stuck in infinite loops. Sometimes we must wait long periods until they finish their computations. This is known as the “halting problem” which was referred to in the adapted passage above. It may be impossible to tell whether the machine is in an infinite loop or we are simply taking a long time to finish the computation. When an abstract machine completes its computation it is said to halt. In light of this, we will do the following: Let

M : A∗ → {0, 1, Does Not Halt}. That is we consider machines as functions with this additional property that they may just not ever stop computing. Also we allow for an on–board clock which whenever the machine M halts can tell us the amount of computational steps this machine underwent.

Definition 2 Given the existence of a machine M : A∗ → {0, 1, Does Not Halt} which halts on all words in the language, and outputs 1 if and only if the word is in fact in the language L, we will say that the language is recursively enumerable.

26 We let RE denote the set of recursively enumerable languages.

We may have that the machine halts on words not in the language and reports that these words are not in the language but we have no guarantee that this happens.

Definition 3 A recursively enumerable language with a machine which always halts is called recursive. We shall let R denote the class of recursive languages.

Definition 4 A recursive language with some polynomial p and a machine which when given a vector of size n halts in less than p(n) steps is said to be in P.

Thus far, we have defined

P ⊂ R ⊂ RE (4) and these happen to be (though we haven’t proven it yet) strict subsets. For A a strict subset of B there are elements in B not in A.

The following definition is an adaptation from [45]:

Definition 5 A recursive language L is said to be in NP when there exist two polynomials p and q and a machine M which runs on inputs of two vectors such that for x, y ∈ A∗

1. M(x, y) halts in p(|x|) steps.

2. For all x ∈ L there exists y of length q(|x|) such M(x, y) = 1.

3. For all x∈ / L for all y of length q(|x|) we have M(x, y) = 0.

27 Languages in NP are somehow trickier than the rest. It is clear where they fit in

relation to the other classes in the sense that we know:

P ⊂ N P ⊂ R ⊂ RE (5)

Figure 2: An Euler Diagram of language complexity. This figures assumes P= 6 NP

Theorem 4 1. There are languages not in RE.

2. There are languages in RE not in R.

3. There are languages in R not in NP.

4. There are languages in P and therefore NP.

For a proof see [45]. The theorem shows that every inclusion in (5), with the one

possible exception of P ⊂ N P, is strict. This problem, known as the “P versus

28 NP” problem, is considered by many the most significant problem in mathematics.

It is one of the seven Millennium Prize Problems selected by the Clay Mathematics

Institute to carry a US $1,000,000 prize for the first correct solution. There are a

few ways this could be resolved:

• It seems likely that N P 6= P. There is ample evidence that this the case.

• It may be the case that every language in NP is also in P. All the evidence

above suggests that this is not the case though evidence is not the same as

a proof.

• The problem may be independent of ZFC. There is some evidence that this

is not the case but we cannot (presently) rule out this possibility [14].

Consider the following questions:

• How can we find the of an n dimensional Boolean network?

• What is the ideal path for a saleswoman to hit all her n sales locations with

varying lengths between locations?

• What is the best strategy for a Battleship player on an n × n board?

• What is the ideal algorithm for generating an n2 × n2 sudoku problem [52].

The questions above may not look similar but if we had an algorithm that required only polynomial time steps for just one of them we would have a polynomial

29 algorithm to all of them. These equivalent problems are called NP–complete and

there are more than 3000 problems known to be NP–complete.

These problems demonstrate a prevalent concept which is not proven: With

certain types of problems no matter how much we twist and turn and reformulate

them they become no easier. At best with these problems we can move the difficulty

from one area to another but the difficulty is never evaded. Even if we accept a

defeat that these problems cannot be solved “quickly” finding equivalent problems

may help us identify which problems exhibit this behavior.

What’s the connection between this formal notion of a language and complexity

class to the questions above. These problems are about generating a solution. And

the presentation of languages here have to with the existence of a solution. These

two types of problems are equivalent in terms of polynomial halting times.

Let φ be a machine which solves SAT. A Boolean function g is coded as a string

of length n and input into φ and then in fφ(n) steps φ outputs 1 if this Boolean function is satisfiable and 0 otherwise. Let ψ be a related but different problem:

ψ is a machine which solves SATGEN: A Boolean function in m variables is coded as a string of length n and input into ψ. The algorithm ψ halts in fψ(n) steps and

outputs a solution x1, x2, . . . , xm if it exists; otherwise it outputs 0. Let us sketch a proof of this well–known fact.

Theorem 5 There exists a machine φ that solves SAT such that fφ is a polynomial if and only if there exists a machine ψ that solves SATGEN such that fψ is a

30 polynomial.

Proof.

Assume ψ solves SATGEN with fψ being polynomial. Then we let φ be the follow- ing machine: φ will take the Boolean function g in variables x1, x2, . . . , xm, as an

input. The function g must be coded somehow as a string and we take the length of this string to be the instance size. Call ψ with input g and if ψ outputs 0 (that

is, there is no solution) then let φ output 0. Otherwise, output 1. Then φ runs in

at most O(fψ) steps.

Next we assume φ solves SATGEN with fφ being polynomial. ψ will take the

Boolean function g in variables x1, x2, . . . xm, as an input. The function g must be

coded somehow as a string and we take the length of this string to be the instance

size.

Let ψ be the following algorithm: Call φ with input g and if φ outputs 0

then let ψ output 0. Otherwise, we will find the solution dynamically. Let y1 =

φ(1, x2, . . . , xm). Let’s assume we’ve defined y1, . . . , yt. And define a vector vt =

(y1, . . . , yt, 1, xt+2, . . . , xm). Then we define yt+1 = φ(vt).

We have set yt+1 as the of the following statement: “There exists an

assignment of xt+2, . . . , xm which solves g when x1 = y1, x2 = y2, . . . , xt = yt and

yt+1 = 1.” If the statement is true then we have set yt+1 = 1 and the statement

tells us that φ(y1, . . . , yt, 1, xt+2, . . . , xm) = 1. If the statement is false then we have

set yt+1 = 0 and we know that φ(y1 . . . yt, 0, xt+2 . . . xm) = 1. We can repeat this

31 process until we have defined ym. Then we output s = (y1, . . . , ym) which solves

g. This algorithm ψ calls upon φ exactly m times. So fψ ≤ mfφ. So we know fψ is bound by a polynomial.



4.3 NP–Completeness and Diagonalization

The following is an adaptation from [13]:

Diagonalization Can we just construct an NP language L specifi-

cally designed so that every single polynomial-time algorithm fails to

compute L properly on some input? This approach, known as diag-

onalization, goes back to the 19th century. In 1874, Georg Cantor

showed the real numbers are uncountable using a technique known as

diagonalization. Given a countable list of reals, Cantor showed how to

create a new real number not on that list. Alan Turing, in his seminal

paper on computation, used a similar technique to show that the Halt-

ing problem is not computable. In the 1960s complexity theorists used

diagonalization to show that given more time or memory one can solve

more problems. Why not use diagonalization to separate NP from P?

Diagonalization requires simulation and we do not know how a fixed

NP machine can simulate an arbitrary P machine. Also a diagonal-

ization proof would likely relativize, that is, work even if all machines

32 involved have access to the same additional information. Baker, Gill

and Solovay [50] showed no relativizable proof can settle the P ver-

sus NP problem in either direction. Complexity theorists have used

diagonalization techniques to show some NP-complete problems like

Boolean formula satisfiability cannot have algorithms that use both a

small amount of time and memory, but this is a long way from P= 6 NP.

5 Different Kinds of Dynamics

Dynamical systems can be defined on sets of every cardinality. For now, we focus on dynamics on spaces with continuum, countable, and finite cardinality.

5.1 Dynamics on Manifolds

ODEs on manifolds9 are continuous dynamics. Not all dynamical systems on man- ifolds need to be continuous. Let x = (x1, x2, . . . xn) be a vector of real variables and letx ˙ = p(x) be a system of n differential equations where p is a vector of

10 Lipschitz continuous functions (p1, p2 . . . pn) each from the state space onto R. 9Manifolds look like the plane up close. Manifolds are locally isomorphic to a Euclidean space

(which are copies of Rn). In this thesis we only focus on dynamical systems on the manifold Rn. 10Real world models will likely have at least Lipschitz continuous functions. A function f from a metric space X1 to a metric space X2 is Lipschitz continuous when there exists a constant

L > 0 such that for all v1, v2 ∈ X1 d2(f(v1, v2)) < L d1(v1, v2) where di is the metric of the space Xi.

33 Figure 3: Dynamical systems can be found with all sorts of different state spaces

and time schemes. In the above layout a few examples are given. In section 3

we discuss methods of converting these dynamics from one section of this chart

to another. For example, the research group at OU focused on turning Boolean

Networks into Ordinary Differential Equations.

Lorenz’s model of the weather:

x˙ = σ(y − x)

y˙ = x(ρ − y) − z

z˙ = xy − βz

Wherex ˙,y ˙, andz ˙ refer to convective motion, and horizontal and vertical tem- perature variation respectively. For some parameter settings, the Lorenz model is

34 an example of chaos11. Certainly not all ODEs are chaotic (though it’s thought

“most” are. At least in the sense of the footnote.) and any two-dimensional ODE

or any linear ODE12 would be an example of a non–chaotic ODE.

As we vary the positive constants σ, ρ and β we find the dynamic can do quali-

tatively different things. Dependencies of qualitative behavior on parameters are

studied by a cool branch of dynamical systems called “.” Con-

tinuous models have nice properties and are sought as models because of the ample

theory available regarding these dynamics.

Consider the following type of dynamic: Say we have a state space X and a function f : X → X. We let f n denote the nth iteration of this map. We consider

time n ∈ N and define a trajectory starting at an initial condition x0:

n (x0, x1, x2 ... ) where xn = f (x0)

This formulation of a discrete dynamic is called a Difference Equation (DE) and

its trajectory should be contrasted with that of Ordinary Differential Equations

(ODEs). In an ODE, like the Lorenz model, vectors change continuously as t

moves through R; these are dynamics with an implicit continuous solution on a 11Though the entire systems is not transitive, the attractor is and such systems are colloquially

refered to as chaotic. 12A linear ODE would have the form ~x˙ = A~x0 + b where A is an n × n real matrix and b is a

real–valued vertical vector of length n. Note that the Lorenz model is non–linear because of the

two xy terms.

35 Euclidean space and a continuous time scheme. On the other hand, DEs yield

discrete trajectories. DEs can be used to approximate ODEs. The theory for DEs

and ODEs is well–developed for low dimensions and linear functions; as we step

away from these requirements, mathematical theory dwindles. For an example of

a DE consider the , a famous discontinuous dynamic from R to R:

f(x) = µx(1 − x) (6)

The mapping f is highly dependent on the value we select for µ. For example for

many µ > 3.57 we see chaos. Two very similar initial conditions will have wildly

different trajectories [21].

5.2 Dynamics on Countable Spaces

A problem previously studied at Ohio University [31] is the Collatz conjecture or

the 3x + 1 problem. Consider the following function f : N → N defined as:

   x/2 if x is even f(x) = (7)   3x + 1 if x is odd Conjecture 1 (The Collatz Conjecture)

For every x ∈ N there exists a value n such that f n(x) = 1.

For some values this is easy to see: f 5(5) = 1. On the other hand consider the

intermediate steps for the number 7.

7 → 22 → 11 → 34 → 17 → 52 → 26 → 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1

36 Another way to conceive this problem is a dynamical system. In this way the conjecture above claims that there the dynamic has one and only one attractor: the {1, 2, 4} attractor. Note that is really is just another DE but now on a non–

Euclidean space, namely the set of positive integers N.

5.3 Dynamics on Finite Spaces

Cellular Automata At least one study [15] claims that happiness spreads through proximity. Consider the following rule: An inhabitant is only happy when the day before he had a happy neighbor. Let 1 denote happiness and 0 unhappiness; we imagine time t ∈ N moving discretely. Then the happiness of the resident xj at time t in apartment j is determined as follows:

xj(t + 1) = xj−1 ∨ xj+1 (8)

Let us say we have a row of apartments where people are happy or unhappy. In the center of this row of apartments we have a happy guy, Mandelbrot. Everyone else on this row is unhappy. Mandelbrot is happy today but perhaps not tomor- row. How does this change in time? Let us examine this image. Mandelbrot in apartment 10 is happy the first day and then unhappy the next and this pattern of every other day happiness spreads and persists indefinitely. It is interesting to note that while Mandelbrot will only be happy for half his existence his happiness proliferates to his neighbors. The pattern created in this example is not terribly

37 Figure 4: Happiness in an apartment complex

fun to look at and is not terribly complicated. On the other hand, this may be

unsurprising as we created it with 8 very simple rules.

The chart above allows us to see under what conditions xj(t) succeeds to a happy state. Note that in these cellular automata the state of each node is only determined exclusively by itself and the (immediate) neighboring nodes.

The surprising part is that with a different set of seemingly as simple rules we can get fairly complicated images. The following set of rules in Figure 5 creates the image in Figure 6 known as the Sierpinski triangle.

This image is known as a . The study of was pioneered by

Mandelbrot. Part of the interest in these creatures is in this hidden power: simple

38 Figure 5: Some More Simple Looking Rules

Figure 6: The Sierpinski Triangle rules can create complicated patterns. When discussing cellular automata each node only takes input from itself and the two nodes adjacent to it. This can be a valuable model when we are considering things with this kind of metric. On the other hand, if we truly want to model happiness in an apartment complex we will have to admit that there is an underlying social network. The couple in apartment

3 has dinner with their friends in apartment 6. Their happiness or lack thereof has an effect. Some nodes may feed input and accept input from many other nodes.

In this case we will have to step away from cellular automata and study a more

39 complicated model.

Boolean Networks If we relax this requirement about which nodes feed input into which nodes we get a more general model called a Boolean network. These were proposed by Kauffman as models of gene regulatory networks (see [33] for a review). Say we are given n binary variables in t: {x1(t), x2(t), . . . , xn(t)} and

a vector of functions f = f1, f2, . . . , fn such that xi(t + 1) = fi(x). That is, xi

updates based on the states of (potentially) all n variables.

Consider the following toy model: Let B be a Boolean network in 3 dimensions

governed by the following rules:

f1(x) = x1 ∨ (x2 ∧ x3) ∨ ¬(x1 ∧ x2 ∧ x3) (9)

f2(x) = x1 ∨ (x2 ∧ ¬x3) (10)

f3(x) = ¬x3 ∨ (x1 ⊕ x2) (11)

Where ∨, ∧, ⊕ refer to and, or and , respectively.

Then B following state transition map

We can convert equation (9) from logical expression to polynomials:

f1(x) = x1 + x2x3 (12)

f2(x) = x1 + x2 + x1x2 + x2x3 + x1x2x3 (13)

f3(x) = x1x3 + x2x3 + 1 (14)

40 Figure 7: The points in red are on a transient. The blue points are all in some attractor. There are two attractors: The {011, 100, 111} attractor and the fixed point 001.

5.4 Stochastic Processes

For a finite dynamical systems where we have n states {x1, x2, . . . , xn} we are not always so lucky to observe that xj succeeds xi with probability 1. In the simplest model called a Markov Chain we assume that some probabilities 0 ≤ pi,j ≤ 1 are known such that xj succeeds xi with probability pi,j. Then we can collect these   P ... P  1,1 1,n      probabilities in a matrix M =  ......        Pn,1 ... Pn,n There are some implicit assumptions tucked into this model.

• We assume that these the distribution of probablities of future states is com-

pletely determined by the present state, when we make this assumption we

say the model has the Markov property.

Imagine that Mandelbrot spends each day either

41 1. Cleaning apartment 10.

2. Writing about fractals.

3. Or walking in the park.

It is possible that the probability that Mandelbrot chooses a specific daily

activity is only dependent upon what he did yesterday. But that would

make Mandelbrot a rather unique person. It seems likely that after 3 days of

cleaning his apartment he may want to get out of apartment 10. Assuming

the Markov probability is not always appropriate but makes many problems

tractable.

• We assume we can observe all the possible states.

• We also assume the probabilities are not changing in time. This is another

helpful assumption when aiming for tractability. But in a more complicated

42 model we might assume that the summer time increases the chances of walks

in the park.

When the probability distributions of the future states are not determined by just the present state but rather the last k states we say we have a kth order

Markov model. In this case we have to examine an nk × nk matrix as our model.

The problem with using a kth degree markov model is tractability. These models

while being very general can be computationally expensive.

6 Chaos in Dynamical Systems

6.1 Chaos in Flows

Many forms of chaos can be discussed and compared [17, 18]. Devaney defines

a continuous dynamic to be chaotic if it exhibits at a minimum three qualities:

topological transitivity, dense orbits, and sensitive dependence on initial condi-

tions. Although it is known that that the first two features imply the third [2],

ironically it is commonplace to compute the Lyapunov exponent first when check-

ing for chaos. The Lyapunov exponent of a dynamical system characterizes the

rate of divergence of initially close trajectories. In some ways we can view as the

measure of how sensitive a dynamic is to initial conditions. Although a positive

a Lyapunov and hence sensitive dependence need not imply chaos there is much

discussion about how often this happens. It is thought this happens with high

43 probability. On top of these criteria, there has been significant studies of the

features of chaos including strange attractors.

Figure 8: Sensitivity to initial conditions requires that any point is arbitrarily close

to points with different future trajectories. This is demonstrated by the Lorenz

attractor, now an icon of chaos. [43]

6.2 The Lyapunov Exponent

Sensitivity on initial conditions can be “measured” by how quickly nearby trajec-

tories converge or diverge. The Lyapunov exponent is a measure of this sensitivity

on an attractor. 1 δZ(t) λ = lim lim ln( ) t→∞ δZ0→0 t δZ0

Where δZ0 is the initial distance of two initial conditions and δZ(t) refers to their distance at time t.

44 A positive Lyapunov exponent indicates exponential divergence and a negative

value indicates exponential convergence. We can not assume stable orbits when

λ = 0 we may simply be observing slower than exponential movement. On the

other hand, λ = ∞ implies faster than exponential trajectories. For different initial

conditions well find different Lyapunov exponents. But were mostly interested in

the maximal one. If the maximal Lyapunov exponent (MLE) is greater than one

we say we have sensitivity on initial conditions.

6.3 Chaos on Discrete Spaces

Discrete spaces also have a notion of sensitivity. But the notion of zooming in real

close and focusing on how nearby conditions differ is somewhat less intuitive in

discrete systems. We can get around this by using distributions.

Consider a distribution µ which for every set Bn = {Boolean networks in n nodes} assigns probabilities to each element. That is, µ = {µn} where µn : Bn →

[0, 1]. We assign each Boolean network a probability relative to the other networks

of n nodes. Then for a network M let a(M) be the average Hamming distance of

Hamming neighbors after one update. Then we can compute the value

X d(n) = µn(M)a(M)

M∈Bn

That is, we let d(n) be the average a(M) in the probability distribution µn. Fi-

nally, we call d = limn→∞ d(n) the slope of the Derrida curve at the origin. This

value is taken to be measure of order and chaos on Boolean networks: For a slope

45 equal to one we call the distribution critical. For greater values we call the distri- bution chaotic and for smaller values we call it ordered. This captures the level of sensitivity on initial conditions.

One might speculate that this slope acts as analog to the MLE for ODEs.

7 Connecting Different Dynamics

7.1 Comparing Dynamics

Modelers of real world problems often convert one dynamic into another. It can be helpful when seeking tractablity and insight into a real world problem.

7.2 What is Possible when Converting Systems?

Let us say we have a deterministic Boolean network with synchronous updating. As simple a system as we really can ever get. Using the conversion method described by [51] or [27] we can convert this system into an ODE. There are various forms of consistency worth analyzing here. Are the fixed points of the Boolean network the fixed points of the ODE? This is proven for the conversion methodology in

[51] and we get this by analogy for [27]. Now that we have that the fixed points in the dynamics are consistent we should ask about points that are not fixed.

That is, when we observe the movement “a” to “b” in the Boolean network do we get similar movement in the ODE system? Making this vague idea of “similar

46 movement” formal requires us to split up the continuous state space and attribute

the vectors of the Boolean network to sections of the continuous space. This kind

of analysis is done for specific classes of Boolean networks and presented in [24,41].

Next we can take this ODE and analyze probabilistically. Already we have chopped

the ODE state space up in a fairly arbitrary way: we have taken the single vector

“a” in the Boolean network and said this is represented by its section of the ODEs

state space. This is somehow like the painful jump from to logic we

see often in mathematics. What we have done here is for example said that all

days above 58◦ F are hot days and otherwise they are cold days. So we’ve taken

a continuous space and matched it to either 1 or 0, hot or cold. Now when we

analyze the Boolean network we may see that every hot day with high humidity

yields another hot day (one time step later). On the other hand, when looking at

the ODE we may see this is the case for days with temperature 73◦ F or above

but lower than this we get mixed results. Note that we have already made a

concession to consistency: we can no longer say that a heads to b with probability

1. But maybe we can say it still happens with 93 percentage probability. This certainly represents a kind of consistency. Next we can take this Markov chain and using a classic simulation technique: simulate it using the Gillespie algorithm which induces an ODE [16]. That is, we are capable of the following conversions:

Boolean Networks ↔ Ordinary Differential Equations ↔ Markov Models

47 A couple methodologies have been outlined in the literature and the research group

at Ohio University has developed some independent methodologies for converting

Boolean Networks into ODE systems. In [51] Wittman outlines a way to convert

Boolean functions into ODEs of linear decay functions. While these functions

are analytic and have consistent fixed points, not much can be determined about

consistency outside of fixed points. We attempt to improve upon this with the

conversion method outlined [29].

As for taking ODEs and turning them into Boolean networks, there exists a

large body of literature. See for example [8]. Similarly, there exists a large body of

literature on converting ODEs into Markov models and Markov models into ODEs.

This goes beyond the scope of this thesis.

8 Very Few Answers and Many Questions

In the previous sections we have rushed through many delicate parts of mathemat-

ics which one should avoid rushing through. The pay-off for our haste is that we

now can discuss some connections of these fields. In this thesis we have discussed

some various forms of difficulty. There are some problems where the difficulty is

really only in the eyes of the beholder. If the problem is rewritten it may become

simpler. Of particular interest are the problems when this is not the case.

Principle 1 No matter how many times (thus far) we change a problem the in- teresting questions never become any easier to answer. At best, we can only move

48 the difficulties from one component to another component of our question.

This seems to be true (at least emperically) about many difficult mathematical questions. Is there a way to classify and connect these questions that always exhibit this difficulty whether we call this difficulty “NP hard” or “chaos” or with

?” It would be interesting to attempt to classify these problems that have this property.

Conjecture 2 Any consistent algorithm which models a chaotic phenomenon will be at least NP hard. A consistent discretization of the phenomenon will be fractal in nature.

The conjecture above is somewhat vague but may act as a guide for future studies.

It’s not quite clear how independence fits into this framework. We mentioned earlier, it’s possible that the P vs. NP is independent of the axioms. Discussing difficulties may have an inherit difficulty to it.

8.1 Where does chaos take us?

The obvious answer is that chaos will take us all over the state space of interesting math. We remain unsatiated with the connections drawn in this thesis. We’ve pointed to many unanswered potential connections which should lead to many cool areas mathematics13. 13We must apologize for any mathematics which didn’t make the list of ‘cool’ mathematics in this thesis. Though to be fair, the set of patterns worthy of exploration was never enumerable.

49 9 Appendix of Personal Contributions

Many interesting ideas were motivated by the research at Ohio University. Here is a selection of some of my personal work of a more technical nature after some slight adaptations. Distance Preserving Bijections on Binary Vectors was published in Midstates Conference for Undergraduate Research in Computer Science and

Mathematics in 2010. The copyright on this publication prohibits republishing this work in a journal or conference proceeding without consent. The other notes are all research notes and no effort has been made to present them in a different form than they were presented to the research group.

9.1 Hallmarks of Chaotic Versus Ordered Dynamics.

Description:

The naive but natural analogy, “The slope of the Derrida curve at the origin acts like the Maximal Lyapunov Exponent” is somewhat imperfect. The slope is computed as the average distance between initial conditions with Hamming distance one after one update. Therefore, it’s a measure on the entire Boolean network. On the other hand, the MLE is really just a measure on the attractors of an ODE.

To make a more perfect analogy, I isolated the class of networks where every initial condition was an element of the attractor. These time–reversible networks were previously described by Kadanoff, Aldana and Coppersmith in [1]. In the

50 following paper I summarize some features of chaotic distributions of Boolean networks and hone in on these time–reversible networks.

51 Hallmarks of Chaotic Versus Ordered Dynamics

Aug 5, 2010.

Terms

Definition of Chaos

Chaos has many varying different definitions but we’d like to adapt one for the sake of functionality. Consider a distribution µN then for each N we may select a Boolean Network JN ∈ µN . Then we can draw a Derrida curve for each JN .

Let dJN denote the slope of Derrida curve at Hamming distance 1. Letting dN =

E[{dJN : JN ∈ µ}] we have a sequence of dis in the follow fashion: d1, d2, . . . , dN

Now if the sequence converges we let

Dµ = lim dN . N→∞

Note that “little” d makes sense for any JN but “big” D is only logical for a distribution µN .

If Dµ > 1 we will call the system chaotic.

If Dµ = 1 we will call the system critical.

If Dµ < 1 we will call the system ordered.

Whenever context clarifies the meaning of these terms we will drop the subscripts in an effort to decrease clutter.

52 Frozen Nodes

The ordered regime is marked by the quality that most initial conditions will reach

an attractor with a large number of variables or nodes that will never flip (without

perturbation). Such nodes are called eventually frozen nodes.

Homeostatic Stability

A property of order, homeostatic stability is the property that most small pertur-

bations (flipping a single node) leaves a vertex (or initial condition) in its original

basin of attraction.

Reachability among cycles after perturbation

Let {A1,A2,...,Az} denote the attractors of a network. We say that Aj is directly reachable from Ai if there is (at least) one node such that the flip of that node at time t (when the system is in attractor Ai) has the effect of bringing (after a transient) the network to the attractor Aj.

We also define Aj to be indirectly reachable from Ai if there exist a path from Ai to Aj as a result of more successive single bit flip, and we define Aj to be reachable from Ai if it is either directly or indirectly reachable. High reachability is a feature of chaos.

53 The number of attractors is a feature that divides chaos and order. This quantity should of course be relative to N. The body of literature about Boolean networks tells us that fewer attractors implies order whereas a system with more attractors would be chaotic. This idea is also a result of our intuition regarding the phrases chaos and order.

Long and Short Attractors

The classic definition would be that a long attractor is one whose length increases exponentially with N. In truth we find this phrase somewhat misleading; as we increase N within a distribution attractors don’t truly gain length. For example consider a network with N nodes in a distribution µ. Call it JN . Then as we vary N in the following way JN → JN 0 , the attractors in JN do not hold any obvious correspondence to those in JN 0 . The point here is that there exists some c > 0 such that with probability approaching one as N gets large, a randomly selected initial condition finds itself in a basin of an attractor whose length is greater than or equal to cN . If a distribution µ has this characteristic for 1 < c ≤ 2, we would then say that µ contains long or chaotic attractors.

Ω-Limit Set and its permutations

Here is a related but distinct idea. Consider Ω = ω − limit set of JN . If we compute the successors of each element in Ω we have a permutation P of length

54 L ≤ N where L = |Ω| if there exists some 1 < b ≤ 2 such that with probability approaching one as N gets large, a randomly selected Boolean Network will have an

ω −limit set of size bN , then we may say the ω −limit set is growing exponentially with N. Note that if the attractors are growing exponentially the ω−limit set must be as well. To put this idea simply. We may write each cycle as C1,C2,...,CT and

0 0 0 their corresponding attractors (sets of nodes) as C1,C2,...,CT , then the product of cycles is a permutation P that includes every element in the ω-limit set. Obviously we may say that if one of these factors (cycles) grows exponentially then the product does as well.

If we let |K| refer to the length of a permutation K, then,

0 0 0 Ω = C1 ∪ C2 ∪ · · · ∪ CT

P = C1 × C2 × · · · × CT

|P | = |C1| + |C2| + ··· + |CT |

|P | = |Ω|

Table 9.1 was taken from [33] and goes into detail about the features discussed.

We would like to explicate the results above. The results above were found empirically in the 1980s and some have been found less than accurate. The ex- ploration of Boolean Networks is a fairly new endeavor and some of results have been reconsidered. The following was adapted from [10]. Based on computer sim- ulations, the mean attractor number of critical K = 2 Kauffman networks with a

55 constant probability distribution for the 16 possible updating functions was once believed to scale as N 1/2 [32]. With increasing computer power, a faster increase was seen (linear in [5], “faster than linear” in [48], stretched exponential in [4], [3]).

Then, in a beautiful analytical study, Samuelsson and Troein [46] have proven that the number of attractors grows indeed faster than any power law with the network size N. A proof that the number and length of attractors of critical K = 1 networks increases faster than any power law was published some time later [11]. These two proofs, although they apply to closely related systems, are conceptually different.

The latter derives structural properties of the relevant part of the networks, and obtains from there a lower bound for the number of attractors.

56 Symbols

For the following sections we will adopt the following symbols

• M will denote the state space of a Boolean network with N nodes. We can

say that M denotes the set of all binary vectors of length N.

• For a vector ~m ∈ M, the symbol |~m| will denote the number of active nodes.

 • We can organize M in the following way. Let ML = ~m ∈ M |~m| = L .

• “+” will most often refer to element-wise modulo two addition of vectors.

• “¬” will be a shorthand for a single bit pertubation. For a node xi we will

write ¬xi = xi + 1.

• We reduce the phrase “BN is a Boolean network with N nodes and an up-

dating function A on the state space M” to BN = (M,A)

• Let [N] = {1, 2,...,N} and let Π be the set of all permutations on [N].

• For a vector ~m = (m1, m2, . . . , mN ) ∈ M and a permutation π ∈ Π we let

the symbol π(~m) = (mπ(1), mπ(2), . . . , mπ(N))

• K(Ai) shall denote the amount of significant variables in the function Ai.

• K(A) shall denote the expected amount of significant variables for a function

Ai defined by A.

57 Reversible Networks

Discussion

For many distributions we see the Boolean networks exhibit all of the features of

chaos or none of them. However this need not be the case in general. In this

section we will explore some examples and counterexamples. Many examples we

will use will relate to a class of Boolean Networks described in [1] under the name

reversible.

Definition 6 We’ll say JN and it’s updating function A is reversible when A is

injective.

Let ~x,~y ∈ M. Then JN is reversible if and only if

A~x = A~y ⇒ ~x = ~y or equivalently ~x 6= ~y ⇒ A~x 6= A~y

We consider Boolean networks with updating functions which are finite maps (in

that we’re only considering a finite number of nodes). So a Boolean is surjective

if and only if it’s injective if and only if it’s bijective.

One of the most significant property of the reversible Boolean Networks is that when we look at the state-transition map we find no transient elements. Every element is in some attractor!

Let’s see some examples.

58 Example 1. 1 Let IN = (M,A) denote a Boolean Network where each node copies

itself. Then IN will have the following updating function: for all ~m ∈ M we have

A~m = ~m. Clearly, IN is reversible.

Example 2. 1 Let ~x ∈ M and let BN = (M,A) denote a Boolean Network with

the following updating function:

for all ~m ∈ M we have A~m = ~m + ~x. It can be seen that BN is reversible.

Example 3. 1 Consider π ∈ Π and let BN = (M,A) denote a Boolean Network with the following updating function: for all ~m ∈ M we have A~m = π(~m).

Example 4. 1 Consider π ∈ Π and ~x ∈ M and let BN = (M,A) denote the

following mapping on M:

for all ~m ∈ M we let A~m = π(~m + ~x). We can see that example (4.1) is reversible

because it is a composition of two reversible Boolean networks.

From these examples we can build some sets of reversible networks. Let S denote

the set of all possible updating functions on M.

Reversible Sets of Networks

Example 2. 2 We’d like to build a set of Boolean Networks from example (2.1).

Let C = {A ∈ S ∃~x ∈ M such that ∀~m ∈ M A~m = ~m + ~x}. Then the set C is

reversible.

59

Example 3. 2 Let P = {A ∈ S ∃π ∈ Π such that ∀~m ∈ M A~m = π(~m)} Then the set P is reversible.

Example 4. 2 Let H = {A ∈ S|∃π ∈ Π ∃~x ∈ M such that ∀m ∈ M A~m =

π(~m + ~x)} Then the set H is reversible.

The mappings in examples (1.1), (2.2), (3.2) and (4.2) are not only sets but indeed form groups. We’ll submit a proof of this in the next section.

Distributions

How does one go from these sets to what we call “distributions?” Lets begin with some definitions. Consider that there exist 2N∗2N Boolean networks with N nodes.

We’ll call the set containing all of these the grand pool. The Grand Ensemble assigns equal probability to each element in the grand pool.

Definition 7 A distribution is a function that assigns probabilities to each of the networks in the grand pool. We will often use the following slight abuse of notation. If a distribution µN assigns a probability of zero to a network JN , we will write JN ∈/ µN . Otherwise we will write JN ∈ µN .

Definition 8 A uniform distribution is a function that assigns equal weight to every element in a subset of the grand pool.

Definition 9 A restriction κ on a uniform distribution µ is a uniform distribu- tion such that the probability of selecting a Boolean Network JN from κN is zero

60 whenever JN ∈/ µN .

Call κ a proper restriction of a uniform distribution µ, whenever κ 6= µ.

Definition 10 Assume for all N we have a non- GN of updating func- tions for Boolean networks with N nodes. We’ll call µN a distribution on the set GN if a network JN ∈/ µN whenever JN ∈/ GN .

Example 1. 2 Consider a distribution µ on the set provided in example (1.1).

This is a fairly boring distribution; for each N our distribution selects the iden- tity Boolean network with N nodes, but it offers a good example of a distribution that straddles the features of ordered and chaotic dynamics. Let’s examine these features. µ is a critical distribution.

Features of Order

• Short Attractors.

• All nodes are frozen.

Features of Chaos

• Many Attractors. Indeed the identity Boolean Network has 2N attractors.

• Homeostatically instable (maximally).

• High Reachability (although low direct reachablity).

61 Example 2. 3 Consider a uniform distribution µ on the set provided in example

(2.2). µ is again critical.

Features that lay between order and chaos

• On average half of the single bit flips will take a vector to a new attrac-

tor. This feature makes this distribution neither homeostatically stable nor

homeostatically instable.

Features of Order

• On average half of the nodes are frozen.

• Short Attractors.

Features of Chaos

• Many Attractors. Indeed as N gets large we have approximately 2N−1 attrac-

tors.

• High Reachability

Example 3. 3 Consider a distribution µ on the set provided in example (3.2). µ is critical despite having exclusively features associated with chaotic dynamics.

Example 4. 3 Consider a distribution µ on the set provided in example (4.2).

Then the features of µ are the exact same as the features listed above in example

(3.3) and is critical.

62 Introduction

In the previous section we analyzed some distributions on sets without going into

much detail about how we created these sets. Here we’ll look a little deeper and

expand upon how these sets relate to our understanding of dynamics.

We claimed previously that examples (2.2), (3.2) and (4.2) were groups. We prove

this here. But first we need to see that all reversible Boolean networks are related

by one group.

Lemma 6 The set of updating functions for all reversible Boolean networks forms

a group. We’ll call this group R.

For each R ∈ R we see that R permutes M. So we see that R is isomorphic to S2N ,

N the set of permutations on [2 ]. 

For ~x ∈ M let C~x denote the following mapping:

∀~m ∈ MC~x ~m = ~m + ~x

Let C = {C ∈ R|C = C~x for some ~x ∈ M}

Because |M| = 2N we can see that |C| = 2N as well.

Lemma 7 C forms a subgroup of R.

Proof.

Note that this set is given in example (2.2) and each element can be seen to be

63 reversible. Thus, C ⊆ R. Consider C~x,C~y ∈ C.

We can see closure of C is given to us by the closure of M under modulo 2 addition of vectors.

Note that C~xC~y = C~x+~y.

It follows inverses aren’t too hard to find: letting ~x = ~y we see that

C~xC~x = C~x+~x = C~0.

But what’s the meaning of C~0? This mapping leaves each element of M untouched, making C~0 the identity of our group. The closure of C under composition is

guaranteed by the closure of M under vector modulo two addition.

We’ve shown C is closed under composition, it has an identity element and each element has an inverse (even though this was trivial as |C| < ∞ ) which tell us

that C ≤ R. 

Let Π denote the set of all permutations on the letters {1, 2,...,N}. Then Π

forms a group under composition.

Consider π ∈ Π and ~m = (m1, m2, . . . , mN ) ∈ M.

Let’s adopt the following shorthand: π(~m) = (mπ(1), mπ(2), . . . , mπ(N))

We let Pπ denote the following mapping:

∀~m ∈ MPπ ~m = π(~m)

Let P = {P ∈ R|P = Pπ for some π ∈ Π}.

Lemma 8 P forms a subgroup of R.

64 Proof.

Firstly note that P is the set described in example (3.2) and that each element is

in P is reversible. Thus, P ⊆ R. Consider A = Pα and B = Pβ, both elements of

P. Then AB = Pαβ. We need to confirm that AB is in P which means we need to see that αβ is in Π. But Π is closed under composition.

The existence of an identity element of P is guaranteed by the existence of an identity element in Π likewise we’re guaranteed inverses by their existence in Π.

−1 A = Pα−1 is the mapping that for all ~m = (m1, m2, . . . , mN ) ∈ M we have

−1 A (m1, m2, . . . , mN ) = mα−1(1), mα−1(2), . . . , mα−1(N)

So we’ve shown closure, inverses and an identity element and thus, P ≤ R. 

Lemma 9 The set H described in example (3.3) is the group generated by the

union of C and P.

Proof. Consider C = C~x ∈ C and P = Pπ ∈ P for some ~x ∈ M and some π ∈ Π.

We let H=PC, then for all ~m ∈ M the mapping H satisfies:

H ~m = π(~m + ~x).

Or equivalently H ~m = π(~m) + π(~x).

So if we let ~y = π(~x) and Y be the mapping in C such that for all ~m ∈ M we have

Y ~m = ~m + ~y. Then H=YP, thus PC ⊆ CP .

Consider a mapping G = CP . then for all ~m ∈ M the mapping G satisfies:

G~m = π(~m) + ~x. Or equivalently G~m = π(~m + π−1(~x)). so CP ⊆ PC. So we’ve

shown that CP = PC which happens if and only if H = CP is a subgroup of R.

65 

Definition 11 We will say a vector ~x is active on l nodes or has an activity level of l if ~x has exactly l non-zeros nodes. In particular this tell us

l X ~x = eki i=1 for distinct integers 0 ≤ k1, k2, . . . , kl ≤ N

We will adopt the shorthand |~x| = l which will read ~x is active on l nodes.

Definition 12 A node will be called active when it takes on the value one. Con- versely a node is called inactive whenever it takes on the value zero.

For example vector ~m = (1, 0, 1, 0, 0) has 2 active nodes positioned at 1 and 3 and

three inactive nodes positioned at 2,4 and 5.

Definition 13 Vectors ~x,~y have Hamming distance h if we can perturb exactly h nodes in ~y to recover ~x.

To see this mathematically ~x = ~y + ek1 + ek2 + ··· + ekh for distinct k1, . . . , kh

and 0 ≤ ki ≤ N. We will often adopt the shorthand |~x + ~y| to denote Hamming distance between ~x and ~y.

Note that a Boolean network JN has a state space of binary vectors of length N.

At most these vectors differ at N locations meaning that Hamming distance on vectors in the state space of JN is a function which has an integer range from 0 to

N. We will use the phrase ~x “neighbors” ~y in the case that |~x + ~y| = 1.

66 Definition 14 For the use of this research note we would like to adopt the follow-

ing phrase. A Boolean Network JN and it’s updating function A are h-conservative

when

|~x + ~y| = h ⇒ |A~x + A~y| = h

Note that by definition all Boolean Networks are zero-conservative. Two spe-

cific cases of this situation are of utmost interest to us. Nets which are one-

conservative will simply be referred to as conservative. A Boolean network which

is h-conservative for all h we will call universally conservative.

Definition 15 A Boolean Network JN and its updating function are H-conservative

for a set H = {h1, h2 . . . hp} if JN is hi − conservative for each hi ∈ H.

Lemma 10 H ∈ H is universally conservative.

Proof. Consider π ∈ Π and ~c ∈ M Without loss of generality we let H be the

following mapping for all ~m ∈ M we let H ~m = π(~m+~c). now consider two vectors

which differ by p distinct perturbations. ~x,~y ∈ M such that |~x+~y| = p. Note that

this means there exists a unique vector ~p such that ~y = ~p + ~x and |~p| = p. Lets compute H~x and H~y

H~x = π(~x +~c) = π(~x) + π(~c). H~y = H(~p + ~x) = π(~p + ~x +~c)π(~p) + π(~x) + π(~c) So

H~x and H~y differ by a perturbation vector π(~p) but π does not change the activity

67 level of a vector. In other words —π(~p)| = |~p| = p. So we have shown that

|~x + ~y| = p ⇒ |H~x + H~y| = p

meaning H is p − conservative but p is arbitrary so H is universally conservative.



Components of updating functions

Assume that ~x ∈ M and that A~x = ~y. We can write the function A as a set of

functions {A1,A2,...AN } where Ai~x = yi. We get the following set of equations

A~x = {A1,A2,...AN }~x = {A1~x,A2~x,. . . AN ~x} = {y1, y2, . . . yN } = ~y

We call each Ai a component of A.

Definition 16 Let the significant input of a function Ai, denoted by K(Ai),

refer to the number nodes required in ~m to determine the value of Ai ~m.

Definition 17 Let the average significant input of A refer to the expected value of

K(Ai).

We’ll use the phrase “A is a function with K = K0” to denote that the expected significant input for a function Ai of A is K0.

Some functions Ai might be completely determined without any nodes. Such functions are called constant components.

68 Lemma 11 If an updating function has any constant components, then it is not

injective.

Proof. Assume A = {A1,A2,...AN } and that Ai is a . So for

all ~m ∈ M we have that Ai ~m = c ∈ {0, 1} Now consider two vectors

~x = (m1, m2, . . . mi−1, 1, mi+1 . . . mN ), ~y = (m1, m2, . . . mi−1, 0, mi+1 . . . mN ). Then

A~x = A~y but ~x 6= ~y. So A is not injective. 

Corollary 12 If A = {A1,A2,...AN } is injective and K(A)=1 then K(Ai) = 1 for each i.

Proof. Follows directly from the Lemma 11. At best K(Ai) > 1 and all the other functions take in exactly one significant input. Then the expected significant input is greater than one which is a contradiction. 

Corollary 13 If A = {A1,A2,...AN } and K(Ai) = K(Aj) = 1 where Ai and Aj both take input from the same node then A cannot be injective.

Proof. Assume Ai and Aj both take input from the gth node of a vector. Now by the lemma there are only two possible functions for these components Ai and

Aj, namely, copy and negate.

Let ~m = {m1, m2, . . . , mN }. If Ai copies mg then Ai ~m = mg and if it negates it then Ai ~m = ¬mg. So if Ai and Aj select the same function then Ai ~m = Aj ~m and then two vectors where the ith and jth position differ must be transient states, a contradiction. So perhaps Ai and Aj select to different functions. Then

69 Ai ~m = ¬Aj ~m and then the two vectors where the ith and jth position take on the same value must be transient states. Another contradiction. 

Lemma 14 Boolean networks which are h-conservative (for any fixed h) are closed under composition.

Proof.

Consider two h-conservative Boolean networks A, B and two vectors ~x and ~y such that |~x + ~y| = h.

We can get the desired result simply by successively (first B and then A) applying the h-conservative property.

|~x + ~y| = h ⇒ |B~x + B~y| = h ⇒ |AB~x + AB~y| = h

This makes the set of h-conservative nets a semigroup under composition.

Lemma 15 For any set of integers H the set of updating functions which are

H − conservative and reversible form a group under composition.

Let X denote the set of H − conservative mappings. By the lemma above we know that X is a semigroup.

The set of reversible mappings, R is group. Now let DH denote X ∩ R. This is a semigroup because it’s the intersection of two semigroups. But on the other hand,

DH ⊆ R. So DH is a semigroup contained in a finite group. This makes DH a group.  Note that for any sets P and Q comprised of non-negative integers

70 less than N we have DP ∩ DQ = DP ∪Q In particular this tells us that DP ∪Q ≤ DP .

The most extreme example of this would be the following. Let H∗ denote the set of universally conservative and reversible nets. Then for any set P of non-negative

∗ integers less than N we have H ≤ DP . We have proven the following corollary.

Corollary 16 The set of universally conservative and reversible nets is a subgroup of DH for any H.

Theorem 17 Let BN = (M,A) Then the following are equivalent:

(a) BN is reversible and 1-conservative.

(b) BN is universally conservative.

(c) BN is reversible with K=1.

(d) BN ∈ H

We have the following plan for our proof.

(d) ↔ (c)

↑ &

(a) ← (b)

We’ll begin with a proof of

(a) → (d)

Proof. Let D denote the set of reversible and 1−conservative nets. Then D ≤ R.

Let BN = (M,A) be as in (a), then we have A ∈ D. We will show that BN ∈ H the group described in example (4.2). Note that showing D = H has two parts.

71 We need to see that D ⊆ H and that H ⊆ D. The latter proof has already been

done in Lemma 10.

Take the mapping C to be the following mapping:

For all ~m ∈ M we let C ~m = ~m + A~0

Let ψ = CA. Because C and A are 1 − conservative and reversible we know their

composition is also 1-conservative and reversible by Lemma 15. Thus, ψ ∈ D. Let

M1 be the set of vectors with exactly one active node. Now ψ maps the zero vector

to itself so it must permute the elements of M1. How come?

For any 1 ≤ i ≤ N the zero vector is neighbors ei, because ψ preserves neighbors

we can see that ψ(e1)= ej for some 1 ≤ j ≤ N. So, ψ(M1) ⊆ M1. But, ψ is

injective so in fact ψ(M1) = M1. Let π denote this permutation.

Then ψ(ei) = eπ(i).

Let P = Pπ denote the mapping described in example (3.1). Lemma 10 gives us that P and P −1 are both elements in D.

Now we know P −1ψ ∈ D by Lemma 14. For brevity let φ = P −1ψ.

We have all ready seen that

∀1 ≤ i ≤ N φ(ei) = ei and we claim that φ is the identity mapping.

Let La = {~x ∈ M |~x| ≤ a}

Consider the following statement

∀~x ∈ La φ(~x) = ~x (15)

72 Then we have already completed the base case of an inductive argument as we

have satisfied (15) for a = 1 by our creation of φ.

Assume (15) holds for a ≥ 1 we will prove that it must than hold a + 1.

Select ~x ∈ La+1 such that |~x| = a + 1.

Now because a ≥ 1 there must exist at least 2 nodes on which ~x is active.

We can perturb these two positions creating two vectors with activity level a.

Label these vectors ~y and ~z. Note that if we perturb both of these positions the resulting vector, call it ~w, has a − 1 active nodes. By (15) we know that

φ(~w) = ~w φ(~y) = ~y φ(~z) = ~z

But ~y and ~z both neighbor ~x, so φ(~x) must neighbor ~y and ~z. This allows us only two possibilities: ~x gets mapped under φ to either ~x or ~w. How come?

Note that vectors ~y and ~z have a differ on exactly 2 positions call them i and j.

This means there are only two two-step paths between them. Either we perturb

~y at position i creating an intermediate vector which we then perturb at position j which results in the creation of vector ~z, or we can perturb j first and then i.

Regardless the intermediate vector we create must neighbor both ~z and ~y and we insist that because there are only two two-step paths between ~z and ~y there are only two vectors that can neighbor both ~z and ~y. Note that one possibility is that

when perturbing vector ~y, the intermediate vector is active on both i and j this

would be vector ~x. The other possibility would be that this intermediate vector is

73 active on neither i nor j this would represent vector ~w. So φ(~x) is either ~x or ~w.

But because φ is injective we know that ~w is the only vector that is mapped to ~w under φ. So we are left with the conclusion that φ(~x) = ~x.

So our inductive argument tells us that φ(~x) = ~x for all ~x ∈ M meaning φ is the identity map. We will use the symbol I to denote the identity.

φ = I Substituting in our definition of φ we see

P −1CA = I Recall that C is it’s own inverse and we see that

A = CP where C ∈ C and P ∈ P

This means by lemma 9 that A ∈ H, concluding the proof.

(d) → (b)

Assume BN satisfies the conditions of (d). Then A can be written as the compo- sition of a shift C ∈ C and a permutation P ∈ P.

A = CP or perhaps A = PC.

But C and P are both universally conservative. How come? By lemma 9 we know that C and P are elements of H and therefore A is an element of H. The rest of the proof is completed for us by lemma 10.

74 (b) → (a)

Assume BN = (M,A) satisfies the conditions of (b). We will show it has satisfies the conditions of (a). A is 1 − conservative is trivial because A is universally conservative. So the non-trivial part of the proof is that A must be injective. For all h we have:

|~x + ~y| = h ⇒ |A~x + A~y| = h so then obviously

|~x + ~y| > 0 ⇒ |A~x + A~y| > 0

(d) → (c)

Assume the conditions of (d). Then we have some π ∈ Π and some ~x ∈ M such that the mapping A = {A1,A2,...,AN } for all ~m ∈ M does the following for all

~m ∈ M A~m = π(~m + ~x) = π(~m) + π(~x).

Now write out the nodes of a vector ~m in the following way. ~m = (m1, m2, . . . , mN )

−1 Consider a component Aj of the updating function A. Let i = π (j). Then we

know that the Aj accepts input from mi. It is easy to check the value of xi; if it is active then Aj ~m = ¬mi if xi is inactive then Aj ~m = ¬mi. Regardless, we’ve seen that value of Aj ~m is completely determined by the value mi. So K(Aj) = 1 but j was arbitrary so each component of A must have exactly one significant input and therefore K(A)=1.

75 (c) → (d)

Assume BN = (M,A) has the conditions of (c). We will show that there exist

C ∈ C and P ∈ P such that A = PC. Let A = {A1,A2,...AN }. Note that

K(Aj) = 1 by Corollary 12. This means Aj requires only one node in vector ~m =

(m1, m2, . . . , mN ) to determine the value of Aj ~m. Let i(j) = g when Aj requires only position g of a vector ~m to determine the value of Aj ~m. Note that we have only two possibilities either Aj ~m = mg or Aj ~m = mg +1. Note that by corollary 12 each component has exactly 1 input and by corollary 13 this input must be unique.

Therefore [N] → i([N]) represents a permutation; denote this permutation by π.

Let P be the following mapping: For all ~m ∈ M we let P ~m = π−1(~m). Now letting

ψ = PA, we see we have altered the wiring diagram such that each node feeds input only to itself. Note that P ∈ P. Now taking ψ(~0) = ~c, we create a new mapping C: for all ~m ∈ M, we let C ~m = ~m + ~c. If we let φ = Cψ, we conclude that φ is the identity mapping,

φ = I. Substituting in our definition of φ we see

CP −1A = I. Recall that C−1 = C yielding

A = PC with P ∈ P and C ∈ C. So by Lemma 9 we have proven A ∈ H.

Conjecture

We suspect that the following is also equivalent to the conditions of the theorem:

BN is reversible and H-conservative and ∃h1, h2 ∈ H with gcd(h1, h2) = 1.

76 However we have not yet found a proof of this.

The Derrida Curve

How does this H-conservative property answer our question about the shape of the

Derrida curve? If JN is H-conservative with h ∈ H then the Derrida curve is given by some function d such that d(h/N) = h/N.

The converse of this is however generally false. But, the converse does hold for h=1. This fact with the theorem above together yields the following corollary.

Corollary 18 Assume BN = (M,A) is a reversible Boolean network with a Der- rida curve given by the function d where d(1/N) = 1/N. Then K(A)=1.

It may appear that we have proven that for the distribution of reversible networks the phase transition from chaotic dynamics to critical dynamics appears at K = 1.

But this is not the case! Recall that the definition of critical only requires that the limit as N gets large of the slope of Derrida curve to be one. Coppersmith et al. present numerical evidence that this transition appears at K ≈ 1.65 and shows it’s bounded between 1.4 and 1.7 in [7] and [6]. Note that while this is indeed a phase transition, it appears that this is not a transition in the classic sense. Adopting the notation used in [7] and [6] , we have for dissipative case ordered dynamics for all distributions with K < Kc and chaotic dynamics for all K > Kc where

Kc = 2. However for the reversible model the situation is slightly different. Again we see chaotic dynamics for all distributions with K > Kc. However, there are no

77 distributions on the reversible model with ordered dynamics so we can say that all distributions for K < Kc are critical.

78 9.2 Distance Preserving Bijections

Description:

I reframed the results of Hallmarks of Chaotic Versus Ordered Dynamics in an algebraic context. This note cannot be republished due to a copyright.

9.3 Unicyclic Boolean Networks

Description:

B. Elbert worked on the so-called unicylic networks. Here I give an application of the algebraic perspective to unicyclic networks.

Unicyclic Networks

Sept 29, 2011.

Abstract

Some thoughts on unicyclic Boolean networks. A response to a conjecture

of B. Elbert.

We consider an n dimensional Boolean network.

Definition 18 Let I be an n by n identity matrix. We define the matrix An in the following way: let Ai,j = Ii+1modn,j for all i, j ∈ [n].

79 So for n = 3 we have   0 0 1       A3 =  1 0 0        0 1 0

Proposition 19 An = I

Proof.Let ei be the unit vector active on position i. Note that Aei = ei+1mod n.

n n Of course this implies that A ei = ei. Note that the unit vectors {ei}i=1 form a

basis for the state space and A is linear. It follows that because An fixes the basis elements it fixes all elements. 

Let ~x = (x1, x2, . . . xn) Lets define an n = 3 dimensional Boolean network by

0 f(~x) = A3~x + (101)

Note that + here is used as mod 2 addition.

Note that every unicyclic Boolean network can be thought of in this way. First we permute the nodes and second we decide which ones to negate. The matrix A tells us how to permute and then a constant tells us how to negate.

Definition 19 f is unicyclic if and only if

f(~x) = A~x + ~c for some ~c ∈ 2n.

Note the definition above is actually a proposal to be confirmed when this note is coupled with B. Elbert’s notes. But for this paper we will take this to be the

80 definition. Note that a unicyclic Boolean Network B is a constant shift after a permutation so by [36] we have that B is an isometry. The following conjecture was formulated by B. Elbert:

Conjecture 1 A unicylcic Boolean network must be periodic of 2n (these need not minimal periods).

Proof.Let ~x ∈ 2n Consider f 2n(~x)

= A(A(... (A~x + ~c) + ~c) + ··· + ~c) + ~c

Note we have 2n constants c and 2n matrixes A in the expression above. multi- plying this out we get

A2n~x + (A2n−1 + ··· + A + I)~c we confirm that everything in the polynomial (A2n−1 + ··· + A + I) cancels as

Ak = Ak+n for all k. In particular this means that Ak + Ak+n = 0 as we’re dealing with mod 2 addition. So we’re left with

A2nx = (An)2~x = I~x = ~x

Finally the conclusion is that f 2n(~x) = ~x yielding an affirmative result to Ben’s conjecture. 

81 9.4 Counter-Examples

Description:

W. Just and I co-authored a note discussing what can go wrong when converting

a Boolean Network into an ODE. In some ways this note acts as fine–tuning of

the definition of consistency. The examples and counterexamples regarding consis-

tencies with their illustrations are my contributions. More precisely, the sections

entitled: “Notations and Some basic definitions,” “D1(f, γ) for Boolean constants

f in one dimension,” “D1(f, γ) for Boolean constants f in higher dimension,”

“D1(f, γ) for nonconstant f in one dimension,” “D1(f, γ) for nonconstant f in

higher dimension,” “The Case n = 2” are my contributions excepting some re- marks by W.Just. The note is included here with permission from W. Just. The other sections are written by W. Just excepting a sprinkling of illustrations and introductory examples.

(In)consistency: Some low-dimensional examples

Winfried Just and Mason Korb

Ohio University

June 14, 2011.

Abstract

We consider some basic and hopefully enlightening low-dimensional ex-

amples of Boolean systems and their ODE counterparts and explore whether

82 their ODE dynamics is consistent with the Boolean dynamics.

This note combines and revises the material of [22] as well as some of the material in [23] and [41]. Thus it supersedes [22] and the relevant parts of [23] and [41]. The part of [23] that is omitted here is jointly covered by [30] and a planned note of Bismark; the part of [41] that is not covered here has been moved to [40].

Notation and some basic definitions

For undefined notions, see [26] and [28]. Our notation mostly follows the one in [26,

28], but in this note we shall do away with B and simply write D1(f,~γ) instead of

D1(B,~γ) and D2(f,~γ) instead of D2(B,~γ), where f denotes the updating function

of B. Notice that the state space of B is implied by the dimension of the domain

of f. We will let d(x, y) denote the Euclidean distance between vectors x, y ∈ Rn.

For the reader’s convenience, let us list the definitions of some functions that

were considered in [26] and [28] and that we will frequently use:

3 g(xi) = 3xi − xi − 3

  0 if x ≤ −1,  i   S(xi) = .5(x + 1) if −1 < x < 1,  i i    1 if xi ≥ 1 .

83   0 if xi ≤ 0, s(xi) =  1 if xi > 0.

We need to be careful about using s(~x) and S(~x). When comparing an n-

dimensional Boolean network with another system we let

n 2n s(~x) = (s(x1), . . . s(xn)) regardless of whether ~x is in R or R . On the other hand,

we will let S(~x) = (S(x1),...S(xn)) whenever the ODE system has n-dimensions.

In [26] we constructed associated ODE systems D1(f,~γ) and D2(f,~γ) for any

n-dimensional Boolean system B. Recall that D1 was defined in the following

manner: for each i ∈ [n] we let:

x˙i = γi(g(xi) + 6Pi(S(~x))), (16)

where γi > 0. One can think about the γis as constants, but our will

not be affected if the γis are allowed to depend on the state or even change over

time, as long as they are all bounded and bounded away from zero, that is, if there

are constants M > m > 0 such that m < γi(~x,t) < M for all i, ~x, and t.

Exercise 1 Show that (16) has one globally stable equilibrium < −1 if Pi(S(~x)) <

1 5 6 , one globally stable equilibrium > 1 if Pi(S(~x)) > 6 , and three equilibria (a locally

stable one < −1 and an unstable one in (−1, 1), and another locally stable one > 1)

1 5 when Pi(S(~x)) ∈ ( 6 , 6 ).

n As in [26], we will consider (16) for real-valued function Pi : R → R that have

84 some (but not necessarily all) of the following five properties:

1 Pi takes the same values as fi on vectors of zeros and ones.

n 2 Pi is continuous and maps [0, 1] into [0, 1].

3 Pi is a polynomial function.

4 Pi has the smallest possible degree.

5 Pi is faithful.

The definition of D2 in some ways just reuses D1 after modifying f. Let f =

n n n n (f1, f2, . . . , fn) : 2 → 2 be given, where 2 is our shorthand for {0, 1} . Recall

that f is the updating function for a uniquely determined n-dimensional Boolean

system B. We want to extend f to an updating function f + : 22n → 22n of a

2n-dimensional Boolean system B+. Now for each i ∈ [n] we define an auxiliary

functions ci(~s) = sn+i that copies the value of variable number n + i to variable

number i. Finally, let

+ f = (c, f) = (c1, . . . , cn, f1, . . . , fn), (17)

+ and define D2 in the following manner: D2(f,~γ) = D1(f ,~γ).

− 3 + Let x be the unique root of the polynomial g(xi) = 3xi − xi − 3 and let x be

3 − the unique root of the polynomial g(xi) + 6 = 3xi − xi + 3. Then x ≈ −2.1038 and x+ ≈ 2.1038.

85 Lemma 20 Let f define any n-dimensional Boolean system and let ~γ denote any vector of positive reals of suitable dimension. Then [x−, x+]n is a forward-invariant

− + 2n set in D1(f,~γ) and [x , x ] is a forward-invariant set in D2(f,~γ).

− + n − + n Proof. Let ~x(0) ∈ [x , x ] . If the trajectory φ~x escapes [x , x ] then there exist

− + a time τ and some variable xi such that xi(τ) ∈/ [x , x ]. Because our functions

− are continuous we know there must exist a time t such that xi(t) = x with a

+ negative derivative or such that xi(t) = x with a positive derivative. Let us deal

− with the case that xi(t) = x . Then equation (16) becomes:

3 x˙i = γi(3xi − xi − 3 + 6Pi(S(~x))) = γi(6Pi(S(~x))) (18)

But the sigmoid function S varies between zero and one, so we have seen that

0 ≤ x˙i ≤ 6γi. In other words we can make it as fast or slow as we want but we

− can not make it negative. If xi reaches x it will either be pushed back (perhaps slowly, or after a period of time) into the interval or x− is a fixed point for the variable xi.

+ A symmetric situation occurs if xi tries to escape past x . Then equation (16) becomes:

3 x˙i = γi(3xi − xi + 3 + 6Pi(S(~x)) − 6) = γi(6Pi(S(~x)) − 6). (19)

But since Pi(S(~x)) ≤ 1, the right-hand side of (19) will never be positive, and we can argue as in the previous case. 

86 These sets are not actually invariant, but this does not bother us, since we only

care about forward trajectories and their Boolean counterparts anyway. Thus in

view of Lemma 20 we will henceforth consider the state space of D1(f,~γ) to be

− + n − + 2n [x , x ] and the state space of D2(f,~γ) to be [x , x ] . Note that both of these

state spaces are compact and connected.

Assume that some ODE system

~x˙ = p(~x) (20) as above is given and that for every ODE trajectory we defined a symbolic real-time trajectory Ψ(~x(0)) as in [28]. For an initial condition ~x(0) and a time interval T Let

Ψ(~x(T )) = {s(~x(t)) : t ∈ T }. We may think of Ψ(T ) as the ODE implementation of a Boolean trajectory for initial condition ~x(0) on T . We will spend a good deal of time considering the “quality” of this implementation. Of particular importance for us will be how many times the ODE approximation of the Boolean model changes Boolean states on T .

Definition 20 For an initial condition ~x(0) and a time interval T , we will say

~x(T ) is switchless when Ψ(~x(T )) = {s} for some s ∈ 2n. In this case we will simply write Ψ(~x(T )) = s.

Definition 21 Let U be a subset of the state space of the ODE system (20).

We will call p and f strongly consistent on U when for every initial condi- tion ~x(0) ∈ U there exists a sequence (Tτ )τ∈N of pairwise disjoint consecutive and

87 nondegenerate intervals with S = [0, ∞) such that for all τ ∈ τ∈N N

(i) ~x(Tτ ) is switchless, and

n (ii) The sequence (Ψ(~x(Tτ )) : τ ∈ N) is a Boolean trajectory in B = (2 , f).

If for all ~x(0) ∈ U the sequence (Ψ(~x(Tτ )) : τ ∈ N) is in fact a synch trajectory

of B = (2n, f), that is, if for all τ ∈ N and ~x(0) we have

τ (iii) Ψ(~x(Tτ )) = f (s(~x(0))),

then we will call p and f strongly s-consistent on U.

Note that Definition 21 agrees with the definitions of strong (s)-consistency given by Definition 3 of [28], except that its wording explicitly refers to the right- hand side p(~x) of (20), which will be convenient for our work in this note. For

any given p and f there exist maximal U = U(f, p) and U s = U s(f, p) such that

p and f are strongly (s-)consistent on U (U s). These sets consist of all initial conditions ~x(0) for which the ODE trajectory is strongly (s-)consistent with the

Boolean dynamics given by f.

In general, U(f, p) may be a tiny subset of the state space or may even be

empty. It is not immediately clear when we can assume U(f, p) to be an open set

or even to have nonempty interior. The next definition describes some additional

desirable properties of U(f, p) or U s(f, p).

Definition 22 Let St denote the state space of (20), let f : 2n → 2n, and let

U = U(f, p) or U s(f, p).

88 (i) We say that U is complete if for every Boolean state s ∈ 2n there exists an

nonempty open V ⊂ U with s(~x) = s for every ~x ∈ V .

(ii) We say that U is universal if the set V := {~x ∈ St : ∃ t ≥ 0 ~x(t) ∈ U} contains a dense open subset of St of full Lebesgue measure.

Note that V in point (iii) is the set of initial conditions whose trajectories are

eventually (s-)consistent with the Boolean dynamics in the sense defined in [28].

Thus U(U s) is universal if eventual (s)-consistency holds on almost the entire

state space. Example 2 below shows that U s(f, p) may be complete without being

universal and 23 below shows that U s(f, p) may be universal without

being complete.

In the remainder of this note we will explore behavior of the notions that we

reviewed above for some very simple low-dimensional examples of Boolean systems.

We will start with the simplest possible Boolean systems and then work our way

up to slightly more complicated ones.

D1(f, γ) for Boolean constants f in one dimension

Let B = (2, f) be a Boolean system of dimension one with a constant updating

function. There are exactly two such systems, given by f(s) ≡ 0 and f(s) ≡ 1.

We need only one variable with index i = 1 here and we have n = 1, but we will

n still write xi, Pi, and [0, 1] in view of later work.

The most natural choices for Pi are the constant functions Pi(~x) ≡ 0 if f(s) ≡ 0

89 and Pi(~x) ≡ 1 if f(s) ≡ 1. Then we get s-consistency on the whole state space. In fact, this works under more general assumptions about Pi.

Proposition 21 Assume f : 2 → 2 is a constant Boolean function, γi > 0,

n and Pi satisfies Conditions 1 and 2 above is such that Pi([0, 1] ) ⊂ [0, 1/6) or

n Pi([0, 1] ) ⊂ (5/6, 1]. Then (16) and f are strongly s-consistent on the whole state space [x−, x+]n.

Proof: By Exercise 1, under our assumptions the right-hand side of (16) has only one globally stable equilibrium x∗ outside of the interval [−1, 1] at all times,

∗ ∗ with x < −1 if f ≡ 0 and x > 1 if f ≡ 1. Thus any trajectory in D1(f,~γ) will

move towards this equilibrium. It will cross the threshold of 0 at most once and

the Boolean state s(t) defined by

s(t) = 0 if ~x(t) < 0 (21) s(t) = 1 if ~x(t) ≥ 0 will eventually be the fixed point of B = (2, f). Strong s-consistency immediately follows. 

It may seem puzzling that 21 and 23 use the assumption that

n n Pi([0, 1] ) ⊂ [0, 1/6) or Pi([0, 1] ) ⊂ (5/6, 1]. If fi is a Boolean constant, why would we want to use anything else for Pi than the corresponding constant polynomial?

The answer is that we do not really want to use other Pis, but such alternatives

90 may naturally result from our conversion methods. For example, the Boolean

expression s1 ∧ s1 ∧ ¬s1 is a contradiction and thus equivalent to the Boolean

constant zero. Our first conversion method described in [26] translates it into

2 the polynomial P1(x1) = (x1) (1 − x1) which maps [0, 1] onto [0, 0.1481] and thus

still satisfies the assumptions of Proposition 21. On the other hand, (s1 ∧ ¬s1)

also represents a contradiction, but it gets translated into a polynomial P1 that

takes all values on the interval [0, 1/4] and thus does not satisfy the assumption of

Proposition 21. We will return to this example below (Example 1). The conversion

methods implemented in [12] do not check whether a given Boolean expressions is or

is not a contradiction or , and our software cannot avoid such situations.

Exercise 2 Show that the conversion method to faithful Pi that was described in [26] does not give Pis that satisfy the assumptions of Propositions 21 and 23.

On the other hand, the conversion method described in [51] always gives poly- nomials Pi of minimal degree, which for Boolean constants are necessarily constant, no matter how the tautology or contradiction is actually represented as a Boolean expression.

So perhaps we should simply adopt the conversion method of [51] instead of investigating possibly pathological interpretations of Boolean constants? This may not be a good idea, for three reasons.

First of all, for large Boolean systems the conversion method of [51] requires a lot of time to compute; ours can be implemented in a much faster way.

91 Second, if we aim at results of largest possible generality (see [25]), we also need to deal with Pis that are in some ways less than optimal. Notice, for example, that

2 the conversion of s1 ∧ s1 ∧ ¬s1 into the polynomial P1(x1) = (x1) (1 − x1) is quite natural, but not optimal in the above sense.

Third, we want to build up some results that we can use in a more general setting. Suppose for example that f1 = s1 ∧ s3. Even the method of [51] will translate this into a quadratic polynomial. However, if we investigate the behavior of a trajectory along which s(x3) = 0, then f1 will behave along this trajectory as a Boolean constant in exactly the same way as any contradiction. The more general result Proposition 23 may give us a tool for investigating this trajectory, while a result with the more stringent assumption that Pi be constantly equal to zero would not.

The following example shows that the assumptions of Proposition 21 can be weakened to some extent.

Example 1 Let f(s1) = s1 ∧ ¬s1. Then the corresponding polynomial P1(x1) =

(x1)(1−x1) does not satisfy the assumptions of Proposition 21, but the correspond- ing ODE implementation D1(f, 1) is still strongly consistent with f on the whole state space.

Proof: The ODE for the unique variable x1 is

3 x˙1 = 3x1 − x1 − 3 + 6S(x1)(1 − S(x1)). (22)

92 We can get a feeling for this function by examining Figure 9.

3 Figure 9:x ˙1 = 3x1 − x1 − 3 + 6S(x1)(1 − S(x1))

We find that this cubic has only one zero at x−, so this example gives p(x) and f(s) which are strongly consistent on the whole state space, with {x−} being the only attractor. The proof is identical to that of Proposition 21. 

However, some assumptions beyond 1 and 2 on Pi are necessary in Proposi- tion 21.

s Example 2 Let k = s1 ∧ s1 ∧ s1 ∧ s1 and let f1 = k ∧ ¬k. Then U (f, p) for the

ODE implementation p = D1(f, 1) based on the corresponding polynomial P1(x1) =

4 4 x1(1 − x1) is complete but not universal.

Proof: The ODE for the unique variable x1 is

4 4 x˙1 = g(x1) + 6S(x1) (1 − S(x1) ). (23)

93 4 4 Figure 10:x ˙1 = g(xi) + 6S(x1) (1 − S(x1) )

We can get a feeling for this function by examining Figure 10.

− The system has three fixed points r1 = x , r2 = .58875, r3 = .87703. Let us consider x1(0) ≥ r2. Then there is no t such that Ψ(x1(t)) = 0. This demonstrates

+ that the system is not eventually consistent on any U ⊆ [r2, x ). On the other

− s hand, if we let U = [x , r2) we have strong s-consistency on U. Thus U (f, p) =

− [x , r2), which is complete but not universal. 

The following example generalizes Example 2 and identifies the mechanism responsible for the observed dynamics.

Example 3 Assume f : 2 → 2 is the constant Boolean function f ≡ 0 and

γ1 > 0. Moreover, assume that P1 satisfies Conditions 1 and 2 above is such that g(x1) + Pi(S(x1)(0)) > 0 for some x1(0) with x1(0) > 0. Then the trajectory of x1(0) in (16) is not eventually strongly consistent with f.

94 Proof: By Exercise 1, at time t = 0 there will be a locally stable equilib-

∗ ∗ rium x (0) > 1 of (16) andx ˙ 1(x1(0)) > 0, so x1 will move towards x . This situation will persist over some time interval T ; for all t ∈ T , the variable x1(t) will increase and move towards a changing equilibrium x∗(t) > 1. In particular, x1(t) will not cross 0 as it should if the trajectory were consistent with f. In order for x1 to change direction, g(x1) + Pi(S(x1)(0)) would need to become negative.

But by the Intermediate Value Theorem, this would requirex ˙ 1(t1) = 0 at a right endpoint t1 of T , in which case the trajectory of x1(0) would reach a fixed point whose Boolean state s(x1(t1)) = 1 is inconsistent with f. If Pi is Lipschitz con- tinuous rather than merely continuous, the trajectory of x1(0) will never actually reach a fixed point and T will be infinite. 

Notice that the assumptions of Example 3 contradict the assumptions of Propo- sition 21, but they are not an outright of the latter.

Problem 1 Formulate assumptions that are both necessary and sufficient in Propo- sition 21 and prove a versions of the proposition under these more general assump- tions.

Proposition 22 For any ODE implementation p = D1(f, γ) of a contradiction or tautology f : 2 → 2 the set U s(f, p) has nonempty interior.

Proof: We prove the proposition for the case of a contradiction; the case of a

− tautology is analogous. Note thatx ˙1(x ) < 0 by Condition 1 on P1. Moreover,

95 since P1 is continuous by Condition 2, Thus there exists an  > 0 such that for

− − all y with |y − x | <  we havex ˙1(y) < 0. It follows that U = [x , ) is as required in the proposition. 

D1(f,~γ) for Boolean constants f in higher dimen- sions

Proposition 21 easily generalizes to the following result:

Proposition 23 Assume f : 2n → 2n is a Boolean function such that each com- ponent fi of f is a Boolean constant. Assume γi > 0 for all i ∈ [n] and that

n each Pi satisfies Conditions 1 and 2 above and is such that Pi([0, 1] ) ⊂ [0, 1/6)

n or Pi([0, 1] ) ⊂ (5/6, 1]. Then (16) and f are strongly consistent and eventually strongly s-consistent on the whole state space [x−, x+]n. However, if n > 1, then the set on which (16) and f are strongly s-consistent is not complete.

Proof: The proof of strong consistency is exactly the same as the proof of Propo- sition 21, since we can treat each variable separately. The last sentence will follow from Lemma 28 of the Appendix. We will defer its and instead give two illustrative examples here. 

For our first example, assume that s∗ is the steady state of the Boolean system

∗ ∗ and ~x(0) is an initial state with s(xi(0)) = 1 − si and s(xj(0)) = 1 − sj for some i 6= j, then both xi will cross zero at some time ti > 0 and xj will cross zero

96 at some time tj > 0. For the trajectory of ~x(0) to be s-consistent with f, these

crossings would have to happen at exactly the same time. This may be true for an

individual ~x(0), but not for all initial conditions in an open neighborhood of ~x(0).

To see why, consider the simplest case where Pi = Pj are constant. Then the crossing times ti and tj depend monotonically on xi(0) and xj(0) in an identical fashion, and we will have ti = tj only if xi(0) = xj(0). Thus if U is the set

∗ ∗ of all initial conditions ~x(0) with s(xi(0)) = 1 − si and s(xj(0)) = 1 − sj while

∗ s s(xk(0)) = sk for k ∈ [n]\{i, j}, then U (f, p)∩U will be the nowhere dense subset of U that is obtained by intersecting U with the hyperplane {~x : xi = xj}.

For our second example, let us consider the two-dimensional f given by the following Boolean rules:

f1(s) = ¬(s1 ∨ s2) ∨ s1 (24)

f2(s) = ¬(s1 ∨ s2) ∨ s2. (25)

The functions above are not constant, but the example is still easy to analyze

and nicely illustrates the phenomenon of non-simultaneous crossings, so we include

it here. Note that f = (f1, f2) maps each Boolean state s ∈ {01, 10, 11} to itself,

while 00 is mapped to 11. Our standard conversion method to D1(f,~1) gives us

the following set of equations:

97 Figure 11: Phase portrait of the D1 counterpart of a Boolean network where every element is a fixed point except 00 which is succeeded by 11.

P1(S(x)) = (1 − S(x1)(1 − S(x2)) + S(x1) − (1 − S(x1)(1 − S(x2))S(x1) (26)

P2(S(x)) = (1 − S(x1)(1 − S(x2)) + S(x2) − (1 − S(x1)(1 − S(x2))S(x2).

This results in a phase portrait given by Figure 11 which illustrates a number of

things. First of all, one can see that from many initial conditions, the ODE system

will approach a steady state that corresponds to the correct Boolean steady state

of f. However, for initial conditions with x1(0), x2(0) < 0, the ODE dynamics

will be strongly s-consistent with the Boolean dynamics only if x1(0) = x2(0),

similarly to the situation in the previous examples. Moreover, for all the initial

conditions below or to the left of the two curved sample trajectories shown, the

system will approach one of the two steady states (x+, x−) or (x−, x+). Since these

98 regions include many initial conditions with x1(0), x2(0) < 0, we will not even have consistency for most trajectories starting with x1(0), x2(0) < 0. Finally, we note that the ODE system also has two unstable fixed points at (0, x+) and (x+, 0) which do not have Boolean counterparts.

Again, the assumptions of Proposition 23 are stronger than necessary.

Example 4 Consider the following two-dimensional Boolean Network: Any initial condition (s1, s2) is succeeded by (0, 0). We let

f1(s) = s2 ∧ ¬s2 f2(s) = s1 ∧ ¬s1. (27)

Letting x˙i = g(xi) + 6Pi(S(~x)) as prescribed by D1(f,~γ) with the standard implementations of the Pis we find:

x˙ 1 = γ1[g(x1) + 6S(x2)(1 − S(x2))] (28)

x˙ 2 = γ2[g(x2) + 6S(x1)(1 − S(x1))]

Taking γ1 = 2 and γ2 = 10 gives us the phase portrait seen in Figure 12.

The part of the x1-nullcline centered at (0, 1) and the part of the x2-nullcline centered at (1, 0) present no serious problem and the only attractor of this system is given by {(x−, x−)}. This proves that again this system and f(s) are strongly consistent on U = [x−, x+]2.

It is important to note that our selection of (γ1, γ2) = (2, 10) had no impact on the location of nullclines. But if we warp these nullclines we can introduce

99 Figure 12: Phase Portrait of (28).

additional fixed points.

Example 5 Consider the following two-dimensional Boolean network: Any initial

condition (s1, s2) is succeeded by (0, 0). Let k = s1 ∧ s1 ∧ s2 ∧ s2 and

f1(s) = k ∧ ¬k f2(s) = k ∧ ¬k. (29)

∗ 3 Construct D1(f,~γ) by taking ~γ = (1, 1), letting k = (S(x1)S(x2)) and using the standard ODE implementation p = D1(f, (1, 1)) of (29)

∗ ∗ x˙i = g(xi) + 6k (1 − k ). (30) for i ∈ {1, 2}.

This system has 3 fixed points: (x−, x−), (.58875,.58875), and (.87703,.87703).

the phase portrait can be seen in Figure 13.

100 Figure 13: Phase Portrait of (30).

Because for this system x1 = x2 impliesx ˙ 1 =x ˙ 2 we find that Y = {(y, y) ∈

[x−x+]2} is invariant. This shows us that this system is not strongly consistent on

[x−, x+]2. Consider the initial condition ~x(0) = (2, 2). If the system of ODEs in

this example and f(s) were strongly consistent there there would exist at least one

t > 0 such that Ψ(x(t)) = f(Ψ(x(0))) = 0. This tells us that x(t) has two non-

positive components. But because Y is invariant this means that the trajectory

would have to pass through (.87703,.87703) which is impossible as this is a fixed point. However, we can see that f and p are strongly consistent on [x−, x+]2\Y ; in other words, [x−, x+]2\Y ⊆ U(f, s). Thus U(f, s) is complete and universal. One can also easily see that the intersection of the set U s(f, p) with the first quadrant is contained in Y , so that U s(f, s) is universal but not complete. The latter was to be expected from our earlier observations about nongenericity of simultaneous crossings of boundaries.

101 It is quite interesting to note that in contrast with Example 2 we do get a

universal U s(f, p). This cannot happen in one dimension, where for constant f

we must have either U s(f, p) = [x−, x+] or U s(f, p) note universal. The additional dimension provides an opportunity for trajectories to move around the problematic areas. We will make good use of this effect in our later work.

Problem 2 (a) Formulate assumptions that are both necessary and sufficient in

Proposition 23 and prove a version of the proposition under these more general assumptions.

(b) Formulate assumptions that are both necessary and sufficient in Proposition 23 if we replace “eventually strongly s-consistent on the whole state space [x−, x+]n” by “U s(f, p) is universal” in its conclusion and prove a version of the proposition under these more general assumptions.

Part (a) of Problem 2 should not be too difficult; part (b) is more interesting, but also likely to be more challenging.

A generalization: D1(f,~γ) for loop-free f

Recall that with any n-dimensional Boolean system with updating function f we can associate a directed graph Df = ([n],Af ) called the connectivity of f such that

< j, i > ∈ Af iff variable sj acts as an essential input in the regulatory function fi.

We will write D instead of Df and A instead of Af if f is implied by the context.

102 We call f loop-free if D contains no directed cycles. Boolean constants as in the previous section are loop-free. The simplest examples of Boolean functions f that

are not loop-free have dimension 1 and A = {< 1, 1 >}.

In any loop-free Boolean system the set of nodes [n] can be partitioned into

Sκ levels; [n] = ξ=0 Lξ, where

• L0 6= ∅ and L0 consists of all variables with constant regulatory functions;

that is, of all variables with indegree 0 in D.

Sη Sη • Lη+1 consists of all variables i such that i∈ / ξ=0 Lξ and j ∈ ξ=0 Lξ for all

< j, i > ∈ A.

Consider the sync trajectory of initial state s(0) for a loop-free f. For all i ∈ L0, the Boolean state si(τ) remains constant for all τ ≥ 1. Variables in L1 take their inputs only from variables in L0, so si(τ) will remain fixed for all τ ≥ 2.

By induction it follows that the system will reach a unique steady state s∗ after at most κ + 1 steps. (Notice that in our treatment of “Boolean constants” these variables need to take their constant state only for times t ≥ 1).

Lemma 24 Let f : 2n → 2n be loop-free and let ~γ be an n-dimensional vector

of positive reals. Assume that for all i with si ∈ L0 the assumptions of Proposi- tion 23 are satisfied by Pi, and assumptions 1 and 2 are satisfied by all Pi. The the dynamics of D1(f,~γ) is eventually strongly consistent on the whole state space with f.

103 Exercise 3 (a) Prove Lemma 24.

(b) Give an example that we cannot replace “eventually strongly consistent” with

“strongly consistent” in the lemma.

(c) Show that for sufficiently large ε we can replace “eventually strongly consistent” in the lemma with “ε-s-consistent.”

Exercise 4 Find an example of a loop-free Boolean system that is chaotic in the sense of the slope of the Derrida curve.

Notice that an example as in exercise 4 will give us automatically an example of a chaotic (in the sense of the Derrida curve) example of a Boolean system for which the corresponding ODE system (in view of Lemma 24) has a single steady state and is therefore not chaotic. However, this Boolean system does not exhibit such hallmarks of chaos as long attractors, many attractors or sensitive dependence

(in the long run) on initial conditions.

The really interesting Boolean systems are not loop-free. Therefore, Lemma 24 is of somewhat limited interest all by itself. However, the lemma fails to generalize in some illuminating ways, which may help us build up some helpful intuitions for the later parts of our project.

D1(f, γ) for nonconstant f in one dimension

When n = 1, then there are four Boolean systems of dimension n: Two of them

represent Boolean constants. These were already dealt with in Section 9.4. The

104 other two have regulatory functions fc(s) = s1 (the “copy” function), and fcn(s) =

1 − s1 (“copy-negation”). Neither of the latter is loop-free. Let us take a closer

look at these systems.

The case D1(fc, γ)

First consider our standard implementation

P1(x1) = x1. Then P1 is faithful and P1 ◦ S is piecewise linear. It follows that

− + D1(fc, γ) has three steady states: Locally asymptotically stable ones at x , x and

an unstable one at zero. For x(0) < 0 the trajectory will move towards x−, for x(0) > 0 the system will move towards x+, and for x(0) = 0 the trajectory will remain at the unstable fixed point. The exact same observation holds for every faithful P1. We get the following.

s Proposition 25 Let p = D1(fc, γ) be implemented by a faithful P1. Then U (f, p) =

[x−, x+]\{0}.

The assumption that P1 be faithful is necessary in Proposition 25. For example,

c consider P1 such that P1 ◦ S(x) takes negative values for x < x < 0 and positive values for x > xc. Then an inspection of the phase-line diagram of p = D1(fc, γ) reveals that for xc < x(0) < 0 the real-time Boolean trajectory Ψ(x(0)) will not be switchless, which is inconsistent with the Boolean dynamics. For such a choice of

− + c P1 we still have eventual strong consistency on [x , x ]\{x } though, which implies that U s(f, p) is universal.

105 Exercise 5 Construct an example of P1 that satisfies conditions 1 and 2 such that

s for the corresponding p = D1(fc, γ) the set U is not universal.

The case D1(fcn, γ)

In this case, all Boolean trajectories satisfy

... 7→ 0 7→ 1 7→ 0 7→ 1 7→ ...

Consider our standard implementation of p = D1(fcn, γ) with P1(x1) = 1 − x1

and P1 ◦ S is piecewise linear. Thus for x < 0 the form of (16) implies that

dx dx dt > 0, and for x > 0 the form of (16) implies that dt < 0. We conclude that

D1(fcn, γ) has exactly one globally asymptotically stable steady state at zero. For

− + any x(0) ∈ [x , x ] the trajectory of x(0) in D1(fcn, γ) will retain the Boolean

state s(x(0)) at all times, and U(f, s) = ∅. Thus D1(fcn, γ) will be maximally

inconsistent with the Boolean dynamics of fcn.

The case n = 2

In this case there exist already 28 = 256 different Boolean systems of dimension n.

Some of these are loop-free (Exercise: How many?) and covered by Section 9.4;

some of them reversible in the sense described in [37] (Exercise: How many?);

most are neither. As was first observed by Mason, some of the reversible Boolean

systems of dimension n = 2 are chaotic in the sense of the Derrida curve. For

106 example, the system given by

00 7→ 00 01 7→ 11 10 7→ 01 11 7→ 10

has this property. Since D1(f,~γ) is a two-dimensional ODE system for the lat- ter, the ODE dynamics must be ordered. This gives, even prior to any simulations, the following result.

Corollary 26 Chaos in a loop-free Boolean system does not imply chaos in the corresponding ODE system D1(f,~γ).

This is quite remarkable, but the absence of chaos in D1(f,~γ) may be a bit of an artifact due to its low dimension. We still need to explore whether such examples exist, or even are the norm, in higher dimensions or if we work with other ODE analogues, such as D2(f,~γ).

The case D2(fcn, γ)

Define a 2-dimensional updating function

+ 2 2 fcn(f1, f2) : 2 → 2 by choosing the following regulatory functions.

f1(s) = 1 − s2 f2(s) = s1. (31)

This system is critical and reversible. A non-steady state attractor is given by

00 7→ 10 7→ 11 7→ 01 7→ 00, (32)

107 and since this attractor comprises the whole state space of the Boolean system generated by f +, it is the only one.

+ Define p = D1(fcn,~γ) by choosing P1 = 1 − x2 and P2 = x1 as in Section 9.4.

+ Now let us take a closer look: D1(fcn,~γ) is really nothing else but D2(fcn,~γ) with the roles of variables reversed. The reversal is a minor notational blunder, but the systems are clearly conjugate, so we leave our notation here as is in order to minimize the amount of necessary revisions. We found that for D1(fc, γ) we cannot get any consistency between ODE and Boolean trajectories whatsoever. In a sense, we added just one dummy variable to D1(fc, γ), and bingo! As we will show here, the resulting ODE system shows as much consistency with the Boolean dynamics as one could possibly hope for.

Let us be careful though that we are not getting ahead of ourselves here. Recall that with a state ~x = (x1, . . . , x2n) in the 2n-dimensional state space of D2(f,~γ) we associate a Boolean state s(~x) = (s(x1), . . . , s(xn)) of dimension n only, that is, we ignore the auxiliary variables xn+1, . . . , x2n. Then we construct the Boolean sequence st¯ as described in [28] based on these n-dimensional vectors only, and hope that it will be a Boolean trajectory.

It is true for any n-dimensional Boolean system given by f that we can treat

+ D2(f,~γ) as D1(f ,~γ), but the correspondence between (sync) trajectories of f and f + is not straightforward. In the example discussed here such a direct corre- spondence does hold, but in Subsection 9.4 below we will give an example where sync trajectories of f + correspond to trajectories of f, but not to sync trajectories

108 of f. In general not every trajectory of f + will correspond to a trajectory of f.

Such a correspondence does hold for sync trajectories though. We will explore this issue in more detail in a subsequent note.

Since n = 2, we have the luxury of being able to perform an easy phase-

+ plane analysis of D1(fcn,~γ). Figure 14 gives the phase portrait for the choice of parameters γ1 = γ2 = 1.

+ Figure 14: Nullclines and direction arrows for D1(fcn, (1, 1)).

The horizontal and vertical parts of the nullclines occur in the regions of the phase plane where P1(S(x1, x2)) or P1(S(x1, x2)) are constant. The most important fact we can learn from Figure 14 is:

The two nullclines intersect at (0, 0), which is the only steady state.

Let us study this system analytically. By [26] we have:

109 P1(S(x1, x2)) = 1 for x2 ≤ −1

P1(S(x1, x2)) = 1 − 0.5(x2 + 1) for −1 < x2 < 1

P1(S(x1, x2)) = 0 for x2 ≥ 1 (33)

P2(S(x1, x2)) = 0 for x1 ≤ −1

P2(S(x1, x2)) = 0.5(x1 + 1) for −1 < x1 < 1

P2(S(x1, x2)) = 1 for x1 ≥ 1.

In view of (16) and (31), the Jacobian at (0, 0) is given by

  3γ 3γ  1 1 J =   (34)   −3γ2 3γ2 with eigenvalues

p 2 λ1 = 1.5(γ1 + γ2) + 0.5 9(γ1 + γ2) − 72γ1γ2 (35) p 2 λ2 = 1.5(γ1 + γ2) + 0.5 9(γ1 + γ2) − 72γ1γ2.

Since γ1, γ2 > 0, we get two conjugate complex eigenvalues. Moreover, 1.5(γ1 +

γ2) > 0, and it follows that (0, 0) is an unstable focus. By the Poincar´e-Bendixson

Theorem, each trajectory that starts off the equilibrium (0, 0) will approach a limit cycle, and Figure 14 indicates that the ODE dynamics on [x−, x+]2\{0, 0}

+ will be strongly-s-consistent with the Boolean dynamics of fcn. In other words,

+ s + − + 2 for p = D1(fcn,~γ) we have U (fcn, p) = [x , x ] \{0, 0}, which is complete and universal. For this set of initial conditions, the sequence of switching times of x2

110 (which is the state variable of D2(fcn,~γ) under our reversed notation) will be

st¯ = (..., 0, 1, 0, 1,... ),

which is exactly the dynamics of fcn.

Inspection of Figure 14 reveals that the limit cycle visits a “clean state” for every Boolean state along the trajectory. This feature occurs in much more general situations and forms the basis for our work in [30]. Moreover, notice that there exists exactly one limit cycle. This feature depends on the particular form of the

Pi’s which were chosen as the simplest possible ones. They are faithful polynomials of lowest possible degrees. In general, if we only assume the the Pi’s satisfy condi- tions 1 and 2, the phase portrait may be more complicated and the set U s(f +, p) does not need to be universal.

Exercise 6 Construct a specific example that confirms the claim made in the pre- vious sentence.

However, in view of the results in [30], the set U s(f +, p) will always be complete, and will contain sufficiently clean states for every Boolean state.

Another interesting feature of this example is that we did not need to assume any separation of time scales. This contrasts with our work in [30] where such separation of time scales was assumed. In the (distant) future we may want to consider the following.

111 Problem 3 For which systems is an assumption about separation of time scales actually needed to prove some version of consistency?

The case D2(fc, γ)

Define a 2-dimensional updating function

+ 2 2 fc (f1, f2) : 2 → 2 by choosing the following regulatory functions.

f1(s) = s2 f2(s) = s1. (36)

This system is critical and reversible, has two steady states 00, 11 and an attractor of length 2 that comprises the other two states 10 and 01.

+ Define p = D1(fc ,~γ) by choosing P1 = x2 and P2 = x1 as in Section 9.4.

Figure 15 gives the phase portrait for γ1 = γ2 = 1.

+ Figure 15: Nullclines and direction arrows for D1(fc , (1, 1)).

112 Inspection of Figure 15 reveals that (0, 0) is the unique steady state and all trajectories that start outside a diagonal separatrix will approach one of the steady states (x−, x−), (x+, x+) that correspond to the Boolean steady states 00 and 11.

By symmetry of the expressions forx ˙1, x˙2, the separatrix Sep must consist of all

states (x1, x2) such that x2 = −x1, and inspection of Figure 15 reveals that this

separatrix is also the of the equilibrium (0, 0). Let

− + 2 U= = {~x ∈ [x , x ] :(x1x2 > 0)} and

− + 2 U6= = {~x ∈ [x , x ] :(x1x2 < 0}.

s One can also see from the phase portrait that U= ⊂ U (f, p) and that all

− + 2 trajectories that start outside of [x , x ] \Sep will eventually enter U=. Thus

U s(f, p) is universal.

But U s(f, p) is not complete; for completeness, U s(f, p) would need to contain some points from U6=. But inspection of Figure 15 also reveals that trajectories that start in U6= either stay on Sep (in which case Ψ(~x) remains switchless) or will enter U=, so that Ψ(~x) will contain exactly one Boolean switch. In this case the

+ Boolean sequence will still be a trajectory of fc , but not the sync trajectory. It follows that U(f, s) = [x−, x+]2\Sep. Thus U(f, s) is both complete and universal.

In other words, the dynamics of p will be strongly consistent, but not strongly

+ − + 2 s-consistent with the Boolean dynamics of fc on [x , x ] \Sep.

113 + Now let us take a closer look: D1(fc ,~γ) is really nothing else but D2(fc,~γ).

From the point of view of fc, the set U= contains representatives of every Boolean state, and this set should be considered complete from this point of view. Thus for D2(fc,~γ) we get strong s-consistency on a complete subset of the state space.

An ODE system without periodic orbits

Let n = 4 and define a Boolean updating function f : 24 → 24 by choosing the

following regulatory functions.

f1(s) = 1 − s2 f2(s) = s1, (37)

f3(s) = 1 − s4 f4(s) = s3.

This system is critical and reversible. It is really nothing else than the direct

product of the Boolean system defined by fcn of Subsection 9.4 with itself. There

are four disjoint attractors of length four each in this system:

0000 7→ 1010 7→ 1111 7→ 0101 7→ 0000,

0010 7→ 1011 7→ 1101 7→ 0100 7→ 0010, (38) 0011 7→ 1001 7→ 1100 7→ 0110 7→ 0011,

0001 7→ 1000 7→ 1110 7→ 0111 7→ 0001,

and their union is the whole state space. Note that these sync trajectories

correspond to the attractor of fcn given by (32). They differ by how far out of step

114 the variables s1, s2 are with the variables s3, s4.

Now define p = D1(f, γ) analogously to the definition in Subsection 9.4. It

follows from our previous work that the projection of almost any trajectory of

D1(f,~γ) on the (x1, x2)-plane approaches a stable limit cycle C1, while the pro-

jection on the (x3, x4)-plane approaches a stable limit cycle C2. Let us assume for

simplicity that γ1 = γ2 = 1 and γ3 = γ4.

Then the minimal time T it takes for (x1(t), x2(t)) ∈ C1 to return to itself is

fixed, while the minimal time T (γ3) it takes for (x3(t), x4(t)) ∈ C2 to return to

itself depends continuously on γ3. Thus the dynamics on the restriction of the state

space of D1(f,~γ) to C = C1 × C2 is topologically equivalent (even diffeomorphic,

but we don’t need this here) to the dynamics on a torus given by two maps on

2π the unit circle defined by ϕt(β) = β + αt and ψt(β) = β + α(γ3)t, where α = T

2π and α(γ3) = , and we consider angles that differ by a multiple of 2π as equal. T (γ3)

It is well known that if α is irrational, then the latter dynamics is transitive α(γ3)

(see [44], pp. 245/246). It follows that for most choices of γ3 (actually, for most choices of ~γ) the system D1(f,~γ) does not have periodic orbits. However, this system does not have sensitive dependence on initial conditions; it is an example of a quasi-periodic system. These observations lead to the following result, whose formal proof is left as an exercise.

Proposition 27 For the system defined above we have

− + 4 U(f, p) = [x , x ] \{~x : x1 = x2 = 0 ∨ x3 = x4}

115 regardless of the choice of ~γ, but U s(f, p) = ∅ except ~γ in a residual subset of

(0, ∞)4.

Thus in this example, strong consistency between the ODE and Boolean tra- jectories is a generic property, but strong s-consistency occurs only for very special choices of ~γ.

The example in this section may appear largely irrelevant, since there is no interaction between variables in the set {x1, x2} and variables in the set {x3, x4}; the system is decomposable. However, this type of dynamics may occur along trajectories in larger systems that are not decomposable. For example, this will happen when some other variables that mediate interactions between these sets take fixed values along the trajectories in question. It will also happen if the variables x1, x2, x3, x4 send input to other variables, but do not themselves receive input from other parts of the system.

Appendix: Nongenericity of strong s-consistency

Here we present the proof of a well-known general result in the theory of ODEs that precludes strong s-consistency on a complete subset of the state space for

ODE implementations of most Boolean systems of interest.

Definition 23 Let f : 2n → 2n be a Boolean updating function and let D(f) be an ODE implementation of the corresponding Boolean system. We say that D(f) is a topologically nondegenerate implementation of f if

116 1. The state space St of D(f) is a compact m-dimensional topological manifold

with boundary for some m ≥ n.

2. The right-hand side of D(f) is Lipschitz-continuous.

3. There are subsets Z1,...,Zn ⊂ St such that for all i ∈ [n] both Zi and St\Zi

are m-dimensional topological manifolds.

4. For all i ∈ [n] the boundary Ni of Zi in St is a union of finitely many

m − 1-dimensional topological manifolds.

5. For all i, j ∈ [n] with i 6= j the intersection Ni ∩ Nj is a union of finitely

many compact topological manifolds of dimensions ≤ m − 2.

6. The Boolean state si(~x) for ~x ∈ St will be interpreted as zero if ~x ∈ Zi and

as one if ~x ∈ Z1.

Note that we do not require any smoothness conditions on the manifolds in Def- inition 23. For this reason we call the implementation “topologically” degenerate.

In some subsequent results, we may need to impose more stringent conditions on the boundaries of the Zis and it seems prudent to reserve the unmodified adjective

“nondegenerate” for such purposes. In this definition we also do not require any kind of consistency between the ODE and the Boolean system; it suffices that we can define real-time Boolean trajectories.

The following lemma implies that for nondegenerate ODE implementations of Boolean systems the set of initial conditions whose trajectories cross multiple

117 boundaries simultaneously is negligible.

Lemma 28 Suppose D(f) is a topologically nondegenerate ODE implementation

of a Boolean system. Let i 6= j, and suppose that ~x(0) is an initial condition and

0 < t0 < t1 are times with {~x(t): t ∈ [0, t1]} contained in the interior of St such that

(i) ~x(t0) ∈ Ni ∩ Nj.

(ii) For all ~y(0) in some neighborhood U of ~x(0) we have

|{t ∈ [0, t1]: ~y(t) ∈ Ni ∩ Nj}| ≤ 1.

Then there exists a neighborhood V of ~x(0) such that the set

NS(i, j) = {~y(0) : ∀ t ∈ [0, t1] ~y(t) ∈/ Ni ∩ Nj} (39)

contains a dense open subset of V .

Notice that condition (i) covers both the case when the Boolean states si, sj

change simultaneously at time t0 and the case where the trajectory reaches the

two boundaries at time t0 and then turns back, as well as mixed scenarios. Con-

dition (ii) precludes, among other things, trajectories that move along Ni ∩ Nj

for a while. For all ODE implementations of Boolean systems of interest to us,

condition (ii) will be satisfied on a dense open subset of the state space.

Proof of Lemma 28: Let everything in sight be as in the assumptions, and let

W be a closed neighborhood of ~x(t0). Define a map F : W × [0, t1] → St × [0, t1]

118 by F (~z(0), t) = (~z(t − t0), t). This definition requires that we can extend ODE

trajectories backwards in time, which may not always be the case (see Lemma 20

where we have only forward-invariance for our state space), but since we assumed

that ~x(0) is in the interior of St we can choose W sufficiently small so that the

relevant trajectories don’t leave St in the time interval [−t1 + t0, 0]. Let K be the range of F .

By Theorem 3.16 of [44], F is continuous in both variables. Since W × [0, t1] is compact, F is a homeomorphism between W × [0, t1] and K. Thus K is a

topological manifold of dimension m + 1. Let V = {~x ∈ St :(~x,t0) ∈ K}. Thus

V is the set of points whose trajectory resides in W at time t0. This set is a

neighborhood of ~x0 by continuity of F . Wlog (by choosing W sufficiently small) we can assume that V ⊂ U, where U is as in (ii). By condition (5) of Definition 23,

F (Ni ∩Nj) is a union of finitely many submanifolds of dimension m−1 of K. Now let

∗ V = {~y(0) ∈ V : ∃t ∈ [0, t1] ~y(t) ∈ Ni ∩ Nj}.

∗ Notice that V is the projection of F (Ni ∩ Nj) onto V . The projection map

is continuous, and condition (ii) implies that its restriction to the compact set

∗ F (Ni ∩ Nj) is injective. Thus V is homeomorphic to F (Ni ∩ Nj) and thus is a

union of finitely many manifolds of dimension m−1. Since int(V ) has dimension m,

the lemma follows. 

119 9.5 The Arithmetic Conversion

Description:

I propose a different method of constructing a consistent continuous version of a

Boolean function. This method has the advantage of being more straightforward

computationally but may have some undesirable latent effects.

Arithmetic Method

Jan 3, 2011.

Abstract

The following note is intended for the members of the research group

interested in computational side of our project. I propose a different method

of constructing Pi from fi. This method has the manifest function of being

more straight forward computationally but we should be concerned about

any latent result which changes the features of Pi.

Discussion

Q Let S = {s1 . . . sn} be a set of n binary variables. For H ⊂ S we write H to connote the product (or logical conjuction) of these variables. We call such a product a monomial.

Proposition 29 For any Boolean function f over variables from S there exists

120 T ⊂ P (S) such that M Y f = H (40) H∈T

We use ⊕ to denote addition modulo 2. When f is in this form we will say that f is in polynomial form. Proof.We can defeat this with a few lemmata.

L Q Lemma 30 H∈T H is satisfiable when T is non–empty.

Assume T is not empty. Then let M be a smallest (by inclusion) subset of T Then set all the elements of M equal to 1 and all the other elements in S equal to 0.

L Q Q This satisfies H∈T M as the M = 1 and all the other monomials are zero.

Lemma 31 M Y M Y T 6= R ⇒ H 6= H H∈T H∈R

Then if we look at the difference (which is really ⊕) of these two functions we see that this is M Y H H∈T R where refers to symmetric differnce. But T R is non-empty when T and R are distinct. So then lemma 1 demonstrates that there is a place where the difference of these functions is non-zero. So the functions are distinct.

Now we’re prepared to tackle the original statement:

One can construct a combinatorial proof of this: There are 22n distinct (by lemma

2) functions in this construction (40) as we can choose 22n different subsets T of the power set P (S). On the other hand, how many functions f exists that

121 f :[n] → {0, 1}? Again 22n unique functions! So these functions in polynomial form account for all the Boolean functions. 

Consider the function:

f1(s) = s1 ∨ s2 (41)

Then we can construct a unique polynomial using ⊕ and our monomials which acts exactly like (41) :

f1(s) = s1 ⊕ s2 ⊕ s1s2 (42)

Let’s think of this ⊕ as a single operation and we’ll replace each ⊕ with a simple

+. Note that (42) is equivalent to

f1(s) = s1 + s2 + s1s2mod 2 (43)

How can we convert these into continuous sys- tems?

n For this next section let x = (x1, . . . xn) ∈ R

In [24] W. Just outlines 3 properties to look for in a conversion from fi to Pi.

Additionally a specific method is outlined which satisfies these properties. This method begins with fi and using operations ∧ and ∨ and draws its arguments from S frequently. I would like to consider another method of constructing Pi.

This new method would begin with function fi in polynomial form. The string

122 specifying a Boolean function in polynomial form will in general be very large.

Another possible draw back of this new construction I call ‘false positives’ if this

is a serious problem then we can add ’no false positives’ to list of properties we

need to consider when constructing Pi if this is not a serious problem then we will

have constructed an alternative conversion method. Either way it’s a win-win! But

we’re getting ahead of ourselves; Let’s see how this conversion method works.

How do we construct Pi?

Let a Boolean function fi be given as in (43). Whenever we see si we will replace

it with S(xi) and then we will need to address the mod 2 part of this expression. I will let w act as this mod 2 function and I will describe this in more depth shortly.

In the case of (43), we have

Pi = w(S(x1) + S(x2) + S(x1)S(x2))

Now we know that w is to act as function that plays the role of modulo 2 addition.

So we would like w to taken on the value 1 at odd integers and 0 at even integers.

I will give two options for how we can build w. Much like the way we have built

S one of these may be easier for theory (the one described by W. Just) and one

may be easier for computations (this was described by B. Elbert). Consider the

following analytic function

w(u) = .5sin(πu) + .5

123 Another option would be constructing the formulating the following Lipshitz smooth function for w.

m(u) = 1 − |u + 1| when 0 ≤ u ≤ 2 and letting w be the even extension of m.

124 9.6 An Example Demonstrating this Conversion

Description:

I use an example to demonstrate the software in the light of this arithmetic con-

version.

Demonstration of the Arithmetic Conversion

0 Jan 9, 2011 Let X = (x1 . . . xn) be a set of real variables. And let (y1 . . . y2n ) be a vector of the product of the subsets ordered anti-lexigraphically. We can let πx refer to such a vector. For example,

{x1,x2,x3} 0 π = (x1x2x3, x2x3, x1x3, x3, x1x2, x2, x1, 1)

Discussion

Consider the following trajectory.

00 → 01 → 11 → 10 → 00 →

This is given by the trajectory,

f1(s) = s2

f2(s) = ¬s2

Writing this in polynomial form,

f1(s) = s2

125 f2(s) = 1 + s2

From here we move on to the format the computer will use   s s  1 2        0 1 0 0  s     2  f =         0 0 1 1  s   1      1

In generalm one can consider in this form: f = R × 2s Where R represents the

Boolean rules and s = {s1 . . . sn} a set of binary variables. If one tries the following

code:

rules =[0 1 0 0; 0 0 1 1];

A= process ( rules );

ppic (A,’bin ’)

One should arrive at a diagram which confirms the trajectory given by 1. So

we know how we get to the computer to give us what we want on the Boolean

network side. What about on the ODE side? The conversion into a system of

ODEs Now we know from from the previous notes [1,2] how to convert this into a

system of ODEs what we want to demonstrate is how to automate this process.

Following the methods outlined for D1 in the previous notes we arrive at the

following set of ODEs

x˙1 = g(x1) + 6w(S(x2))

x˙2 = g(x2) + 6w(S(x1 + 1))

126 Note we can rewrite this in the following manner

   S(x )S(x )   1 2          0 1 0 0  S(x )     2  x˙ = g(x) + 6w          0 0 1 1  S(x )    1        1

It looks like we maybe able to use the Boolean rules directly to compute p(x)

the same way we use it to compute f(s) Let x ∈ Rn be given. In general we can

S(x) consider the D1(f,~γ) to be (where f is given in 3):x ˙ = ~γ(g(x) + 6w(R2 ) This

allows us to use the following code to draw a picture of what’s happening:

function []= drawbn2ode2 (rules ,initialstate )

[t,y]= ode45 (@xdot ,[0 10] , initialstate );

p l o t ( y )

function [ ans ]= xdot (t, vector )

vector = vector ’;

Sv=S( vector );

T= myproduct (Sv);

ans =g( vector ’) +6∗ mod2curve ( rules ∗T);

end

end

If we give this function the rules defned by R in 2 we get Figure 3 One can

confirm that this is indeed consistency (at least for this initial condition) with the

127 trajectory in 1. Note that as this program is written now, the nested function

‘xdot’ is not passed in the matrix of rules ‘R’ but rather can has complete access to it.

This is good when we consider quenched Boolean networks but we may want to be able periodically alter the rules of the Boolean network. Such networks are called annealing networks and I’ve removed this possibility only because it will let this function run faster.

References

[1] M. Aldana, S. Coppersmith, and L. P. Kadanoff, Boolean Dynamics with Random Couplings

in Perspectives and Problems in Nonlinear Science (J. Kaplan, E. Marsden, and K. R.

Sreenivasan, eds.), Springer Verlag, New York, NY, 2003.

[2] J. Banks, J. Brooks, G. Cairns, G. Davis, and P. Stacey, On Devaney’s Definition of Chaos,

The American Mathematical monthly 99 (1992), 332–334.

[3] U. Bastolla and G. Parisi, Relevant elements, magnetization and dynamical properties in

Kauffman networks: A numerical study, Physica D: Nonlinear Phenomena 115 (1998), 203

–218.

[4] U. Bastolla and G. Parisi, The modular structure of Kauffman networks, Physica D: Non-

linear Phenomena 115 (1998), 219 –233.

[5] S. Bilke and F. Sjunnesson, Stability of the Kauffman model, Physical Review 65 (2001),

016129.

128 [6] S. N. Coppersmith, L. P. Kadanoff, and Z. Zhang, Reversible Boolean networks I: distribution

of cycle lengths, Physica D Nonlinear Phenomena 149 (2001), 11–29, available at arXiv:

cond-mat/0004422.

[7] S. N. Coppersmith, L. P. Kadanoff, and Z. Zhang, Reversible Boolean networks II. Phase

transitions, oscillations, and local structures, Physica D Nonlinear Phenomena 157 (2001),

54–74, available at arXiv:cond-mat/0009019.

[8] M. Davidich and S. Bornholdt, The Transition from Differential Equations to Boolean Net-

works: A Case Study in Simplifying a Regulatory Network Model, Journal of Theoretical

Biology 255 (2008), 269 –277.

[9] R. Devaney, An Introduction to Chaotic Dynamical Systems, Addison-Wesley, 1989.

[10] B. Drossel, number of Attractors in random Boolean networks, Physical Review 72 (2005),

5.

[11] B. Drossel, Tamara Mihaljev, and Florian Greil, number and Length of Attractors in a

Critical Kauffman Model with Connectivity One, Physical Review Letters 94 (2005), 088701.

[12] B. Elbert, A Quick User’s Guide to Boolean–Continuous, 2011. Software documentation.

[13] L. Fortnow, The status of the P versus NP problem, Communications of the Association for

Computing Machinery 52 (2009), 78–86.

[14] L. Fortnow and S. Aaronson, Is P versus NP formally independent, Bulletin of the European

Association for Theoretical Computer Science 81 (2003), 109–136.

[15] J. H. Fowler and N. A. Christakis, Dynamic spread of happiness in a large social network:

longitudinal analysis over 20 years in the Framingham Heart Study, British Medical Journal

Clinical Research Ed. 337 (2008), a2338.

129 [16] Daniel T. Gillespie, Exact stochastic simulation of coupled chemical reactions, The Journal

of Physical Chemistry 81 (1977), 2340–2361, available at http://pubs.acs.org/doi/pdf/

10.1021/j100540a008.

[17] J. L. G. Guirao, D. Kwietniak, M. Lampart, P. Oprocha, and A. Peris, Chaos on hyperspaces,

Nonlinear Analysis: Theory, Methods & Applications 71 (2009), 1–8.

[18] J. L. G. Guirao and M. Lampart, Relations between distributional, Li-Yorke and [omega]

chaos, Chaos, Solitons & Fractals 28 (2006), 788 –792.

[19] R. Impagliazzo, R. Paturi, and F. Zane, Which Problems Have Strongly Exponential Com-

plexity?, Journal of Computer and System Sciences 63 (2001), 512 –530.

[20] E. L Ince, Ordinary Differential Equations, Dover Publications, Inc., 1956.

[21] C. Jeffries and J. Perez, Observation of a Pomeau-Manneville intermittent route to chaos in

a nonlinear oscillator, Physical Review A 26 (1982), 2117–2122.

[22] W. Just, (In)consistency: Some basic examples, 2010. Research Note.

[23] W. Just, More examples, 2010. Research Note.

[24] W. Just, State of the Folder, 2010. Research Note.

[25] W. Just, Consistent ODE implementation for arbitrary Boolean systems I, 2011. Research

Note.

[26] W. Just, Converting Boolean systems into ODE systems, 2011. Research Note.

[27] W. Just, Converting Boolean Systems into ODE Systems: an Example., 2011. Research Note.

[28] W. Just, Definitions of consistency between ODE and Boolean systems, 2011. Research Note.

[29] W. Just, Definitions of Consistency between ODEs and Boolean systems, 2011. Research

Note.

[30] W. Just, Strong s-consistency for one-stepping Boolean systems., 2011. Research Note.

130 [31] J. Yucas K. Hicks G. Mullen and R. Zavislak, A polynomial analogue of the 3n+1 problem,

American Mathematical monthly 115 (2008), 615–22.

[32] S. A Kauffman, Metabolic Stability and epigenesis in randomly constructed genetic nets,

Journal of Theoretical Biology 22 (1969), 437–467.

[33] S. A. Kauffman, Origins of Order, Oxford University Press, Oxford, 1993.

[34] M Kline, Mathematical thought from ancient to modern times, Oxford University Press,

1972.

[35] M. Korb, Demonstration of the Arithmetic Conversion, 2010. Research Note.

[36] M. Korb, Distance Preserving Bijections on Binary Vectors., Midstates Conference For Un-

dergraduate Research in Computer Science and Mathematics 5 (2010), 16–18.

[37] M. Korb, Research Note on Boolean Networks, 2010. Research Note.

[38] M. Korb, Research Note on Boolean Networks: Some MatLab Codes, 2010. Research Note.

[39] M. Korb, A Note on Time Reversible Networks., 2011. Research Note.

[40] M. Korb, Another Way of Constructing Pi., 2011. Research Note.

[41] M. Korb, Strong Consistency; Some Examples., 2011. Research Note.

[42] P. Laplace, A Philosophical Essay on Probabilities, Dover, 1820.

[43] J. L˝uand G Chen, A New Chaotic Attractor Coined, International Journal of Bifurcation

and Chaos 12 (2001), 659–661.

[44] J. D. Meiss, Differential Dynamical Systems, SIAM, 2007.

[45] Michael S., Introduction to the Theory of Computation, PWS Publishing Company, 1997.

[46] B. Samuelsson and C. Troein, Superpolynomial Growth in the number of Attractors in Kauff-

man Networks, Physical Review Letters 90 (2003), 098701.

131 [47] J. E. Sasser, History of Ordinary Differential Equations The First Hundred Years (D. E

Cameron and J. D. Wine, eds.), Vol. Proceedings of The Midwest Mathematics History

Conferences, Miami University, Oxford, Ohio, October 2-3, 1992, Modern Logic Publishing,

1997.

[48] J. E. S. Socolar and S. A. Kauffman, Scaling in Ordered and Critical Random Boolean

Networks, Physical Review Letters 90 (2003), 068702.

[49] R. M. Solovay and S. Tennenbaum, Iterated Cohen Extensions and Souslin’s Problem, The

Annals of Mathematics 94 (1971), pp. 201–245.

[50] R. M. Solovay. T. P. Baker J. Gill, Relativizations of the P =? NP Question, Society for

Industrial and Applied Mathematics Journal of Computing 4(4) (1975), 432–42.

[51] D. Wittmann, J. Krumsiek, J. Saez-Rodriguez, D. Lauffenburger, S. Klamt, and F. Theis,

Transforming Boolean models to Continuous Models: Methodology and Application to T-Cell

Receptor Signaling, BioMed Central Systems Biology 3 (2009), 98.

[52] T. Yato and T. Seta., Complexity and completeness of finding another solution and its

application to puzzles, In Proceedings of the National Meeting of the Information Processing

Society of Japan (IPSJ), 5 (2003), 1052–1060.

132