View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by CERN Document Server

Neutral and Punctuated Equilibrium

in Evolving Genetic Networks

Stefan Bornholdta and Kim Sneppenb

a Institut f¨ur Theoretische Physik, Universit¨at Kiel, Leibnizstr. 15, D-24098 Kiel, Germany

[email protected]

b NORDITA, Blegdamsvej 17, DK-2100 Copenhagen, Denmark

[email protected]

(August 23, 1997)

Abstract

We consider of Boolean networks, and demonstrate how require-

ment of continuity in evolution may leads to punctuated equilibrium. We

discuss evolution of genetic flexibility and how this may reconcile the expo-

nential distribution of lifetimes seen by Van Valen with the power law

distribution of genera lifetimes obtained when one averages over the

record. PACS number(s): 87.10.+e, 02.70.Lq, 05.40.+j

Typeset using REVTEX

1 Evolution of is presumably a random process with selection [1]. It has been discussed whether this process can be viewed as some hill climbing process [2], or whether evolution mostly happens as a random walk where most changes have no influence on the , and thus may be considered as neutral [3]. The case of evolution as in some externally imposed fitness landscapes have been originally proposed by [2], and later formed the basis for models of punctuated equilibrium by Newman [4] and Lande [5]. The case for neutral evolution has been presented by Kimura [3], and is experimentally supported on microlevel by observation of the many functionally identically variants there are of most of the important macromolecules of life. The observation of punctuated equilibrium in the fossil record, recently discussed by Gould and Eldredge [6], may be taken as an indication that evolution of a species consists of exaptations of jumping from one hill top to another nearby in some fitness landscape. Naturally such jumps will be rare, separated by large time intervals where species are located at a fitness peak, and the resulting evolutionary pattern will show punctuations as indeed seen in the fossil record. This picture of single species evolution in a given fixed landscape has been modeled explicitly by Newman and Lande [4,5]. However, also neutral evolution may show punctuations: as, for example, might be visualized by finding the exit in a labyrinth or from finding a golf hole by random walk in a flat landscape. The picture here is that genetic changes always take place, but that the phenotypic changes only rarely occur. This has recently been demonstrated by modeling the evolution of RNA secondary structure by P. Schuster and collaborators [7]. For these molecules most base pair exchanges do not induce any changes in their secondary structure, and thus they may be considered as neutral. Occasionally, however, one may suddenly lead to a complete readjustment of the structure, and presumably therefore to a major change in its functionality. In any case, as demonstrated by Bak and Sneppen [8], punctuated equilibrium on the organism level might be connected to the episodic punctuations observed on the ecosys- tem level. The crucial element of such an extrapolation is that the environment of each

2 species depends on species which are ecological neighbors, thereby allowing punctuations to propagate across the ecosystem. In the present paper we introduce a model of single species evolution without inter-species competition. The evolutionary steps in the model consists of random mutations combined with selection of mutants preserving the phenotype with respect to a given environment. Thus, the only requirement is continuity in phenotype. Other changes in genotype are allowed, creating a path of neutral mutations. We will discuss how this requirement of continuity in evolution may constrain and guide the evolution of an individual species in the face of a constantly changing environment. Our fundamental constituents are the genes of the organism, and the evolution we con- sider is on the genetic network level. Although genetic networks consist of biochemical switches [9], it has been proposed that the on-off nature of these switches can be well approximated by Boolean functions [10–12]. Thus we here consider networks of random Boolean functions. The functionality we test for is attractors of these networks [11]. We model continuity by testing for reaching a given attractor on subsequent steps, but allowing changes that modify attractors that are not tested from the actual initial condition. In sub- sequent steps, the initial condition (modeling the environment) assumes new random values such that previous neutral mutations may then surface in the phenotype. The model may be viewed as coarse grained in time, demanding continuity in fulfilling present demands on a given environment, but opening for silent changes in how the organism may respond to changed environments. We note that the continuity requirement also means that we only evolve species for which at least one neighbour genotype has quite similar dynamics. Thus our continuity requirement favours evolution of robust networks where changing of a few rules does not leads to overall changes of expression. In the simplest form we start the simulation by selecting a genetic network with N genes

and assign each a Boolean variable σi = −1 or 1. Further for each of the N genes we define an updating matrix in the form of a lookup table which determines its output for each of the 2N input states from the N genes in the system. Finally we define which gene is actively

3 connected to which, by a matrix wi,j that defines the input to state i from gene number j as

wi,jσj. The entry value of the connectivity matrix wi,j may take values −1, 0 and 1. A value 0 corresponds to no coupling from gene i to gene j, and is defined to enter the updating matrix by the entry value −1 always. Typically only a fraction of the connectivity matrix entries are in use, and the average number of active inputs per gene is called the connectivity K. It varies between 0 and N, meaning that K includes self couplings. Thus K =0means that all is fixed to the output state specified by input corresponding to (−1, ..., −1) to all states. Initially we start with a low but finite connectivity. In the following, an initial average connectivity of K = 1 coupling per site will be used. Boolean networks can exhibit a rich dynamical behavior, including fixed points, periodic attractors and long transients to reach these. Further the number of attractors, their length, and the length of the transients to reach these are known to depend strongly on the coordi- nation number [13]. In this paper we do not address any questions connected with the time scale of these attractors. Instead we consider a longer evolutionary time scale where the geometry of the network may change. This reminds of evolutionary time scales of biological species as compared with the much shorter lifetimes of organisms.

The evolutionary time step of the system is:

1. Select a random initial state of the system {σi}. Let your mother system evolve from this state until a final attractor is determined.

2. Create a daughter network by a) adding, b) removing, or c) adding and removing a weight in the coupling matrix at random, each option occurring with probability =1/3. Let you daughter system evolve from the same initial state as that selected for the mother and test whether it reaches the same attractor as the mother system did. In case it does then replace mother with daughter network and go to step 3. In case another attractor is reached, keep mother network and go to step 3.

3. Then finally one random bit of the total N ∗ 2N lookup table entries is flipped to

4 another value. This allows for a convenient self averaging of the system, and in fact represents a very slow change.

Iterating these steps makes an evolutionary algorithm that represents the sole require- ment of continuity in evolution and how this may proceed under an environment that fluctu- ates. No selective pressure is applied. However, by not testing the entire basin structure of the network we allow for silent mutations in the organism that may surface in a later changed environment. In our model this new environment is represented by the new input vector selected in the next time step. In practice this new environment may well represent either physical changes in the environment or maybe changes in neighbor species in an ecosystem, that we do not attempt to model here. In figure 1a we show how the connectivity of this system evolves with time in a network of size N = 16. One observes that the typical connectivity of the network is confined to lower values than for random networks. This is further quantified in Figure 2 where the distributions of average connectivities are displayed in the statistically stationary state. All data are taken beginning after an equilibration time of ten percent of the length of the total run. Notice that there are two distributions, one counting the frequency of connectivities for all new “species” and one counting the time averaged distribution of connectivities. These two distributions diverge strongly for high connectivities because the few species with high connectivity have very long lifetimes, i.e., it is very difficult to find mutations which do not change the activity pattern of the networks completely. In our case, the activity pattern consists of the transient and the final periodic attractor following the given initial state. The time scale of these patterns becomes large for networks with high connectivity, making it more difficult to keep the exact dynamic pattern under the mutation of a weight. In popular terms, an increased complexity of the network makes further evolution difficult. One may speculate that this is the reason that real genetic networks keep connectivity low: It will be easier to evolve by increasing the number of genes N at a fairly low connectivity level (the present model, however, does not consider variable N). In Figure 1a we further see

5 that marked punctuations occur, where long periods of nearly fixed average connectivity sometimes are interrupted by a sudden change in connectivity. This interplay between long waiting times and short times for actual changes is observed in the fossil record. The phenomenon has been coined “punctuated equilibrium” by Gould and Eldredge. As also seen from Figure 1b, the periods of stasis show a similar structure on shorter time scales as they do on longer time scales. Thus the stasis time distribution may have long tails. This is explored further in Figure 3a where we show this distribution averaged over the

simulation. Approximately the distribution of stasis times is ∝ 1/t2. Periods of stasis at high values of the average connectivity can become very long, which in practice calls for extremely long equilibration times. Thus, the simulation is mainly sensitive to a feasible range of equilibration times limited by the duration of the run. The corresponding (left) part of the stasis time distribution exhibits a smooth average. In the right part of this distribution, corresponding to long stasis times, sample to sample fluctuations remain [14]. In Figure 3b we decompose the stasis time distribution into times obtained for different values of the average connectivity. Again we observe that higher connectivities typically show longer stasis times. Remarkably, when looking at the statistics of a small interval of low K values, we observe exponentially distributed stasis times. The power law behavior then comes about by averaging over the range of all K values. In to test for the robustness of our model we tried other mutation rules (again without any evolutionary pressure, i.e. symmetric in adding and removing weights). In one variant a daughter network was created by a) adding or b) removing a weight in the coupling matrix at random, each option occurring with probability = 1/2, thus allowing for K-changing mutations only. In another variant a daughter network was created by independently adding a random weight with probability = 1/2 and removing a random weight with probability = 1/2. The simulation results with these rules are identical to those described above. Let us briefly discuss the meaning of the stasis times and punctuations observed here. According to the definition of our model above, we quantify waiting times in terms of the

6 number of times mutant networks are exposed to new environments before a neutral mutation occurs that fulfills continuity. Thus they are not to be confused with the “neutral evolution” introduced by Kimura [3] which leads to waiting times consisting of a number of neutral mutations. Furthermore, Kimura’s model is defined at the molecular level, whereas here we consider evolution at the level of genetic networks. The genetic networks are formally defining a “species” and the length of the waiting times indicates the “genetic flexibility” of a species. Associating the interconnectedness of our model networks with the genetic flexibility of real organisms one may attempt to understand a puzzling decomposition of lifetimes of species in the fossil record. First it was noted by Van Valen [15] that each group of closely related species have exponentially distributed lifetimes. Second, an analysis of the overall distribution of genera lifetimes, tabulated by Raup and Sepkowski [16], showed that this is rather distributed as ∝ 1/t2 [17] for genera lifetimes exceeding 15 million years. It is tempting to speculate that groups of closely related species are associated to the same genetic flexibility, and thus may evolve, and eventually get extinct, with a frequency given by this genetic flexibility. This would explain the exponetial distribution of Van Valens. Averaging over all genetic flexibilities may then be an average over different characteristic lifetimes, and our simplified evolution scenario demonstrates how such an averaging can give an overall 1/t2 distribution. We note that the characteristic distribution of 1/t2 cannot be obtained on the basis of fitness landscapes alone, where one rather gets a distribution ∝ 1/t corresponding to a sampling of passing times over a distribution of barriers [17]. In conclusion we have studied evolution of Boolean networks in absence of any com- petition. This simplification allowed us to discuss how the requirement of evolving robust networks in itself may lead to an evolution which exhibits punctuated equilibrium.

7 ACKNOWLEDGMENTS

S.B. thanks NORDITA, Copenhagen, for kind support and warm hospitality and the Deutsche Forschungsgemeinschaft for supporting this work.

8 REFERENCES

[1] C. Darwin, The Origin of Species by Means of , Harmondsworth, 1859.

[2] S. Wright, Evolution 36 427 (1982).

[3] M. Kimura, The Neutral Theory of , Cambridge University Press, Cambridge (U.K), 1983.

[4] C.M. Newman, J.E. Cohen and C. Kipnis, Nature 315 400 (1985).

[5] R. Lande, Procedings National Academy of Sciences, USA 82 7641 (1985).

[6] S.J. Gould and N. Eldredge, Nature 366, 223 (1993).

[7] P. Schuster, J. Weber, W. Gruner and C. Reidys, in: Physics of Biological Systems, From Molecules to Species, Springer Verlag, Berlin, 1997.

[8]P.BakandK.Sneppen,Phys.Rev.Lett.71 4083 (1993).

[9]M.Ptashne,A Genetic Switch, Cell Press & Blackwell Scientific Publications, 1992.

[10] S.A. Kauffman, J. Theor. Biol. 22 437 (1969).

[11] R. Somogyi and C.A. Sniegoski, Complexity (1996) 45-63.

[12] J.A.Sales, M.L. Martins and D.A. Stariolo, Phys. Rev. E55 (1997) 3262

[13] S. Kauffman, Physica D 42 135 (1990).

[14] Note that in the small networks considered here, some finite size effects occur for con- nectivities K close to the network size N. We therefore consider simulations where K remained at lower values (K<13) which is fulfilled by most of the simulations we did.

[15] L. Van Valen, Evolutionary Theory 1, 1 (1973).

[16] D. Raup, Bad genes or Bad luck? W. N. Norton & Company, New York, 1991.

[17] K. Sneppen, P. Bak, H. Flyvbjerg, and M.H. Jensen, Proceedings National Academy of

9 Science, USA, 92 5209 (1995).

[18] R.V. Sole and J. Bascompte, Proc. R. Soc. Lond. B263 161 (1996).

10 FIGURES FIG. 1. Evolution of the genetic network connectivity with time (a) and closeup on a part of the connectivity evolution (b).

FIG. 2. Distributions of average connectivities in the statistically stationary state. The fre- quency of connectivities for all new “species” is shown, as well as the time averaged distribution of connectivities.

FIG. 3. Stasis time distribution in the neutral evolution of networks (a) and the decomposition of stasis time distributions for different intervals of the average connectivity (b).

11 16

14

12

10

K 8

6

4

2

0 0 5e+06 1e+07 1.5e+07 2e+07 t 16

14

12

10

K 8

6

4

2

0 7e+06 7.2e+06 7.4e+06 7.6e+06 7.8e+06 8e+06 t 0.03 K new K

0.025

0.02

0.015

0.01

0.005

0 0 1 2 3 4 5 6 7 8 K 1e+06

100000

10000

1000

100

10 N(T)

1

0.1

0.01

0.001

0.0001 1 10 100 1000 10000 100000 T 1e+06 0<=K<2 2<=K<4 4<=K<6 100000 6<=K<8

10000

1000 N(T) 100

10

1

0.1 1 10 100 1000 10000 T