arXiv:1008.3800v1 [nlin.AO] 23 Aug 2010 fEoa Sads,19)adWbol Clael tal., et (Caldarelli 1998). Webworld and foodwebs 1994) the (Standish, and EcoLab 1991) of (Ray, particular Tierra in of network systems, interaction evolutionary life artificial by ated r rcse edt rdc ewrswt a with networks produce surplus to tend that processes hypothesis the ary to a leads in This links the fashion. shuffling random by obtained networks with control compared of those values complexity heightened significantly of network neural the egans and avail- website, foodwebs Pajek of the from number able a including networks, natural of networks for nodes. practical thousand several size, is of measure group complexity automorphism network some computing with this Coupled for as weights. degree, link same new of the account taking of networks as networks well full filled and partially empty to of relative values problem complexity a overcome higher to with method (2010a), the Standish generalised and In refined network. I the a computing of for complexity method information a developed I (2005), Standish In nti ok pl h aemtost ewrscre- networks to methods same the apply I work, this In nSads 21a,Isuidsvrlpbihddatasets published several studied I (2010a), Standish In tefadrnol hfe versions. shuffled randomly and betw itself difference li information artificial little the had networks, foodwebs real generated in found Webworld. was and what tha to EcoLab Contrary models complex- Tierra, life same networks: artificial the ecological well-known applying exhibit several by to idea this measure test ity I paper, this In th generated that process inform evolutionary network. represents the hy- surplus by One complexity captured tion this links. that control network is randomised the com- pothesis shuffling and in by web, difference obtained food distinct networks real a the several is between to there plexity applied webs, food When world theoretic real networks. information of an measure developed have complexity I work, previous In seTbe1.I otcss hs ewrsexhibited networks these cases, most In 1). Table (see ( ∆ oprdwt admasml processes. assembly random with compared ) Introduction Abstract colo ahmtc n ttsis nvriyo e Sou New of University , and Mathematics of School ewr opeiyo Foodwebs of Complexity Network complexity usl Standish Russell een .el- C. a- fe e t the - ubro qiaetdescriptions equivalent of number sas lsie fdsrpin htcndtriei two the classifier if This called determine commonly object. same can There the that to descriptions correspond encoded. descriptions of ob- be classifier can an a interest is also of is there objects the cases, of most scriptions complexity In a as simple. vious content fairly information is using measure of notion The opeiyof complexity description language. tion where a oteobject the to map s e hsuiu rfi.Hence prefix. unique this ter increases, os oooial.Therefore monotonically. so do orpeettentoka isrn,w edt tr the store to need ( we count bitstring, a node as network the represent To representation bitstring Network on labels any exactly. links. ignore correspond or We links nodes nodes the the identical. the that over topologically placed are such be They one, can second one the of of nodes con- the this considere if are In networks identical two network. weights, link given any ignoring a text, of to used representations be the will and count which identical, language, are representation networks prefix-free two a when decide to how things: converges. with tonically fteajcnymti.Teiiilpr ftesrn has string the of part initial The w matrix. adjacency the of oigta are that lowing ( y ocmuetecmlxt fsm object some of complexity the compute To eas h ersnainlnug spexfe,every prefix-free, is language representation the Because oueti omls ihntok,w edt xtwo fix to need we networks, with formalism this use To = ) h lsie osntcr htsmosapa af- appear symbols what care not does classifier The . ⌈ prefix-free log N 2 stesz fteapae sdfrterepresenta- the for used alphabet the of size the is ω n y ⌉ C opeiya Information as Complexity utices sfs,i o atrthan faster not if fast, as increase must n nta agaehsauiu rfi flength of prefix unique a has language that in ( n ikcut( count link and ) 1 is olwdb ige‘’so i.Fol- bit. stop ‘0’ single a by followed bits, ‘1’ ℓ x x u sbuddblwb .S qain(1) equation So 0. by below bounded is but , w lim = ) sgvni h ii as limit the in given is ersnainlnug ihnwihde- which within language representation isrpeetn h au of value the representing bits x ℓ ne h gedcasfir hnthe Then classifier. agreed the under observer →∞ hWales th ℓ log ω denoted , N ( l ,a ela representation as well as ), ,O ℓ, C − ( ω O log ( ( y ( ,x ℓ, y )) ℓ ω O )) ∞ → ( ) ( ≥ ,x ℓ, erae mono- decreases x flength of ) N . ) : ℓ x − n on the count , s nbinary. in ( y N ) As . ℓ ℓ and , that (1) is d ℓ Dataset nodes links hln CERi hln CERi | ln C−hln CERi| C e ∆= C− e σER celegansneural 297 2345 442.7 251.6 191.1 29 celegansmetabolic 453 4050 25421.8 25387.2 34.6 ∞ lesmis 77 508 199.7 114.2 85.4 24 adjnoun 112 850 3891 3890 0.98 ∞ yeast 2112 4406 33500.6 30218.2 3282.4 113.0 baydry 128 2138 126.6 54.2 72.3 22 baywet 128 2107 128.3 51.0 77.3 20 cypdry 71 641 85.7 44.1 41.5 13 cypwet 71 632 87.4 42.3 45.0 14 gramdry 69 911 47.4 31.6 15.8 10 gramwet 69 912 54.5 32.7 21.8 12 Chesapeake 39 177 66.8 45.7 21.1 10.4 ChesLower 37 178 82.1 62.5 19.6 10.6 ChesMiddle 37 208 65.2 48.0 17.3 9.3 ChesUpper 37 215 81.8 60.7 21.1 10.2 CrystalC 24 126 31.1 24.2 6.9 6.4 CrystalD 24 100 31.3 24.2 7.0 6.2 Everglades 69 912 54.5 32.7 21.8 11.8 Florida 128 2107 128.4 51.0 77.3 20.1 Maspalomas 24 83 70.3 61.7 8.6 5.3 Michigan 39 219 47.6 33.7 14.0 9.5 Mondego 46 393 45.2 32.2 13.0 10.0 Narragan 35 219 58.2 39.6 18.6 11.0 Rhode 19 54 36.3 30.3 6.0 5.3 StMarks 54 354 110.8 73.6 37.2 16.0

Table 1: Complexity values of several freely available network datasets, as reported in Standish (2010a). For each network, the number of nodes and links are given, along with the computed complexity C. In the fourth column, the original network is shuffled 1000 times, and the logarithm of the complexity is averaged (hln CERi). The fifth column gives the difference between these two values, which represents the information content of the specific arrangement of links. The final column gives a measure of the significance of this difference in terms of the number of standard deviations (“sigmas”) of the distribution of shuffled networks. In two examples, the distribution of shuffled networks had zero standard deviation, so ∞ appears in this column. Knowing the value of n, the number of bits needed to repre- network. However, this is highly combinatoric, sent l is ⌈log2 L⌉, where L = (n(n − 1)/2) so l is stored in and only really feasible for small networks. However, the a field of that width. number of representations can also be computed by dividing For the final part of the string, the linkfield, we can rep- the total number of possible renumberingsof the nodes (N!) resent the adjacency matrix such that a ‘1’ bit in position by the size of the automorphism group, for which several i(n − 1)+ j-th represents a link from node i to j if ji, where nodesare numbered 0 ...n−1, Darga et al., 2008). Even though each of these algorithms i

n˙i = rini + βij ninj , (3) Xj Results where ni is the population density of species i, ri its growth Methods and materials rate and βij the interaction matrix. is handled via a novel stochastic truncation algorithm, rather than the more Tierra was run on a 512KB soup, with SlicePow set to 1, un- til the soup died, typically after some 10 instructions usual threshold method. occurs by randomly mu- 5 × 10 have executed. Some variant runs were performed with Sli- tating th ecological parameters (r and β ) of the parents, i ij cePow=0.95, and with different random number generators, subject to the constraint that the system remain bounded but no difference in the outcome was observed. (Standish, 2000). The interaction matrix is a candidate foodweb, but has too The source code of Tierra 5.0 was modified in much information. Its offdiagonal terms may be negative as a few places, as described in the Tierra section of well as positive, whereas for the complexity definition (2), this paper. The final source code is available as Eco we need the link weights to be positive. There are a number tierra.5.0.D7.tar.gz from the Lab website hosted on of ways of resolving this issue, such as ignoring the sign of SourceForge (http://ecolab.sf.net). the off-diagonal term (ie taking its absolute value), and an- The genebanker output was processed by the eco- tisymmetrising the matrix by subtracting its transpose, then tierra.3.D13 code, also available from the EcoLab website, using the sign of the offdiagonal term to determine the link to produce a list of phenotypeequivalents for each genotype. direction. A function for processing the interaction log file generated For the purposes of this study, I chose to subtract just the by Tierra and producing a timeseries of foodweb graphs was negative βij terms from itself and its transpose term βji. added to Eco-tierra. The script for running this postprocess- This effects a maximal encoding of the interaction matrix ing step is process ecollog.tcl. information in the network structure, with link direction and The EcoLab model was adapted to convert the interaction weight encoding the direction and size of resource flow. The matrix into a foodweb and log the foodweb to disk every effect is as follows: 1000 time steps for later processing. The Webworld model 30000 C – Normal RNG 100∆ – Normal RNG 25000 C – Havege 100∆ – Havege

20000

bits 15000

10000

5000

0

-5000 0 1 2 3 4 5 6 7 Instructions Executed (×1010)

Figure 1: Complexity of the Tierran interaction network for SlicePow=0.95, and ∆, exaggerated by a factor of 100. Two different random number generators were used, Havege and the normal linear congruential generator supplied with Tierra.

30000 C – Normal RNG 100∆ – Normal RNG 25000 C – Havege 100∆ – Havege

20000

bits 15000

10000

5000

0

-5000 0 1 2 3 4 5 6 7 8 9 Instructions Executed (×1010)

Figure 2: Complexity of the Tierran interaction network for SlicePow=1, and ∆, exaggerated by a factor of 100. Two different random number generators were used, Havege and the normal linear congruential generator supplied with Tierra. 500 C 450 100∆ 400 350 300 bits 250 200 150 100 50 0 -50 0 1 2 3 4 5 6 Timesteps (×107)

Figure 3: Complexity of EcoLab’s foodweb, and ∆, exaggerated by a factor of 100, as described in the text.

600 C 100∆ 500

400

bits 300

200

100

0

-100 0 2 4 6 8 10 12 14 16 18 Timesteps (×107)

Figure 4: Complexity of Webworld’s foodweb, and ∆, exaggerated by a factor of 100, as described in the text. was adapted similarly. The model parameters were as doc- References umented in the included ecolab.tcl and webworld.tcl exper- Bedau, M. A., Snyder, E., and Packard, N. H. (1998). A classi- iment files of the ecolab.4.D37 distribution, which is also fication of long-term evolutionary dynamics. In Adami, C., available from the EcoLab website. Belew, R., Kitano, H., and Taylor, C., editors, Artificial Life VI, pages 228–237, Cambridge, Mass. MIT Press. Finally, each foodweb, and 100 link-shuffled control ver- sions were run through the network complexity algorithm Caldarelli, G., Higgs, P. G., and McKane, A. J. (1998). Modelling (2). This is documented in the cmpERmodel.tcl script of in multispecies communities. J. Theor, Biol., 193:345–358. ecolab.4.D37. The average and standard deviation of ln C was calculated, rather than C directly, as the shuffled com- Darga, P. T., Sakallah, K. A., and Markov, I. L. (2008). Faster sym- plexity values fitted a log-normal distribution better than a metry discovery using sparsity of symmetries. In Proceed- ings of the 45st Design Automation Conference, Anaheim, standard normal distribution. The difference between the California. measured complexity and exphln Ci (ie the geometric mean of the control network complexities) is what is reported as Drossel, B., Higgs, P. G., and McKane, A. J. (2001). The influ- ∆ in Figures 1–4. ence of predator-prey population dynamics on the long-term evolution of food web structure. J. Theor. Biol., 208:91–107. G¨ornerup, O. and Crutchfield, J. P. (2008). Hierarchical self- Discussion organization in the finitary process soup. Artificial Life, 14:245–254. It can be seen from Figures1–4, that noneof the artificial life McKay, B. D. (1981). Practical graph isomorphism. Congressus models studied generate substantially greater network com- Numerantium, 30:45–87. plexities than do the control networks. By “substantially”,I mean more than 10% of the total network complexity. The Myrvold, W. and Ruskey, F. (2001). Ranking and unranking per- complexity difference that exists is nevertheless often statis- in linear time. Information Processing Letters, 79:281–284. tically significant, albeit small (of the order of a few bits). By contrast, most of the 26 practical networks studied in Ray, T. (1991). An approach to the synthesis of life. In Langton, Standish (2010a) exhibited substantially greater complexi- C. G., Taylor, C., Farmer, J. D., and Rasmussen, S., editors, Artificial Life II, page 371. Addison-Wesley, Reading, Mass. ties than their controls, the exceptions being the David Cop- perfield adjective-noun adjacency dataset (0.98 bits), and the Standish, R. K. (1994). Population models with random embryolo- C. elegans metabolic network (which at 34.6 bits is about gies as a paradigm for evolution. Complexity International, 0.1% of the total complexity). 2. The complete failure for several independent artificial Standish, R. K. (2000). The role of innovation within economics. evolutionary systems to be able to generate this complex- In Barnett, W., Chiarella, C., Keen, S., Marks, R., and Schn- abl, H., editors, Commerce, Complexity and Evolution, vol- ity surplus weakens the case for the surplus as being due ume 11 of International Symposia in Economic Theory and to operation of an evolutionary process. It is possible that Econometrics, pages 61–79. Cambridge UP. this is another illustration of the difference between arti- ficial evolutionary systems and natural evolutionary sys- Standish, R. K. (2003). Open-ended artificial evolution. Inter- national Journal of Computational Intelligence and Applica- tems observed with Bedau-Packard statistics (Bedau et al., tions, 3:167. arXiv:nlin.AO/0210027. 1998). There is also the possibility that some systematic artifact skews the observational data towards more symmet- Standish, R. K. (2004a). Ecolab, Webworld and self-organisation. ric networks (which increases complexity values), however In Pollack et al., editors, Artificial Life IX, page 358, Cam- bridge, MA. MIT Press. it seems implausible that networks collected by many dif- ferent observers in many different fields should exhibit the Standish, R. K. (2004b). The influence of parsimony and random- same systematic error. More work needs to be done applying ness on complexity growth in Tierra. In Bedau et al., editors, ALife IX Workshop and Tutorial Proceedings, pages 51–55. this complexity metric to both artificially evolved networks arXiv:nlin.AO/0604026. and observational data of naturally evolved networks to elu- cidate if this is artifact, or a real phenomenon. Standish, R. K. (2005). Complexity of networks. In Abbass et al., editors, Recent Advances in Artificial Life, volume 3 of Ad- vances in Natural Computation, pages 253–263, Singapore. Conclusion World Scientific. arXiv:cs.IT/0508075. Standish, R. K. (2010a). Complexity of networks (reprise). Artifi- In this work, I measured the network complexity of several cial Life. submitted. arXiv: 0911.348. artificially evolved foodwebs to see if I could reproduce the Standish, R. K. (2010b). SuperNOVA: a novel algorithm for graph complexity surplus seen in empirical network data. In none automorphism calculations. Journal of Algorithms - Algo- of the artificial systems I studied was the complexity surplus rithms in Cognition, Informatics and Logic. submitted, arXix: substantial enough to be considered a real effect. 0905.3927. 40 2 39.8 log(x)−3.67978 + exp 0.5 35 0.0351403x − 0.0351403 +   +   + 39.8 x−39.6623 2 ++ . 30 + ++ . exp −0 5 . + 1 40794 1 40794 ++ +   + +  25 + + ++++ + + + 20 + ++ +++ + 15 + +++

++ +++ 10 + + + + + + ++ ++ +++ + 5 + Complexity −→ | + ++ ++++++ + + +++ + + +++++++ + + + 0 ++ +++++++++++++++++ 40 45 50 55 60 upland forest

riparian forest adj. upla riparian forest adj. past

flooded swamp forest

herbaceous wetland floodplain forest

main fork

muddy creek low marsh adj. main fork

mud flat

high marsh

north fork

low marsh adj. north fork Meaning A Meaning B

      B  A