Macroecological and macroevolutionary patterns emerge in the universe of GNU/ operating systems Petr Keil, A. A. M. Macdonald, Kelly S. Ramirez, Joanne M. Bennett, Gabriel E. Garcia-Pena, Benjamin Yguel, Bérenger Bourgeois, Carsten Meyer

To cite this version:

Petr Keil, A. A. M. Macdonald, Kelly S. Ramirez, Joanne M. Bennett, Gabriel E. Garcia-Pena, et al.. Macroecological and macroevolutionary patterns emerge in the universe of GNU/Linux operating systems. Ecography, Wiley, 2018, 41 (11), pp.1788-1800. ￿10.1111/ecog.03424￿. ￿hal-02621181￿

HAL Id: hal-02621181 https://hal.inrae.fr/hal-02621181 Submitted on 26 May 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

Distributed under a Creative Commons Attribution| 4.0 International License doi: 10.1111/ecog.03424 41 1788– 1800 ECOGRAPHY Research Macroecological and macroevolutionary patterns emerge in the universe of GNU/Linux operating systems

Petr Keil, A. A. M. MacDonald, Kelly S. Ramirez, Joanne M. Bennett, Gabriel E. García-Peña, Benjamin Yguel, Bérenger Bourgeois and Carsten Meyer

P. Keil (http://orcid.org/0000-0003-3017-1858) ([email protected]) and J. M. Bennett, German Centre for Integrative Biodiversity Research (iDiv) Halle- Jena-Leipzig, Leipzig, . JMB also at: Inst. of Biology, Martin Luther Univ. Halle-Wittenberg, Halle (Saale), Germany. – A. A. M. MacDonald, G. E. García-Peña and B. Bourgeois, Centre de Synthèse et d’Analyse sur la Biodiversité – CESAB, Aix-en-Provence, France. AAMM also at: Univ. of British Columbia, Vancouver, BC, . GEG-P also at: Centro de Ciencias de la Complejidad (C3) y Facultad de Medicina Veterinaria y Zootecnia, México. BB also at: Agroécologie – Agrosup Dijon, Inst. national de la recherche agronomique (INRA), Univ. Bourgogne Franche-Comté, Dijon, France. – K. S. Ramirez, Terrestrial Ecology, Netherlands Inst. of Ecology, Wageningen, the Netherlands. – B. Yguel, Unité MECADEV mécanismes adaptatifs et évolution, UMR 7179 CNRS/MNHN, Brunoy, France, and Centre d’Ecologie et des Sciences de la Conservation (CESCO-UMR 7204), Sorbonne Univ.-MNHN-CNRS-UPMC, CP51, Paris, France. – . Meyer, Macroecology and Society, German Centre for Intergrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany, and Faculty of Biosciences, Pharmacy and Psychology, Univ. of Leipzig, Leipzig, Germany, and Dept of Ecology and Evolutionary Biology, Yale Univ., New Haven, CT, USA.

Ecography What leads to classically recognized patterns of biodiversity remains an open and con- 41: 1788–1800, 2018 tested question. It remains unknown if observed patterns are generated by biological doi: 10.1111/ecog.03424 or non-biological mechanisms, or if we should expect the patterns to emerge in non- biological systems. Here, we employ analogies between GNU/Linux operating systems Subject Editor: Thiago Rangel (distros), a non-biological system, and biodiversity, and we look for a number of well- Editor-in-Chief: Miguel Araújo established ecological and evolutionary patterns in the Linux universe. We demon- Accepted 2 February 2018 strate that patterns of the Linux universe generally match macroecological patterns. Particularly, Linux distro commonness and rarity follow a skewed distribution with a clear excess of rare distros, we observed a power law mean-variance scaling of temporal fluctuation, but there is only a weak relationship between niche breadth (number of software packages) and commonness. The diversity in the Linux universe also follows general macroevolutionary patterns: the number of phylogenetic lineages increases lin- early through time, with clear per-species diversification and extinction slowdowns, something that has been indirectly estimated, but not directly observed in biology. Moreover, the composition of functional traits (software packages) exhibits significant phylogenetic signal. The emergence of macroecological patterns across Linux suggests that the patterns are produced independently of system identity, which points to the possibility of non-biological drivers of fundamental biodiversity patterns. At the same time, our study provides a step towards using Linux as a model system for exploring macroecological and macroevolutionary patterns.

Keywords: complex systems, cultural evolution, , speciation, species-abundance distribution, Taylor’s power law

–––––––––––––––––––––––––––––––––––––––– © 2018 The Authors. This is an Online Open article www.ecography.org This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any 1788 medium, provided the original work is properly cited. Introduction also been a continuous feedback between evolutionary biol- ogy, ecology and economics (Malthus 1798, Maynard Smith Despite the general paucity of strict laws in ecology and evo- 1982, Worster 1994, Bonds et al. 2012), and some adoption lution (Lawton 1999), there are several quantitative patterns of ecological and evolutionary principles for understanding that are consistently observed across taxonomic, geographi- the development and maintenance of software (Fortuna et al. cal and temporal scales. For example, abundances of spe- 2011, Valverde and Sole 2015). However, these fields have cies generally have skewed frequency distributions, with been emphasizing micro-evolutionary, population-genetic, many more rare species than abundant ones (Gaston 1996a, or population-dynamic processes, lacking an evaluation of McGill et al. 2007, Morlon et al. 2009), and there is a the emerging macro-ecological and macro-evolutionary pat- power-law relationship between the mean and the variance of terns. The notable exceptions (Mace and Pagel 1995, Nekola temporal abundance fluctuations (Taylor and Woiwod 1980, and Brown 2007, Blonder et al. 2014, Gjesfjeld et al. 2016) Kendal 2004). In general, species diversity decreases from the often focus on one selected pattern or on multiple but closely equator towards the poles (Rosenzweig 1995, Gaston 2000) related ecological patterns (Blonder et al. 2014). While and increases with area (Rosenzweig 1995, Drakare et al. there are many examples of non-biological systems follow- 2006). Species with large geographic distributions tend to ing patterns observed in biology, we are unaware of a system have wider niches, and vice versa (Slatyer et al. 2013). Species simultaneously exhibiting macroecological and macroevolu- with similar functional traits are often phylogenetically related tionary patterns. (Darwin 1859, Losos 2008). Diversification of evolutionary Here, we propose structural analogies between biological lineages slows down over evolutionary time (Rabosky and diversity and the universe of open GNU/Linux-based Lovette 2008, Rabosky and Glor 2010, Moen and Morlon computer operating systems (hereafter Linux). We have 2014), although generality of the latter pattern has been ques- chosen Linux, since its components are all openly available tioned (Harmon and Harrison 2015, Graham et al. 2016). and well-documented, and because a large number of struc- To explain these patterns, we can invoke uniquely eco- tural analogies can be drawn with the biological system of logical and evolutionary processes: the patterns could be evolution, distribution and abundance of biological species, an outcome of assembly rules, natural selection, behav- hereafter biodiversity (see Table 1 and Material and meth- ior, species interactions, or interplay between specific ods for details). Based on these analogies, we test the Linux functional traits and environments. However, it has been universe for classical macroecological and macroevolution- demonstrated that some of the patterns are not unique to ary patterns and relationships, and we discuss the structural ecological and evolutionary systems, and often emerge in properties that likely generate common patterns in both other complex systems (Gaston et al. 1993, Mace and Pagel biological and computer operating systems. 1995, Bettencourt et al. 2007, Nekola and Brown 2007, Warren et al. 2011, Blonder et al. 2014, Scheffer et al. 2017). Examples are: species-abundance distributions of music fes- Material and methods tival setlists (Nekola and Brown 2007) and basketball wins by a team (Warren et al. 2011), frequency distributions of Biodiversity structures in Linux components of software (Pang and Maslov 2013), latitudi- nal gradients of language diversity (Mace and Pagel 1995), The development and distribution of GNU/Linux operating species–area relationships in corporations, industrial codes, systems uses various open-source licenses (e.g. GNU GPL2, and minerals (Blonder et al. 2014), and diversification www.gnu.org), which allows the code to be freely copied slowdowns in American automobiles (Gjesfjeld et al. 2016). and modified. This has led to the development of hundreds To explain such conspicuous universality, theories have been of operating systems, connected in a branching pattern of proposed that predict the patterns across different systems for descent with modification. These different operating sys- pure statistical reasons, given that the systems share struc- tems possess different applications, and are used with vari- tural constraints (Frank 2009, Sizling et al. 2009, Harte ous degrees of popularity. We propose that each Linux-based 2011, Blonder et al. 2014). One such typical constrain is the , commonly referred to as a distro (Table 1) partition of populations of objects into categories. can be viewed as a lineage or species (Mens et al. 2014a, b). However, structural constraints are not the only way Our decision to give a Linux distro a distinct name, rather that biological and non-biological systems can resemble one than calling it another version of an existing distro, was based another – similarity may also arise from analogous underly- on developers’ subjective view that the distro is sufficiently ing processes. Striking examples can be found in the rapidly different, which is similar to how many biological species expanding literature on cultural evolution (Dawkins 1976, have been defined to date. Cavalli-Sforza and Feldman 1981, Boyd and Richerson The number of hardware devices on which a distro is 1985, Sperber 1996, Mesoudi 2016) which exploits paral- installed can be interpreted as its commonness (e.g. popu- lels between (mostly) Darwinian evolution and the evolu- lation abundance or related, but not identical, range size). tion of languages, beliefs, skills, knowledge, institutions or While popular distros can spread across devices, unpopular other forms of socially transmitted information. There has ones may go extinct. Thus, because distro abundances change

1789 Table 1. Summary of analogies between ecological terms and the ‘Linux universe’, i.e. the system encompassing all Linux distros, Linux software packages, their developers and users, and relationships between those.

Ecological terminology Linux definition Species or lineage A , most commonly referred to as ‘distro’. A distro is a computer operating system comprised of a collection of software packages. Phylogeny Genealogical relationships between distros. Most Linux distros have evolved from one of three main distros: Debian, or SLS. We are not aware of a merging between distros, and hence a tree seems to be a good representation of Linux evolutionary history. Commonness Hits per day (HPD) or number of machines that run a distro. HPD is a yearly average of the number of times (abundance, range size) per day any given distro page on DistroWatch.com is accessed. HPD is a proxy for distro popularity. Diversification event Approximate date at which the development of a distro began. Extinction event Approximate date at which development on a distro ceased. De-extinction may occasionally happen, if development of a distro is resumed. Functional or life-history Software packages available with each distro. These packages or ‘functional traits’ determine the applicability traits of the distro. Natural selection Use and popularity of a distro is based on users and downloads. Unused and unpopular distros go extinct. Niche breadth Number of functions, purposes, or capabilities that a distro has. Similar to multidimensional niche volume. Area/Productivity Population of country where distro was developed. Alternatively, this could be the user base of each distro, however those statistics are not available. through time, distros have population dynamics. Further, user’s computer to another, but is instead enforced centrally computer operating systems, and especially Linux distros, by developers. Thus, the micro-evolutionary mechanism is come with diverse functionalities, which are pre-installed not Darwinian sensu stricto (Lewontin 1970). This is a prob- in the form of software packages, and which can be made lem that is widely discussed in the field of cultural evolution analogous to functional traits of biological species. Finally, (Mesoudi 2017), and it is usually concluded that even imper- new distros emerge through a similar to biological fect analogies are useful. In our case the analogies work well speciation: When developers decide to come up with a new at the macro-evolutionary level – there certainly is a selec- distro, part of the code of an ancestor distro is reused and tion by users, there is origination and extinction, and distros combined with custom-tailored new code, and with existing do change in time. As a result, a clear phylogeny emerges, open-source packages. The evolution of distros is influenced with each tip of the phylogeny (distro) having a unique and by environmental factors such as hardware architecture and inherited combination of traits. We can then argue that if user requirements (Yan et al. 2010); and constrained by user biodiversity and Linux share similar categorization of objects, habits and the need for cost-effective development through but they differ in their inner processes and mechanisms, and reuse of code (Myers 2003, Fortuna et al. 2011); as a result, yet they display the same emerging patterns, then we have a Linux distros have a genealogy which is potentially analogous case for a non-biological and non-mechanistic explanation of to a phylogenetic tree. these patterns. Once the qualitative structural analogies were set, we examined if the analogous structural elements of Linux follow the same quantitative patterns as biodiversity. Dur- Data on species commonness ing this process, we found that some biodiversity patterns were hard to imagine in the Linux realm; such as the lati- We considered popularity (i.e. the number of users) of each tudinal gradient of biodiversity, which would depend on distro as a measure of species commonness, which is analogous a reasonable analogy to latitude. Hence, from the set of to either species abundance or range size. However, because known biodiversity patterns, we chose those for which most GNU/Linux operating systems are freely available for reasonable analogies can be made given available data on download and use, there is no single way to directly deter- Linux. Specifically, we looked for the following macroeco- mine how frequently each distro is used. Therefore, we used logical patterns: 1) the skewed species-abundance distribu- three sources of data as proxies of commonness. tions, 2) power-law mean-variance scaling of population 1) We used popularity metrics from Distrowatch (www. fluctuations, and 3) a positive relationship between niche .com), measured as hits per day (HPD), which breadth and range size. Between macroevolutionary pat- is the daily number of clicks on each distro-specific page on terns we looked for: 4) slowdowns of diversification rates Distrowatch. We assume that HPD correlated to the actual over evolutionary time and 5) a phylogenetic signal in number of users, i.e. to the number of individuals in the functional traits. global population of a particular Linux species. Although this We note that the analogies between Linux and biology assumption is problematic (Edge 2011), we still consider the (Table 1) are imperfect. For example, the evolution of Linux HPD worth exploring, but we also consider alternative mea- distros involves no genetic inheritance, there is little indi- sures. From Distrowatch we extracted data on HPD of 275 vidual-level variation, and the inheritance is not from one distros averaged over a yearly period between 27/04/2016

1790 and 28/04/2015 taken from  http://goo.gl/hMjUXr . of requests (squids) on for pages on specific These data were used to examine the species-abundance operating systems. We downloaded two sets of data: 1) distributions (SAD), and to assess the relationship between data logged between 2010 and 2011 for 8 most popu- niche breadth and commonness. We also extracted HPD for lar distros at the time, and 2) data logged between 2012 each year between (and including) 2002 to 2015 (from the and 2014 for 17 most popular distros at the time. These main page on  www.distrowatch.com , section Page Hit were used for calculations of Taylor’s Power Law (TPL), Ranking; downloaded on 28/04/2015). In each year, these separately in each of the two time periods. data were available only for the 100 most popular distros. This makes them unsuitable for any comprehensive tempo- Models of species-abundance distribution (SAD) ral dynamics of SAD, but it allows analyses of population dynamics for the more common distros. We used distros that We fitted the lognormal and logseries models to the data on were present in the data for  10 consecutive years to assess frequency of commonness from Distrowatch (HPD) and the temporal mean-variance scaling (Taylor’s power law; LinuxCounter (no. of machines). We chose these models TPL). since they describe well the generally observed excess or 2) As an alternative and a more direct measure of distro rare species in ecological abundance data (Baldridge et al. commonness we used the sample of 164,726 computers regis- 2016). We fit the logseries model using package sads in tered, by volunteers, at LinuxCounter ( www.linuxcounter. R (function fitls). To fit the lognormal model, we used net/statistics/distributions ; downloaded on 27/11/2017), the mean and standard deviation of log commonness as where the identity of each distro on each computer is known. the maximum likelihood estimates of parameters of the These data were used to examine the species-abundance lognormal probability density function. To plot the fit- distributions (SAD), and to assess the relationship between ted models alongside the data, we drew a random sample niche breadth and commonness. from each distribution and ordered the outcomes. This was 3) Finally, we used Wikimedia traffic analysis reports repeated 500 times and the average is shown as the solid ( http://goo.gl/2h6Rq6 ), which give monthly counts lines in Fig. 1.

(a) Source: LinuxCounter (b) Source: LinuxCounter 40000 5 Model 4 30000 Lognormal 3 Logseries 20000 2 no. of machines

10000 10

No. of machines 1 log 0 0 0.00.5 1.01.5 2.02.5 0 100 200 300 Rank log10 Rank

(c) Source: DistroWatch (d) Source: DistroWatch

3000 3

2000 2 HP D 10 HP D lo g 1000 1

0 0 0.00.5 1.01.5 2.02.5 0 100 200 log Rank Rank 10

Figure 1. Rank-commonness distributions of 275 extant Linux distros from DistroWatch (a–b) and 307 distros from LinuxCounter (c–d). HPD stands for ‘hits-per-day’ on www.distrowatch.com; no. of machines is number of actual computers on which a particular distro is installed in the set of all computers registered on LinuxCounter – both are proxies for distro commonness (abundance). Black points are the data, solid lines are the lognormal (red) and logseries (blue) SAD models fitted to the data.

1791 Calculation of Taylor’s power law (TPL) In total, we obtained the data for 275 distros, from which we further removed operating systems based on For each distro with  10 yr monitored on Distrowatch, BSD, distros (e.g. ), distros with and for each distro monitored by Wikimedia traffic reports, more than 8000 packages, and the Debian distro. This was we calculated log-transformed temporal variance of the because package lists for these distros on Distrowatch show commonness (Distrowatch HPD or monthly Wikipedia all available packages in their repositories, not packages that requests) and log-transformed temporal mean of common- are bundled on installation media (L. Bodnář, Distrowatch ness. Although a non-linear regression is potentially better administrator, pers. comm.). This left us with 227 distros suited to estimate the TPL scaling exponent, we followed for the analysis. the practice in the broad literature on TPL and fitted a nor- mal linear regression to the data using log variance versus log mean. We took the slope of the regression as the esti- Relationship between niche breadth and commonness of the TPL scaling exponent, and we hereafter call it We fit a generalized linear model (quasipoisson family, log ‘TPL slope’. link function) with commonness as a response and the three measures of niche breadth (described above) as predictors; we Data on niche breadth log-transformed the number of packages prior to the mod- Niche and niche breadth can be complex concepts (Chase eling. For commonness, we used both the HPD measure and Leibold 2003, Peterson et al. 2011), here we use a simpli- obtained from Distrowatch (227 data points) and the num- fied definition of niche breadth as the suite of environments ber of machines with a given distro obtained from Linux- or resources that a species can inhabit or use (Gaston et al. Counter (108 data points). 1997). We used three proxies of niche breadth of Linux distros. Phylogenetic information 1) Number of packages. We made the analogy between species functional traits and software packages that come The majority of current GNU/Linux distros descended from pre-installed with each distro on an installation medium, for three original distros: Debian, Red Hat and Soft Landing example as an .iso file on a live DVD. We defined the number Linux system (SLS, later known as ). Family trees of software packages as niche breadth – more packages mean of these three Linux families were compiled by the GNU/ wider niche breadth. The number of packages was extracted Linux Timeline project (GLDT) consisting of A. Lundqvist, for each distro listed at  www.Distrowatch.com  between D. Rodic, M. A. Mustafa, A. Urosevic and J. A. Sandoval; the 28/04/2015 and 27/04/2016. Distrowatch provides detailed coded trees are available at  http://futurist.se/gldt/  under information on each distro, from which we extracted the GNU Free Documentation Licence. These trees were last full list of packages (-specific example:  http://goo. updated in 2012, which is also the point at which our phylo- gl/0Qhflk ). genetic analyses terminate. We converted the family trees into 2) Number of applications. We used 32 ‘applica- Newick phylogenetic trees, and for each distro we extracted tion’ categories defining the broad purposes of each distro. the date at which its development began (‘speciation’), and These categories were Assistive, Beginners, , the date at which development ceased (‘extinction’). Here we Clusters, Data Rescue, Desktop, Disk Management, Educa- rely on the dates provided by the GLDT project, which does tion, Firewall, Forensics, , Gaming, High Per- not consider the possibility of renewed development after a formance Computing, Live CD, Live DVD, Live Medium, ‘dormancy’ period. Although such ‘de-extinctions’ can hap- Multimedia, Myth TV, , Network Attached Storage, pen in theory, they either did not occur, or were not coded in Old Computers, Privacy, , Router, Scientific, our three Linux phylogenies. Further, the GLDT estimates Security, Server, Source-Sbased, Specialist, Telephony, Thin of the last development dates may be imprecise – here we Client, and . We use number of these applications as assume that the imprecision is either lower, or comparable a measure of niche breadth. We merged Live CD, Live DVD with, the imprecision of extinction dates derived from fossil and Live Medium to a single category. On Distrowatch, each data (Wang and Marshall 2016) or phylogenies (Morlon distro can be labeled with any combination of these catego- 2014), which itself can be substantial. ries. These labels can be obtained from each distro-specific (example of Ubuntu:  http://goo.gl/VDMUt6 ). Speciation and extinction through time 3) Number of special applications. In many biodiversity datasets, patterns are mainly influenced by a small number of We separated the process of diversification among GNU/ common species, which can hide potentially interesting pat- Linux distros into speciation and extinction. For each of the terns of the rare species (Jetz and Rahbek 2002). We applied three distro families we plotted: 1) the cumulative number this concept here and used the same number of applications of speciation events, which is the number of distros that as above, but excluded the prevalent ‘Desktop’ and ‘Live CD’ had been created up to a given date (even if development categories, so that the number reflects only the relatively had ceased) – analogous to cumulative speciation through narrow applications. time plots created for biological systems, the so-called

1792 ‘lineage-through-time plots’ (Nee et al. 1994). 2) The cumu- Results lative number of extinction events through time, which is the cumulative number of distros that had ceased develop- We found that the diversity patterns observed in the GNU/ ment. 3) Per-species instantaneous speciation and extinction Linux universe matched different macroecological and rates; these are numbers of distros that were created or their macroevolutionary patterns to varying degrees. development ceased in a given month, divided by the total number of all extant distros in that month. 4) Per-species Macroecological patterns instantaneous diversification rate, defined as speciation rate minus extinction rate. This quantity is of key interest, since it Species commonness measured as the number of machines or is expected to slow down over time. as HPD, followed a skewed frequency distribution (Fig. 1), with many distros being uncommon and few distros being common, which is typical for ecological abundance data. In Phylogenetic signal of traits both metrics, the distribution can be approximated by a log- normal probability density function, which performed better For this analysis, we used 56 distros that descended from than the logseries model. Although the distribution of HPD Debian (excluding Debian) – these are the distros for which lacked the typical tail of singletons (Fig. 1d), this is likely we have both data on traits (i.e. packages) and phyloge- because Distrowatch does not keep track of the extremely netic relationships. We used the same data on packages as rare distros – in this respect the data from LinuxCounter are in the analysis of niche breadth (above). Data and lists for more representative. all software packages of the 56 distros were collected from We found that the relationship between commonness  www.distrowatch.com  on 14/05/2016 (see the Material mean (M) and variance (V) can be modelled by a power and methods section on niche breadth for more details). We function in all three examined datasets (Fig. 2), with slope created a package by distro binary matrix, with 1 for presences of approximately 2 (Table 2). We detected no significant of the package, and 0 for absence. To test for the phylogenetic positive relationship between commonness and the three signal we performed two analyses. measures of niche breadth (Fig. 3), with the exception of a 1) We estimated if the dissimilarity in composition of weak but significant, positive relationship between common- binary traits is related to the phylogenetic distance between ness and the number of all software packages (quasipoisson distros. We calculated two distance matrices of compositional deviance explained  7%, p  0.01, Fig. 3a), and between dissimilarity, one using βj (Jaccard beta), the other βsim (Beta commonness and the number of all applications (quasipois- sim) (Koleff et al. 2003). We also calculated a distance matrix son deviance explained  12%, p  0.001, Fig. 3b) in the based on phylogenetic distance between the 56 distros. We HPD metric from Distrowatch. then used Mantel tests (1000 permutations) to detect sig- nificant correlations between distance matrices representing Macroevolutionary patterns compositional dissimilarity and phylogenetic distance. We also calculated Spearman correlations (Rho) between the In all three Linux families (i.e. larger groups of Linux distros matrices. with the same origin), the cumulative number of phyloge- 2) We measured the strength of phylogenetic signal netic lineages increased linearly through time (Fig. 4a, b). in each trait separately. If the traits were continuous, it would have been possible to use lambda (Pagel 1999) or K (Blomberg et al. 2003) statistics. Since our traits are all binary (presence or absence of a package), we used the D statistic 15 (Fritz and Purvis 2010) implemented in function phylo.d in R package caper. D measures phylogenetic signal in binary Source DistroWatch traits, and can have negative or positive values, with D  0 10  Variance Wikipedia 2010−2011 indicating phylogenetically conserved traits, D 1 indicat- 10 Wikipedia 2012−2014 ing random distribution of trait states along the phylogeny, log and indicating D  1 overdispersed traits. We calculated 5 D for each package, together with the probability that the observed D comes from a randomly distributed package along the phylogeny. 2.55.0 7.5

log10 Mean

Data and code Figure 2. Taylor’s power law (TPL) of temporal fluctuation of com- monness of Linux distributions, estimated from three data sources. All of the data and R code used for the analyses are archived Each point represents a Linux distribution. Dashed lines have slopes at Zenodo.org  http://doi.org/10.5281/zenodo.1120445  1 and 2, and are presented for visual guidance. Parameters of the under GNU General Public License, ver. 2. linear regressions are in Table 1.

1793 Table 2. Parameters of the Taylor’s power law (TPL) describing beyond the scope of a single paper. Instead, for now we resort temporal fluctuation of commonness of Linux distributions. Pro- to a parsimonious and non-mechanistic explanation for the vided are slope and intercept of the linear regression of log temporal variance vs log temporal mean, their standard errors (SE) and the prevalence of the patterns. Simply, when systems share struc- coefficient of determination (R2). tural constraints, such as partitioning of objects to categories (species) across space (Frank 2009, Harte 2011), or when Source Intercept (SE) Slope (SE) R2 variables in the system interact in similar ways [additively DistroWatch –3.34 (1.25) 2.18 (0.2) 0.8 vs multiplicatively; (Blonder et al. 2014)], they will exhibit Wikipedia 2010–2011 –1.4 (2.42) 1.7 (0.32) 0.81 similar quantitative patterns, independent on system identity Wikipedia 2012–2014 0.11 (1.43) 1.94 (0.08) 0.97 and its detailed inner working. This follows from both the central limit theorem (McGill and Nekola 2010) and the the- In the Debian family, there was a peak in instantaneous ory of maximum entropy (Harte 2011). Thus, we can explain per-species speciation rate around 2005, followed by a pro- the emergence of the patterns without any typically biologi- nounced slowdown and a peak of extinction rates with a cal mechanism such as genetic basis for phenotypic variation, subsequent slowdown in 2006 (Fig. 4c). Similar patterns complex species interactions, behavior, or community assem- occurred in the Red Hat family, but three years earlier bly rules. In fact, all of these mechanisms are mostly absent in (Fig. 4c). Diversification in all three Linux families always the Linux universe, yet the biodiversity patterns emerge any- showed a decline after 2005 (Fig. 4d). However, in Debian way, which really points to their non-biological causes. In the and SLS there also was a relatively low rate of diversifica- following, we discuss each investigated pattern individually. tion rate prior to 2005, while Red Hat underwent a rapid diversification even prior to 2005 (Fig. 4d). Macroecological patterns Our analyses of phylogenetic signal in trait composition showed that more closely related distros (measured by tempo- We showed that commonness of GNU/Linux-based oper- ral distance from their nearest common ancestor) were more ating systems follows both a skewed frequency distribution similar in their composition of traits measured by beta-diversity and a power relationship between mean commonness and its of package composition (Fig. 5b, c) than randomly selected temporal variance. In ecology, species-abundance distribu- pairs of distros. Further, 17% and 23% of the 14,161 pack- tions (Fischer 1943, McGill et al. 2007, Nekola and Brown 2007) and distributions of range sizes (Gaston 1996b, 2003) ages exhibited phylogenetic signal significant atα  0.05 and are often best described by the same skewed lognormal or α  0.1 respectively, measured by the D statistic (Fig. 5d, e). logseries model (Baldridge et al. 2016). This pattern emerges when individuals within a population have roughly constant Discussion per-individual probabilities of reproducing or dying, which in turn leads to stochastic multiplicative (as opposed to addi- Our analyses demonstrate that the Linux universe and bio- tive) dynamics of population size (Lande et al. 2003). In the logical systems not only showcase many structural analogies Linux universe, these multiplicative dynamics emerge when (as already pointed out by Mens et al. 2014a, b), but also users install (reproduction) and uninstall distros (death) share quantitatively similar emerging properties, including on computers, and when each user has a roughly con- patterns of commonness and evolutionary rates. This is in stant probability of installing his/her favourite distro on a line with findings from other complex systems, which also computer, or abandoning the distro. exhibit macroecological patterns (Gaston et al. 1993, Mace Similar dynamics can also produce the mean-variance scal- and Pagel 1995, Bettencourt et al. 2007, Nekola and Brown ing of population abundance fluctuations known as Taylor’s 2007, Warren et al. 2011, Blonder et al. 2014); and macroevo- power law (TPL), which we also detected in the Linux uni- lutionary patterns (Gjesfjeld et al. 2016). The Linux universe verse. TPL generally has a proportion of explained variance shows that not one, but a whole spectrum of biological of  0.8 in both single- and multi-species systems (Taylor patterns can emerge in a single non-biological system. Thus, and Woiwod 1980, 1982, Hubbell 2001), with the slope we should expect similar patterns to emerge in other complex typically falling between 1 and 2 (Kendal 2004, Keil et al. systems with categories of objects that evolve, for example in 2010). Multiple hypotheses have been suggested to interpret languages, musical styles, political parties, companies, or even empirical TPL slopes (Kendal 2004). Slopes closer to 1 were countries and religions. linked to reproductive asynchrony (Ballantyne and Kerkhoff At this point the reader might call for a mechanistic model 2007), species interactions (Kilpatrick and Ives 2003), hard that could replicate, or predict, the patterns across unrelated upper limits on population size (Keil et al. 2010), and even to systems. Although aimed at describing the origin of different statistical artifacts (Kiflawi et al. 2016), while slopes closer to patterns compared to this investigation, such an approach has 2 emerge from simple multiplicative models such as random been already adopted e.g. by Gherardi et al. (2013), where a walk or deterministic (Perry 1994). The slopes close to simple stochastic model was able to predict both distribution 2 observed here suggests that the temporal dynamics of Linux of Linux package sizes and mammalian body masses. How- commonness follow an unbounded stochastic multiplicative ever, to replicate the multitude of patterns presented here, model, which is also what we expect to lead to the observed we would have to attain a degree of sophistication that goes skewed frequency distributions.

1794 (a) Source: DistroWatch (b) Source: DistroWatch (c) Source: DistroWatch

1000 1000 1000

100 100 100 HP D HP D HP D

10 10 10

100 1000 246 0 1234 No. of packages No. of applications No. of special applications

(d) Source: LinuxCounter (e) Source: LinuxCounter (f) Source: LinuxCounter

1000 1000 1000 No. of machines No. of machines No. of machines 10 10 10

100 1000 246 01234 No. of packages No. of applications No. of special applications

Figure 3. Three measures of niche breadth and their relationship with commonness measured as HPD or no. of machines. (a, d) Niche breadth measured as number of packages, (b, e) niche breadth measured as number of all applications, (c, f) niche breadth measured as number of special applications (not including ‘Desktop’ and ‘Live CD’). Solid red lines are means of quasipoisson log-linear regressions, shading shows standard errors of the means. HPD stands for ‘hits-per-day’ on  www.distrowatch.cz ; no. of machines is the number of actual machines on which a particular distro is installed in the set of machines registered on LinuxCounter – both are proxies for distro commonness (abundance).

We detected no, or only very weak, positive relationships be custom-added at any time, and this likely weakens the between commonness and niche breadth of distros (i.e. the strength of using pre-installed software for determining the functionality that distros offers), both when measuring it as success of an operating system. the number of software packages in a distro or as the number of applications (as stated by the developers). In ecology, such Macroevolutionary patterns positive correlation between niche breadth and common- ness (specifically, geographical range size) is a general pattern Diversification slowdowns are often observed in biological (Slatyer et al. 2013). The hypothesized ecological explanation systems [Rabosky and Glor (2010), Moen and Morlon is that by exploiting a greater array of resources and main- (2014), but see Harmon and Harrison (2015) and Graham taining populations in a wider variety of conditions, species et al. (2016)], and can be explained by competition for may become more common (Brown 1984). However, our limited resources or space. Similar reasoning can also explain findings fail to generalize this ecological relationship to the the diversification slowdowns that we detected in the Linux Linux universe. A possible reason for the absence (or weak- universe: we suggest that between years 2000 and 2005 ness) of the link could be our definition of niche breadth, as Linux users had become comfortable with certain distros the number of software packages that come pre-installed with that satisfied most of the potential applications and user a distro can be easily expanded. Unlike biological phenotypes, requirements and therefore diversification slowed down. An which are usually fixed over ecological timescales, numerous alternative explanation can be borrowed from (Gjesfjeld et al. additional packages are available for many distros that can 2016) who observed diversification slowdowns in American

1795 (a)

(b)

(c)

(d)

Figure 4. Speciation, extinction, and diversification of Linux distros through time in three major distro families. (a) Phylogenetic trees of the three families of Linux distros that were used to make the panels (b), (c), and (d). Branches of two major distros, Ubuntu and Debian, are highlighted by green and purple. (b) Accumulation of new species (distros) and extinctions over time. (c) Instantaneous per-distro rates of speciation and extinction rate. (d) Instantaneous per-distro diversification rate (speciation rate – extinction rate). Solid black lines are LOWESS regressions (smoothing span 0.3), shading indicates standard errors of the smoothed means. automobiles, and who argue that the slowdowns can be universe of Linux is open and largely free from the usual cost explained by the emergence of dominant designs (Abernathy or copyright constraints, we consider this economical expla- and Utterback 1978). As Gjesfjeld et al. (2016) explains, nation implausible, and we tend to lean towards explain- when technologies become too specialized, the cost of imple- ing the slowdowns by the ecological mechanism involving menting innovations becomes too high and diversification saturation of a finite niche space. slows down, leading to long-term dominance of the most As mentioned, Linux distros come with diverse func- successful designs. However, since the community-driven tionalities, which are pre-installed in the form of software

1796 (b) (c) (a) 1.00 Rho = 0.342 ** 1.0 Rho = 0.348 ** ElementaryOS Ubuntu Bodhi PinguyOS BackBox Netrunner 0.75 0.8 wattOS PeppermintOS AV CAINE Si m 0.6 Leeenux 0.50 β UltimateEdition Jaccard β ZorinOS Greenie UbuntuStudio 0.4 Madbox 0.25 gNewSense Mint DEFT 0.00 0.2 Ulteo 510152025 510152025 Bardinux DoudouLinux Phylogenetic distance [years * 2] Phylogenetic distance [years * 2] OpenMediaVault Semplice Canaima Metamorphose (d) (e) Proxmox 1500 PelicanHPC Webconverger BOSS siduction Elive LliureX 2000 Voyage 1000 BlankOn ClonezillaLive Guadalinex

antiX count count gnuLiNex 1000 Musix 500 SymphonyOS B2D MAX Omoikane.Arma. 0 0

−10 −5 0510 0.00 0.25 0.50 0.75 1.00 D statistic p value

Figure 5. Phylogenetic signal in 14 161 binary traits (software packages) of 56 Debian-based distros for which we had information on pack- age composition. (a) Phylogenetic tree of the distros. (b–c) Phylogenetic signal in the composition of software packages shipped with each distro, expressed as (significant) positive correlation between phylogenetic distance and trait compositional dissimilarity measured by βJaccard and βSim. Rho is Spearman correlation, asterisks indicate probability p  0.01 that there is no correlation (Mantel test, 9999 permutations). (d) Frequency distribution of the D statistic (Fritz and Purvis 2010) across the 14 161 binary traits; red line (D  1) indicates random distribution of a trait along the phylogeny, D  0 is phylogenetic conservatism. (e) Distribution of probability that a given D comes from a trait randomly distributed along the phylogeny; blue line is p  0.05. packages, and which we have made analogous to functional avoid radical changes that cause bugs due to complex pack- traits of biological species. We found that more closely related age dependencies (Mens et al. 2014a, b). At the same time, distros were more similar in their composition of traits than the signal is not particularly strong. The likely reason for the randomly selected pairs of distros, and 17 to 23% of indi- weakening is that distros are not fully confined to directly vidual software packages were significantly phylogenetically inherit packages from their ancestors. Instead, developers of conserved. These results are consistent with a widespread distros are free to load them with existing open-source soft- macroevolutionary pattern known as phylogenetic ‘signal’ or ware that was originally developed for other distros, making ‘autocorrelation’ of traits, i.e. the tendency of more closely the process of package inheritance more similar to horizontal related species to be ecologically (functionally) more similar gene transfer in bacteria (Ochman et al. 2000) or to horizon- than species drawn at random from the phylogenetic tree of tal transmission of culture in human societies (Henrich and closely related species (Harvey and Pagel 1991, Wiens et al. Broesch 2011, Hewlett et al. 2011). 2010). Finding a phylogenetic signal in Linux distros is expected. Linux as a model system for biodiversity Some degree of functional similarity of a descendant dis- tro with the parent is indeed desired, perhaps by the users Being well-documented, Linux may serve as a useful model (they like the familiar), by the developers (they like to build system for studying certain macroecological or macroevolu- on what has already been built), as well as by the need to tionary patterns and dynamics that are particularly hard to

1797 get by, given our limited capacity to document biological Acknowledgements – We are grateful to Luke Harmon, Antonín processes across space, time, and taxa (Hortal et al. 2015, Macháč and eight anonymous referees for extensive comments. Meyer et al. 2015), and the difficulty of conducting experi- Ladislav Bodnář from  www.distrowatch.com  assisted with ments over large spatial and temporal scales. For instance, access to the raw data. we unveiled slowdowns in both diversification and extinction Funding – This study was conceived at a workshop (25–29/4/2016, Aix-en-Provence, France) organized by Alison Specht and the rates in Linux, processes that happen at timescales that are French Center for Synthesis and Analysis of Biodiversity (CESAB), impossible to directly observe in nature, and usually need to together with Marten Winter and the synthesis division (sDiv) of be indirectly inferred using strong assumptions and complex the German Centre for Integrative Biodiversity Research (iDiv), models. funded via the German Research Foundation (DFG FZT 118). The potential use of the Linux universe as a model for BY benefited from a postdoctoral fund from the French national biodiversity may indeed extend beyond the comparisons research agency LabEx ANR-10-LABX-0003-BCDiv, in the context made here. Here, we have made the first steps, and these ini- of the ‘Investissements d’avenir’ no. ANR-11-IDEX-0004-02. KSR tial analogies can be complemented with a more thorough was supported by ERC Adv grant 26055290. CM acknowledges exploration of the Linux system to fully assess their utility for support from the Volkswagen Foundation through a Freigeist macroecological and macroevolutionary research. More anal- Fellowship. ogies could be made, and patterns analyzed, based on easily retrievable empirical data on Linux. For instance, economic References activity tied to Linux applications could serve as an anal- ogy to productivity of geographical areas and allow study- Abernathy, W. J. and Utterback, J. M. 1978. Patterns of industrial ing the similarity in productivity–biodiversity relationships innovation. – Technol. Rev. 80: 40–47. (Currie et al. 2004) of the two systems. Another example Baldridge, E. et al. 2016. An extensive comparison of species- is the positive relationship between area and speciation rate abundance distribution models. – PeerJ 4: e2823. (Lomolino et al. 2010, Wagner et al. 2014): populations in Ballantyne, F. and Kerkhoff, A. J. 2007. The observed range for larger areas are more likely to become isolated and drift apart temporal mean-variance scaling exponents can be explained by genetically, or to encounter a larger variety of selection pres- reproductive correlation. – Oikos 116: 174–180. sures and undergo adaptive radiations. Our preliminary anal- Bettencourt, L. M. A. et al. 2007. Growth, innovation, scaling, and yses suggest that such a relationship can be observed in the the pace of life in cities. – Proc. Natl Acad. Sci. USA 104: 7301–7306. Linux universe (Supplementary material Appendix 1 Fig. A1). Blomberg, S. P. et al. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. – Evolution Towards cultural ecology 57: 717–745. Blonder, B. et al. 2014. Separating macroecological pattern and Darwinian evolutionary thinking and models have success- process: comparing ecological, economic, and geological fully been adapted to study evolution of cultural systems systems. – PLoS One 9: e112850. (Dawkins 1976, Cavalli-Sforza and Feldman 1981, Boyd and Bonds, M. H. et al. 2012. Disease ecology, biodiversity, and the Richerson 1985, Sperber 1996, Mesoudi 2016, 2017). How- latitudinal gradient in income. – PLoS Biol. 10: e1001456. ever, as shown in our results and elsewhere, there is a broader Boyd, R. and Richerson, P. J. 1985. Culture and the evolutionary variation of culture that also has analogies in ecology, and process. – Univ. of Chicago Press. that exhibits ecological patterns. Here we see potential for a Brown, J. H. 1984. On the relationship between abundance and new discipline of cultural ecology, which would use ecologi- distribution of species. – Am. Nat. 124: 255–279. cal models to explain culture. For example, fluctuation and Cavalli-Sforza, L. L. and Feldman, M. W. 1981. Cultural transmission and evolution: a quantitative approach. – Princeton Univ. Press. distribution of rarity and commonness can be used to pre- Chase, J. M. and Leibold, M. A. 2003. Ecological niches: linking dict extinction risk (Kunin and Gaston 1997) in cultures and classical and contemporary approaches. – Univ. of Chicago technologies. There is also a potential for biodiversity theory, Press. and its popular models, such as the neutral model (Hubbell Currie, D. J. et al. 2004. Predictions and tests of climate-based 2001) or the theory of island biogeography (MacArthur and hypotheses of broad-scale variation in taxonomic richness. Wilson 1967), to predict and map patterns of cultural diver- – Ecol. Lett. 7: 1121–1134. sity. We suggest that one aspect of cultural variation that can Darwin, C. 1859. On the origin of species by means of natural benefit from biodiversty theory, is the issue of cultural and selection, or the preservation of favoured races in the struggle technological diversity (e.g. in nations, towns, or companies) for life. – John Murray. and its relationship with productivity and stability – a large Dawkins, R. 1976. The selfish gene. – Oxford Univ. Press. body of relevant theory can be offered here by the biodiver- Drakare, S. et al. 2006. The imprint of the geographical, evolutionary and ecological context on species–area relationships. – Ecol. sity-ecosystem function research (BEF; Loreau et al. 2002). Lett. 9: 215–227. Finally, some ecological models, such as the neutral theory Edge, J. 2011. Distribution “popularity.” –  https://lwn.net/ of biodiversity (Hubbell 2001), offer direct connections to Articles/470782/-lwn.net . evolutionary theory, and thus to an even broader interdisci- Fischer, R. A. 1943. A theoretical distribution for the apparent plinary integration. abundance of different species. – J. Anim. Ecol. 12: 54–58.

1798 Fortuna, M. A. et al. 2011. Evolution of a modular software Koleff, P. et al. 2003. Measuring beta diversity for presence–absence network. – Proc. Natl Acad. Sci. USA 108: 19985–19989. data. – J. Anim. Ecol. 72: 367–382. Frank, S. A. 2009. The common patterns of nature. – J. Evol. Biol. Kunin, W. E. and Gaston, K. J. 1997. The biology of rarity: causes 22: 1563–1585. and consequences of rare common differences. – Chapman and Fritz, S. A. and Purvis, A. 2010. Selectivity in mammalian Hall. extinction risk and threat types: a new measure of phylogenetic Lande, R. et al. 2003. Stochastic population dynamics in ecology signal strength in binary traits. – Conserv. Biol. 24: 1042–1051. and conservation. – Oxford Univ. Press. Gaston, K. J. 1996a. The multiple forms of the interspecific Lawton, J. H. 1999. Are there general laws in ecology? – Oikos 84: abundance–distribution relationship. – Oikos 76: 211–220. 177–192. Gaston, K. J. 1996b. Species-range-size distributions: patterns, Lewontin, R. C. 1970. The units of selection. – Annu. Rev. Ecol. mechanisms and implications. – Trends Ecol. Evol. 11: Syst. 1: 1–18. 197–201. Lomolino, M. V. et al. 2010. Biogeography. – Sinauer Associates. Gaston, K. J. 2000. Global patterns in biodiversity. – Nature 405: Loreau, M. et al. 2002. Biodiversity and ecosystem functioning. 220–227. – Oxford Univ. Press. Gaston, K. J. 2003. The structure and dynamics of geographic Losos, J. B. 2008. Phylogenetic niche conservatism, phylogenetic ranges. – Oxford Univ. Press. signal and the relationship between phylogenetic relatedness Gaston, K. J. et al. 1993. Comparing animals and automobiles: a and ecological similarity among species. – Ecol. Lett. 11: vehicle for understanding body size and abundance relationships 995–1003. in species assemblages? – Oikos 66: 172–179. MacArthur, R. H. and Wilson, E. O. 1967. The theory of island Gaston, K. J. et al. 1997. Interspecific abundance–range size biogeography. – Princeton Univ. Press. relationships: an appraisal of mechanisms. – J. Anim. Ecol. 66: Mace, R. and Pagel, M. 1995. A latitudinal gradient in the density 579–601. of human languages in North America. – Proc. R. Soc. B 261: Gherardi, M. et al. 2013. Evidence for soft bounds in Ubuntu 117–121. package sizes and mammalian body masses. – Proc. Natl Acad. Malthus, T. R. 1798. An essay on the principle of population. Sci. USA 110: 21054–21058. – J. Johnson. Gjesfjeld, E. et al. 2016. Competition and extinction explain the Maynard Smith, J. 1982. Evolution and the theory of games. evolution of diversity in American automobiles. – Palgrave – Cambridge Univ. Press. Commun. 2: 16019. McGill, B. J. and Nekola, J. C. 2010. Mechanisms in macroecology: Graham, C. H. et al. 2016. Phylogenetic scale in ecology and evo- AWOL or purloined letter? Towards a pragmatic view of lution. – bioRxiv: 063560. mechanism. – Oikos 119: 591–603. Harmon, L. J. and Harrison, S. 2015. Species diversity is dynamic McGill, B. J. et al. 2007. Species abundance distributions: moving and unbounded at local and continental scales. – Am. Nat. 185: beyond single prediction theories to integration within an 584–593. ecological framework. – Ecol. Lett. 10: 995–1015. Harte, J. 2011. Maximum entropy and ecology: a theory of Mens, T. et al. 2014a. Studying evolving software ecosystems based abundance, distribution, and energetics. – Oxford Univ. Press. on ecological models. – In: Mens, T. et al. (eds), Evolving Harvey, P. H. and Pagel, M. D. 1991. The comparative method in software systems. Springer, pp. 297–326. evolutionary biology. – Oxford Univ. Press. Mens, T. et al. 2014b. ECOS: ecological studies of open source Henrich, J. and Broesch, J. 2011. On the nature of cultural software ecosystems. – Software Maintenance, Reengineering transmission networks: evidence from Fijian villages for adap- and (CSMR-WCRE), pp. 403–406. tive learning biases. – Phil. Trans. R. Soc. B 366: 1139–1148. Mesoudi, A. 2016. Cultural evolution: a review of theory, findings Hewlett, B. S. et al. 2011. Social learning among Congo Basin and controversies. – Evol. Biol. 43: 481–497. hunter-gatherers. – Phil. Trans. R. Soc. B 366: 1168–1178. Mesoudi, A. 2017. Pursuing Darwin’s curious parallel: prospects Hortal, J. et al. 2015. Seven shortfalls that beset large-scale for a science of cultural evolution. – Proc. Natl Acad. Sci. USA knowledge of biodiversity. – Annu. Rev. Ecol. Evol. Syst. 46: 114: 7853–7860. 523–549. Meyer, C. et al. 2015. Global priorities for an effective information Hubbell, S. P. 2001. The unified theory of biodiversity and basis of biodiversity distributions. – Nat. Commun. 6: 8221. biogeography. – Princeton Univ. Press. Moen, D. and Morlon, H. 2014. Why does diversification slow Jetz, W. and Rahbek, C. 2002. Geographic range size and down? – Trends Ecol. Evol. 29: 190–197. determinants of avian species richness. – Science 297: Morlon, H. 2014. Phylogenetic approaches for studying diversifica- 1548–1551. tion. – Ecol. Lett. 17: 508–525. Keil, P. et al. 2010. Predictions of Taylor’s power law, density Morlon, H. et al. 2009. Taking species abundance distributions dependence and pink noise from a neutrally modeled time beyond individuals. – Ecol. Lett. 12: 488–501. series. – J. Theor. Biol. 265: 78–86. Myers, C. R. 2003. Software systems as complex networks: Kendal, W. S. 2004. Taylor’s ecological power law as a consequence structure, function, and evolvability of software collaboration of scale invariant exponential dispersion models. – Ecol. graphs. – Phys. Rev. E 68: 046116. Complex. 1: 193–209. Nee, S. et al. 1994. The reconstructed evolutionary process. – Phil. Kiflawi, M. et al. 2016. Heterogeneous ‘proportionality constants’ Trans. R. Soc. B 344: 305–311. – a challenge to Taylor’s power law for temporal fluctuations in Nekola, J. C. and Brown, J. H. 2007. The wealth of species: abundance. – J. Theor. Biol. 407: 155–160. ecological communities, complex systems and the legacy of Kilpatrick, A. M. and Ives, A. R. 2003. Species interactions can Frank Preston. – Ecol. Lett. 10: 188–196. explain Taylor’s power law for ecological time series. – Nature Ochman, H. et al. 2000. Lateral gene transfer and the nature of 422: 65–68. bacterial innovation. – Nature 405: 299–304.

1799 Pagel, M. 1999. Inferring the historical patterns of biological evolu- Taylor, L. R. and Woiwod, I. P. 1980. Temporal stability as a tion. – Nature 401: 877–884. density-dependent species characteristic. – J. Anim. Ecol. 49: Pang, T. Y. and Maslov, S. 2013. Universal distribution of compo- 209–224. nent frequencies in biological and technological systems. Taylor, L. R. and Woiwod, I. P. 1982. Comparative synoptic – Proc. Natl Acad. Sci. USA 110: 6235–6239. dynamics. I. Relationships between inter- and intra-specific Perry, J. N. 1994. Chaotic dynamics can generate Taylor’s power spatial and temporal variance/mean population parameters. – J. law. – Proc. R. Soc. B 257: 221–226. Anim. Ecol. 51: 879–906. Peterson, A. T. et al. 2011. Ecological niches and geographic Valverde, S. and Sole, R. V. 2015. Punctuated equilibrium in the distributions. – Princeton Univ. Press. large-scale evolution of programming languages. – J. R. Soc. Rabosky, D. L. and Lovette, I. J. 2008 Explosive evolutionary Interface 12: 20150249. radiations: decreasing speciation or increasing extinction Wagner, C. E. et al. 2014. Cichlid species–area relationships are through time? – Evolution 62: 1866–1875. shaped by adaptive radiations that scale with area. – Ecol. Lett. Rabosky, D. L. and Glor, R. E. 2010. Equilibrium speciation 17: 583–592. dynamics in a model adaptive radiation of island lizards. – Proc. Wang, S. C. and Marshall, C. R. 2016. Estimating times of extinc- Natl Acad. Sci. USA 107: 22178–22183. tion in the fossil record. – Biol. Lett. 12: 20150989. Rosenzweig, M. L. 1995. Species diversity in space and time. Warren, R. J. et al. 2011. Universal ecological patterns in college – Cambridge Univ. Press. basketball communities. – PLoS One 6: e17342. Scheffer, M. et al. 2017. Inequality in nature and society. – Proc. Wiens, J. J. et al. 2010. Niche conservatism as an emerging Natl Acad. Sci. USA 114: 13154–13157. principle in ecology and conservation biology. – Ecol. Lett. 13: Sizling, A. L. et al. 2009. Species abundance distribution results 1310–1324. from a spatial analogy of central limit theorem. – Proc. Natl Worster, D. 1994. Nature’s economy: a history of ecological ideas. Acad. Sci. USA 106: 6691–6695. – Cambridge Univ. Press. Slatyer, R. A. et al. 2013. Niche breadth predicts geographical range Yan, K.-K. et al. 2010. Comparing genomes to computer size: a general ecological pattern. – Ecol. Lett. 16: 1104–1114. operating systems in terms of the topology and evolution of Sperber, D. 1996. Explaining culture: a naturalistic approach. their regulatory control networks. – Proc. Natl Acad. Sci. 107: – Blackwell. 9186–9191.

Supplementary material (Appendix ECOG-03424 at  www. ecography.org/appendix/ecog-03424 ). Appendix 1.

1800