<<

bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Running Head: Thermodynamic properties of

Biological as interacting compartments: thermodynamic properties and diversity indices Fernando Meloni,1 Gilberto M. Nakamura,1 and Alexandre Souto Martinez1 Universidade de S˜aoPaulo, Ribeir˜aoPreto 14040-901, Brazil (Dated: today) Diversity indices provide simple and powerful metrics for assessing biological communities. Based on entropy measures, the approach considers statistical and thermodynamic inferences to deduce ecological patterns. However, concerns exist regarding the accuracy of diversity indices. Because relative quantities depend on the sorting of (e.g., guilds and ) and their interactions, field observations carry inherent imprecision, thus leading to misinterpretation. Here, we present a framework that is able to appropriately achieve the thermodynamic properties in ecological and ensure the inference power. We demonstrate that effective abundances rather than raw abundances provide a trustful estimator of probabilities, which is evaluated through massive tests. We use empirical and synthetic data to show the advantages and reliability of this new framework under a broad range of conditions. The tests demonstrate that the replication principle is always optimized by the new estimator. Compared to other methods, this approach is simpler and reduces the importance of schemes used for sorting organisms. We highlight the robustness and the valor of effective abundances for ecological contexts: i) to assess and monitor the biodiversity, ii) to define the best sorting of organisms according to maximum entropy principles, and iii) to link local to regional diversity (α-, β-, and γ-diversity).

keywords: biodiversity assessment, entropy, extensive and additive properties, estimator of probabilities, interacting compartments, predictable patterns

I. INTRODUCTION from well-supported concepts, interpretations, formula- tions, and inferences from information theory, physics, The assessment of biodiversity is a primary concern statistics, and thermodynamics, and they allow deduc- 13,15,16,36,74,78 among ecologists. They are interested in monitoring ing further relationships from data . Briefly, species and to explain how , type the calculation of diversity indices considers the number and several other environmental features affect - of categories and the number of organisms (abundances) 50 isms and their organization in nature15,50. Beyond ba- as primary information . The W categories are defined sic knowledge, the motivation for this interest has in- by any criteria of interest (e.g., species, genera, behavior, creased over the past decades due to the increasing hu- , and so forth) and are used for sorting the organ- man impacts on climate and natural ecosystems that isms, while relative quantities provide a quantitative in- place species survival and services at risk2,26. ference. This type of basic information can be obtained The best appraisal of patterns in biological communi- for a broad range of practical conditions and reflects the ties considers details of several taxa, such as their biol- applicability of diversity indices. Furthermore, diversity ogy, genetic variability, , behavior, and so forth. indices provide reliable information about global patterns However, this level of information is rarely available for of biological organization even if information about or- practical contexts, and objective measures are considered ganisms is limited. along with external inferences to deduce the Accordingly, once the W categories are determined a patterns54,55. Taxonomic/functional compositions, the priori, the Ak considers the number of organ- shape of the curves describing species-abundance distri- isms in the k-th category and determines the probability PW PW butions (SAD), and the ecological indices consist of dis- pk = Ak/A, for A = k=1 Ak and k=1 pk = 1. These tinct approaches for measuring and comparing biological probabilities are used to calculate the classic Shannon communities50. Each approach addresses levels of infor- diversity index H 50 as follows: mation, and their use shows advantages and disadvan- W 41,55 X tages for distinct contexts . Among these metrics, H = − p ln p , (1) ecological indices deserve special attention, and we ex- k k k=1 plore them in this paper. Also referred to as diversity indices, ecological indices which is equivalent to the Shannon entropy 74. In prac- are deduced from the Boltzmann-Shannon-Gibbs en- tical contexts, the H values of biological communities tropy, and they are proposed to measure order-disorder in living under distinct influences are compared by similar biological communities. These indices enjoy advantages sampling effort. In such cases, the variation ∆H is inter- bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 2 preted as a metric that addresses the influence of a par- gether to elaborate constraints and produce distinct a ticular driver (e.g., climate changes or impacts) priori hypotheses, and they are compared next in accor- on the ecological framework 13,50,53. dance with MaxEnt principles 13,34,35. The arrangement In addition to the H index, there are several other that better achieves the thermodynamic properties de- ecological indices that provide complementary aspects of fines the most plausible granular scheme. However, the the biological communities. The notation approach also brings important concerns. For instance, knowledge of tens, hundreds, or sometimes thousands of W 1 X distinct organisms is a prerequisite for accurate results aH = ln pa (2) 1 − a k using guilds. Beyond the necessity of specialists, this im- k=1 plies some subjectivity in selecting the traits and respec- unifies the three most important ecological indices13,15,36: tive weights (if weights are addressed). Finally, guilds are the W (number of categories), the H considered for particular groups of organisms and under index (Shannon entropy), and the Ginni-Simpson index specific situations (e.g., tree species in forests, in soil, fishes in lakes, and so on), and they cannot be D = PW p2 , where the last infers the curve asymme- k=1 k readily inferred for wide proposes. Therefore, the opin- try of ranked relative quantities. For a = 0 and R > 2, ion and context may affect the evaluation of probabilities we have 0H = ln(W ), which is the harmonic mean; for 1 pk, the values assumed by the diversity indices, and the a → 1, H = H, which is the geometric mean; and subsequent conclusions. for a = 2, 2H = 1/D, which is the arithmetic mean. These cases are special cases of the R´enyi generalized Notwithstanding the importance of the assortment of entropy and are important because they predict useful organisms, we emphasize here that the granular scheme relationships. One expects to find correlations between is only one side of the problem for the accuracy for biodi- the abundances of organisms and the values assumed by versity assessment. We claim that ecologists should ad- distinct diversity indices, such as af(A) ≡ aH, which dress further sources of data variability before defining also implies that g(0H) ≡ 1H and so on 15,36. These the best assortment. Ecological systems admit complex correlations match the concept of the replication princi- spatial-temporal dynamics (many of which are poorly un- ple in ecological contexts13,15,38, and they are true only if derstood), which finally drives the presence and quanti- the additive and extensive thermodynamic properties are ties of organisms. Beyond environmental influences, such ensured13,34,78. Assuming this circumstance, the entropy dynamics derive from intricate food webs and interaction observed for small samples can be re-scaled to predict networks, which finally frame the biological organization the entropy of the entire . This type of inference in the ecosystems. The consequence is a mutual and would be used, e.g., to explain how the biological diver- non-trivial dependence of relative quantities. Because sity is spatially partitioned (γ-, α- and β- diversities)12,38. ecological information is generally acquired through field Although H and other diversity indices provide an inter- observations and samplings, fluctuations in relative abun- 45,51 esting approach to assess and interpret biological commu- dances may accrue relevant “noise” in empirical data , nities, tests using empirical data often show a different leading to uncertain probabilities pk for diversity indices. reality, and the topic remains open 10,12,14,24. Consequently, the reproducibility of ecological experi- 7 The inconsistency between theoretical predictions and ments remains a topic of scientific concern . empirical results is justified by the lack of a precise def- Taking these arguments into consideration, we conjec- inition of appropriate classes for ecological systems34.A ture that insights taken from non-equilibrium systems common approach for sorting organisms considers taxo- 19,40,52,80 could be adapted to a static approach for re- nomic arguments, where individuals are strictly grouped ducing the effects of short-range fluctuations on observed by their phylogeny. This method is intuitive but leads quantities. In this paper, we explore the inherent inac- to granular schemes that poorly represent ecological con- curacy of diversity indices as a consequence of a poor straints. In fact, taxonomic sorting reflects the evolu- interpretation of raw abundances when used for estimate tionary history of groups better than the immediate ef- probabilities. We propose a new estimator that is able to fect of environmental forces on niche occupancy 72, often separate data variations produced by entropic and non- resulting in probabilities pk that fail in evaluating the entropic drivers, assuming samples as stationary (or as diversity patterns 76. In contrast to taxonomic schemes, quasi-stationary) states. The approach is demonstrated the guild approaches employ functional traits and few to optimize the results obtained by diversity indices, taxonomic arguments, maximizing the niche context for making them reliable tools for assessing biological or- sorting organisms. The individuals are grouped because ganization. The additive and extensive thermodynamic they share similar traits and explore similar niches, a properties of biological communities are recovered from scheme that minimizes ecological redundancies. Conse- effective abundances, thus allowing a broad range of in- quently, guild methods tend to improve the assessment ferences about time, space, and external influences since of environmental disturbances on biological communities the adequate context is considered. Extensive tests with 9,37,43,72. Multiple arguments (such as phylogeny, mor- empirical and synthetic data remove any doubt regard- phological traits, genetic information, behavior, niche oc- ing the applicability and reliability of the new estimator cupancy and ad hoc information) can be addressed to- of probabilities. This new estimator of probabilities rep- bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 3 resents a shift in our understanding about abundances, effects of environmental changes on biological diversity which is taken as the result of interacting compartments 11,17,25,32,42,44,71,73,79. rather than independent ones. Our rationale, formula- In their theoretical study, W¨urtzand Annila80 consid- tion and interpretations broadly match the framework 80 ered the paths to solve this question. The authors con- presented by W¨urtzand Annila , with differences for sidered ecosystems as systems out of equilibrium, and the general interpretation and mainly for practical appli- they interpreted the ecological interactions as chemical cation. We discuss the phenomenological process behind potentials. Using as context, the au- the quantitative changes in the data structure, as well as thors explicitly deduced the time and matched the ther- the potential of this approach to answer classic problems modynamic entropy to ecological succession. However, concerning biodiversity patterns. the most important insight was the estimator of prob- The remainder of this paper is organized as follows. abilities, which is assumed as the result of a product. In Sec. II, we present the rationale used to obtain an Unfortunately, this topic has remained poorly explored alternative estimator of probabilities, which enables op- until now. We develop an independent reasoning that timizing the thermodynamic properties in ecological sys- widely converges to a similar mathematical framework. tems. In Sec. III, we use data of ground arthropods sam- Because the concept was previously proven, we spend pled in a successional gradient to show that less time with justifications in order to focus on the ap- sorting schemes are insufficient for assessing the entropy plicability and implications. Our approach also has some production in ecological systems, elucidating how effec- differences because we address each sample as a station- tive abundances provide a distinct approach to improve ary state, a static approach that does not demand that the biodiversity assessment. In the same section, we test time be explicitly considered 20,21. In fact, the stationar- the framework in further empirical data, showing that ity addressed here is a condition imposed to the sample, thermodynamic properties are recovered in distinct com- and it is not a necessary condition for the system. The munities, including , arthropods, mammals, fishes approach strictly focuses on variability in abundances, and others, that may used to infer regional diversity from which for methodological and practical concerns enables few samples and is minimally affected by the shape of wider inference power from ∆H because time, space or SAD. We finally discuss the implications of our findings any other driver can be deduced from appropriate exper- and the phenomenological aspects concerning the topic imental designs. With some contextualization, we next in Sec. IV. present a step by step description of the alternative prob- ability estimator qk, which is a of Ak and ϕ but not necessarily proportional to them. II. ENTROPY OF BIOLOGICAL COMMUNITIES AND Following Haegeman and Loreau33, the Shannon en- ESTIMATORS OF PROBABILITIES tropy [Eq. (1)] describes the organization of biological communities in a given ecological system. The main as- Contextualizing the problem, empirical data are gen- pect that is necessary to obtain a reliable value of H is erally obtained from limited samples, and the dynam- the proper definition of groups according to which the ics driving the spatio-temporal patterns of diversity un- organisms are classified. As previously mentioned, there der in situ conditions are hardly accessible. As dis- are several ways to group organisms in compartments, tinct influences on relative abundances, on the one hand, and we start with the taxonomic view. At a basic granu- we have the phenomena that alter the free in lar level, species define the categories, and the abundance ecosystems and directly modify the ecological framework Ak represents the number of organisms in the k-th species (e.g., suppression, ecological succession, cli- PW of the system, while the total abundance is A = k=1 Ak mate changes, and so forth). These phenomena disturb and W is the species richness. The ratio ecological systems and lead to a system reorganization, a consequence of the second law of thermodynamics. We name these phenomena entropic influences, or ϕ. On pk = Ak/A (3) the other hand, , , seasonality, and so forth may also affect the quantities of organisms, but estimates the occurrence probability of finding a given these are non-entropic influences, or θ. At homeosta- element of the k-th compartment. Naturally, this defini- sis or any stationary state, such phenomena make abun- tion ensures that PW p = 1. dances fluctuate (regularly or not) over time around a k=1 k fix point. Despite being thermally neutral, these fluc- Despite the intuitive formulation of pk, applying Eq. 3 tuations address systematic sub- and/or overestimation to Eq. 1 raises several experimental questions. For in- of organisms in samples. Consequently, the non-entropic stance, small changes in abundance ∆Ak = εkA with influences impart a high level of uncertainty to Ak val- εk  1, either due to sampling errors or measure- ues (primarily because of the sampling process) and hin- ments in different ecosystems, produce corrections ∆pk = PW 2 der the assessment of entropy variation. Separating en- εk − pk( `=1 ε`) + o(ε ). Since εk are assumed to be tropic from non-entropic effects on observed abundances small, corrections to the estimators themselves are also is possibly the most important barrier for assessing the very small. However, they may produce large corrections bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 4 to the entropy, relative abundances are the same (Suppl. 1). For ecologi- cal systems, further considerations are necessary. The ra- W W X X 2 tio H/A  1 for biological communities since H ∼ o(1), ∆H = − εk ln pk − H εk + o(ε ). (4) 3 −1 whereas A > o(10 ). Moreover, A ln A is a constant k=1 k=1 for fixed A and only shifts the free energy. From these For clarity, let us consider the particular case of a sin- considerations, we have gle variation in abundance, i.e., εk ≡ ε. Under this P 1 circumstance, ∆H = −ε ln pk − εW H. Note that µk ≡ ln (1 + Ak) , (8) P k βA in the first parcel, ε k ln pk, compartments with small probability pk  1 provide considerably larger contri- which in turn produces our estimator corresponding to butions to ∆H than compartments with a larger pres- the k-th compartment ence in the ecosystem. This result should be compared 1 against compartments with equal probability pk = 1/W , qk ≡ ln(1 + Ak), (9) ∆H = εW H − εW H = 0. Therefore, large deviations in Z H arise due to the estimator Eq. (3), making the index PW with the partition function Z = ln(1 + Aj) guaran- sensitive to dynamics. j=1 teeing that PW q = 1. Comparing distinct estimators Another pressing concern regarding the reliability of k=1 k Eq. (3) as an ecological index revolves around the organ- and their thermodynamic interpretation, the Gibbs par- ism interactions and their sorting in categories. These tition function is interactions are expected to shape the various levels of A e−βεk p = k ≡ (10) the biological organization either directly or indirectly. k W A P e−βεn In thermodynamics, even though interactions are known n=1 a priori, the sheer amount of components in the sys- while qk is tem prohibits the exact computation of properties. The ln A + 1 −βε chemical potential µ overcomes this problem by intro- q = k ≡ k . (11) k PW PW ducing an effective energy change caused by the addition n=1 ln An + 1 n=1 −βεn of another particle to the system, including temperature The estimator qk vanishes for vanishing Ak similar to and pressure. In this way, the chemical potential en- pk. However, qk exhibits notable differences from pk in codes all resulting effects of underlying interactions in terms of the relative variations ∆Aj =ε ¯j(1 + Aj) with the system, with Gibbs free energy as G = µN, for N 1 ε¯j  1, identical particles. For ecological problems, the transla- tion is straightforward: we associate a chemical potential W ! 1 X 2 µ ≡ µ (ϕ, A ,...,A ) to the k-th compartment such ∆qk = ε¯k − qk ε¯` + o(¯ε ). (12) k k 1 W Z that the total Gibbs free energy is `

W A brief comparison between Eq. (12) and ∆pk reveals X G = µkAk − G0(ϕ), (5) an interesting fact. Let us assume that both estimators k=1 qk and pk share similar numerical values, pk ≈ qk, with constantε ¯k ≈ εk = ε. These particular constraints are where G0(ϕ) depends only on entropic variable ϕ. From unlikely to naturally occur in uncontrolled environments; Eq. (5), it is clear that the chemical potentials µk rep- they serve only to highlight general properties. For in- resent weights to the compartment occupation Ak, from stance, despite the numerical equivalence between the es- which we infer the estimator timators in this hypothetical scenario, the variations ∆qk µ q = k . (6) are reduced by a constant factor Z ∼ o(W ) > 1, which k PW n=1 µn mitigates fluctuations of abundances Ak. For more re- alistic systems, we consider the fluctuations ∆ϕ due to One reobtains the estimator pk when µk ∝ Ak, i.e., the entropic influences: effort required to add another organism to the k-th com- partment increases linearly with A . " W # k 1 ∂Ak X 1 ∂A` ∆p = ∆ϕ − p , (13a) For more complex environments, we should define or k A ∂ϕ k A ∂ϕ estimate the values µk in some other way. Indeed, from ` " W # the Maxwell relation βµk = −(∂H/∂Ak) and H = ∆ϕ ∂ X ∂ PW ∆qk = ln(1 + Ak) − qk ln(1 + A`) − k=1 pk ln pk (Eq. (1)) with estimators pk = Ak/A Z ∂ϕ ∂ϕ ( A = PW A ), one obtains ` k=1 k (13b) ln A − ln A H βµest = k + . (7) k A A 1 It is easy to show that the term ln(Ak)−ln(A) in Eq. (6) Note the difference between εk = ∆Ak/A andε ¯k = ∆Ak/(1 + A ), i.e., ε /ε¯ < 1 refers to the Hmax (maximum entropy possible) only if all k k k bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 5

−1 Other than the factor Z , ∆qk and ∆pk differ mostly The continuous observation during the stationary state in the way the variations behave: ∆qk depends only on would imply that θk is a function able to describe how Ak relative variations, whereas ∆pk requires the variation fluctuates around the fixed point determined by Fk(ϕ). of Ak with ϕ in addition to the total value A = A(ϕ). However, the sampling process makes θk assume discrete Consequently, qk may produce a better separation of values indicating whether Ak deviates from the value pro- effects from the different compartments in the system duced by Fk(ϕ) at the right moment of the sample. Then, when compared to pk for the same compartmentaliza- θk = [λkF (ϕ)]/[F (ϕ)], where λk, ∀k is a Lagrange multi- tion scheme. In other words, the estimator qk reduces plier, and the interdependences among the various compartments. From the relation q A /Z, one obtains the inequal- λkFk(ϕ) k 6 k Ak = Fk(ϕ) × . (17) ity qk/pk 6 (A/Z). Therefore, the positive index Fk(ϕ)   W For θk = 1, the influence of the further compartments 1 Z X qk on the kth is null, whereas for θ 6= 1, some influence is Q = 6 1 (14) k W A pk k=1 detectable. As categories interact, the quantities in one category which measures the deviation between the probability es- are expected to increase at the expense of a decrease timator qk from the more common estimator pk. This in another during the stationary state, maintaining the assumption shifts the concept of units in ecological sys- thermodynamic balance according to the second law. tems, meaning that only exponential variations in rela- Consequently, all values assumed by θ alternate around tive abundances are properly inferred in terms of entropy 1 through the W compartments. Concerning Eq. 16, en- variations. Note that our rationale matches the general tropic and non-entropic influences on abundances Ak can MaxEnt framework for thermodynamic variational prin- be isolated at the logarithm scale as ciples 19–21, whereas our partition function Z is similar W W to that assumed in W¨urtzand Annila80. In fact, Eq. 11 X X demands deduction and inferences, and we present them ln(A) = ln Fk(ϕ)+ ln θk. (18) in terms of abundances. Eq. 5 indicates that abundances k=1 k=1 are the result of distinct terms. To clarify this topic, PW For the consequence of θk → 1, 1 ln θk → 0, and there- we consider Fk(ϕ) to be the entropic influence on the fore, we have the “effective” abundances Aˆk as abundance of the k-th compartment, and θk is the non- entropic influence of the W compartments on quantities W W found in the k-th compartment.Then, any entropy pro- X X Aˆ = Fk(ϕ)= ln[Ak + 1], (19) duction is related to ∆F (ϕ), while non-entropic effects k k result from the internal dependence among categories (bi- ological interactions, , and so forth). where unit summed to Ak guarantees the relation even Applying these terms to abundances, the classic ap- when one or more categories are not represented in the proach assumes that sample. This framework states that Aˆ is a function of “observed” raw abundances A, and it exclusively holds W W X X the entropic influences on Ak. Therefore, it is expected A = Ak = Fk(ϕ) × θk, (15) that k=1 k=1 ∆Aˆ ≡ ∆ϕ. (20) where A is the result of an arithmetic relation among the ¯ W compartments, a framework that makes little sense Importantly, as θk → 1, ∀k, the average value θ ≈ in terms of dependent compartments. For instance, con- PW 1, and the expression k=1 ln θk can be rewritten as sider the effects produced by the introduction of an exotic W [ln(W −1], making Eq. 18 assume that species to a random ecosystem. One does not expect that the addition of n predators will reduce exactly n prey, W −1 X ∗ as assumed by the arithmetic ratio. In fact, a preda- τW ≡ ln[A(ϕ)] − W [ln(W ))] ≡ ln(Ak + 1)] , (21) tor consumes several prey, affecting the growth rate of i prey because the number of prey also affects the growth where τW is the optimized entropy for the granular rate of predators. Therefore, this mechanism produces a scheme that produced W . The errors associated with cascade of influences on relative abundances, which are Eqs. 19 - 21 reflect whether the assumption θ → 1 de- better expressed by multiple dependences80. Considering k viates from the true distribution produced by θ1,...,θW , A as the result of a product rather than a sum, Eq. 15 inferring whether the assumed granular scheme deviates assumes that from the actual maximum entropy possible for the sys- W tem. Accordingly, the multiplicative approach and con- Y A = Fk(ϕ) × θk. (16) sequent probabilities calculated as qk offer a robust ap- k=1 proach for comparing granular schemes and for assessing bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 6 the entropy production in ecological systems. Because a process. This hypothesis appears to be implausible time is not necessary and thermodynamic properties are due to strong experimental evidence attesting to the recovered, further inferences can be explored, such as the contrary1,4,17,18,27,57,60,64. The alternative explanation space and consequently the partitioning diversity. In the assumes that the cause-effect relationship exists, but W , following, we shall test the plausibility of this rationale A and H are unable to properly assess entropy produc- using empirical and synthetic data. tion in soil communities. We assume that the second explanation is correct because it is corroborated with ex- perimental evidence. III. TESTING THE FRAMEWORK IN EMPIRICAL DATA

Entropy production in soil systems

In the first test with empirical data, we verify whether the aforementioned framework is applicable for practi- cal contexts. For this purpose, we use TGA, which is a data set that contains information of and counts of ground arthropods in a successional gradient of tropical forests. Concurrent to the granular scheme, we aim to evaluate how diversity indices of ground arthro- pods respond to the entropy production when the classic probabilities, pk, and the new one, qk, are committed. TGA covers 13 forests in distinct stages of development through a gradient constituted by 10 reforestations and 3 forest remnants. All sites are located in Southeast Brazil and bear a high diversity of tree species (Atlantic forests). Data of plant biomass are removed to infer the entropic influences on the diversity of ground arthropods. Then, samples of tree densities (ind./m2), area of trunks (m2), and litter biomass (kg/m2) are converted to only one 45,59 variable, ϕ, by using principal component analysis . Figure 1. Relationship between successional process and eco- ϕ represents a general quantitative descriptor of forest logical indices of ground arthropods considering a gradient development, and its non-dimensional values stand for of thirteen sites (n = 13). ϕ: indicator of forest devel- the linear coefficients of the eigenvector of the largest opment, obtained from a PCA conducted using values of eigenvalue (see Supplement 2 for details). Concerning tree , tree density and litter biomass. For ground ground arthropods, the input data are counts of ground arthropods, R: species richness; A: total abundance; H: arthropods sampled in soil and litter during winter and Shannon diversity index using species as compartments; W : 67 summer, totaling 40 samples per site (see Supplement number of guilds according to Parisi et al. ; QBS: Quali- tat Biol´ogicadel Suolo, an indicator of soil health based on 2 for details). Part of these data were previously ana- 67 0 lyzed to explain the colonization of ground arthropods ground arthropods ; H : Shannon index obtained for guilds as compartments; τ = ln(A) − W ln(W −1), the maximum during the development of the forest 57,60. The organ- W entropy; Aˆ = PW ln A , the effective abundance; and H : isms are sorted by species (occasionally morpho-species), k=1 k M ˆ and data of distinct samples are joined in only one large Shannon entropy achieved by probabilities qk = ln(AK )/A. sample per site. With data of fauna in hand, we consider A-C show that R, A and H of ground arthropods are poorly correlated with the forest development. The use of guilds in- the species sorting to calculate the number of categories creases the correlation between the number of categories W W (species richness), the abundance A and the classic and the successional stage, as does QBS. However, the use of Shannon diversity H (Eq. 1) for the 13 sites of TGA. guilds as compartments produces values of entropy H0 that Next, we plot respective values against ϕ quantities, and are poorly correlated with the forest development. Finally, we observe the correlation between them. the use of qk probabilities, derived from the effective abun- Figures 1A-C depict that the variations in the three dances approach, show that the forest development affects the metrics, namely, W , A and H, show no correlation organization of ground arthropods, mainly when qk is consid- with the values of ϕ. Consequently, two possible ex- ered together with guilds. planations may be assumed. The first one admits no cause-effect relationship between the forest development There is a clever way to avoid the concerns around and the biological organization of ground arthropods. H and related indices: sorting organisms with categories This explanation would imply that the successional pro- other than , in which relevant biological and cess does not produce entropy in litter and soil lay- ecological constraints are explicitly addressed. In fact, ers or that ground arthropods are not sensitive to such conceptual and empirical evidence suggest that guild ap- bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 7 proaches may improve the extraction of ecological infor- making the use of an appropriate estimator of probabili- mation because they fairly reflect the niche occupancy ties necessary. 1,9,61,67. Ideally, functional approaches would pursue new Then, we follow the framework presented in Sec. II, categories that alter H values and improve the thermody- considering abundances as the result of interacting cat- namic properties. Consequently, a better inference about egories. Employing qk probabilities, we calculate the entropy production for the TGA case is also expected, Shannon index (Eq. 1) for two granular schemes: species 0 and we test this hypothesis. and guilds. We use the notations HM (species) and HM Thus, we sort the arthropods of TGA in guilds follow- (guilds) to differentiate the indices calculated by qk from 67 ing the criteria described in Parisi et al. . Tailored from those calculated by pk. The indices τW and Aˆ, predicted ground data, this methodology collects infor- by Eq. 19 and Eq. 21, are also calculated, and all them mative content about the biological quality of the soil are then plotted against ϕ. 0 and produces a reliable indicator of soil health, the QBS Figure 1G shows that correlations between τW or τW (Qualit`aBiological del Suolo) 5,30,47,61,67. Accordingly, with ϕ are compatible with the respective correlations arthropods are sorted by high-level taxa and morpholog- observed for W and W 0, which confirms the gain pro- ical traits into taxonomic-functional categories (≈ 28). duced by guilds in comparison to the species scheme (Fig- Each category receives an ad hoc score, the EMI (eco- ures 1A,D,F). Furthermore, the tests indicate that the qk morphological index), that provides a quantitative in- probabilities always attain better inference about ϕ than ference of how adapted its organisms are to soil niches the pk (Figures 1H-I). For both species and guilds, the (Supplement 3). For a random sample, the EMI scores effective abundances Aˆ are correlated with the values of of all represented categories are summed to produce the ϕ, with even the granular schemes still being important. QBS value. Note that relative abundances show lim- Once TGA addresses successional context, guilds nat- ited importance for the calculation of QBS, and single or urally produce stronger inference power about entropy multiple occurrences at the same k-th category result in production than species. When qk is employed together equal contributions. Concerning the TGA example, one with guilds, abundances finally reveal the entropy pro- expects that the number of EMI categories and QBS val- duction in the TGA data set, with Aˆ0 achieving even ues increase over the forest development as a consequence better results than QBS (Figure 1E,H). As predicted by 17 0 of soil maturation . We then conjecture that the EMI Eq. 19, HM also correlates with ϕ, which is evidence categories could also improve the assessment of entropy that effective abundances allied to guild compartments production in TGA if abundances are used rather than recover the thermodynamic properties in the TGA case. the ad hoc values. Such optimization is predicted by Eq. 21, and since it is In the following, we refer to the EMI categories as confirmed, it ensures the replication principle. Because guilds, and we add the prime symbol to differentiate its the replication principle is of great importance for ther- use (primed) from the use of species (unprimed). For modynamic inference, we apply further tests to explicitly arthropods sorted according to the EMI categories, W 0 evaluate this hypothesis. corresponds to the number of guilds, and QBS is the re- spective index calculated following the approach of Parisi et al.67. Using guilds as a granular scheme, we also cal- Recovering the replication principle in soil systems 0 0 0 0 culate H , as pk = Ak/A, where Ak is the relative abun- dance of the k-th guild. Then, we plot W 0, QBS, and H0 The replication principle states that diversity indices against the ϕ values, and we compare these results with are functions of abundances13,15, which requires that ex- those obtained for species. tensive and additive thermodynamic properties be en- Figures 1D-E depict how the number of guilds W 0 and sured. Concerning the extensive property, it is expected the QBS index vary as functions of ϕ. The linear fits that an increase of abundance leads to a monotonic in- indicate that both quantities correlate with the forest crease in the number of compartments and in entropy development, inferring that the successional process af- values. For the additive property, it is expected that the fects the organization of soil communities. In fact, guilds entropies of samples considered separately are equivalent join several organisms in a few categories, which natu- to the entropies of samples considered together, implying rally reduces the total variance. However, the same re- the independence of categories. Subsequently, we evalu- sult does not hold for H0 (Figure 1F). In comparison ate the reliability of such relationships in the TGA ex- to H (species), H0 reveals that guilds produce negligi- ample. ble improvements in the assessment of entropy produc- Objectively, we test whether qk can actually optimize tion. Furthermore, the H0 behavior is in clear contrast the extensive property in the TGA case by evaluating to that of QBS, although both metrics address the same whether the increase of Aˆ leads to a monotonic increase granular scheme. How is this result possible? Because in W , HM and τW . Such correlations are expected only QBS computes ad hoc quantities and H0 computes abun- when the replication principle is ensured. We also test 0 dances, it is clear that the estimator pk makes H inef- the influence of granular schemes by evaluating the curves ficient. Therefore, the granular schemes alone cannot obtained for species and guilds separately. As predicted 0 assess the entropy production in systems such as TGA, in Sec. II, Figure 2 depicts that the values of HM and HM bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 8 linearly correlate with respective effective abundances Aˆ erty states that HA + HB ≡ HAB, where A and B are and Aˆ0, the number of categories W and W 0, and maxi- samples, while AB is their union. This would be final 0 mum entropy τW and τW . These results explain the bet- evidence that qk ensures the replication principle (ther- ter inferences about entropy production depicted in Fig- modynamic properties) in ecological systems. To avoid ure 1, suggesting that the qk probabilities clearly achieve any influence of the entropy production on TGA data, the extensive properties in the TGA case. Furthermore, which address correlations among samples, we develop a the results clearly indicate that the optimization of the strategic test to properly evaluate the additivity in this extensive property does not necessarily depend on the data set. granular scheme, which also confirms the predictions of The test consists of a shuffling routine followed by sys- Sec. II. tematic comparisons between the variances (σ2) of sam- ple data separately with the variances of the same sam- ple data together.The sites of TGA (m = 13) are la- beled as Ua, Ub,...,Um and are used to create c clusters C1, ..., Cc. Each cluster combines arthropod counts of four random sites, such as C1 = (Ua ∪ Ub ∪ Uc ∪ Ud), C2 = (Ua ∪ Uc ∪ Ud ∪ Ue), and so on. Next, a second set of x clusters is created, now combining the clusters C1, ..., Cm two by two, such as C(1,2), ..., C(x−1,x), where Ci,j = (Ci ∪ Cj). With this shuffled structure in hand, we calculate the variances (σ2) of respective probabili- ties, such as σ2 , σ2 , σ2 , and so on, for the first group C1 C2 C3 of clusters, and σ2 , σ2 , σ2 , and so forth for C(1,2) C(1,3) C(2,3) the second one. Because the additive property explicitly concerns the independence of compartments, the granu- lar scheme is of great importance. Then, we propose four distinct treatments to evaluate the subject:

• Treatment 1: W for categories and pk for probabil- ities;

• Treatment 2: W for categories and qk for probabil- ities;

0 • Treatment 3: W for categories and pk for proba- bilities;

0 • Treatment 4: W for categories and qk for proba- bilities. Briefly, each treatment evaluates the linear correlation between σ2 + σ2 ≡ σ2 . A good linear fit indicates Ci Cj C(i,j) that the treatment allows achieving the additive prop- erty. Conversely, the lack of a correlation suggests that the thermodynamic property is poorly achieved. Figures 3A-D depict the results. Figure 3A indicates that, for Treatment 1, the sum of Figure 2. Thermodynamic relations for ecological indices of variances is not compatible with the variance of distinct ground arthropods sampled in secondary tropical forests when communities together. Therefore, the species scheme and using probabilities q . A: arthropods sorted as species; B: k p probabilities cannot ensure the additive property. Fig- arthropods sorted as guilds67. Div.: diversity indices; W : k ures 3B and 3C indicate that Treatments 2 and 3 par- number of categories; Aˆ: effective abundances; τW : maxi- mum entropy for W compartments and A. The figure shows tially recover the additive property, implying that the that probabilities qk can recover the extensive and additive granular scheme and the estimator of probabilities con- properties of ground arthropod communities. Under such cir- sist of distinct approaches to reduce data variability in cumstances, the thermodynamic relationships are guaranteed, ecological systems. Finally, Treatment 4 (Figure 3D) thus allowing the prediction of how changes in the number of shows that the guild scheme and qk probabilities em- organisms are related to the number of compartments, as well ployed together produce an excellent linear fit between as the system organization. σ2 + σ2 ≡ σ2 , indicating that the additive prop- Ci Cj Ci∪Cj erty is completely recovered. Next, we explicitly evaluate whether the additive prop- Once more, the tests support the results depicted in erty is also recovered by qk in TGA. The additive prop- Figure 1 and the predictions of Sec. II. Therefore, the bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 9

56 tests confirm the independent effect of qk probabilities steppes from Oklahoma ; understory plant communities on the optimization of thermodynamic properties of soil of temperate forests from Canada66; and tree communi- 69 systems. Although the qk probabilities ensure the repli- ties of tropical forests from India . The experimental cation principle in the TGA case, an inference about gen- conditions also greatly vary among studies. There are eral ecological systems requires further tests. Thus, we times series, simple samples in spatially explicit condi- subsequently evaluate the reliability of the approach in tions, counting of individuals, biomass estimation, fre- further contexts. quencies, and so forth. We provide a list of the main characteristics of these data sets in Supplement 3, but we encourage a search in the original papers for more details (all of them are available as open source data). This broad range of experimental conditions is suit- able for an important propose: assessing and comparing the replication principle and the thermodynamic infer- ence produced by the classic and the new estimators of probabilities. Note that we are not attempting to repli- cate the original experiments using a new tool, or evaluat- ing the reliability of the previous findings. In fact, several details about the original experimental designs and infer- ences about external influences are not addressed here, because they are pointless for our propose. Accordingly, climate, human manipulation, entropy production, gran- ular schemes, and other aspects described in these data sets are here considered as only sources of entropy varia- tion. Our only intention is to check the consistency of the Figure 3. Comparing the variances (σ2) of probabilities estimator qk concerning the replication principle, observ- obtained using distinct approaches for quantities of ground ing how the diversity indices fit variations in organism arthropods found in secondary Atlantic forests. Thirteen sites quantities. Based on the consistency of thermodynamic showing different levels of development are used to create mul- laws, it is expected that the data variability and the dis- tiple arrangements (n=63). A: species compartments and pk; tinct conditions considered by these data sources mini- the sum of variances shows a poor linear correlation with the mally affect the general patterns and that the abundance variance of sums (R2 < 0.01; p = 0.593), indicating that addi- variability present in each data set would provide suffi- tive and extensive properties of the systems are not achieved. 67 2 cient information to observe the thermodynamic prop- B: guilds and pk (R ≈ 0.90; p << 0.001); C: species and 2 erties. Finally, we aim to evaluate the applicability of qk (R ≈ 0.55; p < 0.001). In both cases, the correlation the framework described in Sec. II as a general tool for between the sum of variances and the variance of sums was improved in comparison to A, suggesting that the thermody- assessing biological diversity in ecological systems. namic properties are better assessed. D: guilds and qk; the Accordingly, for a data set S containing n samples, we excellent linear fit (R2 ≈ 0.98, p << 0.001) indicates that the simply calculate the diversity indices of each sample, and additive and extensive properties of the ecological system are we analyze the correlation among distinct indices under recovered by addressing both approaches together. the conditions found in S. Because we intend to prove the optimization of the replication principle, we evaluate the correlation between the same diversity indices con- sidered by Hill numbers 13,15,36, although not necessarily Beyond soil systems: the generality of the result obtained as Eq. 2. For pk, we observe the values of W , H and by qk D as functions of A, whereas for qk, we observe W , HM and DM as functions of Aˆ. Furthermore, we assess the Notwithstanding the predictions of Sec. II and evidence correlation between W and H and HM . In contrast to from soil systems, we hypothesize that the optimiza- the tests with TGA, only one granular scheme is assessed tion of thermodynamic properties by qk cannot hold for per data set. Based on MaxEnt principles, the best es- broader ecological contexts. Hence, we evaluate the repli- timator is considered to be the one capable of producing cation principle in general situations, taking ten data sets index values that optimize the inference produced by rel- available on-line and calculating the entropy produced by ative quantities. The results are depicted in Figure 4 and the estimators pk and qk. These data sets concern a broad Figure 5. range of biological communities in worldwide ecosystems, Concerning the ten data sets, the use of pk almost al- including mammals, perennial plants and ants from US ways produces values for the diversity indices (W , H, and deserts3,62; fishes and benthic invertebrates from tem- D) that poorly correlate with abundances A. Even the perate lakes48,49; ground arthropods sampled in Italy’s correlation between W and H is important only for one organic and conventional crops61; butterflies living in case (Figure 556 ). Therefore, the test corroborates the open areas from Colorado, US63; plant species growing in poor inference provided by the classic estimator of prob- bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 10 abilities, which assumes a priori the biological categories as independent compartments. Conversely, probabilities qk always achieve strong correlations between effective abundances and the ecological indices or between W and HM (Figures 4 and 5). Note that probabilities qk are ap- propriate even for the case in which pk probabilities are satisfactory. Therefore, the results corroborate the evidence pro- vided by the TGA example, as well as the prediction of Sec. II. Concerning the broad range of conditions re- garded by this test, the results depicted in Figures 4 and 5 stress the valor of the framework presented here for the assessment of diversity patterns in ecological sys- tems. These new and strong evidence reinforce that ef- fective abundance, as well as the entire conceptual frame- work behind its assumption, consists of a trustful manner to recover the thermodynamic entropy of ecological sys- tems, which is a substantial gain in the inference power produced by diversity indices based on Boltzmann-Gibb- Shannon entropy. In fact, the approach is so consistent in all experimental situations that we further proceed to find additional inferences, as follows.

Probabilities qk as a tool to link the partitioning of diversity

The examples presented above provide strong evidence supporting the use of qk probabilities to achieve the ther- modynamic properties in ecological systems. In turn, we Figure 4. Relationship between quantities of organisms and conjecture that q probabilities can also be employed to entropy measures according to distinct estimators of proba- k ˆ solve an old question in ecology: the partitioning of bio- bilities. A: abundances; A: effective abundance; W : number diversity24. Briefly, the problem consists of whether the of compartments (species, genera, guilds, and so forth); black diamonds represent the number of compartments W ; gray di- diversity of local samples represent the entire diversity amonds represent the Shannon entropy; and dark red asterisk found in the region. The local diversity is named α- represents the Gini-Simpson asymmetry D. On the left, the diversity, the regional diversity is named γ-diversity, and indices are calculated for probabilities taken as pk = Ak/A, their relationship is molded by the β-diversity. This im- PW and in the middle, by probabilities as qk = ln Ak/ k=1 ln Ak. portant subject has generated debate among ecologists On the right, black triangle represents the H obtained by qk, for decades and still lacks a consensus, with dozens of while gray triangle represents the H obtained by pk. Ac- 24 frameworks already proposed . Because probabilities cordingly, probabilities qk improve the maximum entropy for qk optimize the extensive properties in general ecological distinct biological communities, revealing their additive and systems, we conjecture that an appropriate experimental extensive properties. In lines: 1st: Mammals from Utah, US3; nd 62 design could reveal how diversity patterns are affected 2 : Ants from Chihuahuan Desert, Arizona, taken by baits , n = 135, W = 20; 3rd: Benthic invertebrates from North Lake by the sampling area. If the correlation between sam- 48 th pling area and H is confirmed, then the q probabili- region , n = 30, W = 19; 4 : fishes from temperate lakes of M k North America taken by electro fishing49, n = 35, W = 27; ties would provide the basis for definitively solving the 5th: Butterflies from open areas of Colorado, US,63 n = 66, dilemma, at least under the thermodynamic perspective. W = 58. Art: oogazone.com. Hence, we use a data set that contains counts of vas- cular plants, where the samples are taken in an experi- mental design with spatial correlation65. This data set considers a large parcel of 256m×256m, partitioned into the abundances, effective abundances and the number of 256 samples of 16m × 16m arranged side by side, as in species monotonically increase as a function of the sam- a square lattice. We calculate A, Aˆ, W , Hmax, H and pling area, as expected for extensive systems. The same HM for one parcel alone, then for two parcels together, holds for Hmax and for HM , but not for H. In fact, the and then for three parcels together, and so on, until all entropy calculated by pk produces an anomalous behav- 256 of the parcels are addressed together. We plot the ior, suggesting that entropy would decrease despite the respective values against the respective area (m2) repre- increase in sampling area. Therefore, no doubts exist sented by the data. For extensive systems, a monotonic that qk can actually recover the extensive properties of increase of all indices is expected. Figure 6 shows that ecological systems and can be used to link γ-diversity to bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 11

Underlying ecological mechanism

The last doubt regards a possible influence of under- lying ecological mechanisms on the inference produced by qk. In fact, the forces driving the shape of SADs are poorly known, and the method presented here makes no a priori inferences about them. In fact, distinct mech- anisms can produce SADs with different shapes, which could carry consequences for the assessment of entropy in biological communities68. Consequently, we test whether qk probabilities are sensitive to the SAD shape using syn- thetic data. We employ computational algorithms to generate data distributed as four probability density functions (PDFs) 31,70 for our treatments, including a) Poisson, b) expo- nential, c) lognormal, and d) power law. In each case,

we produce n = 50 data vectors (V~1, ..., V~50), where V~j represents a “pseudo” biological community found in the th j site. The length of V~j denotes Wj, while its data en-

ters the relative abundances A1,A2, ..., AWj . In all treat- ments, we intentionally produce “pseudo” communities, where respective species richness Wj varies according to a Poisson distribution (Λ = 35). Furthermore, we mono- tonically increase the parameter values of generator func- tions, guaranteeing that all the treatments contain some entropy variability (see Supplement 4 for details). Sim- ilar to the tests with empirical data, we compare the correlation between diversity indices produced by pk and qk in accordance with thermodynamical predictions. Figure 5. Relationship between quantities of organisms and Briefly, the results obtained for synthetic data (Fig- entropy measures according to distinct estimators of proba- bilities. A: abundances; Aˆ: effective abundance; W : number ures S1-S4) show that probabilities calculated as pk of compartments (species, genera, guilds, and so forth); black achieve extensive properties for systems whose SADs are diamonds represent the number of compartments W ; gray di- distributed as a Poisson or exponential PDF. However, amonds represent the Shannon entropy; and dark red asterisk pk was hardly efficient for SAD distributed as lognormal represents the Gini-Simpson asymmetry D. On the left, the or power law functions. Conversely, probabilities calcu- indices are calculated for probabilities taken as pk = Ak/A, lated as qk optimized the thermodynamic properties in PW and in the middle by probabilities as qk = ln Ak/ k=1 ln Ak. all treatments. Therefore, the use of qk probabilities is On the right, black triangle represents the H obtained by qk, demonstrated to be robust even when data variability is while gray triangle represents the H obtained by pk. Ac- large, as in lognormal and power law cases, and the opti- cordingly, probabilities qk improve the maximum entropy for mization of entropy is hardly sensitive to the underlying distinct biological communities, revealing their additive and mechanisms driving the SADs. extensive properties. In lines: 1st: Abundance of soil arthro- pod groups found in cultivated from Italy n = 28, W = 2861; 2nd: Perennial plants from Chihuahuan Desert, IV. DISCUSSION Arizona, n = 42, W = 5962; 3rd: Vegetation cover of grass and shrub species in prairie steppes from Oklahoma, US56, n = 60, W = 320; 4th: Semi-quantitative crown cover from We present an important advancement for the assess- trees in the Lac Croche temperate forest, Quebec, Canada, ment and monitoring of biodiversity. In fact, our frame- n = 40, W = 7266; 5th: Basal area of trees in tropical forests work does not develop a new entropy or create a new 69 from India, n = 46, W = 100 . Art: oogazone.com. diversity index. Rather, one simply employs a more ap- propriate estimate of abundances to well-known concepts and formulations. Then, the novelty concerns which quantities can actually recover the thermodynamic prop- erties of ecological systems. The several tests presented α-diversity only using a scale correction, while pk is not here indicate that our approach maximizes the replica- a proper metric. Naturally, β-diversity is a function that tion principle, providing a more precise measure of order- links local and regional diversity, and it can unequivocally disorder patterns of biological diversity. The potential predict how much of the regional diversity is represented update in the ecological discussion is substantial because in a set of samples. several themes showing conflicts between theoretical pre- bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 12

entire system. Such deductions appear to explain the general interpretation of biomass pyramids77, for which distinct trophic levels show a geometric dependence. Fur- ther evidence comes from the effects produced by the in- clusion/exclusion of species in natural ecosystems, such as in biological invasions. Such events can affect the en- tire community in non-trivial manners, which cannot be explained by the simple additive processes22,46.

Our tests demonstrate that the classic estimator pk that relies on additive processes yields weak results when assessing diversity in empirical data. The inefficacy of pk is not new in ecological discussions, but the consequences of its inaccuracy have not received suitable attention. We wonder how many of the conflicting aspects concerning the biological diversity are in fact consequences of misin- terpretations produced by this problematic estimator. In our tests, we include some important topics that exem- plify our concern: the thermodynamic inference provided by diversity indices, the maximum entropy produced by distinct granular schemes, the assessment of entropy pro- duction in changing ecosystems, and the explicit relation- ship between local and regional diversities, which leads to the partitioning of diversity. All these topics represent a large part of the long-term discussion in ecology, and the estimator qk appears to offer a potential solution for sev- Figure 6. Influence of sampling area (m2) on the diversity of eral of them. Therefore, we claim that the estimator pk plants according to different estimators of probabilities. Div.: should be definitively left aside in favor of the adoption diversity metrics; A: abundances; Aˆ: effective abundance; of q and that the topics mentioned here be reassessed W : species richness; H : maximum entropy possible for k max under this new approach. W compartments; H: Shannon entropy calculated by pk = Ak/A; HM : Shannon entropy calculated by qk = ln(Ak + The mathematical formulation proposed here follows PW 1)/ ln(Ak +1). The analyses consider the data of Palmer well-established concepts and methods used to assess 1 19–21 et al.65. On the top, the increase in sampling area led to non-equilibrium systems , which are adapted for a the monotonic increase of abundances, effective abundances static approach. In fact, the probabilities and interpre- and species richness. The bottom shows that Hmax and the tations presented here match those proposed for ecolog- 80 entropy produced by qk (HM ) monotonically increase with ical contexts in Ref. , which surprisingly has remained the area, whereas the entropy produced by pk (H) presents poorly explored until now. In comparison with the pre- an unpredicted pattern. Therefore, only HM achieves the vious findings, in which time is an inherent aspect80, the extensive properties of this system. static approach here takes into account that each sample is being interpreted as a stationary (or quasi-stationary) state. Note that the hypothesis of stationarity is fragile dictions and empirical data can now be reviewed without under a more rigorous interpretation but that this poorly further samplings. Next, we discuss some particular im- affects the accuracy of the calculation or the practical plications. implications of the results. In this sense, our framework The most important aspect of our findings concerns the consists of an additional step in the direction of thermo- estimator of probabilities. The approach proposed here dynamic inferences, concepts, and views for wider eco- is highly logical in biological terms. The quantitative de- logical scopes. Once the diversity indices can produce pendence among categories is a well-known aspect of any trustworthy metrics, time, space or any other variable ecological system that has remained poorly understood of interest can be experimentally contextualized and its until now. Then, the major conceptual shift occurs for phenomenological effects deduced from the ∆H obtained the calculation of probabilities and quantitative interpre- from samples. tations. The dependence among compartments assumes Therefore, the importance of qk is mostly practical and multiplicative processes rather than additive ones and corroborated by empirical data. The framework is simple yields important consequences for diversity indices and and is in accordance with thermodynamic rules, and it their inference power. For instance, only exponential in- does not require additional theoretical assumptions and creases in the number of organisms (or biomass) produce new formulations. The tests with empirical data revel detectable variations in terms of entropy. Second, the how resilient the approach is to variations of granular of a new category in ecological systems (e.g., schemes and is applicable to a broad range of ecological a new guild) depends on the geometric increase of the conditions. In fact, our set of tests could not find bounds bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 13

29 for the use of qk, which is another highlight. However, using a similar framework . Therefore, it is clear that the most clear advantage concerns the inference power the geometric symmetry for relative proportions is not and the prediction ability produced by the new estima- producing arbitrary results and only reflects the quan- tor, which matches the replication principle. Because titative effects of interacting compartments on relative HM uses a formulation similar to that used by the clas- abundances. Therefore, the mechanism proposed here is sic H, the approach shows another advantage in com- known in other disciplines, and it is able to reduce data parison to other recently proposed methodologies8,16: as noise and produce a new probability distribution for a previously mentioned, HM can readily be employed to re- distinct physical system. In this sense, we also show its view previous studies without further field work. In fact, applicability for ecological systems. Compared with gen- 78 the reproducibility of results in ecological experiments is eralized entropy , the assumptions of the use of qk do currently in question7, and some studies with biodiver- not require a new granular scheme39 because only the sity showing conflicting results can now be reevaluated quantitative signal inferred by the observables are con- without further data samplings. sidered. As demonstrated, the thermo-statistical interpreta- One important aspect addressed here remained under- tion achieved by qk in ecological systems provides the explored: the underlying mechanism driving the shape of basis for deep advancements in the knowledge about the SAD curves. The shape of SAD curves and its rela- biodiversity12,14,24,38. As we demonstrated here, α-, γ−, tion with the central limit theorem23 can be explored to and β-diversities are perfectly inferred by the thermody- deduce the ecological dynamics. The theorem states that namic inferences. Because entropy and the number of the additive processes would result in SAD distributed categories scale with effective abundances, few samples as Gaussian distributions or exponential distributions (if can be used to estimate the regional pool of species, as constrains are addressed), while multiplicative processes well as the total and relative abundances expected for the would result in lognormal distributions. Obviously, the entire region. The same is applicable to studies search- theorem predicts such patterns as opposite limit cases ing for adequate schemes for sorting organisms, as we when realizations are massive, but large spectra can be show for the TGA example (species × guilds). Our tests observed for limited sampling effort. Accordingly, the are not extensive, and we argue that several other tools empirical evidence clearly shows that SAD curves are al- and methods addressing the MaxEnt principles33–35 can most always distributed as a lognormal PDF, occasion- take advantage of the use of qk. This potential deserves ally as a power-law PDF, but almost never as PDFs such particular attention in future studies. as Gaussian or exponential. For the lognormal cases, we have a clear indication of multiplicative processes during the pattern formation. However, distinct processes re- Phenomenological process behind the data transformation lated to sampling effort can produce a power law from data that are actually lognormal distributed58,81. There- fore, even for when the power-law hypothesis is the best Concerning the phenomenological implications of the approach to describe the SAD curves, the evidence point new estimator, an important question arises: why does in the direction of multiplicative processes molding abun- the logarithm of abundances better achieve the thermo- dance, which should be explored in favor of defining the dynamic properties of biological communities? To an- underlying mechanisms driving the biological diversity. swer this question, we need to interpret ecological pat- terns under a statistical view. The asymmetry of SAD curves is considered to be a universal pattern of ecolog- ACKNOWLEDGEMENTS ical systems. Although the underlying mechanisms that produce SAD are still debatable, they are certainly re- We are grateful for C.R.F. Granzotti’ comments during lated to the unbalanced fitness of species and the conse- the manuscript preparation. Thanks to E. Stanley and quent number of respective decedents. As the logarithm NTL-LTER for permitting us to use their data sets. This function asymmetrically affects relative quantities, it is work was supported by the S˜aoPaulo Research Founda- natural that log-transformed proportions are less uneven tion (FAPESP) Grant No. 2013/06196-4. than raw proportions. This principle is well known from signal analysis, where mel frequencies28,75 are used to reduce the signal-to-noise ratio and provide more clear REFERENCES human interpretation of sound waves. In fact, can only distinguish great variations in sound waves, 1Andersen, A. N. 1997. Using ants as : multiscale while small oscillations are imperceptible. This concept issues in ant community ecology. Conservation Ecology 1:1–8. 6 is also used in statistics, where the Box-Cox techniques URL http://www.consecol.org/vol1/iss1/art8/. are used to manipulate relative frequencies in random 2Aronson, J., S. J. Milton, and J. N. Blignaut. 2007. Restoring variables and generate Gaussian PDFs from data dis- natural capital: , business and practice. Island Press. 3Bartel, R. A., F. F. Knowlton, and L. C. Stoddart. 2005. tributed according to another function. Similar to the Mammal Abundance Indices in the Northern Portion of the TGA case, the entropic contribution for the formation of Great Basin, 1962-1993. Ecology 86:3130–3130. URL http: polymer chains is separated from non-entropic influences //dx.doi.org/10.1890/04-1607. bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 14

4Birkhofer, K., C. Dietrich, K. John, Q. Schorpp, A. S. Zait- ings of the National Academy of 110:14296–14301. URL sev, and V. Wolters. 2016. Regional Conditions and Land-Use http://www.pnas.org/content/110/35/14296.abstract. Alter the Potential Contribution of Soil Arthropods to Ecosys- 19Dewar, R. 2003. Information theory explanation of the fluctua- tem Services in Grasslands. Frontiers in Ecology and Evolu- tion theorem, maximum entropy production and self-organized tion 3:150. URL http://journal.frontiersin.org/article/ criticality in non-equilibrium stationary states. Journal of 10.3389/fevo.2015.00150. Physics A: Mathematical and General 36:631 – 641. 5Blasi, S., C. Menta, L. Balducci, F. D. Conti, E. Petrini, 20Dewar, R., and A. Maritan, 2014. A Theoretical Basis for Maxi- and G. Piovesan. 2013. Soil microarthropod communities mum Entropy Production. Pages 141–167 in Beyond the Sec- from Mediterranean forest ecosystems in Central Italy un- ond Law: Entropy Production and Non-equilibrium Systems. der different disturbances. Environmental Monitoring and As- Springer. sessment 185:1637–1655. URL http://dx.doi.org/10.1007/ 21Dewar, R. C., 2014. A general maximum entropy framework for s10661-012-2657-2. thermodynamic variational principles. Pages 49–72 in AIP Con- 6Box, G., and D. Cox. 1964. An analysis of transformations. ference Proceedings, volume 1636. URL http://aip.scitation. Journal of the Royal Statistical Society, Series B 26:211–252. org/doi/abs/10.1063/1.4903723. URL http://asa.scitation.org/doi/10.1121/1.1915893. 22Didham, R. K., J. M. Tylianakis, N. J. Gemmell, T. A. Rand, 7Bruna, E., R. Chazdon, T. Errington, and B. Nosek. 2017. and R. M. Ewers. 2007. Interactive effects of modification Reproducibility & repeatability in tropical : a call and species invasion on native species decline. Trends in Ecology to repeat foundational studies. OSF Preprints pages 1– & 22:489 – 496. URL http://www.sciencedirect. 12. URL file:///C:/Users/PC/Downloads/Bruna_etal_RPTB_ com/science/article/pii/S0169534707002054. Commentary_8may2017%20(1).pdf. 23Durrett, R., 2013. Probability: Theory and Examples. 8Butturi-Gomes, D., M. P. Junior, H. C. Giacomini, and P. D. M. 24Ellison, A. M. 2010. Partitioning diversity1. Ecology 91:1962– Junior. 2014. Computer intensive methods for controlling bias in 1963. URL http://dx.doi.org/10.1890/09-1692.1. a generalized index. Ecological Indicators 37:90– 25Ettema, C. H., and D. A. Wardle. 2002. Spatial soil ecology. 98. URL http://dx.doi.org/10.1016/j.ecolind.2013.10.004. Trends in Ecology and Evolution 17:177–183. 9Cadotte, M. W., K. Carscadden, and N. Mirotchnick. 2011. 26Falk, D., M. Palmer, and B. Zedler. 2006. Foundations of Beyond species: functional diversity and the maintenance of . Island Press. ecological processes and services. Journal of 27Fitter, A. H., C. A. Gilligan, K. Hollingworth, A. Kleczkowski, 48:1079–1087. URL http://dx.doi.org/10.1111/j.1365-2664. R. M. Twyman, J. W. Pitchford, and T. M. of the Nerc Soil 2011.02048.x. Biodiversity Programme. 2005. Biodiversity and ecosystem 10Cernansky, R. 2017. The biodversity revolution. Nature 546:22– function in soil. 19:369–377. URL http: 24. URL https://www.nature.com/polopoly_fs/1.22079! //dx.doi.org/10.1111/j.0269-8463.2005.00969.x. /menu/main/topColumns/topLeftColumn/pdf/546022a.pdf. 28Fletcher, H. 1938. Loudness, Masking and Their Relation to the 11Chalmandrier, L., T. Munkemuller, V. Devictor, S. Lavergne, Hearing Process and the Problem of Noise Measurement. The and W. Thuiller. 2015. Decomposing changes in phylogenetic Journal of the Acoustical Society of America 9:275–293. URL and functional diversity over space and time. Methods in Ecology http://asa.scitation.org/doi/10.1121/1.1915893. and Evolution 6:109–118. URL http://dx.doi.org/10.1111/ 29Flory, P. J., 1993. Spatial configuration of macromolecular 2041-210X.12297. chains. Pages 156–177 in T. Fr¨angsmyr and S. Fors´en, edi- 12Chao, A., C.-H. Chiu, and T. Hsieh. 2012. Proposing a resolution tors. Nobel Lecture, Chemistry 1971-1980. World Scientific Pub- to debates on diversity partitioning. Ecology 93:2037–2051. URL lishing Co. URL https://www.nobelprize.org/nobel_prizes/ http://dx.doi.org/10.1890/11-1817.1. chemistry/laureates/1974/flory-lecture.pdf. 13Chao, A., C.-H. Chiu, and L. Jost. 2014. Unifying 30Galli, L., M. Capurro, C. Menta, and I. Rellini. 2014. Is the QBS- Species Diversity, Phylogenetic Diversity, Functional Diversity, ar index a good tool to detect the soil quality in Mediterranean and Related Similarity and Differentiation Measures Through areas? A cork tree Quercus suber L. (Fagaceae) wood as a case Hill Numbers. Annual Review of Ecology, Evolution, and of study. Italian Journal of 81:126–135. 45:297–324. URL https://doi.org/10.1146/ 31Gillespie, C. S. 2015. Fitting Heavy Tailed Distributions: The annurev-ecolsys-120213-091540. poweRlaw Package. Journal of Statistical Software 64:1–16. URL 14Chao, A., C.-H. Chiu, and L. Jost. 2016. Statistical challenges http://www.jstatsoft.org/v64/i02/. of evaluating diversity patterns across environmental gradients 32Gotelli, N. J., G. R. Graves, and C. Rahbek. 2010. Macroe- in mega-diverse communities. Journal of Vegetation Science cological signals of species interactions in the Danish avifauna. 27:437–438. URL http://dx.doi.org/10.1111/jvs.12420. Proceedings of the National Academy of Sciences 107:5030–5035. 15Chao, A., N. J. Gotelli, T. C. Hsieh, E. L. Sander, K. H. Ma, URL http://www.pnas.org/content/107/11/5030.abstract. R. K. Colwell, and A. M. Ellison. 2014. Rarefaction and ex- 33Haegeman, B., and M. Loreau. 2008. Limitations of entropy trapolation with Hill numbers: a framework for sampling and maximization in ecology. Oikos 117:1700–1710. URL http:// estimation in species diversity studies. Ecological Monographs dx.doi.org/10.1111/j.1600-0706.2008.16539.x. 84:45–67. URL http://dx.doi.org/10.1890/13-0133.1. 34Harte, J. 2011. Maximum Entropy and Ecology. Oxford Unver- 16Chao, A., L. Jost, T. C. Hsieh, K. H. Ma, W. B. Sherwin, and sity Press. L. A. Rollins. 2015. Expected Shannon Entropy and Shannon 35Harte, J., and E. A. Newman. 2014. Maximum information Differentiation between Subpopulations for Neutral Genes under entropy: a foundation for ecological theory. Trends in Ecology & the Finite Island Model. PLOS ONE 10:1–24. URL https: Evolution 29:384 – 389. URL http://www.sciencedirect.com/ //doi.org/10.1371/journal.pone.0125471. science/article/pii/S0169534714001037. 17Coleman, D. C., , D. Crossley, , and P. F. Hendrix. 2004. Funda- 36Hill, M. O. 1973. Diversity and Evenness: A Unifying Notation mentals of Soil Ecology (Second Edition). Second edition edition. and Its Consequences. Ecology 54:427–432. URL http://www. Academic Press, Burlington. jstor.org/stable/1934352. 18de Vries, F. T., E. Thebault, M. Liiri, K. Birkhofer, M. A. Tsi- 37HUBBELL, S. P. 2005. Neutral theory in community ecology afouli, L. Bjornlund, H. Bracht Jorgensen, M. V. Brady, S. Chris- and the hypothesis of functional equivalence. Functional Ecology tensen, P. C. de Ruiter, T. d Hertefeldt, J. Frouz, K. Hedlund, 19:166–172. URL http://dx.doi.org/10.1111/j.0269-8463. L. Hemerik, W. H. G. Hol, S. Hotes, S. R. Mortimer, H. Setala, 2005.00965.x. S. P. Sgardelis, K. Uteseny, W. H. van der Putten, V. Wolters, 38Jost, L. 2006. Entropy and diversity. Oikos 113:363–375. URL and R. D. Bardgett. 2013. Soil properties explain http://dx.doi.org/10.1111/j.2006.0030-1299.14714.x. ecosystem services across European land use systems. Proceed- bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 15

39Keylock, C. J. 2005. Simpson diversity and the Shan- beyond single prediction theories to integration within an eco- non–Wiener index as special cases of a generalized entropy. Oikos logical framework. Ecology Letters 10:995–1015. URL http: 109:203–207. URL http://dx.doi.org/10.1111/j.0030-1299. //dx.doi.org/10.1111/j.1461-0248.2007.01094.x. 2005.13735.x. 56McGlinn, D. J., P. G. Earls, and M. W. Palmer. 2010. A 12- 40Kleidon, A., Y. Malhi, and P. M. Cox. 2010. Maximum entropy year study on the scaling of vascular plant composition in an production in environmental and ecological systems. Philosoph- Oklahoma tallgrass prairie. Ecology 91:1872–1872. URL http: ical Transactions of the Royal Society of London B: Biological //dx.doi.org/10.1890/09-2017.1. Sciences 365:1297–1302. 57Meloni, F., 2012. Influˆenciado desenvolvimento florestal sobre 41Laughlin, D. C., R. T. Strahan, M. M. Moore, P. Z. Ful´e, a comunidade ed´afico-ep´ıgeade Arthropoda e a mirmecofauna: D. W. Huffman, and W. W. Covington. 2017. The hierarchy bases para a bioindica¸c˜aodo processo sucessional na restaura¸c˜ao of predictability in ecological restoration: are vegetation struc- ecol´ogica. Ph.D. thesis, FFCLRP - Universidade de S˜aoPaulo. ture and functional diversity more predictable than community URL http://www.teses.usp.br/teses/disponiveis/59/59131/ composition? Journal of Applied Ecology pages n/a–n/a. URL tde-02012013-120543/pt-br.php. http://dx.doi.org/10.1111/1365-2664.12935. 58Meloni, F., C. R. F. Granzotti, S. Bautista, and A. S. Martinez. 42Lavelle, P., T. Decaens, M. Aubert, S. Barot, M. Blouin, F. Bu- 2017. Scale dependence and patch size distribution: clarifying reau, P. Margerie, P. Mora, and J.-P. Rossi. 2006. Soil inverte- patch patterns in Mediterranean drylands. Ecosphere 8:e01690– brates and ecosystem services. European Journal of n/a. URL http://dx.doi.org/10.1002/ecs2.1690. 42:S3 – S15. URL http://www.sciencedirect.com/science/ 59Meloni, F., N. P. Lopes, and E. M. Varanda. 2012. The relation- article/pii/S1164556306001038. ship between leaf , nitrogen metabolites and herbivory 43Lavender, J. T., K. A. Dafforn, M. J. Bishop, and E. L. Johnston. in two species of Nyctaginaceae from the Brazilian Cerrado. En- 2017. An empirical examination of effects across twenty vironmental and Experimental 75:268–276. degrees of latitude. Ecology pages n/a–n/a. URL http://dx. 60Meloni, F., and E. M. Varanda. 2015. Litter and soil arthro- doi.org/10.1002/ecy.1926. pod colonization in reforested semi-deciduous seasonal Atlantic 44Lebrija-Trejos, E., E. A. P´erez-Garc´ıa,J. A. Meave, F. Bongers, forests. Restoration Ecology 23:690–697. URL http://dx.doi. and L. Poorter. 2010. Functional traits and environmental fil- org/10.1111/rec.12236. tering drive community assembly in a species-rich tropical sys- 61Menta, C., A. Tagliapietra, C. G., Z. A., and P. S. 2015. Ibs- tem. Ecology 91:386–398. URL http://dx.doi.org/10.1890/ Bf and Qbs-Ar Comparison: Two Quantitative Indices Based on 08-1449.1. Soil Fauna Community. EC 2:427–439. 45Legendre, P., and L. Legendre. 2012. Numerical Ecology, Volume 62Morgan Ernest, S. K., G. M. Yenni, G. Allington, E. M. 24, 3th edition. Elsevier. Christensen, K. Geluso, J. R. Goheen, M. R. Schutzenhofer, 46Levine, J. M. 2000. Species Diversity and Biological Invasions: S. R. Supp, K. M. Thibault, J. H. Brown, and T. J. Val- Relating Local Process to Community Pattern. Science 288:852– one. 2016. Long-term monitoring and experimental manipu- 854. URL http://science.sciencemag.org/content/288/5467/ lation of a Chihuahuan desert ecosystem near Portal, Arizona 852. (1977–2013). Ecology 97:1082–1082. URL http://dx.doi.org/ 47Madej, G., G. Barczyk, and M. Gdawiec. 2011. Evaluation of Soil 10.1890/15-2115.1. Biological Qualisty Index (QBS-ar): Its Sensivity and Usefulness 63Oliver, J., K. Prudic, and S. Collinge. 2006. Boulder County in the Post-Mining Chronosequence - Preliminary Research. Pol- Open Space butterfly diversity and abundance. Ecology 87. ish Journal of Environmental Studies 20:1367–1372. 64Ottermanns, R., P. W. Hopp, M. Guschal, G. P. dos Santos, 48Magnuson, J., S. Carpenter, and E. Stanley, 2016. North Tem- S. Meyer, and M. Rob-Nickoll. 2011. Causal relationship be- perate Lakes LTER: Benthic Macroinvertebrates 1981 - cur- tween leaf litter beetle communities and regeneration patterns rent, Data set ID:11. URL https://lter.limnology.wisc.edu/ of vegetation in the Atlantic rainforest of Southern Brazil (Mata dataset/. Atlˆantica). Ecological Complexity 8:299 – 309. URL http://www. 49Magnuson, J., S. Carpenter, and E. Stanley, 2016. North sciencedirect.com/science/article/pii/S1476945X11000444. Temperate Lakes LTER: Fish Abundance 1981 - current, Data 65Palmer, M. M., R. K. Peet, R. A. Reed, W. Xi, and P. S. White. set ID:7. URL https://lter.limnology.wisc.edu/dataset/ 2007. A multiscale study of vascular plants in a North Carolina north-temperate-lakes-lter-fish-abundance-1981-current. Piedmont forest. Ecology 88:162–D1. URL http://esapubs. 50Magurran, A. 2013. Measuring Biological Diversity. Wiley. org/Archive/ecol/E088/162/default.htm. 51Maldonado, C., C. I. Molina, A. Zizka, C. Persson, C. M. Taylor, 66Paquette, A., E. Lalibert´e,A. Bouchard, S. de Blois, P. Legendre, J. Alban, E. Chilquillo, N. Ronsted, and A. Antonelli. 2015. and J. Brisson. 2007. LAC CROCHE UNDERSTORY VEGE- Estimating species diversity and distribution in the era of Big TATION DATA SET (1998–2006). Ecology 88:3209–3209. URL Data: to what extent can we trust public ? Global http://dx.doi.org/10.1890/07-0513.1. Ecology and 24:973–984. URL http://dx.doi. 67Parisi, V., C. Menta, C. Gardi, C. Jacomini, and E. Mozzan- org/10.1111/geb.12326. ica. 2005. Microarthropod communities as a tool to assess soil 52Martyushev, L., and V. Seleznev. 2006. Maximum entropy pro- quality and biodiversity: a new approach in Italy. Agriculture, duction principle in physics, chemistry and biology. Physics Re- Ecosystems & Environment 105:323 – 333. URL http://www. ports 426:1–45. sciencedirect.com/science/article/pii/S0167880904000970. 53Mason, N. W., F. de Bello, D. Mouillot, S. Pavoine, and S. Dray. 68Pathria, R., and P. Beale. 2011. Statistical Mechanics. Elsevier. 2013. A guide for using functional diversity indices to reveal 69Pelissier, R., J.-P. Pascal, N. Ayyappan, B. R. Ramesh, S. Ar- changes in assembly processes along ecological gradients. Journal avajy, and S. R. Ramalingam. 2011. Tree demography in of Vegetation Science 24:794–806. URL http://dx.doi.org/10. an undisturbed Dipterocarp permanent sample plot at Uppan- 1111/jvs.12013. gala, Western Ghats of India. Ecology 92:1376–1376. URL 54Matthews, T. J., and R. J. Whittaker. 2015. REVIEW: On the http://dx.doi.org/10.1890/10-1991.1. species abundance distribution in applied ecology and biodiver- 70R Development Core Team, 2017. R: A Language and Envi- sity management. Journal of Applied Ecology 52:443–454. URL ronment for Statistical Computing. R Foundation for Statistical http://dx.doi.org/10.1111/1365-2664.12380. Computing, Vienna, Austria. URL http://www.R-project.org. 55McGill, B. J., R. S. Etienne, J. S. Gray, D. Alonso, M. J. 71Ritchie, M. E. 2009. Scale, Heterogeneity, and the Structure Anderson, H. K. Benecha, M. Dornelas, B. J. Enquist, J. L. and Diversity of Ecological Communities. Princeton University Green, F. He, A. H. Hurlbert, A. E. Magurran, P. A. Mar- Press. quet, B. A. Maurer, A. Ostling, C. U. Soykan, K. I. Ugland, 72Rosenfeld, J. S. 2002. Functional redundancy in ecology and and E. P. White. 2007. Species abundance distributions: moving conservation. Oikos 98:156–162. URL http://dx.doi.org/10. bioRxiv preprint doi: https://doi.org/10.1101/188813; this version posted September 14, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 16

1034/j.1600-0706.2002.980116.x. 77Trebilco, R., J. K. Baum, A. K. Salomon, and N. K. Dulvy. 2013. 73Schipper, A. M., H. W. M. Hendriks, M. J. Kauffman, A. J. : size-based constraints on the pyramids of life. Hendriks, and M. A. J. Huijbregts. 2013. Modelling interac- Trends in Ecology & Evolution 28:423 – 431. URL http://www. tions of toxicants and in wildlife popula- sciencedirect.com/science/article/pii/S0169534713000888. tions. Journal of Applied Ecology 50:1469–1478. URL http: 78Tsallis, C. 2009. Introduction to Nonextensive Statistical Me- //dx.doi.org/10.1111/1365-2664.12142. chanics. Springer. 74Shannon, C. E. 1948. A Mathematical Theory of Commu- 79Valencia, E., F. T. Maestre, Y. Le Bagousse-Pinguet, J. L. Quero, nication. Bell System Technical Journal 27:379–423. URL R. Tamme, L. Borger, M. Garcia-Gomez, and N. Gross. 2015. http://dx.doi.org/10.1002/j.1538-7305.1948.tb01338.x. Functional diversity enhances the of ecosystem multi- 75Stevens, S., and J. Vokmann. 1937. A scale for the mea- functionality to aridity in Mediterranean drylands. New Phytolo- surement of the psycological magnitude pitch. The Journal gist 206:660–671. URL http://dx.doi.org/10.1111/nph.13268. of the Acoustical Society of America 8:185–190. URL http: 80Wurtz, P., and A. Annila. 2010. Ecological succes- //asa.scitation.org/doi/10.1121/1.1915893. sion as an energy dispersal process. Biosystems 100:70 – 76Thompson, R., and B. M. Starzomski. 2007. What does bio- 78. URL http://www.sciencedirect.com/science/article/ diversity actually do? A review for managers and policy mak- pii/S0303264710000055. ers. Biodiversity and Conservation 16:1359–1378. URL http: 81Zhao, K., M. Musolesi, P. Hui, W. Rao, and S. Tarkoma. 2015. //dx.doi.org/10.1007/s10531-005-6232-9. Explaining the power-law distribution of human mobility through transportation modality 5. URL http://dx.doi. org/10.1038/srep09136.