Cent. Eur. J. Phys. • 7(3) • 2009 • 387-394 DOI: 10.2478/s11534-009-0054-4

Central European Journal of Physics q-Gaussian approximants mimic non-extensive statistical-mechanical expectation for many-body probabilistic model with long-range correlations

Research Article

William J. Thistleton1∗, John A. Marsh2† , Kenric P. Nelson3‡ , Constantino Tsallis45§

1 Department of Mathematics, SUNY Institute of Technology, Utica NY 13504, USA 2 Department of Computer and Information Sciences, SUNY Institute of Technology, Utica NY 13504, USA 3 Raytheon Integrated Defense Systems, Principal Systems Engineer 4 Centro Brasileiro de Pesquisas Fisicas, Rua Xavier Sigaud 150, 22290-180 - RJ, 5 Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA

Received 3 November 2008; accepted 25 March 2009

Abstract: We study a strictly scale-invariant probabilistic N-body model with symmetric, uniform, identically distributed random variables. Correlations are induced through a transformation of a multivariate Gaussian distribution with covariance matrix decaying out from the unit diagonal, as ρ/rα for r =1, 2, …, N-1, where r indicates displacement from the diagonal and where 0 6 ρ 6 1 and α > 0. We show numerically that the sum of the N dependent random variables is well modeled by a compact support q-Gaussian distribution. In the particular case of α = 0 we obtain q = (1-5/3 ρ) / (1- ρ), a result validated analytically in a recent paper by Hilhorst and Schehr. Our present results with these q-Gaussian approximants precisely mimic the behavior expected in the frame of non-extensive . The fact that the N → ∞ limiting distributions are not exactly, but only approximately, q-Gaussians suggests that the present system is not exactly, but only approximately, q-independent in the sense of the q-generalized central limit theorem of Umarov, Steinberg and Tsallis. Short range interaction (α > 1) and long range interactions (α < 1) are discussed. Fitted parameters are obtained via a Method of Moments approach. Simple mechanisms which lead to the production of q-Gaussians, such as mixing, are discussed. PACS (2008): 02.50.-r; 02.60.-x; 02.60.Cb; 02.70.-c; 05.10.-a Keywords: q-Gaussian • non-extensive statistical mechanics • correlated systems © Versita Warsaw and Springer-Verlag Berlin Heidelberg.

1. Introduction

∗ E-mail: [email protected] † E-mail: [email protected] Central Limit Theorems illuminate the mechanisms which ‡ E-mail: [email protected] § lead to the ubiquitous natural occurrence of certain prob- E-mail: [email protected] ability distributions. The most celebrated of these the- orems, most commonly known as The Central Limit The-

387 q-Gaussian approximants mimic non-extensive statistical-mechanical expectation for many-body probabilistic model with long-range correlations

orem, is taught in introductory statistics courses and, in ples of size N drawn from the dependent uniform variables its simplest form, describes how the sum of a sequence and examines their asymptotic (large N) behavior. Section of independent, identically distributed random variables 6 further explores dependent uniform random variables by with finite first and second moments converges to a Gaus- examining the effect of power-law decay in the under- sian distribution [1]. A generalization of the traditional lying autocorrelation function. Finally, in Section 7 we Gaussian distribution which maximizes, under certain con- conclude with a discussion of the relevance of this result ditions, the non-additive entropy Sq [2] of a system has and suggestions for further study. been introduced and shown to model a wide variety of naturally occurring systems, especially those with heavy tails [3]. These q-Gaussian distributions are known to be 2. Dependent uniform random vari- attractors under summation for systems exhibiting the type ables of correlation known as q-Independence [4]. In this paper, we demonstrate another mechanism for the The basic objects of our study are uniform random vari- relevance of the q-Gaussian distribution. We show nu- ables exhibiting global correlations. To obtain such a merically that compact support q-Gaussian distributions system we first construct a multivariate Gaussian random can serve approximately as limiting distributions of spe- variable X with mean µX =0 and whose covariance matrix cially dependent systems of uniformly distributed random ΣX exhibits a particularly simple form of global correla- variables. The uniform distribution is important because tion, of the simplicity of the form and because it is the max- imum entropy distribution under the constraint of com- pact support. The generalized Central Limit Theorem of  ρ ··· ρ et al 1 Umarov .[4] requires variables to have a special de- ρ ··· ρ q-  1  pendency structure, referred to as independent. In this x   : Σ =  . . . .  (1) paper, we demonstrate numerically that, for uniformly dis-  . . .. .  tributed random variables with global correlation simply ρ ρ ··· 1 equal throughout the system, the limiting distribution is approximately a q-Gaussian. This provides a simplified analytical tool for examining an important system, namely We transform each of these Gaussian distributions to pro- one with uniform random variables influenced by uniform duce uniformly distributed random variables via the Prob- global dependency. ability Integral Transformation. To do so, use the elemen- We also generalize the above through the study of a tary fact that if Y is a random variable obtained under strictly scale-invariant (power law decay in covariance an invertible transformation of random variable X as Y matrix) probabilistic N-body model with symmetric, uni- = g(X), then the cumulative distribution functions of X − form, identically distributed random variables where cor- and Y are related as FY (·) = FX (g 1(·)). In this way a relations are induced through a transformation of the mul- random variable X may be mapped to a uniform random tivariate normal distribution with covariance matrix decay- variable U by applying to X its own cumulative distribu- α ing out from the unit diagonal as ρ/r , where r indicates tion function. That is, setting g(·) = FX (·) yields FU (u) = r N α −1 displacement from the diagonal, =1, 2, …, -1, > 0 FX (FX (u))=u, the cumulative distribution function of the characterizes the range of the correlations, and 0 6 ρ 6 1 uniform distribution. Using this idea we apply a transfor- characterizes the strength of the correlation. We show mation involving the error function to each component of numerically that the non-Gaussian sum of the N depen- the multivariate Gaussian distribution X, yielding a mul- dent random variables is well modeled by a distribution tivariate random variable U of globally dependent, uni- which mimics a compact support q-Gaussian distribution formly distributed components. The detailed construction with q(ρ, α) 6 1. is given immediately below. This paper is organized as follows. The next section pro- Proceed by first defining N+1 independent, identically Z Z ∼ N vides the defining relationships and demonstrates the ef- distributed normal random variables 0,… N (0,1) fect of the induced dependency structure for small sample with zero mean, µ=0, and unit variance, σ 2=1. Then de- q- X sizes. Section 3 reviews Gaussian random variables, fine a new multivariate√ random vector with components X ≡ ρZ p − ρZ i N which are shown to approximate the distributions of the defined as i 0 + 1 i for =1,…, . The real mean for the system under study. Section 4 discusses parameter ρ introduces global correlations amongst the q- Z parameter fitting in Gaussian distributions. Section 5 Xi through dependence on the common term 0. The new discusses the probability distribution of the mean of sam- distribution X can then be written as

388 William J. Thistleton, John A. Marsh, Kenric P. Nelson, Constantino Tsallis

We now transform the random vector X into a multivariate uniform distribution U by defining   √ p  Z0 ρ 1 − ρ 0 ··· 0   √ p Z  Xi  ρ − ρ ···   1  U ≡ X − 1 1 √ :  0 1 0  Z  i Φ( i) = erf (4) X    2  BZ: 2 2 =  . . . . .    = 2  ......   .  √ p  .  ρ 0 0 ··· 1 − ρ i n x ZN For =1,…, where Φ( ) is the standard normal cumu- Ui (2) lative distribution function. Each is then distributed − 1 ; 1  Thus expressed, X is seen to be a linear transformation of uniformly on the interval 2 2 and inherits correlations a multivariate normal random vector Z, hence also multi- from the underlying multivariate normal distribution. With variate normal. It is straightforward to verify that X has our system so constructed we proceed to demonstrate a zero mean, and covariance matrix few basic properties. First consider the dependency structure inherited by the T Σx = BΣZ B = [ρ + (1 − ρ)δij ] (3) uniform random variables. Since each of the uniform ran- dom variables has mean E[Ui]=0, the covariance is seen reproducing the form given in Eq. (1) above. to be σ[Ui,Uj ] = E[UiUj ] where

Z ∞ Z ∞       xi xj 1 1 1 2 2 σ UiUj √ √ − xi − ρxixj xj dxidxj ; [ ] = erf erf p exp − ρ2 2 + (5) −∞ −∞ 2 2 2 2π 1 − ρ2 2(1 )

with correlation coefficient ρ[Ui,Uj ] = 12 σ[Ui,Uj ]. We present the correlation of the uniform random variables as a function of the correlation of the underlying normal ran- dom variables in Fig.1 , where data corresponding to the resulting normal variables are transformed to uniformly − 1 ; 1  distributed random variables on 2 2 . The linear cor- relation coefficient (Pearson’s ρ) of the bivariate uniform marginal data is presented as a function of the normal correlation coefficient of the underlying normal system in Fig.1 and is obtained by numerical integration of Eq. (5) and also by direct calculation from the transformed uni- form data. Induced correlations in the uniform random variables are slightly less than those of the normal ran- dom variables. We also show in Fig.2 the joint probability density functions for the bivariate case for various values ρ of normal correlation . Figure 1. Correlation of transformed uniform random variables as a function of the correlation of the underlying normal random variables. 3. q-Gaussian distributions

tributions with increasing q. In the case of q<1 the q- The family of q-Gaussian distributions is obtained as a set Gaussian distributions have compact support. In addition of maximum entropy distributions whenR q considering the 1− [f(x)] q- Sq f to maximizing the generalized entropy, these Gaussian generalized entropic form [ ] = q−1 [2]. In the limit distributions are realized as attractors under summation as q→1, the generalized entropy Sq recovers the tradi- of certain specially correlated systems [4]. For all -∞ < q tional Boltzmann-Gibbs-Shannon entropy, and the usual < q- 3, the Gaussian probability√ density functions may be Gaussian distribution maximizes the entropy, as expected. f x β −βx2 q- written simply as ( ) = Cq eq where the exponential However, when q=6 1, other distributions, the so called q- 1 α −q Gaussian distributions, maximize Sq. In particular, when is defined as eq ≡ (1 + α(1 − q)) 1 and has support only q>1 the resulting q-Gaussians become heavy tailed dis- for 1+α(1-q) >0, β is a scaling constant, and the normal-

389 q-Gaussian approximants mimic non-extensive statistical-mechanical expectation for many-body probabilistic model with long-range correlations

In fact, as ρ →1 the limiting distribution approaches a uniform distribution, consistent with q→ −∞.

4. Parameter fitting in q-Gaussians

Typical, so called curve-fitting, approaches to parameter estimation in the context of q-Gaussian distributions have taken advantage of the fact that a nonlinear transforma- tion (the q-log) of the empirical histogram of q-Gaussian data results in a straight line when the parameters of the distribution are correctly specified. Similar approaches are used when handling data modeled with a Pareto dis- tribution. Alternately, using a procedure known to Gauss, Figure 2. Two dimensional histogram of bivariate joint probability distribution of uniform random variables with induced de- one may attempt a Maximum Likelihood Estimation ap- pendency structure. Distributions are shown for several proach [6] and seek values βˆ and qˆ which maximize the Qn values of the underlying normal correlation, ρ. pdf xi i.e. β likelihood function i=1 ( ), which finds values ˆ and qˆ which are most consistent with the data. Recently, a maximum likelihood method for estimating parameters q- ization accomplished via in the Exponential distribution has been developed [7]. This approach is not taken in the current context due to difficulties arising from the fact that the random variables  ! of interest have support dependent upon the parameters  √ 1 1  π being estimated . Instead, for the q<1 case of compact  2 Γ − q  1 support we follow a Method of Moments approach and  !; −∞ < q < ;  − q 1 q>  p 3 recommend a Maximum Likelihood approach in the 1  (3 − q) 1 − qΓ  2(1 − q) case of infinite support.   Consider a compact support q-Gaussian distribution with  √ density for (1 − (1 − q)βx2) > 0 given as Cq = π; q = 1;     !  √ 3 − q √  π β  Γ q − −βx2  2( 1) f(x) = e  !; 1 < q < 3: Cq   p 1 √  −q   q − 1Γ β − q 3  q − (3 )Γ 2(1−q) 1 1 2 1−q = √   1 − (1 − q)βx : (7) (6) π 1 2 Γ 1−q For the compact support case q<1 we require (1+ (q-1) βx2) > 0. The special case q=1 is recovered in the limit. We note here that compact support q-Gaussian distribu- We may reparametrize this equation for convenience with α 1 q α−1 tions may also be considered as a reparametrization of = 1−q or = α as a distribution classically denoted as a Pearson Type II distribution [5]. √   r α Recalling that, for a multivariate normal distribution, a β 1 α 1 β  β  −βx2 1 + α Γ + 2 correlation coefficient value ρ=0 implies that the normal f(x) = e = √2 2 1 − x : Cq πΓ(α) α α variables are independent, note as a consequence that (8) when the underlying normal random variables are uncor- It is immediately apparent that 0 < α < ∞ and also that related, the derived uniform distributions are themselves this parameter allows the distribution to interpolate be- independent and the usual central limit theorem obtains, tween the uniform distribution as α →0 and the Gaussian resulting in convergence to a q=1 Gaussian distribution. as α → ∞. For ρ>0, the system converges approximately to a q- Gaussian distribution with q<1, with larger correlations associated with q values which are larger in magnitude. 1 S. Donald, H. Paarsch, Unpublished Manuscript (1993)

390 William J. Thistleton, John A. Marsh, Kenric P. Nelson, Constantino Tsallis

We estimate α and β (and hence q and β) via the Method of Moments as follows.P Equate the second order sam- m ≡ xi2 ple moment 2 n to the corresponding second order µ ≡ 2 population moment 2 E[X ] giving us

P x2 α i µ 1 : = 2 = (9) n β (2α + 3) m ≡ Similarly,P equate the fourth order sample moment 4 xi4 n to the corresponding fourth order population moment µ ≡ 4 4 E[X ] giving us

P x4  α2  i µ 1 n = 4 = 3 β2 α α (2 + 3)(2 + 5) Figure 3. Means of two correlated uniform random variables for var-  α  µ 1 : ious values of normal correlation, ρ. Note the typical tri- = 3 2 β α (10) angular shape for independent (ρ=0) random variables. (2 + 5) As ρ →1 the sum converges to a uniform distribution with support on − 1 ; 1 . 2 2 This system may be solved explicitly to obtain MOM es- timators

 m − m2  α 1 5 4 9 2 ; coefficient of the underlying multivariate normal distribu- ˆ = m2 − m (11) N 2 3 2 4 tion and for increasing values of system size, . As indi- cated in the figures, as system size is increased, a limiting U U ::: U 1+ 2+ + N α distribution is approached for Umean = N . βˆ 1 ˆ : = m α (12) 2 2ˆ + 3 This system has been analyzed [8] and found to yield an It is interesting to consider that the system may be pa- explicit analytic representation for the limiting distribu- N→ ∞ rameterized by the kurtosis defined as tion ( )

µ α κ ≡ 4 2 + 3 : µ2 = 3 α (13) 2 2 + 5

Substituting for q gives   1   − ρ 2 − ρ f x 2 − 2(1 )  −1 x 2 ; U ( ) = ρ exp ρ erf (2 ) (15)

1 µ 2 − q + 3 − q − q κ ≡ 4 1 2 + 3(1 ) 5 3 : µ2 = 3 = 3 − q = 3 − q 2 1 2 + 5(1 ) 7 5 − 1 6 x 6 1 2 − q + 5 with support on 2 2 . This distribution is ap- 1 q- (14) proximately, but not exactly, a Gaussian distribution. The authors also confirmed an intriguing asymptotic re- Estimating the kurtosis from the sample moments as κˆ = m via 4 α sult which we originally developed as an ansatz curve m2 , we again obtain the estimate of given in Eq. (12) 2 fitting and which relates the fitted q value and the induc- above. ing correlation coefficient, ρ. We illustrate the result in Fig.6 . An uncorrelated, independent multivariate normal 5. Probability distribution of the system gives rise to an independent uniform system and so in this case one naturally obtains a fitted value of q=1. mean As the correlation is increased towards ρ=1, the fitted distributions reproduce the full spectrum of compact sup- The approximate convergence of the correlated uniform port q-Gaussians and in the limit recover the q=∞ case distributions to q-Gaussian distributions is shown in of a uniform distribution. Along the way the relationship Fig.4 through Fig.6 for various values of the correlation q=(1-5/3 ρ) / (1- ρ) obtains.

391 q-Gaussian approximants mimic non-extensive statistical-mechanical expectation for many-body probabilistic model with long-range correlations

Figure 4. Means for small systems, N=10. Shown are the empir- Figure 6. Relationship between fitted q-Gaussian parameter and in- ically obtained probability densities (simulation) for vari- ducing correlation for large system. Note that indepen- ous values of normal correlation, ρ. Also shown are the dence results in Gaussian distribution, while the uniform distribution is recovered in the limit as ρ → . asymptotic solutions (N→ ∞) of Hilhorst and Schehr (an- 1 alytic). We note that while small systems are not well modeled by this asymptotic solution, especially for light correlations, they are still modeled quite well by fitted q- Gaussians (not shown in figure).

where r = |i − j|.

This correlation structure decreases following a power law in r = |i − j|, representing the extent of the dependency. The uniformly globally dependent model thus appears as the α →0 limit of the generalized model. In the α = 0 case, q differs from unity when ρ is nonzero. Does this relation hold also for non-zero alpha? The generalized model is thus intended to probe the behavior of the asymptotic dis- tribution as a function of the range of correlations α and correlation strength ρ.

In constructing new correlation matrices that replace Eq. (2), it is important to ensure the correlation matrix is invertible to allow the transformation to uniform vari- ates, following Eq. (4). This condition leads to a maximum Figure 5. Means for moderately sized systems, N=50. As con- allowable value of α for any desired ρ. This restriction trasted with Fig.4 , convergence to the Hilhorst Schehr solution (analytic) is reasonable even for light correlations. was adhered to using a numerical approach that checked Again, these systems are modeled quite well by fitted q- the correlation matrix for the positive definiteness. Maxi- Gaussians. mum values of α expressed as a function of correlation ρ are given in Fig.8 through Fig. 10.

6. Power law decay in the inducing The results for system size N=5 are shown in Fig.8 , where the Method of Moments estimators for q are plotted as a correlation matrix function of both α and ρ. The unphysical combinations of α and ρ appear as a missing region in the surface plot of We now consider a generalization of the previous model q. In addition to the known result that the q=1 limit is by including a parameter that controls the extent of corre- recovered in the case ρ →0, for this system size the q=1 lations in the underlying multivariate normal system, and limit is also asymptotically recovered in the limit of short again probing the limit as N→ ∞. Instead of the global correlation. Numerical results for the asymptotic value of correlations implied by Eq. (2), we introduce a parameter q in the general case of nonzero ρ are inconclusive due to α α that controls the extent of the correlations, as ρi;j = ρ/r computational restrictions.

392 William J. Thistleton, John A. Marsh, Kenric P. Nelson, Constantino Tsallis

Figure 9. The same results for a moderately sized system (N=20). Figure 7. Maximum allowable decay rate α as a function of correla- tion ρ shown for various system sizes, N. The restriction that the correlation matrix be positive definite requires α not to be too large. This restriction is also shown as a “missing region” in Fig.8 .

Figure 10. Finally, the same results for a larger sized system (N=200). Convergence to asymptotic surface is quite slow, creating difficulties for a numerical approach.

Figure 8. Our numerical results for increasing values of N strongly suggest that q=1 for ρ=0 (and, of course, all values of α), as well as for all admissible values of ρ whenever α > d=1. Boltzmannian statistical mechanics are to be used, pos- In contrast, for any non-vanishing value of ρ, q monoton- sibly nonextensive statistical mechanics with an index q ically decreases when α is below unity and approaches zero. Results for a small system (N=5). different from unity (see, for instance, [9] and [10]). In the present paper, for a probabilistic model, we have exhibited precisely this behavior, even if, strictly speaking, the lim- 6.1. Discussion iting distribution is not exactly a q-Gaussian. Indeed, the exact limiting distribution [8] is so close to a q-Gaussian Our present results with these q-Gaussian approximants that something quite similar is expected for any model q- α precisely mimic the behavior that was expected in the which would exactly yield Gaussian attractors (for =0, frame of non-extensive statistical mechanics. Indeed, for one such probabilistic model has been recently introduced classical d-dimensional many-body Hamiltonians includ- in [11]) ing two-body interactions (without mathematical complex- We also note here that other mechanisms leading to the ities at the origin), and decaying at long distances like production of infinite support q-Gaussian distributions, α 1/r , say attractively, two very distinct regimes exist. If more commonly referred to as the Student t, have been α > d, the Boltzmann-Gibbs canonical partition function known for some time, see for example [12] and [13]. In par- is well defined, and standard statistics are to be used ticular, if samples are obtained from a normal population at thermal equilibrium. If, however, 0 6 α 6 d, non- via a sampling procedure governed by a random process trivial quasi-stationary regimes emerge, for which non- which yields sample sizes which are themselves random,

393 q-Gaussian approximants mimic non-extensive statistical-mechanical expectation for many-body probabilistic model with long-range correlations

then the resulting sampling distribution of sample means may be pulled away from the Gaussian distribution to- wards an infinite support q-Gaussian. That is, denoting the random sample size as Nnif the sample size is dis- tributed as negative binomial then asymptotically

  N ∼ r; p 1 n negbino = n

Nn 1 X ⇒ Xi ∼ Student t(df = r): (16) Nn i=1

Acknowledgements

This work was supported in part by the Air Force Research Laboratory, Information Directorate, under contract num- ber FA8756-04-C-0258. The authors acknowledge stimu- lating discussions with M. Marsili and A. Williams.

References

[1] G. Grimmett, D. Stirzaker, Probability and Random Processes, 3rd edition (Oxford University Press, Ox- ford, England, 2001) [2] C. Tsallis, J. Stat. Phys. 52, 479 (1988) [3] M. Gell-Mann, C. Tsallis, Nonextensive Entropy: In- terdisciplinary Applications (Oxford University Press, New York, 2004) [4] S. Umarov, C. Tsallis, S. Steinberg, Milan Journal of Mathematics 76, 307 (2008) [5] K. Pearson, Philos. T. R. Soc. A 186, 343 (1895) [6] P. Bickel, K. Doksum, Mathematical Statistics (Pren- tice Hall, Upper Saddle River, NJ, 2001) [7] C. Shalizi, Phys. Rev. E, arXiv:math/0701854 [8] H. J. Hilhorst, G. Schehr, J. Stat. Mech.-Theory E P06003 (2007) [9] A. Pluchino, A. Rapisarda, C. Tsallis, Europhys. Lett. 80, 26002 (2007) [10] A. Pluchino, A. Rapisarda, C. Tsallis, Physica A, 387, 3121 (2008) [11] A. Rodríguez, V. Schwämmle, C. Tsallis, J. Stat. Mech.- Theory E. P09006 (2008) [12] V. E. Bening, V. Y. Korolev, Theor. Probab. Appl.+ 49, 377 (2004) [13] C. Vignat, A. Plastino, Phys. Lett. A 360, 415 (2007)

394