Beyond Benford's Law: Distinguishing Noise from Chaos
Total Page:16
File Type:pdf, Size:1020Kb
RESEARCH ARTICLE Beyond Benford's Law: Distinguishing Noise from Chaos Qinglei Li1, Zuntao Fu1*, Naiming Yuan1,2* 1 Laboratory for Climate and Ocean-Atmosphere Studies, Dept. of Atmospheric and Oceanic Sciences, School of Physics, Peking University, Beijing, China, 2 Department of Geography, Climatology, Climate Dynamics, and Climate Change, Justus-Liebig University Giessen, Giessen, Germany * [email protected] (ZTF); [email protected] (NMY) Abstract Determinism and randomness are two inherent aspects of all physical processes. Time se- ries from chaotic systems share several features identical with those generated from sto- chastic processes, which makes them almost undistinguishable. In this paper, a new method based on Benford's law is designed in order to distinguish noise from chaos by only information from the first digit of considered series. By applying this method to discrete data, we confirm that chaotic data indeed can be distinguished from noise data, quantitatively and clearly. OPEN ACCESS Citation: Li Q, Fu Z, Yuan N (2015) Beyond Benford's Law: Distinguishing Noise from Chaos. PLoS ONE 10(6): e0129161. doi:10.1371/journal. pone.0129161 Academic Editor: Francois G. Schmitt, CNRS, Introduction FRANCE Time series from chaotic systems (CSs) share with those from stochastic processes (SPs) some Received: January 5, 2015 properties make them almost undistinguishable. Though behind the veil of apparent random- Accepted: May 5, 2015 ness, many series from CSs are highly ordered [1–3], the distinction between chaotic and sto- chastic processes is still a long-standing challenge [4–18]. Moreover, experimental chaotic Published: June 1, 2015 records are unavoidably contaminated with noise, which makes the distinction task even Copyright: © 2015 Li et al. This is an open access more complicated. article distributed under the terms of the Creative The discrimination between chaotic and stochastic processes has drawn much attention, Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any since irregular and apparently unpredictable behaviors are often observed in natural measure- medium, provided the original author and source are ments. Many studies have been done aim to uncover the cause of unpredictability governing credited. these systems, and much effort has been further devoted in understanding this topic [4–18]. Data Availability Statement: All relevant data are First of all, exponential power-spectra have been identified in many idealized nonlinear sys- within the paper and its Supporting Information files. tems, and are taken to be characteristics of low-dimensional chaos to differentiate chaos from stochastic processes, whose power-spectra show power-law behavior [4–7]. Nonlinear forecast- Funding: This work was supported by the National Science Foundation of China under grant no. ing [8,9] has also been applied to make tentative distinctions between dynamical chaos and 41175141, and 41475048. The funder had no role in measurement errors, since the accuracy of nonlinear forecast diminishes with increasing pre- study design, data collection and analysis, decision to diction time intervals for chaotic series, but for stochastic series, it does not. Recently, network publish, or preparation of the manuscript. and symbolic dynamics related methods [10–18] are used to handle this issue, where structural Competing Interests: The authors have declared information among consecutive points in physical or phase space are used to characterize and that no competing interests exist. distinguish stochastic from chaotic processes. PLOS ONE | DOI:10.1371/journal.pone.0129161 June 1, 2015 1/11 Distinguishing Noise from Chaos Although above mentioned methods have been successfully applied to distinguish stochastic from chaotic processes, the authors of each method have only explored the related magnitude or permutation information of the analyzed processes, such as power-spectrum method or net- work based methods. We note that digital information has never been used so far to character- ize and further distinguish stochastic from chaotic processes. Actually, digital information is of great importance to characterize specific process. For example, the first digits in many datasets are not uniformly distributed as expected, but heavily skewed toward the smaller digits. This phenomenon was first found by Simon Newcomb in 1881 [19]. Nobody showed interests in this discovery, until 1938 when Frank Albert Benford [20] investigated some 20 tables of 20229 numbers and drawn the conclusion that the first digit probability distribution in many data sets is PBðdÞ¼log10ð1 þ 1=dÞð1Þ where d = 1,2,...,9 is the first digit. It was named as Benford's Law (BL) later by the scientific community. Many scientists in different fields have tried to explain the underlying reasons for BL [20–26], but a successful explanation remains elusive [27,28]. However. although there is no accepted interpretation, BL is nearly taken as an universal law. In recent years, most BL related studies are limited in validating whether particular datasets follow this law [29,30], de- tecting frauds in election and accounting [31,32], as well as testing physical system transition [33,34]. Especially, Tolle and his coauthors [35] examined three low-dimensional chaotic mod- els of dynamical systems, and found examples of either compliance with or deviance from Ben- ford's law, which depends upon the models and the parameters. Can Benford's law be explored to characterize and distinguish stochastic from chaotic pro- cesses? The answer from the Toll's results is no. However, the observed dynamics may be strongly affected by the resolution scales used to document the behaviors of considered pro- cesses [36]. In order to characterize complex multi-scaled series, it is of fundamental impor- tance to incorporate the multiple scale in devising measures [36]. Costa et al [37]. and Zunino et al. [13] have introduced multi-scale entropy (MSE) and multi-scale permutation entropy (MPE) to successfully distinguish different states of analyzed processes or dynamical systems, respectively. These results show the importance of multi-scale in characterizing the analyzed processes or systems. Here for the first time we introduce the multi-scale to Benford's law anal- ysis, and the results show that it does help us in distinguishing chaos from noise. Materials and Methods Generating SPs We generate three kinds of well-known stochastic processes by Fourier transform technique: k (1) Noise with f - power spectra, (2) Fractional Gaussian noise (FGN) and (3) Fractional Brownian motion (FBM). All three SPs are a particular class of colored noise which represent stochastic (infinite-dimensional) systems with different power-law spectra [13,14]. Noise with f -k power spectra 1. Generate a set {ui,i = 1,2,...,N} of independent Gaussian variables of zero mean and vari- 1 ance one, and compute the discrete Fourier transform of the sequence fu^kg. 2. Correlations are incorporated in the sequence by multiplying the new set by the desired -k 2 spectral density f , yielding fu^k g; PLOS ONE | DOI:10.1371/journal.pone.0129161 June 1, 2015 2/11 Distinguishing Noise from Chaos 2 3. Now,fu^kg is symmetrized so as to obtain a real function and then the pertinent inverse Fou- rier transform {xi} is obtained, after discarding the small imaginary components produced by our numerical approximations. Fractional Gaussian noise (FGN) and Fractional Brownian motion (FBM) FBM is the only family of processes which is (a) Gaussian, (b) self-similar, and (c) endowed with stationary increments [14,38,39]. The normalized family of these Gaussian processes, H H H {B (t),t>0}, is endowed with these properties: (i) B (0) = 0 with probability 1, (ii) E[B (t)] = 0 (zero mean), and (iii) covariance given by 2H E½BH ðtÞBH ðsÞ ¼ ðt2H þ s2H jt À sj Þ=2 for t,s∈R. Here E[] refers to the average computed with a Gaussian PDF. The power exponent 0<H<1 is commonly known as the Hurst parameter (exponent). These processes exhibit ‘‘memory” for any Hurst parameter except for H = 1/2, as one realizes from Eq (11). The case H = 1/2 corresponds to classical Brownian motion and successive motion increments are as likely to have the same sign as the opposite (there is no correlation among them). Thus, Hurst’s parameter defines two distinct regions in the interval (0,1). When H>1/2, consecutive incre- ments tend to have the same sign so that these processes are persistent. For H<1/2, on the other hand, consecutive increments are more likely to have opposite signs, and we say that they are anti-persistent. Let us introduce the quantity Fractional Gaussian noise (FGN) as the FBM H H H increments, 2W (t)=B (t+1)-B(t) , so as to express our Gaussian noise in the fashion 2H 2H rðkÞ¼E½WH ðtÞWH ðt þ kÞ ¼ ½ðk þ 1Þ À 2k2H jk À 1j =2; k > 0 Note that for H = 1/2 all correlations at nonzero lags vanish and {W1/2(t),t>0} thus it repre- sents white noise. The FBM and FGN processes are continuous but non-differentiable process- es (in the classical sense). It is possible to define a generalized power spectrum of the form: F/| β f|- , with β =2H+1,1<β<3 for FBM and β =2H-1,-1<β<3 for FGN. For evaluating the FBM and FGN time series, here we use a modified Fourier filtering technique [39,40], which is both exact and fast. Generating CS In order to compare results given in our proposed method with those from other methods, all the CSs chosen in this paper are those used to distinguish noise from chaos in the literature [13–17, 41]. Noninvertible chaotic maps 1 x = xn axn b (1) Gauss map: nþ1 ¼ x n ðÞMod 1 . (2) Linear congruential generator: +1 = + (Mod a b c x x xz 1c),where = 7141, = 54773, = 259200.