arXiv:1008.1804v1 [nlin.CD] 10 Aug 2010 o´gc ePriaadClinis otatN 34o 20 of 5304 No Contract Colciencias, and Pereira nol´ogica de [email protected] ´opez),L rpitsbitdt olna nlss elWrdAppli World Real Analysis: Nonlinear to submitted Preprint opttoa ehd nsaitclpyisadnonlinea analysis and series physics statistical in methods Computational Keywords: and temperature no average or global Recording. stationary monthly are the they in whether nonlinearity data, nonlinear and meth linear data surrogate proposed previously the where and examples distribution linea amplitude stationary , addre non the a hypothesis preserve of the transformation assumpt algorithm; static new the nonlinear a Unfortunately, a propose we data. Here of valid. variance g surrogates and process, mean stochastic local linear stationary hypothesis non null a the of proposed, was algorithm distribu new amplitude a Recently, and spectra) (power thi Typically, autocorrelation p process. to a stochastic as linear emerged a has of data realization surrogate on based testing Hypothesis Abstract ist ofr ocranpoete ftedt.The due testing data. hypothesis the for of preferable is properties approach certain latter to conform to data; ries the to fit good realizations a provides that model ways: method different data two izations surrogate in The undertaken be can hypothesis. system this the compare to and class output process specific hypoth- null a a for formulate esis to is approach The experi- data. underlying mental dynamics popular nonlinear most of investigate to existence the analysis the of series time one nonlinear nowadays in used is tests [1] al. et Theiler Introduction 1. urgt aamto,iiilyitoue by introduced initially method, data Surrogate 1 ∗ ig .Ga´nLoe sspotdb h nvria Tec- Universidad the by supported is L´opez Guar´ın L. Diego mi addresses: Email author. Corresponding [email protected] r ot al eeae urgtsfo a from surrogates generated Carlo Monte are r urgtsgnrtdfo h iese- time the from generated surrogates are e urgt aamto o osainr ieseries time nonstationary for method data surrogate new A ig .Guar´ın L´opez L. Diego [email protected] a eateto lcrclEgneig nvria tecnol Universidad Engineering. Electrical of Department Avr .Ooc Gutierrez), Orozco A. (Alvaro b eerhcne tteIsiuoTecnol´ogico Metropolita Instituto the at center Research EisnDlaoTrejos) Delgado (Edilson DeoL Guar´ın L. (Diego a,1, yia real- Typical ∗ laoA rzoGutierrez Orozco A. Alvaro , constrained cations 10. sdn ygnrtn urgtswihaemd oconform to made are which surrogates generating by done is s oa enadvrac fdt.W rsn oenumerical some present We data. of variance and mean local tcatcpoes urgtsgnrtdb u algorith our by generated Surrogates process. stochastic r ino h aa(hsi o eesr fdt r Gaussian). are data if necessary not is (this data the of tion nrtdb hsagrtmpeev h uoorlto an autocorrelation the preserve algorithm this by enerated yais yohsstsig urgt aa Time data, Surrogate testing, Hypothesis dynamics, r d al u u loih sal odsrmnt between discriminate to able is algorithm our but fail, ods .Uigoragrtmw locnr h rsneof presence the confirm also we algorithm our Using t. drse yti loih sta aaaearealization a are data that is algorithm this by addressed nasalsgeto inlfo ir Electrode Micro a from signal a of segment small a in 2.I re ots ulhptei talvlo signif- of level statistics a pivotal at hypothesis a null requiere a icance test not to does order In it [2]. that fact the to ae eeae ihteAF loih rsreboth preserve algorithm AAFT the surro- with and generated RP gates (AC) the autocorrelation with the generated preserve ones algorithm the while data, original of the (AD) the with distribution amplitude generated the lin- preserves Surrogates method a RS of process. transformation pro- stochastic static ear stochastic nonlinear (iii) gaussian correlated and i.i.d linear (i) cess; (ii) a from process, came null random data the the test that to developed hypothesis transform phase were Fourier [1], Random adjusted surrogates (ii) Amplitude (AAFT) realizations (iii) (RS); and, constrained shuffle (RP); Random for (i) methods named not. classical may it Otherwise, The the rejected. then be surrogates, may the hypothesis of null that from value deviates statistic data the the If of surrogates. the of from distribution the elicited to values value data the from computed compares evokes statistic and simply this of interest one of Then, is test. statistic side) whatever (two side one a for plrwyt ettenl yohssta inli a is signal a that hypothesis null the test to way opular o fGusa mltd itiuini o always not is distribution amplitude Gaussian of ion sdb u loih sta aaaearaiainof realization a are data that is algorithm our by ssed o iao eer.Pria Colombia. Pereira, Pereira. ´ogica of α n a ognrt 1 generate to has one , o Medell´ın, Colombia. no. a dlo egd Trejos Delgado Edilson , /α − (2 1 b /α − coe 9 2018 29, October )surrogates 1) m d the AD and the AC of the original data (in general this is methods, followed by an introduction to our method, not true, this is why an improved version of the AAFT named Amplitude Adjusted Truncated Fourier Trans- algorithm was presented, referred as iAAFT [3]). form (AATFT). Then we introduce a methodology to Recently, Richard et al. [4] showed that surrogates gen- accept or reject a null hypothesis and proceed to apply erated with the mentioned methods are stationarized the methods to several simulated and real , versions of the original data. This imply that while showing the utility of each one. Finally we present some the statistical properties of the data might be time de- concluding remarks. pendent, the statistical properties of the surrogates will not. Because of this, when data becomes from a non- 2. Surrogate data methods stationary process, it is impossible to make a statistical comparison of data with its surrogates. So, the classi- As mentioned, the surrogate data methods, originally cal surrogate data methods are not applicable to non- introduced by Theiler et al. [1], has become a very pop- stationary process. Due the importance of this kind of ular method for hypothesis testing. The original algo- process, many modifications of the classical methods rithms can be stated as follows: have been presented. The first one can be attributed to Schreiber and Schmitz [5], in this approach the surro- 2.1. The existing algorithms gate data preserves the AC and any other desired prop- 2.1.1. Random Phase surrogates (RP) erty of the original time series. To generate a surrogate The surrogate data is generated by the following pro- one starts by random shuffling the data, then measur- cedure: ing (for example) the AC of the surrogates and defin- ing an error function as the square difference of data 1. Start with the original data x[t], t = 1, ··· , N. AC minus the surrogate AC. One has to keep permuting 2. Compute z[n], the Fourier transform of x[t]. pairs until the error function is minimized. To gener- 3. Randomize the phases: z′[n] = z[n]eıφ[n]. ate surrogates for non stationary time series, one has to Where φ[1] = 0 and φ[n] ∈ N(0, 2π), n = 2,..., N. ensure that the surrogates also preserve the local mean 4. Symmetrize z′[n] (to obtain a real inverse Fourier and variance of the data. This procedure can be done Transform): iteratively by means of any optimization algorithm, but z′[n − i + 1] = z′[i + 1], i = 1,..., f loor(n/2), there is no guarantee that one will not be stuck in a lo- if N is even then z′[n/2 + 1] = abs(z′[n/2 + 1]). cal minimum (this issue was overcome in [5] by using z[n] is the complexb conjugate of z[n]. the simulated annealing optimization method). Unfortu- 5. Obtain x′[t], the inverse Fourier transform of z′[n]. nately, this method requires a lot of computational time, b ′ so it is of limited applicability. x [t] is the surrogate data of x[t]. Recently, Nakamura et al. [6] presented a modifica- The surrogates maintain the linear correlation of the tion of the RP method which makes it suitable for non- data, but by means of the phases randomization, any stationary data, they called its method Truncated Fourier nonlinear structure is destroyed. Transform (TFT). Surrogates generated with the TFT algorithm are constrained to preserve the AC and the lo- 2.1.2. Amplitude Adjusted Fourier Transform surro- cal mean and variance of data so, surrogateswill be non- gates (AAFT) stationary if original data are non-stationary. Through The surrogate data is generated by the following pro- this method it is possible to test the null hypothesis cedure: that the data came from a non-stationary linear cor- 1. Start with the original data x[t], t = 1,..., N. related stochastic process. Since surrogates generated 2. Sort the data Sx[k], k = 1,..., N. with this method do not preserve the AD of data, fur- 3. Compute z[n], the Fourier transform of x[t]. ther hypothesis (e.g, data are a realization of a nonlinear statical transformation of a non-stationary linear corre- 4. Make a ranked time series Rx[t] defined to satisfy = lated stochastic process) can not be tested. The aim of Sx[Rx[t]] x[t]. = this paper is to present a new surrogate data method 5. Create a random data set g[t], t 1,..., N. through which is possible to obtain surrogates that are 6. Sort the random gaussian number Sg[k], k = constrained to preserve the AC, AD and local mean and 1,..., N. variance of data, but are otherwise random. 7. Define a new time series y[t] = Sg[Rx[t]]. This document is organized in the following way; ini- 8. Generate a surrogate time series y′[t] from y[t] us- tially we briefly introduce the RP, AAFT and the TFT ing the RP algorithm. 2 9. Make a ranked time series Ry′[t] of y′[t]. 2.2. A new algorithm 10. The surrogate time series of x[t] is given by x′[t] = 2.2.1. Amplitude Adjusted Truncated Fourier Trans- Sx[Ry′[t]]. form surrogates (AATFT) Surrogates generated with the TFT algorithm do not ′ x [t] is the surrogate data of x[t]. preserve AD of data (this is actually not true, if fc is It is evident that this process achieves two aims: First, high enough the surrogates AD will eventually be like just as with RP algorithm, the power spectra (and there- the data AD, but this imply that surrogates are too simi- fore linear correlation) of the data is preserved in the lar to data). It is tempting to think that this issue can be surrogate; and second, the re-ordering process means overcome by simply applying a similar procedure to the that the AD of the data and surrogate is also identical AAFT (or the iAAFT) algorithm, but the solution is not (this is actually not true, as this algorithm does not si- so simple. The idea of the TFT methodis to preservethe multaneously preserve both rank distribution and power low frequency components of data in surrogates, this is spectra, which is why the iAAFT [3] has to be used in done by preserving some phases of frequency domain, most practical situations). and it is possible to observe that thanks to the reorder- ing procedure of the AAFT method the phases will no longer be preserved. 2.1.3. Truncated Fourier Transform Surrogates (TFT) In order to preserve the AC, AD and local mean and The TFT algorithm introduced a way to deal with non variance of data in surrogates we propose the following stationarity; this algorithm works by preserving the low procedure. frequency phases in the Fourier domain, but randomiz- = ing the high frequency components. 1. Start with the original data x[t], t 1,..., N. = The surrogate data is generated by the following proce- 2. Sort the data Sx[k], k 1,..., N. dure: 3. Compute z[n], the Fourier transform of x[t]. 4. Generate a surrogate time series x′[t] of x[t] using 1. Start with the original data x[t], t = 1, ··· , N. the TFT algorithm. ′ ′ 2. Compute z[n], the Fourier transform of x[t]. 5. Compute z [n], the Fourier transform of x [t]. ′ 3. Randomize the phases: z′[n] = z[n]eıφ[n]. Where 6. Change the magnitude of z [n]: z′[n] = (z′[n]/abs(z′[n])) abs(z[n]). φ[n] ∈ N(0,π) if n > fc. 7. Obtain x′[t], the inverse Fourier transform of z′[n]. φ[n] = 0 if n ≤ fc. 8. Make a ranked time series Rx′[t] of x′[t]. 4. Symmetrize z′[n] (as in the RP algorithm). 9. Modify x′[t] so it has the same data as Sx[k] but 5. Obtain x′[t], the inverse Fourier transform of z′[n]. with the order given by Rx′[t]: x′[Rx′[t]] = Sx[k]. x′[t] is the surrogate data of x[t]. x′[t] is the surrogate data of x[t]. While all phases are not randomized in this method, it If one iteratively performs the steps 5 to 9 it is possible is possible to discriminate between linearity and non- to increase the fitness between AC of the data and the linearity because the superposition principle is valid surrogates (the iterative procedure will also reduce the only for linear data. i.e., when data are nonlinear, even preservation of local mean and variance, but this can if the power spectrum is preserved completely, the in- be solved increasing the value of fc). This iterative verse Fourier transform data using randomized phases procedure will be referred to as iAATFT. Note that will exhibit a different dynamical behavior. surrogate time series x ′ [t] is just a shuffling of original The surrogate data generated by this method are influ- time series x[t], so it has the same AD. enced primarily by the choice of frequency fc. If fc is too high, the TFT surrogates are almost identical to the It is important to notice that any implementation of original data. In this case, even if there is nonlinearity the discrete FT assumes that the time series under con- in irregular fluctuations, one may fail to detect it. Con- sideration is periodic with a finite period. When there is versely, if fc is too low, the TFT surrogates are almost a large difference between the first and last points (end- the same as the linear surrogate and the long-termtrends point mismatch), the FT will treat this as a sudden dis- are notpreserved. In this case, even if there is no nonlin- continuity in the time series. As a result, this will intro- earity in irregular fluctuations, one may wrongly judge duce significant spurious high-frequency power into the otherwise. The method for selecting the correct value of power spectrum, which is a critical problem when the fc was presented in [6]. randomization is centered only on the high-frequency 3 portion. For robustness, the AMI must be calculated for lag of To ameliorate this artifact, Nakamura et al. [6] proposed 1. This calculates on average how much information we to symmetrize the original data before the application of have about {xt+1} knowing {xt}. So, if AMI(τ = 1) of the the FT (i.e. {x1, x2, ··· , xn−1, xn, xn, xn−1, ··· , x2, x1}). data deviates from that of the surrogates (i.e. is greater With this procedure, there is no end-point mismatch in or lower) then the null hypothesis may be rejected. the data. In order to reject (or not) a null hypothesis we generate N = 99 surrogates, which gaves us a level of signifi- cance α = 0.02. 3. Testing for nonlinearity

Next we describe our selection of discriminant statis- 3.3. Selection of the correct value of fc tics, and propose a methodology to accept or reject a null hypothesis using this statistic. Finally we study a The selection of the correct value of fc cannotbe done a priori, because it depends on the nature of the data method for selecting the correct value of fc. It is important to clarify that we are not interested in and the length of the time series [6]. Our aim is to performing a deep analysis on the linearity or nonlin- preserve AC, AD and local mean and variances of the earity of any specific time series, our aim is to preset original data in the surrogates. The conservation of AD and study the behaviorof the new surrogate data method is assured by the reordering process of the AATFT al- called AATFT; further applications will be presented in gorithm. To preserve AC and local mean and variance the future. one has to start by randomizing all the phases (100% of frequency domain), if AC, local mean and variance of 3.1. Selection of the discriminant statistics surrogates are not similar to the data then it is neces- sary to randomize only a portion of the phases (i.e. 99% Dynamical measures are often used as discriminating of the higher frequency domain), and keep decreasing statistics. According to [7], the correlation dimension the value of fc by small steps until the surrogates pre- is one of the most popular choices. To estimate these, serve AC, local mean and variance of the original data. we first need to reconstruct the underlying attractor. For It should be noted that it is no possible to preserve the this purpose, a time-delay embedding reconstruction is AC for all lags, but at least for small lags the surrogates usually applied. But this method is not useful for data AC should be identical to the data AC. exhibiting irregular fluctuations and long-term trends, because a smaller time delay is necessary to treat irreg- ular fluctuations and a larger time delay is necessary to 4. Results treat long-term trends. At the moment, there is no opti- mal method for embedding such data [7]. 4.1. Numerical examples Therefore, as discriminant statistics we chose the Aver- age Mutual Information (AMI). The AMI is a nonlinear In order to prove the validity of our surrogate data version of the AC. It can answer the following question: method, we compared results obtained by applying the On average, how much does one learn about the future different algorithms to a fictional time series. from the past?. For further information regarding the AMI, the reader is referred to [7] and references within. 4.1.1. A simple example First, we analyzed the data generated by a linear AR 3.2. Rejection or acceptance of a null hypothesis model given by Surrogate data methods are based on the Monte Carlo hypothesis testing procedure, first one calculates the x(t) = a1 x(t − 1) + a6x(t − 6) + η. (1) statistic for data and surrogates, and then one compares the value of this statistic computed from the data to the Where a1 = 0.3, a6 = 0.2 and η ∈ N(0, 1). We ob- distribution of values elicited from the surrogates. If tained 2048 values and discarded the first half. Our aim there is sufficient difference the null might be rejected, was to prove what has been argued about each algo- otherwise it will not. rithm, in this case the acceptance of the null hypothe- The level of significance of the test is given by the num- sis is granted. Figs. 1 to 4 show that each algorithm ber of surrogates. For a one sided (two sided) test a level achieves its goals, so using the AATFT algorithm we of significance α is reached with 1/α − 1 (2/α − 1) sur- can generate surrogate constrained to have the same AC, rogates. AD and local mean and variance of the data. 4 a) b) a) b)

0 0.2 0 0.2 0.1 0.1 −2 0 −2 0 −0.1 −0.1 −4 −0.2 −4 −0.2

200 400 600 800 1000 10 15 20 200 400 600 800 1000 10 15 20 Iteration Window Iteration Window c) d) c) d) 200 200 1 1 150 150

0.5 100 0.5 100

50 50 0 0 0 0 0 10 20 30 40 50 −0.5 0 0.5 1 0 10 20 30 40 50 −0.5 0 0.5 1 Lag Amplitude Lag Amplitude

Figure 1: a) Original Data (black), surrogate data generated with the Figure 2: a) Original Data (black), surrogate data generated with the RP algorithm (dark gray) and difference between data and surrogate iAAFT algorithm (dark gray) and difference between data and surro- (gray). Values are displaced from one another by 2 for clarity. b) Local gate (gray). Values are displaced from one another by 2 for clarity. b) Mean and variance of data (black) and surrogates (dotted gray). c) AC Local Mean and variance of data (black) and surrogates (dotted gray). of data (black) and surrogates (dotted dark gray), and the difference c) AC of data (black) and surrogates (dotted dark gray), and the dif- between AC of data and surrogates (gray). The difference is displaced ference between AC of data and surrogates (gray). The difference is by 0.2 for clarity. d) AD of data (black) and of surrogates (dotted displaced by 0.2 for clarity. d) AD of data (black) and of surrogates gray) (dotted gray)

4.1.2. Failure of the iAAFT algorithm (the following parameters were used: T = 50,τ = 10, To study the behavior of algorithms in presence of Tmod = 250 and MT = 5.5) and a surrogate generated non stationarity we followed [8]. First we defined an with each algorithm (we excluded the RP algorithm). AR process. Fig. 6 shows an amplification of Fig. 5, it can be noted that surrogates generated with TFT and iAATFT algo- = + + + x(t) a1(t)x(t − 1) a2(t)x(t − 2) a3(t) η. (2) rithms preserve the low frequency behavior. Finally, Fig. 7 shows the results of computing AMI(τ = 1) for Where, data and 99 surrogates generated with each algorithm,

a1 = 2cos (2π/T) exp (−1/τ), it can be observed in Fig. 7 a) that the null hypothe- sis addressed by the iAAFT algorithm was rejected, but a2 = − exp (−2/τ), (3) this happens because the times series is non stationary = a3 1. not because it is nonlinear. As expected, the hypothe- This process can be interpreted as a damped oscillator, sis addressed by TST and AATFT algorithms was not with period T and relaxation time τ. Period-based mod- rejected (Timmer [8] proved that the iAAFT algorithm ulation is introduced by subjecting the mean period of is robust for some kinds of non-stationarity, but as seen the AR process to a sinusoidal fluctuation of the form here this is not a general result). = + T(t) T Mt sin (t2π/Tmod). (4) 4.1.3. Failure of the TFT algorithm This modulation introduces a temporal dependency in Next we generated surrogates for the following pro- cess a1: = = 2 a1(t) = 2cos (2π/T(t)) exp (−1/τ). (5) h(t) g[x(t)] x(t) . (7) However, 4 also introduces a temporal dependency in Where x(t) is given by 2. In this case, the signal is non- the variance, which can be compensated by using linear, but the nonlinearity is given by the observation function g[ ] rather than by the dynamic of the process. a2 2 = 3 Fig. 8 shows that the TFT algorithm detects nonlinear- a3(t) 2 2 2  1 − a − a − 2a a2/(1 − a2) ity, but the iAAFT algorithm does not. This result was  1 2 1  (6)  2  expected, because the hypothesis addressed by the TFT  2 2 2a1(t) a2  × 1 − a1(t) − a2 − . algorithm does not involve a static nonlinear transfor- 1 − a2 ! mation of the linear non stochastic process, while this To generate data, we obtained 2048 values and dis- is exactly the hypothesis addressed by the AATFT algo- carded the first half. Fig. 5 shows the time series rithm. 5 a) b) a) b)

0 0.2 0 0.2 0.1 0.1 −2 0 −2 0 −0.1 −0.1 −4 −0.2 −4 −0.2

200 400 600 800 1000 10 15 20 200 400 600 800 1000 10 15 20 Iteration Window Iteration Window c) d) c) d) 200 200 1 1 150 150

0.5 100 0.5 100

50 50 0 0 0 0 0 10 20 30 40 50 −0.5 0 0.5 1 0 10 20 30 40 50 −0.5 0 0.5 1 Lag Amplitude Lag Amplitude

Figure 3: a) Original Data (black), surrogate data generated with the Figure 4: a) Original Data (black), surrogate data generated with the TFTS algorithm (dark gray) and difference between data and surro- AATFT algorithm (dark gray) and difference between data and surro- gate (gray). Values are displaced from one another by 2 for clarity. b) gate (gray). Values are displaced from one another by 2 for clarity. b) Local Mean and variance of data (black) and surrogates (dotted gray). Local Mean and variance of data (black) and surrogates (dotted gray). c) AC of data (black) and surrogates (dotted dark gray), and the dif- c) AC of data (black) and surrogates (dotted dark gray), and the dif- ference between AC of data and surrogates (gray). The difference is ference between AC of data and surrogates (gray). The difference is displaced by 0.2 for clarity. d) AD of data (black) and of surrogates displaced by 0.2 for clarity. d) AD of data (black) and of surrogates (dotted gray). In this case we randomized the higher 98% of the fre- (dotted gray). In this case we randomized the higher 99% of the fre- quency domain. quency domain.

4.1.4. Failure of the three methods 0 We now present a case where neither of the hypothe- ses are rejected despite the fact that the system that gen- −2 erated the signal is nonlinear. The signal was generated by the Duffing system, given by −4

+ + 2 + = x¨ σx˙ ω0 x βx γ cos ωt. (8) −6

0 100 200 300 400 500 600 700 800 900 1000 = 2 = = = = Iteration In this case, σ 0, ω0 γ ω 1 and β 0.3. The signal x is obviously nonlinear, but this is a case of weak nonlinearity [9]. Figure 5: Data generated by a linear non stationary AR process Fig. 9 shows 1024 points of the x component of the (black), surrogate generated by the iAAFT algorithm (dark gray), the TFT algorithm (gray) and the iAATFT algorithm (light gray) each dis- Duffing equation (integrated for 10.000 steps with a unit placed from the other by 2 units for clarity of 0.1, discarded the first half and then selected a sub- segment of 1024 points which minimized the end-point mismatch) and also shows a surrogate generated with [7], which is given by each algorithm. Surrogates generated with each algo- x˙ = a(y − x) rithm are very similar to the data, in this case we found = that randomizing the higher 95% of the frequency do- y˙ x(b − z) − y (9) main, the AC, local mean and variance of data were z˙ = xy − cz preserved in the surrogates generated with the TFT and = iAATFT methods. Fig. 10 shows that neither of the hy- The system exhibits a chaotic behavior with a 10, = = pothesescan be rejectedusing the AMI. The fact that the b 28 and c 8/3. test fails to reject the hypothesis could be a consequence Fig. 11 shows 1024 points of the x component of the of the selected discriminant statistic or just because the Lorenz system (integrated for 10.000 steps with a unit nonlinearity is so weak that the test simply fails to detect of 0.1, discarded the first half and then selected a sub- it. segment of 1024 points which minimized the end-point mismatch), and also shows a surrogate generated with each algorithm. It is easy to see that the surrogate gen- 4.1.5. A chaotic system erated with the iAAFT algorithm is very different to the Subsequently, we used the iAAFT, TFT and the data despite it preserve the AC and AD of the data (this iAATFT to generate surrogates for the Lorenz system situation was also observed in Fig. 5), this imply that 6 a)

0

−2 0.32 0.34 0.36 0.38 0.4 0.42 0.44 0.46 b) −4

−6

0.295 0.3 0.305 0.31 0.315 0.32 0.325 0.33 0.335 0.34 0.345 820 840 860 880 900 920 940 960 980 1000 1020 AMI (τ = 1) Iteration

Figure 8: AMI (τ = 1) for data generated by a linear non stationary Figure 6: Data generated by a linear non stationary AR process AR process observed through a nonlinear function (longer stem) and (black), surrogate generated by the iAAFT algorithm (dark gray), the 99 surrogates generad with the a) TFT algorithm and b) iAATFT algo- TFT algorithm (gray) and the iAATFT algorithm (light gray) each dis- rithm (10 iterations were performed). We randomized the higher 98% placed from the other by 2 units for clarity of the frequency domain.

a)

0

0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 b) −2

0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 −4 c)

−6

0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 AMI (τ = 1) 0 100 200 300 400 500 600 700 800 900 1000 Iteration

= Figure 7: AMI (τ 1) for data generated by a linear non station- Figure 9: x component of the Duffing equation (black), surrogate gen- ary AR process (longer stem) and 99 surrogates generad with the a) erated by the iAAFT algorithm (dark gray), the TFT algorithm (gray) iAAFT algorithm, b) TFT algorithm and c) iAATFT algorithm (10 and the iAATFT algorithm (light gray) each displaced from the other iterations were performed). We randomized the higher 99% of the by 2 units for clarity frequency domain.

4.2.1. Monthly global average temperature (MGAT) data is either: nonlinear and stationary, linear and non As shown in Fig. 13 a) the MGAT data is non station- stationary or nonlinear and non stationary, but we can- ary (it has a trend) and has an end point mismatch, so not make a clear distinction. Fig. 12 helps us clarify the classical surrogate data methods would not be able this issue, it is obvious that data is nonlinear, because to detect nonlinearity. Prior to the generation of surro- the linear and stationary and linear and non stationary gate data with the TFT and AATFT algorithms we pro- hypotheses were rejected. ceeded to symmetrize the data in order to eliminate the end point mismatch. 4.2. Application to real data Fig. 14 shows that both hypotheses were rejected, so there is a good chance that the data is nonlinear (there Based on the previous results, we applied the TFT is always room for error). This result verifies what was and the AATFT algorithms to two experimental sys- found by [6]. tems: (i) monthly global average temperature (MGAT) from January 1880 to February 2010 (1562 data points). 4.2.2. Micro Electrode recordings (MER) This database is public, available on the web and (ii) Mi- We now turn our attention to physiological data. Fig. cro electrode recording (MER) from the substantia ni- 13 b) shows the typical behavior of these kinds of sig- gra pars reticulata (4096 data points) adquiared during a nals, that is synchronization; the spike is generated be- Parkinson surgery held in Valencia (Spain). The equip- cause of the synchronization of a small cumulus of neu- ment used in the acquisition was the LEADPOINT TM rons sorrounding the micro-electrode implanted in the of Medtronic. brain, obviously this is a difficult case and the standard 7 a) a)

2.415 2.42 2.425 2.43 2.435 2.44 2.445 2.45 2.455 b) 1.75 1.8 1.85 1.9 1.95 2 b)

2.41 2.415 2.42 2.425 2.43 2.435 2.44 2.445 2.45 c)

1.84 1.86 1.88 1.9 1.92 1.94 1.96 1.98 2 2.38 2.39 2.4 2.41 2.42 2.43 2.44 2.45 AMI (τ = 1) AMI (τ = 1)

Figure 12: AMI (τ = 1) for x component of the Lorenz system (longer Figure 10: AMI (τ = 1) for x component of the Duffing equation stem) and 99 surrogates generad with the a) TFT algorithm and b) (longer stem) and 99 surrogates generad with the a) TFT algorithm iAATFT algorithm (20 iterations were performed). We randomized and b) iAATFT algorithm (10 iterations were performed). We ran- the higher 97% of the frequency domain. domized the higher 95% of the frequency domain.

a) 0.5

0 0

−2 −0.5 0 500 1000 1500 Moths b) 0.5 −4 0

−0.5 −6

−1 0 100 200 300 400 500 600 700 800 900 1000 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 time (s)

Figure 11: x component of the Lorenz system (black), surrogate gen- Figure 13: Real time series. a) Monthly global average temperature erated by the iAAFT algorithm (dark gray), the TFT algorithm (gray) from January 1880 to February 2010 and b) Micro electrode recording and the iAATFT algorithm (light gray) each displaced from the other (MER) from the substantia nigra pars reticulata, by 2 units for clarity

gate data methods, by including non stationary and non surrogate data methods are useless. However, the TFT Gaussian processes. Through numerous examples we and the proposed (AATFT) methods are able to mimic demonstrate that classical surrogate data methods will the temporal behavior of data which implies that the fail to discrimine between linear and nonlinear systems preservation of local mean and variance is key to gener- when the underlying process is non-stationary; we also ating valid surrogates. These results are shown in Fig. shown that the same problem occurs with the TFT sur- 15. Finally, Fig. 16 shows that the hypothesis addressed rogate method when the time series generated by a non- by the TFT algorithm is rejected, while the hypothesis stationary process does not have a Gaussian AD. Only addressed by the AAFT algorithm is not. This implies the proposed AATFT algorithm is able to detect the true that data is nonlinear, but nonlinearity is due to the ob- nature of the data in this cases. servation function, further discussion on this matter will With these methods we were able to confirm the pres- be presented in the future. ence of nonlinearity in the monthly global average tem- perature time series, and we also prove that the studied 5. Conclusions MER signal is a realization of a nonlinear statical trans- formation of a linear non-stationary stochastic process, A new surrogate data algorithm was presented. With a result that will be studied further in future works. this algorithm we were able to generate surrogate data that are constrained to the have the same autocorre- 6. ACKNOWLEDGMENTS lation (power spectra), amplitude distribution and lo- cal mean and variance of data, but are otherwise real- We express our sincere gratitude to the Instrumen- izations of a non-stationary linear stochastic process. tation and Control Research Group of the Universidad In this way we expanded the range of uses for surro- Tecnol´ogica of Pereira and to the MIRP group in the 8 a) a)

2.28 2.3 2.32 2.34 2.36 2.38 2.4 0.67 0.68 0.69 0.7 0.71 0.72 0.73

b) a)

2.29 2.3 2.31 2.32 2.33 2.34 2.35 2.36 2.37 2.38 0.65 0.655 0.66 0.665 0.67 0.675 0.68 0.685 0.69 0.695 AMI (τ = 1) AMI (τ = 1)

Figure 14: AMI (τ = 1) for the Monthly global average temperature Figure 16: AMI (τ = 1) for MER signal from the substantia nigra pars from January 1880 to February 2010 (longer stem) and 99 surrogates reticulata (longer stem) and 99 surrogates generad with the a) TFT generad with the a) TFT algorithm and b) iAATFT algorithm (10 it- algorithm and b) iAATFT algorithm (10 iterations were performed). erations were performed). We randomized the higher 98% of the fre- We randomized the higher 97.5% of the frequency domain. quency domain.

[8] J. Timmer, Power of surrogate data testing with respect to non- stationarity, Physical Review E 58 (1998) 5153 – 5156. 0 [9] N. Huang, The Hilbert-Huang Transform in Engineering, Taylor and Francis Group, pp. 1 – 22.

−2

−4

−6

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 time (s)

Figure 15: MER signal from the substantia nigra pars reticu- lata(black), surrogate generated by the iAAFT algorithm (dark gray), the TFT algorithm (gray) and the iAATFT algorithm (light gray) each displaced from the other by 2 units for clarity

Research Center of the Instituto Tecnol´ogico Metropoli- tano of Medell´ın - Colombia.

References

[1] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, J. D. Farmer, Testing for nonlinearity in time series: The method of surrogate data, Physica D 58 (1992) 77 – 94. [2] J. Theiler, D. Prichard, Constrained realization for hypothesis testing, Physica D 94 (1996) 221 – 235. [3] T. Schreiber, A. Schmitz, Improved surrogate data for nonlinear- ity tests, Physical Review Letters 77 (1996) 635 – 638. [4] C. Richard, A. Ferrari, H. Amoud, P. Honeine, P. Flandrin, P. Borgnat, Statistical hypothesis testing with time-frequency surrogates to check signal stationarity, in: IEEE Int. Conf. on Acoustics, Speech and Signal Proc. ICASSP 2010. [5] T. Schreiber, A. Schmitz, Surrogate time series, Physica D 142 (2000) 346–382. [6] T. Nakamura, M. Small, Y. Hirata, Testing for nonlinearity in irregular fluctuations with long-term trends, Physical Review E 74 (2006) 026205. [7] M. Small, Applied Nonlinear Time Series Analysis - Applications in Physics, Physiology and Finance, World Scientific, 2005. 9