arXiv:cond-mat/0408330v1 [cond-mat.stat-mech] 13 Aug 2004 yais cilUiest,35 rmnd i ila Os William Sir Promenade 3655 University, McGill Dynamics, 007Ktwc,POLAND Katowice, 40-007 eidnmclsystems Semi-dynamical 2 Introduction 1 Contents dniyn h iiigVlct Distribution Velocity Limiting the Identifying 4 Analysis 3 ∗ † . ekcnegneciei ...... criteria convergence . Weak . . . . 2.4 . . . . . perturbations deterministic . evolut from density . motion of Brownian . properties ergodic . and 2.3 . Probabilistic operators evolution 2.2 Density 2.1 . -ydcmp...... evoluti . density . map velocity . r-dyadic . the . of . illustration . 4.3 . Graphical . . . map 4.2 . Dyadic . . 4.1 . . . . . dimension one in case linear The 3.2 . ekcnegneof convergence Weak 3.1 orsodn uhr email: author, Corresponding e-mail: h ffcso etrigaDnmclSse yaChaotic a by System Dynamical a Perturbing of Effects The a,udrcrancrusacs eatraeyitrrtdas interpreted alternately process. be chaotic of circumstances, contamination certain the that under illustrate may, r results show process These Ornstein-Uhlenbeck results framework. an these istic how apply and then recovered, We are results systems. time discrete dynamical .. eta ii hoes...... maps noninvertible . for FCLT . theorems limit 2.3.2 Central 2.3.1 .. eaiu ftepsto aibe...... variable . position . the of . variable Behaviour velocity the of 3.2.2 Behaviour 3.2.1 eew eiwadetn eta ii hoesfrhgl hoi b chaotic highly for theorems limit central extend and review we Here [email protected] ihe .Mackey C. Michael eemnsi rwinMotion: Brownian Deterministic [email protected] eatet fPyilg,Pyis&MteaisadCent and Mathematics & Physics Physiology, of Departments , v ( eiDnmclSystem Semi-Dynamical t n and ) coe 6 2018 26, October v n 35 ...... ∗ n at Tyran-Kami´nska Marta and Abstract nttt fMteais ieinUiest,u.Banko ul. University, Silesian Mathematics, of Institute , e,Mnra,Q,CND,HG1Y6 H3G CANADA, QC, Montreal, ler, nwt ydcmpprubtos45 perturbations map dyadic with on o 12 ...... ion slswti oal determin- totally a within esults h intr fa underlying an of signature the xeietldt y“noise” by data experimental 16 ...... o rwinmotion-like Brownian how tdtriitcsemi- deterministic ut † 5 ...... 36 ...... 31 ...... 38 . . . . 36 ...... efrNonlinear for re 20 . 16 ...... 45 . . . 42 . . a14, wa 41 33 5 2 1 5 Gaussian Behaviour in the limit τ 0. 47 → 6 Discussion 51
A Appendix: Limit Theorems for Dependent Random Variables 51
1 Introduction
Almost anyone who has ever looked through a microscope at a drop of water has been intrigued by the seemingly erratic and unpredictable movement of small particles suspended in the water, e.g. dust or pollen particles. This phenomena, noticed shortly after the invention of the microscope by many individuals, now carries the name of “Brownian motion” after the English botanist Robert Brown who wrote about his observations in 1828. Almost three-quarters of a century later, Einstein (1905) gave a theoretical (and essentially molecular) treatment of this macroscopic motion that predicted the phenomenology of Brownian motion. (A very nice English translation of this, and other, work of Einstein’s on Brownian motion can be found in F¨urth (1956).) The contribution of Einstein led to the development of much of the field of stochastic processes, and to the notion that Brownian movement is due to the summated effect of a very large number of tiny impulsive forces delivered to the macroscopic particle being observed. This was also one of the most definitive arguments of the time for an atomistic picture of the microscopic world. Other ingenious experimentalists used this conceptual idea to explore the macroscopic effects of microscopic influences. One of the more interesting is due to Kappler (1931), who devised an experiment in which a small mirror was suspended by a quartz fiber (c.f Mazo (2002) for an analysis of this experimental setup). Any rotational movement of the mirror would tend to be counterbalanced by a restoring torsional force due to the quartz fiber. The position of the mirror was monitored by shining a light on it and recording the reflected image some distance away (so small changes in the rotational position of the mirror were magnified). Air molecules striking the mirror caused a transient deflection that could thus be monitored, and the frequency of these collisions was controlled by changing the air pressure. Figure 1.1, taken from Kappler (1931), shows two sets of data taken using this arrangement and offers a vivid depiction of the macroscopic effects of microscopic influences. In trying to understand theoretically the basis for complicated and irreversible experimental observations, a number of physicists have supplemented the reversible laws of physics with various hypotheses about the irregularity of the physical world. One of the first of these, and arguably one of the most well known, is the so-called “molecular chaos” hypothesis of Boltzmann (1995). This hypothesis, which postulated a lack of correlation between the movement of molecules in a small collision volume, allowed the derivation of the Boltzmann equation from the Liouville equation and led to the celebrated H theorem. The origin of the loss of correlations was never specified. In an effort to understand the nature of turbulent flow, Ruelle (1978, 1979, 1980) postulated a type of mixing dynamics to be necessary. More recently, several authors have made chaotic hypotheses about the nature of dynamics at the microscopic level. The most prominent of these is Gallavotti (1999), and virtually the entire book of Dorfman (1999) is predicated on the implicit assumption that microscopic dynamics have a chaotic (loosely defined, but usually taken to be mixing) nature. All of these hypotheses have been made in spite of the fact that none of the microscopic dynamics that we write down in physics actually display such properties. Others have taken this suggestion (chaotic hypothesis) quite seriously, and attempted an ex- perimental confirmation. Figure 1.2 shows a portion of the data, taken from Gaspard et al. (1998), that was obtained in an examination of a microscopic system for the presence of chaotic behavior.
2 Figure 1.1: In the upper panel is shown a recording of the movement of the mirror in the Kappler (1931) experiment over a period of about 30 minutes at atmospheric pressure (760 mm Hg). The bottom panel shows the same experiment at a pressure of 4 10 3 mm Hg. Both figures are from × − Kappler (1931). See the text for more detail.
3 Figure 1.2: The data shown here, taken from Gaspard et al. (1998), show the position of a 2.5 µm 1 Brownian particle in water over a 300 second period with a sampling interval of 60 sec (see Gaspard et al. (1998); Briggs et al. (2001) for the experimental details). The inset figure shows 2 the power spectrum, which displays a typical decay (for Brownian motion) with ω− .
Their data analysis showed a positive lower bound on the sum of Lyapunov exponents of the system composed of a macroscopic Brownian particle and the surrounding fluid. From their analysis, they argued that the Brownian motion was due to (or the signature of) deterministic microscopic chaos. However, Briggs et al. (2001) were more cautious in their interpretation, and Mazo (2002, Chapter 18) has explored the possible interpretations of experiments like these in some detail. If true, the existence of deterministic chaos (whatever that means) would be an intriguing possibility since, if generated by a non-invertible dynamics (semi-dynamical system), it could serve as an explanation of a host of unresolved problems in the sciences. Most notably, it could serve as an explanation for the manifest irreversibility of our physical and biological world in the face of physical laws that fail to encompass irreversibility without the most incredulous of assumptions. In particular, it would clarify the foundations of irreversible statistical mechanics, e.g. the operation of the second law of thermodynamics (Dorfman, 1999; Gallavotti, 1999; Mackey, 1989, 1992; Schulman, 1997), and the implications of the second law for the physical and biological sciences. In this paper, we have a rather more modest goal. We address a different facet of this chaotic hypothesis by studying how and when the characteristics of Brownian motion can be reproduced by deterministic systems. To motivate this, in Figures 1.3 through 1.6 we show the position (x)
4 and velocity (v) of a particle of mass m whose dynamics are described by
dx = v (1.1) dt dv m = Γv + (t). (1.2) dt − F In Equations 1.1 and 1.2, is a fluctuating “force” consisting of a sequence of delta-function like F impulses given by ∞ (t)= mκ ξ(t)δ(t nτ), (1.3) F − nX=0 and ξ is a “highly chaotic” (exact, see Section 2) deterministic variable generated by ξt+τ = T (ξt) where T is the hat map on [ 1, 1] defined by: − 2 y + 1 for y [ 1, 0) T (y)= 2 ∈ − (1.4) 2 1 y for y [0, 1) . 2 − ∈ In this paper we examine the behavior of systems described by equations like (1.1) through (1.4) and establish, analytically, the eventual limiting behavior of ensembles. In particular, we address the question of how Brownian-like motion can arise from a purely deterministic dynamics. We do this by studying the dynamics from a statistical, or ergodic theory, standpoint. The outline of the paper is as follows. Section 2 gives required background and new material including the definitions of a heirarchy of chaotic behaviours (ergodic, mixing, and exact) with a discussion of their different behaviours in terms of the evolution of densities under the action of transfer operators such as the Frobenius-Perron operators. We then go on to treat Central Limit theorems and Functional Central Limit Theorems for non-invertible dynamical systems. Section 3 returns to the specific problem that Equations 1.1 through 1.4 illustrate. We show how the par- ticle velocity distribution may converge and how the particle position may become asymptotically Gaussian, but for a more general class of maps than given by (1.4). In Section 4 we illustrate the application of the results from Section 3 using a specific chaotic map (the dyadic map, Equation 1.4) to act as a surrogate noise source. Section 5 considers the question when one can obtain Gaus- sian processes by studying appropriate scaling limits of the velocity and position variables, and the convergence of the velocity process to an Ornstein-Uhlenbeck process as the interval τ between chaotic perturbations approaches 0. The paper concludes with a brief discussion in Section 6. The Appendix collects and extends general central limit theorems from probability theory that are used in the main results of the paper.
2 Semi-dynamical systems
We are going to examine the behavior illustrated in Section 1 using techniques from ergodic theory, and closely related concepts from probability theory, applied to the dynamics of semi-dynamical (non-invertible) systems. In this section we collect together the necessary machinery to do so. Much of this background material can be found in Lasota and Mackey (1994).
2.1 Density evolution operators Let (Y , ,ν ) and (Y , ,ν ) be two σ-finite measure spaces and let the transformation T : Y 1 B1 1 2 B2 2 1 → Y be measurable, i.e. T 1( ) where T 1( ) = T 1(A) : A . Then we say that 2 − B2 ⊆ B1 − B2 { − ∈ B2}
5 1.5
1
0.5
0
−0.5
−1
−1.5 0 100 200 300 400 500 600 700 800 900 1000
4
2
0
−2
−4 0 100 200 300 400 500 600 700 800 900 1000
Figure 1.3: The top panel shows the simulated position of a particle obeying Equations 1.1 through 1.4 using Equation 3.28, while the bottom panel shows the velocity of the same particle computed with Equation 3.21. The parameters used were: γ =Γ/m = 10, κ = 1, and τ = 1 ln(9 10 4) − 10 × − ≃ 0.932 so λ e γτ = 9 10 4. The initial condition on the hat map given by Equation 1.4 was ≡ − × − y0 = 0.12562568.
6 1.5
1
0.5
0
−0.5
−1
−1.5 0 100 200 300 400 500 600 700 800 900 1000
4
2
0
−2
−4 0 100 200 300 400 500 600 700 800 900 1000
Figure 1.4: As in Figure 1.3 except that τ = 1 ln(2) 0.069 so λ = 1 . 10 ≃ 2
7 1.5
1
0.5
0
−0.5
−1
−1.5 0 100 200 300 400 500 600 700 800 900 1000
4
2
0
−2
−4 0 100 200 300 400 500 600 700 800 900 1000
Figure 1.5: As in Figure 1.4 except that τ = 1 ln(0.9) 0.011 so λ = 0.9. − 10 ≃
8 1.5
1
0.5
0
−0.5
−1
−1.5 0 100 200 300 400 500 600 700 800 900 1000
4
2
0
−2
−4 0 100 200 300 400 500 600 700 800 900 1000
Figure 1.6: As in Figure 1.3, with the parameters of Figure 1.5 and an initial condition on the hat map (1.4) of y0 = 0.1678549321.
9 T is nonsingular (with respect to ν and ν ) if ν (T 1(A)) = 0 for all A with ν (A) = 0. 1 2 1 − ∈ B2 2 Associated with the transformation T we have the Koopman operator UT defined by U g = g T T ◦ for every measurable function g : Y R. We define the transfer operator P : L1(Y , ,ν ) 2 → T 1 B1 1 → L1(Y , ,ν ) as follows. For any f L1(Y , ,ν ), there is a unique element P f in L1(Y , ,ν ) 2 B2 2 ∈ 1 B1 1 T 2 B2 2 such that
PT f(y)ν2(dy)= f(y)ν1(dy). (2.1) −1 ZA ZT (A) Equation 2.1 simply gives an implicit relation between an initial density of states (f) and that density after the action of the map T , i.e. P f. The Koopman operator U : L (Y , ,ν ) T T ∞ 2 B2 2 → L (Y , ,ν ) and the transfer operator P are adjoint, so ∞ 1 B1 1 T
g(y)PT f(y)ν2(dy)= f(y)UT g(y)ν1(dy) ZY2 ZY1 for g L (Y , ,ν ), f L1(Y , ,ν ). ∈ ∞ 2 B2 2 ∈ 1 B1 1 In some special cases Equation 2.1 allows us to obtain an explicit form for PT . Let Y2 = R, = (R) be the Borel σ-algebra, and ν be the Lebegue measure. Let Y be an interval [a, b] B2 B 2 1 on the real line R, = [a, b] (R) and ν be the Lebesgue measure restricted to [a, b]. We will B1 ∩B 1 simply write L1([a, b]) when the underlying measure is the Lebesgue measure. The transformation T : [a, b] R is called piecewise monotonic if → (i) there is a partition a = a0 < a1 < ... < al = b of [a, b] such that for each integer i = 1, ..., l 1 the restriction of T to (ai 1, ai) has a C extension to [ai 1, ai] and − −
(ii) T ′(x) > 0 for x (ai 1, ai), i = 1, ..., l. | | ∈ − If a transformation T : [a, b] R is piecewise monotonic, then for f L1([a, b]) we have → ∈ l 1 f(T(−i) (y)) PT f(y)= 1 1T ((ai−1 ,ai))(y), T ′(T(−i) (y)) Xi=1 | | 1 where T − is the inverse function for the restriction of T to (ai 1, ai). Note that we have equivalently (i) − f(x) P f(y)= . (2.2) T T (x) x T −1( y ) ′ ∈ X{ } | | Of course these formulas hold almost everywhere with respect to the Lebesque measure. Let (Y, ) be a measurable space and let T : Y Y be a measurable transformation. The B → definition of the transfer operator for T depends on a given σ-finite measure on , which in turn B gives rise to different operators for different underlying measures on . If ν is a probability measure B on which is invariant for T , i.e. ν(T 1(A)) = ν(A) for all A , then T is nonsingular. The B − ∈ B transfer operator P : L1(Y, ,ν) L1(Y, ,ν) is well defined and when we want to emphasize T B → B that the underlying measure ν in the transfer operator is invariant under the transformation T we will write . The Koopman operator U is also well defined for f L1(Y, ,ν) and is an PT,ν T ∈ B isometry of L1(Y, ,ν) into L1(Y, ,ν), i.e. U f = f for all f L1(Y, ,ν). The following B B || T ||1 || ||1 ∈ B relation holds between the operators U , : L1(Y, ,ν) L1(Y, ,ν) T PT,ν B → B 1 U f = f and U f = E(f T − ( )) (2.3) PT,ν T T PT,ν | B
10 for f L1(Y, ,ν), where E( T 1( )) : L1(Y, ,ν) L1(Y, T 1( ),ν) denotes the operator of ∈ B ·| − B B → − B conditional expectation (see Appendix). Both of these equations are based on the following change of variables (Billingsley, 1995, Theorem 16.13): f L1(Y, ,ν) if and only if f T L1(Y, ,ν), ∈ B ◦ ∈ B in which case the following holds
f T (y)ν(dy)= f(y)ν(dy), A . (2.4) −1 ◦ ∈B ZT (A) ZA If the measure ν is finite, we have Lp(Y, ,ν) L1(Y, ,ν) for p 1. The operator U : B ⊂ B ≥ T Lp(Y, ,ν) Lp(Y, ,ν) is also an isometry in this case. Note that if the conditional expec- B → B tation operator E( T 1( )) : L1(Y, ,ν) L1(Y, ,ν) is restricted to L2(Y, ,ν), then this is the ·| − B B → B B orthogonal projection of L2(Y, ,ν) onto L2(Y, T 1( ),ν). B − B One can also consider any σ-finite measure m on with respect to which T is nonsingular and B the corresponding transfer operator P : L1(Y, ,m) L1(Y, ,m). To be specific, let Y be a T B → B Borel subset of Rk with Lebesque measure m and = (Y ) be the σ algebra of Borel subsets of B B − Y . Throughout this paper m will denote Lebesque measure and L1(Y ) will denote L1(Y, ,m). The B transfer operator P : L1(Y ) L1(Y ) is usually known as the Frobenius-Perron operator. A T → measure ν (on Y ) is said to have a density g if ν(A)= g (y)dy for all A , where g L1(Y ) ∗ A ∗ ∈B ∗ ∈ is nonnegative and g (y)dy = 1. A measure ν is called absolutely continuous if it has a density. If Y ∗ R the Frobenius-Perron operator P has a nontrivial fixed point in L1(Y ), i.e. the equation P f = f R T T has a nonzero solution in L1(Y ), then the transformation T has an absolutely continuous invariant measure ν, its density g is a fixed point of PT , and we call g an invariant density under the ∗ ∗ transformation T . The following relation holds between the operators P and T PT,ν 1 PT (fg )= g T,νf for f L (Y, ,ν). (2.5) ∗ ∗P ∈ B In particular, if the density g is strictly positive, i.e. g (y) > 0 a.e. for y Y , then the measures ∗ ∗ ∈ m and ν are equivalent and we also have
f 1 PT (f)= g T,ν for f L (Y ). ∗P g ∈ ∗ The notion of a piecewise monotonic transformation on an interval can be extended to “piecewise smooth” transformations T : Y Y with Y Rk. Therefore if T has, for example, a finitely → ⊂ many inverse branches and the Jacobian matrix DT (x) of T at x exists and det DT (x) = 0 for 6 almost every x, then the Frobenius-Perron operator is given by f(x) P f(y)= a.e. T det DT (x) x T −1( y ) ∈ X{ } | | for f L1(Y ). If T is invertible then we have P f(y)= f(T 1(y)) det DT 1(y) . ∈ T − | − | Finally, we briefly mention Ruelle’s transfer operator. Let Y be a compact metric space, T : Y Y be a continuous map such that T 1( y ) is finite for each y Y and let φ : Y R be → − { } ∈ → a function (typically continuous or H¨older continuous). The so called Ruelle operator acts on Lψ functions rather than on L1(Y ) elements and is defined by
f(y)= eψ(x)f(x) Lψ x T −1( y ) ∈ X{ } for every y Y . The function ψ is a so called potential. If we take ψ(x) = log det DT (x) , ∈ − | | provided it makes sense, then we arrive at the representation for the Frobenius-Perron operator.
11 2.2 Probabilistic and ergodic properties of density evolution Let (Y, ,ν) be a normalized measure space and let T : Y Y be a measurable transformation B → preserving the measure ν. We can discuss the ergodic properties of T in terms of the convergence behavior of its transfer operator : L1(Y, ,ν) L1(Y, ,ν). To this end, we note that the PT,ν B → B transformation T is
(i) Ergodic (with respect to ν) if and only if every invariant set A is such that ν(A)=0 or ∈B ν(Y A) = 0. This is equivalent to: T is ergodic (with respect to ν) if and only if for each \1 1 n 1 k 1 f L (Y, ,ν) the sequence − f is weakly convergent in L (Y, ,ν) to f(y)ν(dy), ∈ B n k=0 PT,ν B i.e. for all g L (Y, ,ν) ∈ ∞ B P R n 1 1 − lim k f(y)g(y)ν(dy)= f(y)ν(dy) g(y)ν(dy); n T,ν →∞ n P Z Xk=0 Z Z (ii) Mixing (with respect to ν) if and only if
n lim ν(A T − (B)) = ν(A)ν(B) for A, B . n →∞ ∩ ∈B Mixing is equivalent to: For each f L1(Y, ,ν) the sequence P nf is weakly convergent in ∈ B T L1(Y, ,ν) to f(y)ν(dy), i.e. B R n lim f(y)g(y)ν(dy)= f(y)ν(dy) g(y)ν(dy) for g L∞(Y, ,ν). n PT,ν ∈ B →∞ Z Z Z (iii) Exact (with respect to ν) if and only if
lim ν(T n(A))=1 for A with T (A) , ν(A) > 0. n →∞ ∈B ∈B Exactness is equivalent to: For each f L1(Y, ,ν) the sequence P n f is strongly convergent ∈ B T,ν in L1(Y, ,ν) to f(y)ν(dy), i.e. B R lim n f(y) f(y)ν(dy) ν(dy) = 0. n |PT,ν − | →∞ Z Z The characterization of the ergodic properties of transformations through the properties of the evolution of densities requires that we know an invariant measure ν for T . Examples of ergodic, mixing, and exact transformations are given in the following.
Example 1 The transformation on [0, 1]
T (y)= y + φ mod1, known as rotation on the circle, is ergodic with respect to the Lebesgue measure when φ is irrational. The associated Frobenius-Perron operator is given by
P f(y)= f(y φ). T −
12 Example 2 The baker map on [0, 1] [0, 1] × (2y, 1 z) 0 z 1 T (y, z)= 2 ≤ ≤ 2 (2y 1, 1 + 1 z) 1 < z 1 − 2 2 2 ≤ is mixing with respect to the Lebesgue measure. The Frobenius-Perron operator is given by
1 1 f( 2 y, 2z) 0 z 2 PT f(y, z)= ≤ ≤ f( 1 + 1 y, 2z 1) 1 < z 1 2 2 − 2 ≤ Example 3 The hat map on [ 1, 1] defined by Equation 1.4 is exact with respect to the Lebesgue − measure, and has a Frobenius-Perron operator given by
1 1 1 1 1 P f(y)= f y + f y . T 2 2 − 2 2 − 2 Example 4 A class of piecewise linear transformations on [0, 1] are given by
N y 2n for y 2n , 2n+1 T (y)= − N ∈ N N (2.6) N N 2n+2 y for y 2n+1 , 2n+2 , N − ∈ N N where n = 0, 1,..., [(N 1)/2] and [z] denotes the integer part of z. For N 2, these piecewise − ≥ linear maps generalize the hat map, are exact with respect to the Lebesgue measure, and have the invariant density g (y) = 1[0,1](y). (2.7) ∗ Example 5 The Chebyshev maps (Adler and Rivlin (1964)) on [ 1, 1] studied by Beck and Roepstorff − (1987), Beck (1996) and Hilgers and Beck (1999) are given by
S (y) = cos(N arccos y),N = 0, 1, (2.8) N · · · with S0(y)=1 and S1(y)= y. They are conjugate to the transformation of Example 4, and satisfy the recurrence relation SN+1(y) = 2ySN (y) SN 1(y). For N 2 they are exact with respect to − − ≥ the measure with the density 1 g (y)= . ∗ π 1 y2 − For N = 2 the Frobenius-Perron operator is givenp by
1 1 1 1 1 PS2 f(y)= f y + + f y + 2√2y + 2 " r2 2! −r2 2!# and the transfer operator by
1 1 1 1 1 S2,νf(y)= f y + + f y + . P 2 " r2 2! −r2 2!# The construction of a transfer operator for a conjugate map is presented in the next theorem from Lasota and Mackey (1994).
13 Theorem 1 (Lasota and Mackey (1994, Theorem 6.5.2)) Let T : [0, 1] [0, 1] be a measurable → and nonsingular (with respect to the Lebesgue measure) transformation. Let ν : ([a, b]) [0, ) B → ∞ be a probability measure with a strictly positive density g , that is g (y) > 0 a.e. Let a second ∗ ∗ transformation S : [a, b] [a, b] be given by S = G 1 T G, where → − ◦ ◦ x G(x)= g (y) dy, a x b. ∗ ≤ ≤ Za Then the transfer operator is given by PS,ν 1 f = U P U −1 f, for f L ([a, b], ([a, b]),ν), (2.9) PS,ν G T G ∈ B 1 where UG,UG−1 are Koopman operators for G and G− , respectively, and PT is the Frobenius- Perron operator for T . As a consequence, ν is invariant for S if and only if the Lebesgue measure is invariant for T .
Example 6 The dyadic map on [ 1, 1] is given by − 2y + 1, y [ 1, 0] T (y)= ∈ − (2.10) 2y 1, y (0, 1] , − ∈ and has the uniform invariant density 1 g (y)= 1[ 1,1](y). ∗ 2 − Like the hat map, it is exact with respect to the normalized Lebesgue measure on [ 1, 1]. It has a − Frobenius-Perron operator given by
1 1 1 1 1 P f(y)= f y + f y + . T 2 2 − 2 2 2 Example 7 Alexander and Yorke (1984) defined a generalized baker transformation (also known as a fat/skiny baker transformation) S : [ 1, 1] [ 1, 1] [ 1, 1] [ 1, 1] by β − × − → − × − S (x,y) = (βx + (1 β)h(y), T (y)) β − where 0 <β< 1, T is the dyadic map on [ 1, 1], and − 1, y 0, h(y)= ≥ 1, y< 0. − For every β (0, 1) the transformation S has an invariant probability measure on [ 1, 1] ∈ β − × [ 1, 1] and is mixing. The invariant measure is the product of a so called infinitely convolved − Bernoulli measure (see Section 4) and the normalized Lebesgue measure on [ 1, 1]. If β = 1 , − 2 the transformation Sβ is conjugated through a linear transform of the plane to the baker map of 1 Example 2. If β < 2 , the transformation Sβ does not have an invariant density (with respect to the planar Lebesgue measure).
Example 8 The continued fraction map 1 T (y)= mod 1 y (0, 1] (2.11) y ∈
14 has an invariant density 1 g (y)= (2.12) ∗ (1 + y) ln 2 and is exact. The Frobenius-Perron operator is given by
∞ 1 1 P f(y)= f T (y + k)2 y + k Xk=1 and the transfer operator by
∞ y + 1 1 f(y)= f . PT,ν (y + k)(y + k + 1) y + k Xk=1 Example 9 The quadratic map is given by
T (y) = 1 βy2, y [ 1, 1] β − ∈ − where 0 < β 2. It is known that there exists a positive Lebesgue measure set of parameter ≤ values β such that the map Tβ has an absolutely continuous (with respect to Lebesgue measure) invariant measure νβ (Jakobson (1981) and Benedics and Carleson (1985)). Let α > 0 be a very small number and let
n αn n n ∆ = β [2 ǫ, 2] : T (0) e− and (T )′(T (0)) (1.9) n 0 ǫ { ∈ − | β |≥ | β β |≥ ∀ ≥ } for ǫ> 0. Young (1992) proved that for sufficiently small ǫ and for every β ∆ǫ the transformation 2 ∈ Tβ is exact with respect to νβ and this measure is supported on [Tβ (0), Tβ (0)].
Example 10 The Manneville-Pomeau map T : [0, 1] [0, 1] is given by β → 1+β Tβ(y)= y + y mod1, where β (0, 1). The map has an absolutely continuous invariant probability measure ν with ∈ β density satisfying c c 1 g (y) 2 yβ ≤ ∗ ≤ yβ for some constants c c > 0 (cf. Thaler (1980)), and is exact. 2 ≥ 1 Finally, we discuss the notion of Sinai-Ruelle-Bowen measure or SRB measure of T which was first conceived in the setting of Axiom A diffeomorphisms on compact Riemannian manifolds. This notion varies from author to author (Alexander and Yorke, 1984; Eckmann and Ruelle, 1985; Tsujii, 1996; Young, 2002; Hunt et al., 2002). Let (Y, ρ) be a compact metric space with a reference measure m, e.g. a compact subset of Rk and m the Lebesque or a compact Riemannian manifold and m the Riemannian measure on Y . If T : Y Y is a continuous map, then by the Bogolyubov- → Krylov theorem there always exists at least one invariant probability measure for T . When there is more than one measure, the question arises which invariant measure is “interesting”, and has led to attempts to give a good definition of “physically” relevant invariant measures. Though this seems to be a rather vague and poorly defined concept, loosely speaking one would expect that one criteria for a physically relevant invariant measure would be whether or not it was observable in the context of some laboratory or numerical experiment.
15 An invariant measure ν for T is called a natural or physical measure if there is a positive Lebesque measure set Y Y such that for every y Y and for every continuous observable f : Y R 0 ⊂ ∈ 0 → n 1 1 − lim f(T i(y)) = f(z)ν(dz). (2.13) n n →∞ Xi=0 Z In other words the average of f along the trajectory of T starting in Y0 is equal to the average of 1 n 1 f over the space Y . We can also say that for each y Y the measures − δ i are weakly ∈ 0 n i=0 T (y) convergent to ν n 1 P 1 − d δ i ν n T (y) → Xi=0 (see the next section for this notation). Observe that if ν is ergodic then from the individual Birkhoff ergodic theorem it follows that for every f L1(Y, ,ν) Condition 2.13 holds for almost all y Y , i.e. except for a subset of ∈ B ∈ Y of (ν) measure zero. Thus, if T has an ergodic absolutely continuous invariant measure ν with density g then every continuous function f is integrable with respect ν and Condition 2.13 holds for almost∗ every point from the set y Y : g (y) > 0 , i.e. except for a subset of Lebesque measure { ∈ ∗ } zero. Therefore such ν is a physical measure for T . Not only absolutely continuous measures are physical measures. Consider, for example, the generalized baker transformation Sβ of Example 7. Alexander and Yorke (1984) showed that there is a unique physical measure νβ for each β (0, 1). 1 ∈ This measure is mixing and hence ergodic. Although for β > 2 the transformation expands areas, the measure νβ might not be absolutely continuous for certain values of the parameter β (e.g. 1+ √5 β = − ) in which case the Birkhoff ergodic theorem only implies Condition 2.13 on a zero 2 Lebesque measure set. Therefore a completely different argument was needed in the proof of the physical property. In the context of smooth invertible maps having an Axiom A attractor the existence of a unique physical measure on the attractor was first proved for Anosov diffeomorhisms by Sinai (1972) and later generalized by Ruelle (1976) and Bowen (1975). Roughly speaking, these are maps having uniformly expanding and contracting directions and their physical invariant measures have densities with respect to the Lebesque measure in the expanding directions (being usually singular in the contracting directions). This property lead then to the characterization of a Sinai- Ruelle-Bowen measure. In a recent attempt to go beyond maps having an Axiom A attractor, Young (2002) additionally requires that T has a positive Lyapunow exponent a.e. The precise definition strongly relies on the smoothness and invertibility of the map T . Note that the generalized baker transformation Sβ has Lyapunov exponents equal to ln 2 and ln β and the measure νβ is absolutely continuous along all vertical directions.
2.3 Brownian motion from deterministic perturbations 2.3.1 Central limit theorems We follow the terminology of Billingsley (1968). If ζ is a measurable mapping from a probabil- ity space (Ω, , Pr) into a measurable space (Z, ), we call ζ a Z-valued random variable. The F A distribution of ζ is the normalized measure µ = Pr ζ 1 on (Z, ), i.e. ◦ − A 1 µ(A) = Pr(ζ− (A)) = Pr ω : ξ(ω) A = Pr ξ A , A . { ∈ } { ∈ } ∈A
16 If Z = Rk, we also have the associated distribution function of ζ or µ, defined by
F (x)= µ y : y x = Pr ζ x , x Rk, { ≤ } { ≤ } ∈ where y : y x = y : y x , i = 1,...,k for x = (x ,...,x ). The random variables ζ and ξ { ≤ } { i ≤ i } 1 k are, by definition, (statistically) independent if
Pr ζ A, ξ B = Pr ζ A Pr ξ B , { ∈ ∈ } { ∈ } { ∈ } i.e. the distribution of the pair (ζ,ξ) is the product of the distribution of ζ with that of ξ. Let (Z, ρ) be a metric space and (Z) be the σ-algebra of Borel subsets of Z. A sequence (µ ) B n of normalized measures on (Z, (Z)) is said to converge weakly to a normalized measure µ if B
lim f(z)µn(dz)= f(z)µ(dz) n →∞ ZZ ZZ for every continuous bounded function f : Z R. Note that the integrals f(z)µ(dz) completely → Z determine µ, thus the sequence (µ ) cannot converge weakly to two different limits. Note also that n R weak convergence depends only on the topology of Z, not on the specific metric that generates it; thus two equivalent metrics give rise to the same notion of weak convergence. If we have a family µ : τ 0 of normalized measures instead of a sequence, we can also speak of weak convergence { τ ≥ } of µ to µ when τ goes to or some finite value τ in a continuous manner. This then means that τ ∞ 0 µ converges weakly to µ as τ τ if and only if µ converges weakly to µ for each sequence (τ ) τ → 0 τn n such that τn τ0 as n . →k →∞ If Z = R and F and Fn are, respectively, the distribution functions of µ and µn, then (µn) converges weakly to µ if and only if
lim Fn(z)= F (z) at continuity points z of F. n →∞ k The characteristic function ϕµ of a normalized measure µ on R is defined by
ϕµ(r)= exp (i
17 Here all the random variables are defined on the same probability space. Note that if Condition 2.14 holds then lim Pr0(ρ(ζn,ζ) > ε) = 0 for all ε> 0 n →∞ for every probability measure Pr on (Ω, ) which is absolutely continuous with respect to Pr. In 0 F other words convergence in probability is preserved by an absolutely continuous change of measure. We will also frequently use the following result from Billingsley (1968, Theorem 4.1): If (Z, ρ) is a separable metric space, and
if ζ˜ d ζ and ρ(ζ , ζ˜ ) P 0, then ζ d ζ. (2.15) n → n n → n → We will write N(0,σ2) for either a real-valued random variable which is Gaussian distributed with mean 0 and variance σ2, or the measure on (R, (R)) with density B 1 x2 exp . (2.16) √2πσ −2σ2 Since σN(0, 1) = N(0,σ2) when σ > 0, we can always write σN(0, 1) for σ 0, which in the case ≥ σ = 0 reduces to 0. The characteristic function of N(0, 1) is of the form φ(r) = exp ( 1 r2), r R. − 2 ∈ Let (ζj)j 1 be a sequence of real-valued zero-mean random variables with finite variance. If there is σ> ≥0 such that n ζj j=1 d σN(0, 1), (2.17) P√n → then (ζj)j 1 is said to satisfy the central limit theorem (CLT). Note that if the random variables ≥ ζ are defined on the same probability space (Ω, , Pr), we have the equivalent formulation of (2.17) j F in the case of σ> 0
n ζj(ω) lim Pr ω Ω : j=1 z = Φ(z), z R, n ∈ σ√n ≤ ∈ →∞ ( P ) where 1 z 1 Φ(z)= exp t2 dt (2.18) √2π −2 Z−∞ is the standard Gaussian distribution function. In the case of σ = 0 n ζj j=1 P 0. P√n → Let w(t),t [0, ) be a standard Wiener process (Brownian motion), i.e. w(t),t [0, ) { ∈ ∞ } { ∈ ∞ } is a family of real-valued random variables on some probability space (Ω, , Pr) satisfying the F following properties:
(i) the process starts at zero: w(0) = 0 a.e;
(ii) for 0 t < t the random variable w(t ) w(t ) is Gaussian distributed with mean 0 and ≤ 1 2 2 − 1 variance t t ; 2 − 1 (iii) for times t1 18 Existence of the Wiener process w(t),t [0, 1] is equivalent to the existence of the Wiener measure { ∈ } W on the space C[0, 1] of continuous functions on [0, 1] with uniform convergence, in a topology which makes C[0, 1] a complete separable metric space. Then, simply, W is the distribution of a random variable W :Ω C[0, 1] defined by W (ω) : t w(t)(ω). → 7→ Let D[0, 1] be the space of right continuous real valued functions on [0, 1] with left-hand limits. We endow D[0, 1] with the Skorohod topology which is defined by the metric ρS(ψ, ψ) = inf sup ψ(t) ψ(s(t)) + sup t s(t) , ψ, ψ D[0, 1], s S t [0,1] | − | t [0,1] | − | ∈ ∈ ∈ ∈ ! e e e where S is the family of strictly increasing, continuous mappings s of [0, 1] onto itself such that s(0) = 0 and s(1) = 1 (Billingsley, 1968, Section 14). The metric space (D[0, 1], ρS ) is separable and is not complete, but there is an equivalent metric on D[0, 1] which turns D[0, 1] with the Skorohod topology into a complete separable metric space. Since the Skorohod topology and the uniform topology on C[0, 1] coincide, W can be considered as a measure on D[0, 1]. A stronger result than the CLT is a weak invariance principle, also called a functional central limit theorem (FCLT). Let (ζj)j 1 be a sequence of real-valued zero-mean random variables ≥ with finite variance. Let σ> 0 and define the process ψ (t),t [0, 1] by { n ∈ } [nt] 1 ψ (t)= ζ for t [0, 1], n σ√n j ∈ Xj=1 (where the sum from 1 to 0 is set equal to 0). Note that ψn is a right continuous step function, a random variable of D[0, 1] and ψn(0) = 0. If ψ d w n → (here the convergence in distribution is in D[0, 1]), then (ζj)j 1 is said to satisfy the FCLT. ≥ If for every k 1 and every vector (t ,...,t ) [0, 1]k with t <... lim lim sup Pr( sup ψn(s) ψn(t) >ǫ) = 0, (2.19) δ 0 n t s δ | − | → →∞ | − |≤ then ψn converges in distribution to the Wiener process w. The term functional central limit theorem comes from the mapping theorem (Billingsley, 1968, Theorem 5.1), according to which for any functional f : D[0, 1] R, measurable and continuous on → a set of Wiener measure 1, the distribution of f(ψn) converges weakly to the distribution of f(w). This applies in particular to the functional f(ψ) = sup0 s 1 ψ(s). Instead of real-valued functionals one can also consider mappings with values in a metric≤ ≤ space. For example, this theorem applies t for any f : D[0, 1] D[0, 1] of the form f(φ)(t) = sups t φ(s) or f(φ)(t)= 0 φ(s)ds. → ≤ R 19 2.3.2 FCLT for noninvertible maps How can we obtain, and in what sense, Brownian-like motion from a (semi) dynamical system? This question is intimately connected with Central Limit Theorems (CLT) for non-invertible systems and various invariance principles. Many CLT results and invariance principles for maps have been proved, see e.g. the survey Denker (1989). These results extend back over some decades, including contributions by Ratner (1973), Boyarsky and Scarowsky (1979), Wong (1979), Keller (1980), Jab lo´nski and Malczak (1983a), Jablo´nski (1991), Liverani (1996) and Viana (1997). First, however, remember that if we have a time series y(j) and a bounded integrable function h : X R, then the correlation of h is defined as → N 1 1 − Rh(n) = lim h(y(j + n))h(y(j)). N N →∞ Xj=0 If the time series is generated by a measurable transformation T : Y Y operating on a normalized → measure space (Y, ,ν), and if further ν is invariant under T and T is ergodic, then we can rewrite B the correlation as N 1 1 − n Rh(n) = lim h(y(j + n))h(y(j)) = h(y)h(T (y))ν(dy). N N →∞ Xj=0 Z The average 20 n 1 j 2k 1 − h(T (y)) − lim j=0 ν(dy) = 0. n √n →∞ Z P ! Proof. First note that the transfer operator preserves the integral, i.e h(y)ν(dy) = PT,ν PT,ν h(y)ν(dy). Hence h(y)ν(dy) = 0. Now let n 1. Since ν is a finite measure, the Koopman and ≥ R transfer operators are adjoint on the space L2(Y, ,ν). This implies R R B n n h(y)h(T (y))ν(dy) = h(y)UT h(y)ν(dy) Z Z n 1 = h(y)U − h(y)ν(dy) = 0 PT,ν T Z and completes the proof of (i). Part (ii) follows from Lemma 1 and Theorem 12 in Appendix, since for each n 1 we have ≥ n 1 n 1 − j 1 n j h T = h T − . √n ◦ √n ◦ Xj=0 Xj=1 If σ > 0, then the CLT is a consequence of part (ii), while the FCLT follows from Lemma 2 and 3 in Appendix. Finally, the existence and convergence of moments follow from Theorem 5.3 and 5.4 of Billingsley (1968) and from Lemma 1 in Appendix. Remark 1 Note that if T is invertible then the equation (h) = 0 has only a zero solution, so PT,ν the theorem does not apply. We now address the question of solvability of the equation (h) = 0. PT,ν Proposition 1 Let (Y, ,ν) be a normalized measure space and T : Y Y be a measurable map B → preserving the measure ν. Let Y = Y Y with Y ,Y and ν(Y Y ) = 0 and let a bijective 1 ∪ 2 1 2 ∈ B 1 ∩ 2 map ϕ : Y Y be such that both ϕ and ϕ 1 are measurable and preserve the measure ν. Assume 1 → 2 − that for every A there is B such that B Y and ∈B ∈B ⊆ 1 1 T − (A)= B ϕ(B). (2.20) ∪ If h(y)+ h(ϕ(y)) = 0 for almost every y Y , then h = 0. ∈ 1 PT,ν Proof. From Condition 2.20 it follows that h(y)ν(dy)= h(y)ν(dy)+ h(y)ν(dy). −1 ZT (A) ZB Zϕ(B) Since 1 h(y)ν(dy)= h(ϕ− (ϕ(y)))ν(dy), Zϕ(B) Zϕ(B) the last integral is equal to h(ϕ(y))(ν ϕ)(dy)= h(ϕ(y))ν(dy) ◦ ZB ZB 1 by the change of variables applied to ϕ− and finally by the invariance of ν for ϕ. This, with the definition of , completes the proof. PT,ν 21 Remark 2 The above proposition can be easily generalized. For example, we can have 1 T − (A)= B ϕ (B) ϕ (B) ∪ 1 ∪ 2 with the sets B, ϕ1(B), ϕ2(B) pairwise disjoint. Note that if Y is an interval then it is enough to check Condition 2.20 for intervals of the form [a, b). Example 11 For an even transformation T on [ 1, 1] with an even invariant density we can take − Y = [ 1, 0] and ϕ(y)= y. In this case h = 0 for every odd function on [ 1, 1]. In particular, 1 − − PT,ν − this applies to the tent map and to the Chebyshev maps SN of Example 5 with N even. We also have h = 0 when h(y) = y and S is the Chebyshev map with N odd. Indeed, first observe PSN ,ν N that by Theorem 1 we have h =0 if P f = 0 where T is given by 2.6 and f(y) = cos(πy) PSN ,ν TN N for y [0, 1] and then note that P f = 0 follows because the expression ∈ TN (N 1)/2 − 2n 2n f(y)+ f + y + f y N N − n=1 X reduces to (N 1)/2 − 2nπ cos(πy)(1 + 2 cos ) N n=1 X which is equal to 0. For the dyadic map, we can take ϕ(y) = y + 1. Then any function satisfying h(y)+ h(y + 1) = 0 gives a solution to h = 0. PT,ν The next example shows that the assumption of ergodicity in Theorem 2 is in a sense essential. Example 12 Let T : [0, 1] [0, 1] be defined by → 2y, y [0, 1 ) ∈ 4 T (y)= 2y 1 , y [ 1 , 3 ), − 2 ∈ 4 4 2y 1, y [ 3 , 1]. − ∈ 4 The Frobenius-Perron operator is given by 1 1 1 1 1 1 1 1 PT f(y)= f y 1 1 (y)+ f y + + f y + 1 1 (y). 2 2 [0, 2 ) 2 2 4 2 2 2 [ 2 ,1] Observe that the Lebesgue measure on ([0, 1], ([0, 1])) is invariant for T and that T is not ergodic 1 1 1 1 1 1 B since T − ([0, 2 ]) = [0, 2 ] and T − ([ 2 , 1]) = [ 2 , 1]. Consider the following functions 1, y [0, 1 ) 1, y [0, 1 ) ∈ 4 ∈ 4 1, y [ 1 , 1 ), h (y)= 1, y [ 1 , 3 ), h (y)= − ∈ 4 2 1 − ∈ 4 4 2 2, y [ 1 , 3 ), 1, y [ 3 , 1], − ∈ 2 4 ∈ 4 2, y [ 3 , 1]. ∈ 4 2 2 2 2 We see at once that PT h1 = PT h2 = 0, PT h1 = h1, andPT h2 = h2. It is immediate that for every y [0, 1] we have h2(T (y)) = h2(y)=1 and h2(T (y)) = h2(y). Lemma 1 and Theorem 13 in ∈ 1 1 2 2 Appendix show that n 1 1 − h T j d N(0, 1), √n 1 ◦ → Xj=0 22 while n 1 1 − h T j d ζ, √n 2 ◦ → Xj=0 where ζ has the characteristic function of the form 1 r2 φ (r)= exp h2(y) dy. ζ − 2 2 Z0 The density of ζ is equal to 1 1 x2 1 1 x2 exp + exp , x R. 2 √2π − 2 2 √8π − 8 ∈ j Consequently, the sequence (h2 T )j 0 does not satisfy the CLT. ◦ ≥ We now discuss the problem of changing the underlying probability space for the sequence j j (h T )j 0. The random variables h T in Theorem 2 are defined on the probability space ◦ ≥ ◦ (Y, ,ν). Since ν is invariant for T , they have the same distribution and constitute a stationary B sequence. We shall show that the result (iii) of Theorem 2 remains true if the transformation T is j exact and h T are random variables on (Y, ,ν0) where ν0 is an arbitrary normalized measure ◦ B j absolutely continuous with respect to ν. In other words we can consider random variables h(T (ξ0)) with ξ0 distributed according to ν0. Now these random variables are not identically distributed and constitute a non-stationary sequence. For example, consider the hat map T (y) = 1 2 y on − | | [ 1, 1] and h(y)= y. Then Theorem 2 applies if ξ is uniformly distributed. We will show that we − 0 can also consider ξ having a density with respect to the normalized Lebesgue measure on [ 1, 1]. 0 − Theorem 3 Let (Y, ,ν) be a normalized measure space and T : Y Y be exact with respect to 2 B → ν. Let h L (Y, ,ν) be such that T,νh = 0 and σ = h 2 > 0. If ξ0 is distributed according to a ∈ B P || || j normalized measure ν0 on (Y, ) which is absolutely continuous with respect to ν, then (h T (ξ0))j 0 B ◦ ≥ satisfies both the CLT and FCLT. Proof. Let g be the density of the measure ν with respect to ν. On the probability space (Y, ) 0 0 B define the random variables ζn by n 1 1 − ζ (y)= h(T j(y)), y Y. n σ√n ∈ Xj=0 To prove the CLT we shall use the continuity theorem (Billingsley, 1968, Theorem 7.6), and to do so we must show that r2 lim exp(irζn(y))g0(y)ν(dy) = exp , r R. n − 2 ∈ →∞ Z Fix ǫ> 0. Since T is exact, there exists m 1 such that ≥ m g (y) 1 ν(dy) ǫ. |PT,ν 0 − | ≤ Z Define n m 1 1 − − ζ˜ (y)= h(T j(y)), n>m n σ√n Xj=0 23 and observe that m 1 1 − ζ = h T j + ζ˜ T m n σ√n ◦ n ◦ Xj=0 1 m 1 j for sufficiently large n. Since for every y the sequence − h(T (y)) converges to 0 as n , σ√n j=0 →∞ we obtain P m lim exp(irζn(y))g0(y)ν(dy) exp(irζ˜n(T (y)))g0(y)ν(dy) = 0. n − →∞ Z Z The equality exp(ir ζ˜ T m)= U m(exp(irζ˜ )) implies that n ◦ T n exp(irζ˜ (T m(y)))g (y)ν(dy)= exp(irζ˜ (y)) m g (y)ν(dy), n 0 n PT,ν 0 Z Z since the operators U and are adjoint. This gives T PT,ν exp(irζ˜ (T m(y)))g (y)ν(dy) exp(irζ˜ (y))ν(dy) m g (y) 1 ν(dy) ǫ. n 0 − n ≤ |PT,ν 0 − | ≤ Z Z Z From Theorem 2 and the continuity theorem (Billingsley, 196 8, Theorem 7.6), it follows that r2 lim exp(irζ˜n(y))ν(dy) = exp . n − 2 →∞ Z Consequently, r2 lim sup exp(irζn(y))g0(y)ν(dy) exp ǫ, n − − 2 ≤ →∞ Z which leads to the desired conclusion, as ǫ was arbitrary. Similar arguments as above, the multidimensional version of the continuity theorem, and Lemma 2 in Appendix allow us to show that the finite dimensional distributions of [nt] 1 1 − ψ (t)= h(T j(ξ )) n σ√n 0 Xj=0 converge to the finite dimensional distributions of the Wiener process w. By Lemma 3 in Appendix Condition 2.19 holds with Pr = ν. Since ν0 is absolutely continuous with respect to ν, it is easily seen that this Condition also holds with Pr = ν0, which completes the proof. Does the CLT still hold when h does not satisfy the equation h = 0? The answer to this PT,ν question is positive provided that h can be written as a sum of two functions in which one satisfies the assumptions of Theorem 2 while the other is irrelevant for the Central Limit Theorem to hold. This idea goes back to Gordin (1969). Theorem 4 Let (Y, ,ν) be a normalized measure space, T : Y Y be ergodic with respect to B → ν, and h L2(Y, ,ν) be such that h(y)ν(dy) = 0. If there exists h˜ L2(Y, ,ν) such that ∈ B 1 n 1 j 2 ∈ B h˜ = 0 and the sequence − (h h˜) T is convergent in L (Y, ,ν) to 0, then PT,ν √n j=0 R − ◦ B P n 1 j 2 − h T lim || j=0 ◦ ||2 = h˜ 2 (2.21) n n || ||2 →∞ P 24 and n 1 1 − h T j d h˜ N(0, 1). √n ◦ → || ||2 Xj=0 j Moreover, if the series j∞=1 h(y)h(T (y))ν(dy) is convergent, then P R ∞ h˜ 2 = h2(y)ν(dy) + 2 h(y)h(T n(y))ν(dy). (2.22) || ||2 n=1 Z X Z Proof. We have n 1 j n 1 j n 1 j − h T − h˜ T − (h h˜) T j=0 ◦ = j=0 ◦ + j=0 − ◦ . (2.23) P √n P √n P √n Since h˜ = 0, we obtain h˜(y)ν(dy)=0and h˜(T i(y))h˜(T j(y))ν(dy) =0for i = j by Condition PT,ν 6 (i) of Theorem 2. Hence R R n 1 n 1 1 − 1 − h˜ 2 = h˜ T j 2 = h˜ T j 2 || ||2 n || ◦ ||2 ||√n ◦ ||2 Xj=0 Xj=0 n 1 1 − and therefore Equation 2.21 holds. Since the sequence (h h˜) T j is convergent to 0 in √n − ◦ Xj=0 L2(Y, ,ν), it is also convergent to 0 in probability. Equation 2.23, Condition (ii) of Theorem 2 B applied to h˜, and property (2.15) complete the proof of the first part. It remains to show that n 1 j 2 − h T ∞ lim || j=0 ◦ ||2 = h2(y)ν(dy) + 2 h(y)h(T n(y))ν(dy). n n →∞ P n=1 Z X Z Since ν is T -invariant, we have 2 n 1 n 1 j 1 − 1 − h(T j(y)) ν(dy)= h + 2 h(y)h(T l(y))ν(dy), n || ||2 n Z Xj=0 Xj=1 Xl=1 Z n j j but the sequence ( j=1 h(y)h(T (y))ν(dy))n 1 is convergent to j∞=1 h(y)h(T (y))ν(dy), which ≥ completes the proof. P R P R Remark 3 Note that in the above proof of the CLT we only used the weaker condition that the n 1 1 − sequence (h h˜) T j is convergent to 0 in probability. The stronger assumption that this √n − ◦ Xj=0 sequence is convergent in L2(Y, ,ν) was used to derive Equation 2.21. Note also that all of the B computations are useless when h˜ = 0 and the most interesting situation is when h˜ is nontrivial, || ||2 i.e. h˜ > 0. || ||2 Strengthening the assumptions of the last theorem leads to the functional central limit theorem. Theorem 5 Let (Y, ,ν) be a normalized measure space, T : Y Y be ergodic with respect to ν, B → and h L2(Y, ,ν) be such that h(y)ν(dy) = 0. If there exists a nontrivial h˜ L2(Y, ,ν) such ∈ ˜ B 1 n 1 ˜ j ∈ B j that T,νh = 0 and the sequence j=0− (h h) T is ν-a.e. convergent to 0, then (h T )j 0 P R√n − ◦ ◦ ≥ satisfies the FCLT. P 25 Proof. Since every sequence convergent ν almost everywhere is convergent in probability, the CLT follows by the preceding Remark and Theorem 4. To derive the FCLT define [nt] 1 [nt] 1 1 − 1 − ψ (t)= h˜ T j and ψ (t)= h T j, t [0, 1], n σ√n ◦ n σ√n ◦ ∈ Xj=0 Xj=0 e where σ = h˜ . Then by (iii) of Theorem 2 we have || ||2 ψ d w. n → By property (2.15) it remains to show that e ρ (ψ , ψ ) P 0. S n n → To this end observe that e k 1 1 − j ρS(ψn, ψn) sup ψn(t) ψn(t) max (h h˜) T . ≤ | − |≤ σ√n 1 k n | − ◦ | 0 t 1 ≤ ≤ ≤ ≤ Xj=0 1 e n 1 j e Since the sequence − (h h˜) T is ν-a.e. convergent to 0, the same holds for the sequence √n j=0 − ◦ 1 k 1 ˜ j max1 k n j=0− (h h) T by an elementary analysis, which completes the proof. σ√n ≤ ≤ | P− ◦ | Remark 4 WithP the settings and notation of Theorem 4 and respectively Theorem 5, if T is exact, j then the same conclusions hold for the sequence (h T (ξ0))j 0 provided that ξ0 is distributed ◦ ≥ according to a normalized measure ν0 on (Y,ν) which is absolutely continuous with respect to ν. Indeed, since convergence to zero in probability is preserved by an absolutely continuous change of measure, we can apply the above arguments again, with Theorem 2 replaced by Theorem 3. One situation when all assumptions of the two preceding theorems are met is described in the following Theorem 6 Let (Y, ,ν) be a normalized measure space, T : Y Y be ergodic with respect to ν, B → and h L2(Y, ,ν). Suppose that the series ∈ B ∞ n h PT,ν n=0 X 2 n is convergent in L (Y, ,ν). Define f = ∞ h and h˜ = h + f f T . B n=1 PT,ν − ◦ ˜ 2 ˜ 1 n 1 ˜ j Then h L (Y, ,ν), T,νh = 0, h(y)ν(dy) = 0, and the sequence ( j=0− (h h) T )n 1 ∈ B P P √n − ◦ ≥ is convergent to 0 both in L2(Y, ,ν) and ν a.e. B R − P In particular, h˜ = 0 if and only if h = f T f for some f L2(Y, ,ν). || ||2 ◦ − ∈ B Proof. Since (h + f)= f, we have by Equation 2.3 PT,ν h˜ = (h + f) (f T )= f U f = 0. PT,ν PT,ν − PT,ν ◦ − PT,ν T 1 n 1 j Thus it remains to study the behavior of the sequence − (h h˜) T , which with our √n j=0 − ◦ notations reduces to 1 (f T n f). This sequence is obviously convergent to 0 in L2(Y, ,ν) √n P n ◦ − B because f T f 2 2 f 2. It is also ν-a.e. convergent to 0 which follows from the Borel- || ◦ − || ≤ || || 2 n 2 Cantelli lemma and the fact that for every ǫ> 0 the series ∞ ν(f T nǫ)= ∞ ν(f nǫ) n=1 ◦ ≥ n=1 ≥ is convergent as f L2(Y, ,ν), which completes the proof. ∈ B P P Summarizing our considerations for general h we arrive at the following sufficient conditions for the CLT and FCLT to hold. 26 Corollary 1 Let (Y, ,ν) be a normalized measure space, T : Y Y be ergodic with respect to ν, B → and h L2(Y, ,ν). If ∈ B ∞ n h < , (2.24) ||PT,ν ||2 ∞ nX=0 then σ 0 given by ≥ ∞ σ2 = h2(y)ν(dy) + 2 h(y)h(T n(y))ν(dy) n=1 Z X Z j is finite and (h T )j 0 satisfies the CLT and FCLT provided that σ > 0. ◦ ≥ Proof. Since the operators and U are adjoint on the space L2(Y, ,ν), we have PT,ν T B h(y)h(T n(y))ν(dy)= n h(y)h(y)ν(dy). PT,ν Z Z Thus h(y)h(T n(y))ν(dy) n h(y)h(y) ν(dy) n h h ≤ |PT,ν | ≤ ||PT,ν ||2|| ||2 Z Z by Schwartz’s inequality. Hence ∞ ∞ h(y)h(T n(y)) dν h n h , | | ≤ || ||2 ||PT,ν ||2 n=1 n=1 X Z X n which shows that the series n∞=1 h(y)h(T (y))ν(dy) is convergent. Since assumption 2.24 im- n 2 plies that the series ∞ h is absolutely convergent in L (Y, ,ν), the assertions follow from n=0 PT,νP R B Theorems 4, 5, and 6. P Remark 5 Note that if Condition 2.24 holds then n lim h 2 = 0. n T,ν →∞ ||P || Since ν is finite, we have n lim h 1 = 0. n T,ν →∞ ||P || Therefore the validity of Condition 2.24 on a dense subset of h L1(Y, ,ν) : h(y)ν(dy) = 0 { ∈ B } implies that T is exact. R Assume that Y is an interval [a, b] in R for some a, b. Recall that a function h : [a, b] R is → said to be of bounded variation if b n h = sup h(yi 1) h(yi) < , | − − | ∞ a _ Xi=1 where the supremum is taken over all finite partitions, a = y R 1 h(y) h for h V ([0, 1]), y [0, 1]. (2.25) | |≤ ∈ ∈ 0 _ 27 Example 13 For the continued fraction map there exists a positive constant c < 1 such that for every function h of bounded variation over [0, 1] we have (Iosifescu, 1992, Corollary, p. 904) 1 1 n h cn h for all n 1, PT,ν ≤ ≥ 0 0 _ _ where ν is Gauss’s measure with density g as in Example 8. From this and Condition 2.25 it ∗ follows that for every h V ([0, 1]) ∈ 1 n n n T,νh 2 sup T,ν h(y) c h. ||P || ≤ y [0,1] |P |≤ ∈ _0 Consequently, Condition 2.24 is satisfied and Corollary 1 applies. By definition, the Frobenius-Perron operator is a linear operator from L1([0, 1]) to L1([0, 1]), but for sufficiently smooth piecewise monotonic maps it can be defined as a pointwise map of V ([0, 1]) into V ([0, 1]). Since functions of bounded variation have only countably many points of 1 discontinuity, redefining PT at those points does not change its L properties. If, moreover, one is able to give an estimate for the iterates of PT in the bounded variation norm 1 1 h = h + h(y) dy, || ||BV | | 0 0 _ Z n p then obviously one is able to estimate the norm of PT f in all L ([0, 1]) spaces. In many cases there exist c , c > 0 and r (0, 1) such that 1 2 ∈ 1 P nh c rn( h + c h ), h V ([0, 1]). (2.26) || T ||BV ≤ 1 2|| ||1 ∈ 0 _ We now describe two classes of chaotic maps for which one can easily show that Condition 2.24 holds for every h V ([0, 1]). Consider a transformation T : [0, 1] [0, 1] having the following ∈ → properties (i) there is a partition 0 = a0 < a1 < ... < al = 1 of [0, 1] such that for each integer i = 1, ..., l the restriction of T to [ai 1, ai) is continuous and convex, − (ii) T (ai 1)=0 and T ′(ai 1) > 0, − − (iii) T ′(0) > 1. For such transformation the Frobenius-Perron operator has a unique fixed point g , where g is ∗ ∗ of bounded variation and a decreasing function of y (Lasota and Mackey, 1994). Moreover it is bounded from below when, for example, T ([0, a1]) = [0, 1]. It is known (Jab lo´nski and Malczak, 1983b) that the estimate in Equation 2.26 holds for the Frobenius-Perron operator PT for trans- formations with these three properties. Suppose that g (y) > 0 for a.a. y [0, 1]. Since ∗ ∈ T,ν f for all f L∞([0, 1], ,ν), we have for all h f L∞([0, 1], ,ν) ||P ||∞ ≤ || ||∞ ∈ B ∈ ∈ B n 1/2 n 1/2 T,νh 2 h T,ν h 1 (2.27) ||P || ≤ || ||∞ ||P || 28 n If h is of bounded variation with h(y)g (y)dy = 0, then hg V ([0, 1]) and T,νh 1 = n ∗ ∗ ∈ ||P || PT (hg ) 1. Thus || ∗ || R n h = O(rn/2) ||PT,ν ||2 by Equation 2.26 and Corollary 1 applies. 1 Let a transformation T : [0, 1] [0, 1] be piecewise monotonic, the function T ′(y) be of bounded → | | variation over [0, 1] and infy [0,1] T ′(y) > 1. For such transformations the Frobenius-Perron opera- ∈ | | tor has a fixed point g and g is of bounded variation (Lasota and Mackey, 1994). Suppose that PT ∗ ∗ has a unique invariant density g which is strictly positive. Then the transformation T is ergodic ∗ and g is bounded from below. There exists k such that T k is exact, the estimate in Equation ∗ k 2.26 is valid for the Frobenius-Perron operator PT k corresponding to T , and the following holds (Jab lo´nski et al., 1985): There exists c > 0 and r (0, 1) such that 1 ∈ 1 1 n n P k f(y) c r ( f + f(y) dy), y [0, 1], f V ([0, 1]); | T |≤ 1 | | ∈ ∈ 0 0 _ Z Hence Corollary 1 applies to T k and every h of bounded variation with h(y)g (y)dy = 0. One ∗ can relax the assumption that g is strictly positive and have instead g c for a.e. y Y = y ∗ ∗ ≥R ∈ ∗ { ∈ [0, 1] : g (y) > 0 . Then the above estimate is valid for y Y and f with suppf = y [0, 1] : ∗ } ∈ ∗ { ∈ f(y) = 0 Y and Corollary 1 still applies to T k. 6 }⊂ ∗ We now describe how to obtain the conclusions of Corollary 1 for conjugated maps. If the Lebesgue measure on [0, 1] is invariant with respect to T , then Theorem 1 offers the following. Corollary 2 Let T : [0, 1] [0, 1] be a transformation for which the Lebesgue measure on [0, 1] is → invariant and for which Equation 2.26 holds. Let g be a positive function and S : [a, b] [a, b] be ∗ → given by S = G 1 T G, where − ◦ ◦ x G(x)= g (y) dy, a x b. ∗ ≤ ≤ Za If h : [a, b] R is a function of bounded variation with h(y)g (y)dy = 0, then → ∗ ∞ R n h < , ||PT,ν ||2 ∞ nX=0 2 where 2 denotes the norm in L ([a, b], ([a, b]),ν) and ν is the measure with density g . || · || B ∗ Proof. By Theorem 1 we have 1 f = U P U −1 f, for f L ([a, b], ([a, b]),ν). PS,ν G T G ∈ B The operator U : L2([0, 1]) L2([a, b], ([a, b]),ν) is an isometry, thus G → B n n h = P U −1 h 2 . ||PS,ν ||2 || T G ||L ([0,1]) 1 1 Since G is increasing, G is a function of bounded variation, as a result U −1 h = h G is of − G ◦ − bounded variation over [0, 1], which completes the proof. Finally we discuss the case of quadratic maps. We follow the formulation in Viana (1997). Consider the quadratic map Tβ, β (0, 2), of Example 9 and assume that for the critical point ∈ 2α c = 0 there are constants λc > 1 and α> 0 such that λc > e and 29 (i) (T n) (T (c)) λn for every n 1; | β ′ |≥ c ≥ (ii) T n(c) c e αn for every n 1; | β − |≥ − ≥ (iii) T is topologically mixing on the interval [T 2(c), T (c)], i.e. for every interval I [ 1, 1] β β β ⊂ − there exists k such that T k(I) [T 2(c), T (c)]. β ⊃ β β Young (1992) shows that there is a set of parameters β close to 2 for which conditions (i), (ii), (iii) hold with λ = 1.9 and α = 10 6 and that the central limit theorem holds for (h T j) with c − ◦ β h of bounded variation. Viana (1997) (Section 5) proves that conditions (i), (ii), (iii) imply all assumptions of our Corollary 1 for the transformation Tβ and functions h of bounded variation with h(y)νβ(dy) = 0. We now use the following generalization of Theorem 6 which allow us to have the CLT and R FCLT for maps with polynomial decay of correlations. Theorem 7 (Tyran-Kami´nska (2004)) Let (Y, ,ν) be a normalized measure space, T : Y Y be B → ergodic with respect to ν, and h L2(Y, ,ν) be such that h(y)ν(dy) = 0. Suppose that ∈ B n 1 R ∞ 3 − k n− 2 h < . (2.28) || PT,ν ||2 ∞ n=1 X Xk=0 ˜ 2 ˜ 1 n 1 ˜ j Then there exists h L (Y, ,ν) such that T,νh = 0 and the sequence ( j=0− (h h) T )n 1 ∈ B P √n − ◦ ≥ is convergent in L2(Y, ,ν) to zero and if B P n 1 − 1 k h = O(nα) with α< || PT,ν ||2 2 Xk=0 1 n 1 ˜ j then ( j=0− (h h) T )n 1 is convergent ν a.e. to 0. √n − ◦ ≥ − P Now we give a simple result that derives CLT and FCLT from a decay of correlations. Corollary 3 (Tyran-Kami´nska (2004)) Let (Y, ,ν) be a normalized measure space, T : Y Y be B → ergodic with respect to ν, and let h L (Y, ,ν) be such that h(y)ν(dy) = 0. If there are α> 1 ∈ ∞ B and c> 0 such that c R h(y)g(T n(y))ν(dy) g (2.29) ≤ nα || ||∞ Z for all g L (Y, ,ν) and n 1, then σ 0 given by ∈ ∞ B ≥ ≥ ∞ σ2 = h2(y)ν(dy) + 2 h(y)h(T n(y))ν(dy) n=1 Z X Z j is finite and (h T )j 0 satisfies the CLT and FCLT provided that σ > 0. ◦ ≥ Only recently the FCLT was established by Pollicott and Sharp (2002) for maps such as the Manneville-Pomeau map of Example 10 and for H¨older continuous functions h with h(y)ν(dy) = 0 under the hypothesis that 0 <β< 1 . When 0 <β< 1 the CLT was proved by Young (1999), 3 2 R where it was shown that in this case condition 2.29 holds. Thus our Corollary 3 gives both the CLT and the FCLT for maps satisfying the following: 30 2 (i) T (0) = 0, T ′(0) = 1, T is increasing and piecewise C and onto [0, 1], (ii) infǫ y 1 T ′(y) > 1 for every ǫ> 0, ≤ ≤ | | 1 β (iii) limy 0 y − T ′′(y) = 0. → 6 1 as the “tower method” of Young (1999) gives us the estimate in Equation 2.29 with α = 1 for β − all H¨older continuous h and g L ([0, 1], ,ν) with the constant c dependent only on h. ∈ ∞ B 2.4 Weak convergence criteria Let (X, ) be a phase space which is either Rk or a separable Banach space, and denote by | · | M1 the space of all probability measures defined on the σ-algebra (X) of Borel subsets of X. For a B real-valued measurable bounded function f, and µ , we introduce the scalar product notation ∈ M1 f,µ = f(x)µ(dx). h i ZX One way to characterize weak convergence in is to use the Fortet-Mourier metric in , which M1 M1 is defined by d (µ ,µ ) = sup f,µ f,µ : f for µ ,µ , FM 1 2 {| h 1i − h 2i | ∈ FFM } 1 2 ∈ M1 where FM = f : X R : sup f(x) 1, f L 1 F { → x X | |≤ | | ≤ } ∈ f(x) f(y) and f L = supx=y | x−y | . This defines a complete metric on 1, and we have µn µ weakly | | 6 M → if and only if d (µ |,µ−) | 0 (cf. Dudley (1989) [Chapter 3]) FM n → We further introduce a distance on by M1 d(µ ,µ ) = sup f,µ f,µ : f 1 for µ ,µ . 1 2 {| h 1i − h 2i | | |L ≤ } 1 2 ∈ M1 This quantity is always defined, but for some measures it may be infinite. It is easy to check that the function d is finite for elements of the set 1 = µ : x µ(dx) < , M1 { ∈ M1 | | ∞} ZX and defines a metric on this set. Moreover, 1 is a dense subset of ( , d ) and M1 M1 FM d (µ ,µ ) d(µ ,µ ). FM 1 2 ≤ 1 2 Let (Y, ,ν) be a normalized measure space and let R : X Y X be a measurable trans- B n × → formation for each n N. We associate with each transformation R an operator P : ∈ n n M1 → M1 defined by Pnµ(A)= 1A(Rn(x,y))ν(dy)µ(dx) (2.30) ZX ZY for µ 1, where ∈ M 1 x A 1A(x)= ∈ 0 x A 6∈ 31 is the indicator function of a set A. Write Unf(x)= f(Rn(x,y))ν(dy) ZY for measurable functions f : X R, for which the integral is defined. The operators U and P → n n satisfy the identity < Unf,µ >= Remark 6 Note that if Rn(x,y) does not depend on y, then Unf = URn f where URn is the Koop- man operator corresponding to R : X X. The following relation holds between the Frobenius- n → Perron operator P on L1(X, (X),m) and the operator P : If µ has a density f with respect to Rn B n m, then PRn f is a density of Pnµ. On the other hand if Rn(x,y) does not depend on x, then Unf is equal to URn f(y)ν(dy), where U is the Koopman operator corresponding to R : Y X. The operator P has the same value Rn n → R n ν R 1 for every µ . ◦ n− ∈ M1 Assume that for each n N the transformation R : X Y X satisfies the following ∈ n × → conditions: (A1) There exists a measurable function L : Y R such that n → + R (x,y) R (¯x,y) L (y) x x¯ for x, x¯ X, y Y. | n − n |≤ n | − | ∈ ∈ (A2) The series ∞ R (0, T (y)) R (0,y) ν(dy) | n − n+1 | n=1 Y X Z is convergent, where T : Y Y is a transformation preserving the measure ν. → (A3) The integral R (0,y) ν(dy) is finite for at least one n. Y | n | R Proposition 2 Let the transformations Rn satisfy conditions (A1)-(A3). If lim Ln(y)ν(dy) = 0, n →∞ ZY then there exists a unique measure µ 1 such that (Pnµ) converges weakly to µ for each ∗ ∈ M ∗ measure µ . ∈ M1 Proof. Assumptions (A2) and (A3) imply that P ( 1) 1. By the definition of the metric d n M1 ⊂ M1 we have d(P δ , P δ ) = sup U f(0) U f(0) : f 1 . n 0 n+1 0 {| n − n+1 | | |L ≤ } Since the transformation T preserves the measure ν, we can write Unf(0) = f(Rn(0,y)ν(dy)= f(Rn(0, T (y))ν(dy) ZY ZY 32 for any f with f 1. Hence | |L ≤ U f(0) U f(0) R (0, T (y)) R (0,y) ν(dy). | n − n+1 |≤ | n − n+1 | ZY Consequently d(P δ , P δ ) R (0, T (y)) R (0,y) ν(dy), n 0 n+1 0 ≤ | n − n+1 | ZY and d (P δ , P δ ) d(P δ , P δ ). FM n 0 n+1 0 ≤ n 0 n+1 0 From Condition (A2), the sequence (P δ ) is a Cauchy sequence. Since the space ( , d ) is n 0 M1 FM complete, (Pnδ0) is weakly convergent to a µ 1. From (A1) it follows that ∗ ∈ M d(P µ , P µ ) L (y)ν(dy)d(µ ,µ ). n 1 n 2 ≤ n 1 2 ZY 1 Hence (Pnµ) is weakly convergent for each µ 1 and has the limit µ . Since, for sufficiently ∈ M ∗ large n, each operator Pn satisfies d (P µ , P µ ) d (µ ,µ ) FM n 1 n 2 ≤ FM 1 2 and the set 1 is dense in (M , d ), the proof is complete. M1 1 FM 3 Analysis We now return to the original problem posed in Section 1. We consider the position (x) and velocity (v) of a dynamical system defined by dx(t) = v(t), (3.1) dt dv(t) = b(v(t)) + η(t), (3.2) dt with a perturbation η in the velocity. We assume that η(t) consists of a series of delta-function-like perturbations that occur at times t ,t ,t , . These perturbations have an amplitude h(ξ(t)), and 0 1 2 · · · η(t) takes the explicit form ∞ η(t)= κ h(ξ(t))δ(t t ). (3.3) − n n=0 X We assume that ξ is generated by a dynamical system that at least has an invariant measure for the results of Section 3.1 to hold, or is at least ergodic for the Central Limit Theorem to hold as in Section 3.2. In practice, we will illustrate our results assuming that ξ is the trace of a highly chaotic semidy- namical system that is, indeed, even exact in the sense of Lasota and Mackey (1994) (c.f. Section 2). ξ could, for example, be generated by the differential delay equation dξ δ = ξ(t)+ T (ξ(t 1)), (3.4) dt − − 33 where the nonlinearity T has the appropriate properties to generate chaotic solutions (Mackey and Glass (1977); an der Heiden and Mackey (1982)). The parameter δ controls the time scale for these os- cillations, and in the limit as δ 0 we can approximate the behavior of the solutions through a → map of the form ξn+1 = T (ξn). (3.5) Thus, we can think of the map T as being generated by the sampling of a chaotic continuous time signal ξ(t) as, for example, by the taking of a Poincar´esection of a semi-dynamical system operating in a high dimensional phase space. Let (Y, ,ν) be a normalized measure space. Let b : Rk Rk, h : Y Rk be measurable B → → transformations, and let (tn)n 0 be an increasing sequence of real numbers. Assume that ξ : ≥ R Y Y is such that ξ(t ) = T (ξ(t )) for n 0, where T : Y Y is a measurable + × → n+1 n ≥ → transformation preserving the measure ν. Combining (3.2) with (3.3) we have dv(t) ∞ = b(v(t)) + κ h(ξ(t))δ(t t ) (3.6) dt − n n=0 X We say that v(t), t t , is a solution of Equation 3.6 if, for each n 0, v(t) is a solution of the ≥ 0 ≥ Cauchy problem dv(t) = b(v(t)), t (t ,t ) dt ∈ n n+1 (3.7) v(t )= v(t )+ κh(ξ(t )), n n− n k where v is an arbitrary point of R and v(t ) = lim − v(t) for n 1. 0 n− t tn → ≥ Let π : R Rk Rk be the semigroup generated by the Cauchy problem (if b is a Lipschitz + × → map then π is well defined) dv˜(t) = b(˜v(t)), t> 0 dt (3.8) v˜(0) =v ˜ , 0 i.e. for everyv ˜ Rk the unique solution of (3.8) is given byv ˜(t)= π(t, v˜ ) for t 0. As a result, 0 ∈ 0 ≥ the solution of (3.6) is given by v(t)= π(t t , v(t )), for t [t ,t ),n 0. − n n ∈ n n+1 ≥ After integration, for t [t ,t ) we have ∈ n n+1 t t t tn x(t) x(t )= v(s)ds = π(s t , v(t ))ds = − π(s, v(t ))ds. − n − n n n Ztn Ztn Z0 Consequently, the solutions of Equations 3.1 and 3.2 are given by t tn − x(t) = x(tn)+ π(s, v(tn))ds, (3.9) Z0 v(t) = π(t t , v(t )), for t [t ,t ),n 0. (3.10) − n n ∈ n n+1 ≥ Observe that x(t) is continuous in t, while v(t) is only right continuous, with left-hand limits, and v(t ) = lim + v(t). n t tn → 34 We are interested in the variables v(tn), vn := v(tn−), ξn := ξ(tn), and xn := x(tn) which appear in the definition of the solution v(t) and x(t). We have v(tn) = vn + κh(ξn), (3.11) v = π(t t , v(t )), (3.12) n+1 n+1 − n n ξn+1 = T (ξn), (3.13) tn+1 tn − xn+1 = xn + π(s, v(tn))ds. (3.14) Z0 We are going to examine the dynamics of these variables from a statistical point of view. Suppose that v0 has a distribution µ, ξ0 has a distribution ν, and that the random variables are independent. What can we say about the long-term behavior of the distribution of the random variables v(tn) or vn? 3.1 Weak convergence of v(tn) and vn. To simplify the presentation and easily use Proposition 2 of Section 2.4, assume that the differences t t do not depend on n, and that t = nτ for n 0. Define Λ : Rk Rk by n+1 − n n ≥ → Λ(v)= π(τ, v), v Rk, (3.15) ∈ where π describes the solutions of the unperturbed system as defined by Equation 3.8. In par- ticular, adding chaotic deterministic perturbations to any exponentially stable system produces a stochastically stable system as stated in the following Corollary 4 Let Λ : Rk Rk be a Lipschitz map with a Lipschitz constant λ (0, 1). Let T : Y → ∈ → Y be a transformation preserving the measure ν, and h : Y Rk be such that h(y) ν(dy) < . → Y | | ∞ Assume that the random variables v and ξ are independent and that ξ has a distribution ν. 0 0 0 R Then v(nτ) converges in distribution to a probability measure µ on Rk and µ is independent of ∗ ∗ the distribution of the initial random variable v0. Moreover, vn converges in distribution to the 1 probability measure µ Λ− . ∗ ◦ Proof. From Equations 3.11 and 3.12 it follows that v((n + 1)τ) = Λ(v(nτ)) + κh(ξ ), n 0. n+1 ≥ Define the transformation R : Rk Y Rk recursively: n × → R (v,y) = v + κh(y), 0 (3.16) R (v,y) = Λ(R (v,y)) + κh(T n+1(y)), v Rk,y Y,n 0. n+1 n ∈ ∈ ≥ Then v(nτ) = Rn(v0,ξ0). One can easily check by induction that all assumptions of Proposition 2 are satisfied. Thus v(nτ) converges in distribution to a unique probability measure µ on Rk. ∗ Since vn+1 = Λ(v(nτ)) and Λ is a continuous transformation, it follows from the definition of weak 1 convergence that the distribution of vn+1 converges weakly to µ Λ− . ∗ ◦ We call the measure µ the limiting measure for v(nτ). Note that µ may depend on ν. ∗ ∗ Remark 7 Although this Corollary shows that there is a unique limiting measure, we cannot con- clude in general that this measure has a density absolutely continuous with respect to the Lebesgue measure. See Example 7 and Remark 12. 35 3.2 The linear case in one dimension We now consider Equation 3.2 when b(v)= γv and γ 0. In this situation, we are considering a − ≥ frictional force linear in the velocity, so Equations 3.1 and 3.2 become dx(t) = v(t), dt dv(t) ∞ = γv(t)+ κ h(ξ(t))δ(t t ). (3.17) dt − − n nX=0 To make the computations of the previous section completely transparent, multiply Equation 3.17 by the integrating factor exp(γt), rearrange, and integrate from (t ǫ) to (t ǫ), where n − n+1 − 0 <ǫ< minn 0(tn+1 tn), to give ≥ − ∞ tn+1 ǫ γ(tn+1 ǫ) γ(tn ǫ) − v(t ǫ)e − v(t ǫ)e − = κ h(ξ(z))δ(z t ) dz n+1 − − n − − n n=0 tn ǫ X Z − γ(tn ǫ) = κe − h(ξ(t ǫ)) (3.18) n − Taking the limǫ 0 in Equation 3.18 and remembering that v(tn−)= vn and ξ(tn)= ξn, we have → vn+1 = λnvn + κλnh(ξn), (3.19) where τ t t and 0 λ e γτn < 1. n ≡ n+1 − n ≤ n ≡ − We simplify this formulation by taking t t τ > 0 so the perturbations are assumed to n+1 − n ≡ be arriving periodically. As a consequence, λ λ with n ≡ γτ λ = e− . (3.20) Then, Equation 3.19 becomes vn+1 = λvn + κλh(ξn). (3.21) This result can also be arrived at from other assumptions 1. 3.2.1 Behaviour of the velocity variable For a given initial v0 we have, by induction, n 1 n − n 1 j vn = λ v0 + κλ λ − − h(ξj) Xj=0 n 1 n − n 1 j j = λ v0 + κλ λ − − h(T (ξ0)). (3.24) Xj=0 1Alternately but, as it will turn out, equivalently, we can think of the perturbations as constantly applied. In this case we write an Euler approximation to the derivative in Equation 3.17 so with an integration step size of τ we have v(t + τ)=(1 − γτ)v(t)+ τκh(ξ(t)). (3.22) Measuring time in units of τ so tn+1 = tn + τ we then can write this in the alternate equivalent form vn+1 = λvn + κ1λh(ξn), (3.23) −1 where, now, λ = 1 − γτ and κ1 = κτλ . Again, by induction we obtain Equation 3.24. 36 We now calculate the asymptotic behaviour of the variance of vn when ξ0 is distributed according 2 2 to ν, the invariant measure for T . Assume for simplicity that v0 = 0, set σ = h (y)ν(dy) and assume that h(y)h(T n(y))ν(dy)=0 for n 1. Then we have ≥ R R 2 n 1 2 2 − n j j v ν(dy)= κ λ − h(T (y)) ν(dy). n Z Z Xj=0 Since the sequence h T j is uncorrelated by our assumption, ◦ 2 n 1 n 1 2n − n j j − 2n 2j j 2 1 λ 2 λ − h(T (y)) ν(dy)= λ − h(T (y)) ν(dy)= − σ . 1 λ2 Z Xj=0 Xj=0 Z − Thus 1 λ2n v2 ν(dy)= κ2σ2 − . (3.25) n 1 λ2 Z − Since b(v)= γv, we have π(t, v)= e γtv. Equation 3.21, in conjunction with Equations 3.12 − − and 3.11, leads to vn+1 = λv(nτ) where v(nτ)= vn + κh(ξn). Thus n n n j j v(nτ)= λ v0 + κ λ − h(T (ξ0)) (3.26) Xj=0 Case 1: If λ < 1 and ξ0 is distributed according to ν, then by Corollary 4 there exists a unique limiting measure µ for v(nτ) provided that h is integrable with respect to ν. The sequence vn also ∗ converges in distribution. Since the function Λ defined by Equation 3.15 is linear, Λ(v)= λv, both sequences v(nτ) and vn are either convergent or divergent in distribution. What can happen if the random variable ξ0 in Equation 3.26 is distributed according to a different measure? Proposition 3 Let λ < 1, the transformation T be exact with respect to ν, and h L1(Y, ,ν). ∈ B If the random variable ξ is distributed according to a normalized measure ν on (Y, ) which is 0 0 B absolutely continuous with respect to ν, then n n j j d κ λ − h(T (ξ0)) µ (3.27) → ∗ Xj=0 and µ does not depend on ν0. ∗ Proof. Recall that by the continuity theorem n n j j d κ λ − h(T (ξ0)) µ → ∗ Xj=0 if and only if for every r R ∈ n n j j lim exp(irκ λ − h(T (y)))ν0(dy)= exp(irx)µ (dx). n ∗ →∞ Y R Z Xj=0 Z 37 We know that the last Equation is true when ν = ν. Since for every m 1 the sequence 0 ≥ m 1 − n j j κ λ − h(T (y)) Xj=0 is convergent to 0 as n and →∞ n n n j j n m j j m λ − h T = λ − − h T T , ◦ ◦ ◦ jX=m Xj=0 an analysis similar to that in the proof of Theorem 3 completes the demonstration. n 1 j Case 2: If λ = 1 we have vn = v0 + κ j=0− h(T (ξ0)). Since v0 and ξ0 are independent ran- n 1 j dom variables, v0 and κ j=0− h(T (ξ0)) areP also independent. Hence vn converges if and only if n 1 j κ j=0− h(T (ξ0)) does. Moreover, if there is a limiting measure µ for vn, then the sequence v(nτ) P ∗ converges in distribution, say to ν , and µ is a convolution of the distribution of v0 and ν . As P ∗ ∗ ∗ a result, µ depends on the distribution of v0. However, if the map T and function h satisfy the ∗ FCLT, then n 1 j − h(T (ξ0)) j=0 d N(0, 1). P σ√n → n 1 j Hence j=0− h(T (ξ0)) is not convergent in distribution since the density is spread on the entire real line. P 3.2.2 Behaviour of the position variable For the position variable we have, for t [nτ, (n + 1)τ), ∈ t t γ(t nτ) γ(s nτ) 1 e− − x(t) x(nτ)= v(s)ds = e− − v(nτ)ds = − v(nτ). − γ Znτ Znτ With x(nτ)= xn we have 1 e γτ 1 λ x x = − − v(nτ)= − v(nτ). (3.28) n+1 − n γ γ Summing from 0 to n gives n 1 λ x = x + − v(jτ). (3.29) n+1 0 γ Xj=0 From this and Equation 3.26 we obtain n j 1 λ j j i i xn+1 = x0 + − λ v0 + κ λ − h(T (ξ0)) γ ! Xj=0 Xi=0 n+1 n j (1 λ ) (1 λ)κ j i i = x + − v + − λ − h(T (ξ )). 0 γ 0 γ 0 Xj=0 Xi=0 38 Changing the order of summation in the last term gives n j n n n n i+1 j i i j i i 1 λ − i λ − h(T (ξ )) = λ − h(T (ξ )) = − h(T (ξ )) 0 0 1 λ 0 Xj=0 Xi=0 Xi=0 Xj=i Xi=0 − n n 1 i λ n i i = h(T (ξ )) λ − h(T (ξ )). 1 λ 0 − 1 λ 0 − Xi=0 − Xi=0 Consequently n+1 n n (1 λ ) κ i λκ n i i x = x + − v + h(T (ξ )) λ − h(T (ξ )). n+1 0 γ 0 γ 0 − γ 0 Xi=0 Xi=0 In conjunction with Equation 3.28, this gives n n n (1 λ ) κ i κ n i i x = x + − v + h(T (ξ )) λ − h(T (ξ )). (3.30) n 0 γ 0 γ 0 − γ 0 Xi=0 Xi=0 Next we calculate the asymptotic behavior of the variance of xn. Assume as before that x0 = j 2 2 v0 = 0, and that h(y)h(T (y))ν(dy)=0 and σ = h (y)ν(dy). We have n R n 2 nR 2 n 2 i n i i i n i i h(T (ξ )) λ − h(T (ξ ) = h(T (ξ )) + λ − h(T (ξ )) 0 − 0 0 0 i=0 i=0 ! i=0 ! i=0 ! X X Xn n X i n i i 2 h(T (ξ )) λ − h(T (ξ )) − 0 0 Xi=0 Xi=0 Since, by assumption, the sequence h T i is again uncorrelated we have ◦ n 2 n n h(T i(y)) ν(dy) = h(T i(y))h(T j (y))ν(dy) ! Z Xi=0 Xi=0 Xj=0 Z n = h(T i(y))2ν(dy) = (n + 1)σ2. Xi=0 Z Analogous to the computation for the velocity variance 2 n n 2n+2 n i i 2n 2i i 2 1 λ 2 λ − h(T (y)) ν(dy)= λ − h(T (y)) ν(dy)= − 2 σ ! 1 λ Z Xi=0 Xi=0 Z − and n n n n i n j j n j i j h(T (y)) λ − h(T (y))ν(dy) = λ − h(T (y))h(T (y))ν(dy) Z Xi=0 Xj=0 Xi=0 Xj=0 Z n n+1 n i i 2 1 λ 2 = λ − h(T (y)) ν(dy)= − σ . 1 λ Xi=0 Z − Consequently, if x0 = v0 = 0 then κ2σ2 1 λ2n+2 1 λn+1 x2 ν(dy)= n +1+ − 2 − . (3.31) n γ2 1 λ2 − 1 λ Z − − 39 Theorem 8 Let (Y, ,ν) be a normalized measure space, T : Y Y be a measurable map such that B → T preserves the measure ν, let σ> 0 be a constant, and h L2(Y, ,ν) be such that h(y)ν(dy) = 0. ∈ B Then n h(T i(ξ )) R i=0 0 d N(0,σ2) (3.32) √n → P if and only if x κ2σ2 n d N 0, . (3.33) √n → γ2 Proof. Assume that Condition 3.32 holds. From Equation 3.30 we obtain n n n xn x0 (1 λ ) v0 κ i κ n i i = + − + h(T (ξ )) λ − h(T (ξ )). √n √n γ √n γ√n 0 − γ√n 0 Xi=0 Xi=0 By assumption n κ κ h(T i(ξ )) d N(0,σ2). γ√n 0 → γ Xi=0 Thus the result will follow when we show that the remaining terms are convergent in probability to zero. The first term x (1 λn) v 0 + − 0 √n γ √n n n i i is convergent to zero a.e. hence in probability. The sequence λ − h(T (ξ0)) is convergent in κ i=0 distribution and 0 as n . Consequently, the sequence γ√n → →∞ P n κ n i i λ − h(T (ξ )) γ√n 0 Xi=0 is convergent in probability to zero which completes the proof. The proof of the converse is analo- gous. Remark 8 Observe that if the transformation T is exact and ξ0 is distributed according to a mea- sure absolutely continuous with respect to ν, then the conclusion of Theorem 8 still holds. Theorem 8 generalizes the results of Chew and Ting (2002). In Section 2.3.2 we have discussed when Condition 3.32 holds for a given ergodic transformation. Remark 9 Note that if we multiply Gaussian distributed random variable N(0, 1) by a positive constant c, then it becomes Gaussian distributed N(0, c2) with variance c2. Thus if we multiply 1 both sides of Equation 3.33 by , we obtain √τ x(nτ) κ2σ2 d N(0, ). √nτ → τγ2 So if κ = √γmτ, as in Chew and Ting (2002), then κ2σ2 mσ2 = . τγ2 γ 40 4 Identifying the Limiting Velocity Distribution Let Y R be an interval, = (Y ), and T : Y Y be a transformation preserving a normalized ⊂ B B → measure ν on (Y, (Y )). Recall from Section 3.1 that µ is the limiting measure for the sequence B ∗ of random variables (v(nτ)) starting from v 0, i.e. 0 ≡ n n i v(nτ)= κ λ − h(ξi), Xi=0 where h : Y R is a given integrable function, 0 <λ< 1, ξ = T i(ξ ), and ξ is distributed → i 0 0 according to ν. Proposition 4 Let Y = [a, b] and h : Y R be a bounded function. Then the limiting measure → µ has moments of all order given by ∗ xkµ (dx) = lim v(nτ)kν(dy) (4.1) ∗ n Z →∞ Z and the characteristic function of µ is of the form ∗ ∞ (ir)k φ (r)= xkµ (dx), r R. ∗ k! ∗ ∈ Xk=0 Z κc κc Moreover, µ ([ , ]) = 1, where c = supy Y h(y) . ∗ −1 λ 1 λ ∈ | | − − Proof. Since h is bounded, we have κc k v(nτ) k , n,k 0. | | ≤ 1 λ ≥ − The existence and convergence of moments now follow from Theorem 5.3 and 5.4 of Billingsley κc κc κc κc (1968). Since v(nτ) has all its values in the interval [ , ] we obtain µ ([ , ]) = −1 λ 1 λ n −1 λ 1 λ 1 where µn is the distribution of v(nτ). Convergence− in distribution− (Billingsley, 1968,− Theore− m 2.1) implies that lim sup µn(F ) µ (F ) n ≤ ∗ →∞ κc κc for all closed sets. Therefore µ ([ , ]) = 1. The formula for the characteristic function ∗ −1 λ 1 λ is a consequence of the other statements,− and− the proof is complete. Remark 10 Note that if the characteristic function of µ is integrable, then µ has a continuous ∗ ∗ and bounded density which is given by 1 f (x)= ∞ exp( ixr)φ (r)dr, x R. ∗ 2π − ∗ ∈ Z−∞ On the other hand if µ has a density then φ (r) 0 as r . ∗ ∗ → | |→∞ κc κc Note also that if a density exists then it must be zero outside the interval [ , ]. −1 λ 1 λ − − 41 Let Y = [a, b] be an interval and h(y) = y for y Y . Thus h is bounded and for this choice ∈ of h all moments of the corresponding limiting distribution µ exists by Proposition 4. However, ∗ the calculation might be quite tedious. We are going to determine the measure µ for a specific ∗ example of the transformation T by using a different method. Let h : Y R be defined by n → n n i i h (y)= λ − T (y), y Y,n 0. (4.2) n ∈ ≥ Xi=0 Then v(nτ) = κhn(ξ0) and vn = κλhn 1(ξ0). Thus knowing the limiting distribution for these − sequences is equivalent to knowing the limiting distribution for hn(ξ0). 4.1 Dyadic map To give a concrete example for which much of the preceding considerations can be completely illustrated, consider the generalized dyadic map defined by Equation 2.10: 2y + 1, y [ 1, 0] T (y)= ∈ − 2y 1, y (0, 1] . − ∈ Proposition 5 Let ξ,ξ be random variables uniformly distributed on [ 1, 1]. Let (α ) be a se- 0 − k quence of independent random variables taking values drawn from 1, 1 with equal probability. {− } Assume that ξ is statistically independent of the sequence (α ). Then for every λ (0, 1) k ∈ d 1 ∞ k hn(ξ0) ξ + λ αk+1 . (4.3) → 2 λ ! − Xk=0 Proof. The random variable ∞ α ξ = k 0 2k Xk=1 is uniformly distributed on [ 1, 1]. It is easily seen that for the dyadic map − ∞ α ξ = T i(ξ )= k+i for i 1. (4.4) i 0 2k ≥ Xk=1 Using this representation we obtain n 1 n 1 n i n 1 − n 1 i − n 1 i − αk+i − n 1 i ∞ αk+i λ − − ξ = λ − − + λ − − . i 2k 2k i=0 i=0 k=1 i=0 k=n i+1 X X X X X− Changing the order of summation leads to n k n 1 k+i n 1 n 1 i λ − − − λ − − ∞ αk+n i αk + n i k = 2 ! 2 − 2 Xk=1 Xi=1 Xi=0 Xk=1 n k λ n 1 n k λ 1 2 λ − 1 αk + − ξn. 2 λ − 2 ! 2 λ − Xk=1 − 42 Consequently n 1 n − n 1 i 1 n k 1 n λ − − ξ = λ − α + ξ λ w , i 2 λ k 2 λ n − n Xi=0 − Xk=1 − where n 1 1 k 1 n wn = αk + ξn . 2 λ " 2 2 # − Xk=1 This gives n n 1 n k hn 1(ξ0)+ λ wn = λ − αk + ξn . (4.5) − 2 λ ! − Xk=1 n Note that for every n we have wn 2. Therefore λ wn is a.s. convergent to 0 as n . Since | | ≤ n d → ∞ hn(ξ0) converges in distribution, say to µ , we have hn 1(ξ0)+λ wn µ and the random variables ∗ − → ∗ on the right-hand side of Equation 4.5 converge in distribution to µ . Since the random variables n n k ∗ αk are independent, the random variablese k=1 λ − αk and ξn are alsoe independent for every n. n 1 k n n k n 1 k The same is true for − λ α and ξ. Moreover, ξ + λ α and ξ + − λ α have k=0 k+1 P n k=1 −e k k=0 k+1 identical distributions. Thus P P P n 1 1 − k d ξ + λ αk+1 µ . 2 λ ! → ∗ − Xk=0 n 1 k k On the other hand − λ α ∞ λ α almost surely as n , but this implies k=0 k+1 → k=0 k+1 → ∞ convergence in distribution. The proof is complete. P P Before stating our next result, we review some of the known properties of the random variable which appears in Equation 4.3. For every λ (0, 1) let ∈ ∞ k ζλ = λ αk+1, (4.6) Xk=0 and let ̺ be the distribution function of ζ , ̺ (x) = Pr ζ x for x R. Explicit expres- λ λ λ { λ ≤ } ∈ sions for ̺λ are, in general, not known. The measure induced by the distribution ̺λ is called an infinitely convolved Bernoulli measure (see Peres et al. (2000) for the historical background and recent advances). It is known (Jessen and Wintner, 1935) that ̺λ is continuous and it is either absolutely contin- uous or singular. Recall that x is a point of increase of ̺λ if ̺λ(x ǫ) < ̺λ(x + ǫ) for all ǫ > 0. − 1 1 The set of points of increase of ̺λ (Kershner and Wintner, 1935) is either the interval [ 1 λ , 1 λ ] 1 − − −1 when λ 2 or a Cantor set Kλ of zero Lebesgue measure contained in this interval when λ < 2 , ≥ 1 ̺λ is always singular for λ (0, 2 ) and the Cantor set Kλ satisfies Kλ = (λKλ + 1) (λKλ 1) 1 1 ∈ ∪ 1 − and 1 λ , 1 λ Kλ. Wintner (1935) noted that ̺λ has the uniform density ρλ(x) = 4 1[ 2,2](x) − 1− − ∈ 1 − for λ = 2 and that it is absolutely continuous for the kth roots of 2 . Thus it was suspected that ̺ is absolutely continuous for all λ [ 1 , 1). However, Erd¨os (1939a) showed that ̺ is singular λ ∈ 2 λ √5 1 for λ = − and for the reciprocal of the so called Pisot numbers in (1, 2). Later Erd¨os (1939b) 2 showed that there is a β < 1 such that for almost all λ (β, 1) the measure ̺λ is absolutely ∈ 1 continuous. Only recently, Solomyak (1995) showed that β = 2 . Proposition 6 For every λ (0, 1) the density f λ of the limiting measure µλ of v(nτ) satisfies ∈ ∗ ∗ κ f λ(v) = 0 if and only if v . ∗ | |≥ 1 λ − 43 κ κ Moreover, on the interval 1 λ , 1 λ we have − − − 2 λ 2 λ κ κλ 2−κ ̺λ −κ v + 1 , 1 λ