The Pennsylvania State University The Graduate School Eberly College of Science
STUDIES ON THE LOCAL TIMES OF DISCRETE-TIME
STOCHASTIC PROCESSES
A Dissertation in Mathematics by Xiaofei Zheng
c 2017 Xiaofei Zheng
Submitted in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
August 2017 The dissertation of Xiaofei Zheng was reviewed and approved∗ by the following:
Manfred Denker Professor of Mathematics Dissertation Adviser Chair of Committee
Alexei Novikov Professor of Mathematics
Anna Mazzucato Professor of Mathematics
Zhibiao Zhao Associate Professor of Statistics
Svetlana Katok Professor of Mathematics Director of Graduate Studies
∗Signatures are on file in the Graduate School. Abstract
This dissertation investigates the limit behaviors of the local times `(n, x) of the n Pn 1 partial sum {Sn} of stationary processes {φ ◦ T }: `(n, x) = i=1 {Si=x}. Under the conditional local limit theorem assumption:
n kn BnP (Sn = kn|T (·) = ω) → g(κ) if → κ, P − a.s., Bn we show that the limiting distribution of the local time is the Mittag-Leffler distri- bution when the state space of the stationary process is Z. The method is from the infinite ergodic theory of dynamic systems. We also prove that the discrete-time fractional Brownian motion (dfBm) admits a conditional local limit theorem and the local time of dfBm is closely related to but different from the Mittag-Leffler dis- tribution. We also prove that the local time of certain stationary processes satisfies an almost sure central limit theorem (ASCLT) under the additional assumption that the characteristic operator has a spectral gap.
iii Table of Contents
Acknowledgments vi
Chapter 1 Introduction and Overview 1 1.0.1 Brownian local time ...... 1 1.0.2 Local time of discrete-time stochastic processes ...... 4 1.0.3 Connection between the Brownian local time and the local times of discrete-time processes ...... 9
Chapter 2 Local Limit Theorems 11 2.1 Motivation ...... 11 2.2 Local limit theorems for independent and identically distributed random variables ...... 12 2.3 Local limit theorems for Markov chains ...... 13 2.4 Conditional local limit theorems for stationary processes ...... 14 2.5 Conditional local limit theorem for discrete-time fractional Brown- ian motion ...... 19 2.5.1 Proof of the conditional local limit theorem ...... 20 2.5.2 Estimate of the variance ...... 24 2.5.3 Estimate of the mean ...... 31
Chapter 3 Limiting Distributions of Local Times 33 3.1 Local times of random walks ...... 33 3.2 Occupation times of Markov chains ...... 38 3.3 Ergodic sums of infinite measure preserving transformation . . . . . 44
iv 3.4 Asymptotic distribution of the local times `n of stationary processes with conditional local limit theorems ...... 47 3.5 Limit theorems of local times of discrete-time fractional Brownian motion ...... 59 3.5.1 Occupation times of discrete-time fractional Brownian motions 60 3.5.2 Occupation times of continuous fractional Brownian motions 63
Chapter 4 Almost Sure Central Limit Theorems 65 4.1 Almost sure central limit theorems for local times of random walks 65 4.2 Almost sure central limit theorem for stationary processes . . . . . 67 4.3 Proof of almost sure central limit theorem (ASCLT) ...... 69 4.3.1 Proof of Theorem 4.2.2 ...... 69 4.3.2 Proof of Proposition 4.3.1 ...... 70 4.4 Transfer operators ...... 74 4.5 Bounds of local times of stationary processes ...... 78
Chapter 5 Conclusion and Open Questions 80
Bibliography 82
v Acknowledgments
Over the past five years I have received support and encouragement from a great number of individuals. I must first thank my adviser, Professor Manfred Denker, for his continuous guidance, endless encouragement and generous help during my graduate study and research. This dissertation could not have been finished without his advices and support. His guidance and friendship have made my graduate study a thoughtful and rewarding journey. I would also like to thank my dissertation committee members: Alexei Novikov, Anna Mazzucato and Zhibao Zhao for generously offering their time, insightful comments and support. I learned fractional Brownian motion from Professor Novikov and benefited a lot from his precious guidance and endless patience. I own my thanks to Professor Mazzucato for her invaluable advices as my mentor when I first came to Penn State. I am grateful to Professor Zhao for his valuable time. I offer my thanks to Professor Svetlana Katok for providing me the oppor- tunity to study in the Ph.D. program. I also thank the staffs of the Department of Mathematics for their kindly assistance. I am most grateful and indebted to my parents and the rest of my family for their unconditional love. Lastly, I must thank Changguang Dong for his unwavering love, patience and support. There are certainly many others who deserve mentioning to whom I offer a simple message: Thank You!
vi Chapter 1
Introduction and Overview
The local time of a continuous-time process is a stochastic process associated with an underlying stochastic process such as a Brownian motion, a Markov process, a diffusion process and so on, that characterizes the amount of time a particle has spent at a given level. It provides a very fine description of the sample paths of the underlying process. While the local time of a discrete-time process measures how often a state is visited and it is a refinement of the notion of recurrence. The local times in these two cases show many similar properties [2] and are closely connected by the invariance principle.
1.0.1 Brownian local time
The notion of local time of a Brownian motion was first introduced by Le´vyin 1948 [40]. His contribution to the deep properties of the local time of Brownian motions laid the foundation of the theory of local times of stochastic processes. And later the theory was further developed by Trotter, Knight, Ray, Itˆo,and McKean, etc. Let {W (s), s ≥ 0} be a 1-dimensional Brownian motion. The occupation time R t 1 of a set A ⊂ R at time t is defined to be µt(A, ω) = 0 {W (s,ω)∈A}ds, which is a random measure on (R, B(R)). Le´vy(1948, [40]) proved that for almost all ω, for any t ≥ 0, µt is absolutely continuous with respect to the Lebesgue measure, so the
Radon-Nikodym derivative Lt,ω exists and is Lebesgue-almost everywhere unique: Z µt(A, ω) = Lt,ω(x)dx. A 2
Trotter (1958, [61]) proved that for almost all ω ∈ Ω, there exists a function L(t, x, ω) that is continuous in (t, x) ∈ [0, ∞) × R, such that Z µt(A, ω) = L(t, x, ω)dx. A
From now on, we use L(t, x), the jointed continuous version of the Brownian local times and we make the following remarks on the notations:
1. {L(t, x)}t≥0,x∈R is called the Brownian local time.
2. Fix x, {L(t, x)}t≥0 is called the Brownian local time at level x.
3. Fix t, {L(t, x)}x∈R is the Brownian local time at time t.
4. Fix x and t, L(t, x) is a random variable.
5. When x = 0, we use L(t) to denote L(t, 0) for short.
So L(t, x) can be studied as a function of t or of x. Interesting questions of L(t, x) such as the exact distribution, the limiting distribution as t goes to ∞ when x is fixed, the fluctuation of the Brownian local time are well studied. We recall some striking results related to the problems we are to deal with in this dissertation. As a function of t, the distribution of the Brownian local time at the level 0 is given by the following theorem.
Theorem 1.0.1 (L´evyidentity, 1948). The processes {(|W (t)|,L(t, 0)) : t ≥ 0} and {(M(t) − W (t),M(t)) : t ≥ 0} have the same distribution, where M(t) = max{W (s): s ∈ [0, t]}.
L´evy(1948) proved the theorem by showing that {M(t) − B(t): t ≥ 0} is a reflected Brownian motion. In [46], the theorem is proved by first defining the Brownian local time through the number of downcrossings of a Brownian motion and then using the embedded random walks into Brownian motions. The local time from this definition can be proved to be the density of the occupation measure. The importance of the concept Brownian local time also lies in its deep con- nection with Itˆo’sformula. 3
Theorem 1.0.2 (Tanaka’s formula).
Z t + 1 W (t) = 1{W (s)>0}dW (s) + L(t) 0 2 and Z t |W (t)| = sgn(W (s))dW (s) + L(t), 0 where sgn denotes the sign function
+1, x > 0 sgn(x) = −1, x ≤ 0.
It is still true when W (t) is replaced by a continuous semimartingale. Tanaka’s formula is the explicit Doob-Meyer decomposition of the submartingale |W (t)| into the martingale part and a continuous increasing process (local time). Tanaka’s formula can be generalized by Ito-Tanaka’s formula, which is also an extension of Itˆo’sformula to convex functions.
Theorem 1.0.3 (Itˆo-Tanaka’s formula). If f is the difference of two convex func- tions, then
Z t Z 0 1 00 f(W (t)) = f(W (0)) + f−(W (s))dW (s) + L(t, x)f (dx). 0 2 R
Recall that if f is convex, its second derivative f 00 in the sense of distributions is a positive measure. Itˆo-Tanaka’s formula holds for any continuous semimartingale. Similar to Brownian motion, the Brownian local time’s magnitude of the fluc- tuations can also be described by the law of the iterated logarithm (LIL). ˆ Theorem 1.0.4 (Kesten, 1965 [35]). Let L(t) = supx∈R L(t, x), then almost surely, for any given x,
Lˆ(t) L(t, x) lim sup 1/2 = lim sup 1/2 = 1 t→∞ (2t log log t) t→∞ (2t log log t) and almost surely, log log t1/2 lim inf Lˆ(t) = γ > 0. t→∞ t 4
The local times of random walks have similar results, but in Kesten’s words, it is “not a mere ‘translation’ to the discrete case”. In this dissertation, we study the law of the iterated logarithm for the local times of certain stationary processes, which will be discussed in Chapter 4. Revuz and Yor [55] and the survey article Borodin [15] are good references to the Brownian local times. Local times of Markov processes, Gaussian Processes, L´evy Processes and others are also studied. Since this article focuses on the discrete- time processes, we won’t go further. More information on local times of continuous stochastic processes can be found in the survey [27].
1.0.2 Local time of discrete-time stochastic processes
A similar question is how to define the local time of a discrete-time process. Since the time space is discrete, we can only measure how many times the process is in some subset of the state space. That is, for a discrete-time process {Sn}, the occupation time λ(n, A) of a subset A of the state space S at time n is defined to be the total number of the visits of Si to the subset A during the first n transitions:
λ(n, A) = #{i ≤ n : Si ∈ A}. Local time can be interpreted as the density of the occupation time measure with respect to the counting measure (when S = Z) or to the Lebesgue measure (when S = R). It is defined as `(n, x) = #{i ≤ n : Si = x} R for x ∈ S, and hence, λ(n, A) = A `(n, x)dm(x), where m(·) is the counting measure or the Lebesgue measure. To our knowledge, the earliest result on the discrete-time process is Chung and Hunt’s work in 1949 [21]. They studied the zeros at time n of simple random walks
Sn: `n = `(n, 0) := #{i ≤ n : Si = 0} and showed the limiting behaviors of the sequence {`n}n≥1. The exact and the limiting distributions of `n are also studied by R´enyi in 1970 [52]. Aleˇskeviˇcien˙e [7] [9] studied the asymptotic distribution and the moments of local times of aperiodic recurrent random walks. For the case when the symmetric random walk is of dimension 2, Erd˝osand Taylor [26] studied the aysmpototic distribution of the local time and from that they obtained strong laws analogous to the law of the iterated logarithm. Another important underlying stochastic process is the Markov process. In 1957, Darling and Kac [24] established some limit theorems for the occupation 5
times for homogeneous Markov processes Xt with stationary transition probabili- ties. For each s > 0, let ps(x, E) be the Laplace transformation of the transition probability P (x, E; t) of the Markov process {Xt}, and V (x) is a non-negative function over the state space. Suppose they satisfy the Darling and Kac con- dition: there exists a function h(s) → ∞, s → 0, and a positive constant C such that Z p (x, dy) s V (y) → C, s → 0, (1.1) h(s) or equivalently, Z ∞ −su 1 Ex[ e V (Xu)du] ∼ Ch(s), s → 0, 0 the convergence is uniform in the support of V . In addition, if h is regularly varying with index α ∈ [0, 1) as s → 0, then the limiting distribution of 1 Z t V (Xu)du Ch(1/t) 0 is the Mittag-Leffler distribution with index α. The result essentially exhausts all possibilities, in view of the following converse.
If {Xt} and V (x) meet the “Darling and Kac condition”, and if in addition for some scaling function u(t) > 0,
1 lim P ( < x) = G(x), (1.2) t→∞ R t u(t) 0 V (Xs)ds where G is a non-degenerate distribution function, then h is a regularly varying function with some index α. It follows that G is the distribution function of the Mittag-Leffler distribution. In the proof, the elementary Tauberian theorem of Karamata plays an important role. Darling and Kac’s theory is applicable to Markov chains by replacing the Laplace transform with generating functions. In particular, it works for random walks, the sum of independent, identically distributed random variables {Xn} with common distribution function F . Xn may take lattice distribution or non-lattice distribution. When F is of mean 0 and in the domain of attraction of some stable law with index d, denoted as F ∈ D0(d)(d ∈ (1, 2]), it is known that Sn, the
1 an In the whole dissertation, an ∼ bn means → 1, n → ∞. bn 6
partial sum of {Xn} has the local limit theorem [31] [59]. It leads to the important “Darling-Kac condition” of the i.i.d. case:
n X 1− 1 P (Sk ∈ A) ∼ |A|n d L(n), as n → ∞, (1.3) k=0 where L(n) is a slowly varying function. By Darling and Kac [24], it implies that the normalized occupation time of Sn converges to the Mittag-Leffler distribution.
Therefore, in the i.i.d. case, the central limit theorem of Sn implies the central limit theorem of the occupation time of Sn. Independently, Kesten [36] and Bretagnolle and Dacunha-Castelle [16] made a conjecture that the converse is also true: (1.3) implies that F ∈ D0(d). Kesten [36] proved it when Sn is a symmetric random walk, and Bingham and Hawkes [13] extent it to symmetric zero-mean L´evyprocesses, left-continuous random walks and completely asymmetric L´evyprocesses. The Darling-Kac theorem can also be extended from Markov processes to the sums of weakly dependent random variables, for example, the R´ev´esz-dependent sequence [23] by Cs¨orgo. In our work, instead of random walks or Markov processes, we consider when the “Darling-Kac” theorem holds for local times for the random variables with dependent structures and what the limiting distributions of the local times are.
One class of dependent random variables is the stationary process {Xn} with a “conditional local limit theorem”. The “Darling-Kac condition” is a consequence of the “conditional local limit theorem”. In Chapter 2, we have a review on local limit theorem and then introduce the “conditional local limit theorem”. As an example, we will prove that the discrete-time fractional Brownian motion has a conditional local limit theorem. In Chapter 3, we shall show that under the assumption of the “conditional local limit theorem”, the normalized local time of the partial sum {Sn} of {Xn} converges to the Mittag-Leffler distribution when the state space of the discrete- time processes {Sn} is Z. When the state space is R, the limiting distribution is not necessarily to be the Mittag-Leffler distribution, but it is closely related to it. The proof is in the framework of the infinite ergodic theory. Occupation times can be represented as a partial sum of iterative transformations in a dynamical system (X, B, µ, S). When the dynamical systems is pointwise dual ergodic, there 7 exists a “Darling-Kac theorem” in the infinite ergodic dynamical system[2] by Aaronson, which is like the “Birkhoff theorem” in finite space. The dynamical system (X, B, µ, S) in which the occupation time is defined can be decomposed into pointwise dual ergodic system and then Aaronson’s Darling-Kac theorem can be applied in each component. The idea of the proof of the Aaronson’s Darling-Kac theorem in infinite dynamical system is to estimate the Laplace transform of all moments of the partial sum of iterative transformations and apply the Karamata’s Tauberian theorem to recover the moments. The “Darling-Kac condition” and hence the conditional local limit theorem in Chapter 2 is the key point in constructing a pointwise dual ergodic dynamical system. In i.i.d. case, the conditional local limit theorem can be deduced to be the local limit theorem. The conditional local limit theorems are widely satisfied, for example Denker and Aaronson’s work on stationary processes in the domain of attraction of a normal distribution [4] and partial sums of stationary sequences generated by Gibbs-Markov maps [5]. H Since the discrete-time fractional Brownian motion Bn also has the conditional local limit theorem when the Hurst index satisfies H ∈ (3/4, 1), it enables us to use the infinite ergodic theory to study the limiting distribution of the following variables: n X H V (Bi ) as n → ∞, i=1 where V is a non-negative function over R. If V (x) is the characteristic function of Pn H H some set B ⊂ R, then i=1 V (Bi ) becomes the occupation time of Bn of the set H Pn H B: λ(B, n) = #{i ≤ n : Bi ∈ B}. The limiting distribution of i=1 V (Bi ) after normalized is closely related to Mittag-Leffler distribution but not Mittag-Leffler distribution. The theory of the infinity ergodic theory has many applications in the area of probability. In Chapter 4, we use it to explore another property of the local times of stationary processes with conditional local limit theorems: the almost sure central limit theorem (ASCLT). Almost sure central limit theory was discovered firstly by Brosamler [20] and Schatte [57] independently. The simplest form of it says that a sequence of independent identically distributed random variables {Xk} 8
2 with moments EX1 = 0 and EX1 = 1 obeys
n 1 X 1 lim 1 S = Φ(x) a.s. { √k ≤x} n→∞ log n k k k=1 for each x. Here 1{·} denotes the indicator function of events, Φ is the distribu- tion function of the standard normal distribution. In the past, ASCLT has been obtained for several classes of independent and dependent random variables. We list some of them. Lacey, Michael T and Philipp, Walter [39] gave a new proof of the ASCLT based on an almost sure invariance principle and extended ASCLT to weakly dependent variables: For any sequence of random variables {Xn}, if its partial sum can be approximated almost surely by i.i.d. normal random variables 1 1 {Y }, then P 1 converges weakly to standard Brownian motion n log n k≤n k {sk(·,ω)} on [0, 1], where
−1/2 n Sk if t = k/n, k = 0, 1, ··· , n sn(t, ω) = linear in between.
When {Xn} are independent, not necessary identically distributed random vari- ables, Berkes, Istv´anand Dehling, Herold [10] gave necessary and sufficient criteria for the generalized ASCLT and its functional version under mild growth conditions on the partial sum Sn. Peligrad, Magda and Shao, Qi-Man [48] obtained the AS- CLT for stationary associated sequence, strongly mixing and ρ-mixing sequences under the same conditions that assure the usual central limit theorem. G. Khurel- baatar [28] proved ASCLT for strongly mixing sequence of random variables with a slower mixing rate than [48]. And he also showed that ASCLT holds for an associated sequence without a stationary assumption. In [11], Berkes, Istv´anand Cs´aki,Endre show that not only the central limit theorem, but every “weak” limit theorem for independent random variables has an analogous almost sure version. However, the study of the almost sure central limit theorem for local times is very little. We only know that for aperiodic integer-valued random walks, Berkes,
Istv´anand Cs´aki[11], established an almost sure central limit theorem when {Xn} are i.i.d. and its law is in the domain of attraction of a stable law of order d ∈ (1, 2]. In Chapter 4, we shall prove that an almost sure central limit theorem holds for 9 local times of stochastic processes whose Perron-Frobenius operators have spectral gaps, which include a different class of processes. At the end of Chapter 4, we also review the bounds and deviations of local times. The bounds of local times of random walks are studied by Chung and Hunt [21]. They have the similar form as that of Brownian local time. We extend the result to the local time of stochastic processes with “conditional local limit theorems” by infinite ergodic theory.
1.0.3 Connection between the Brownian local time and the local times of discrete-time processes
In 1980s, an extensive literature on the invariance principles of local times ap- peared. It is well known that a random walk is the discrete version of Brownian motion. In the light of the invariance principle, it is not a surprise that the local time of Brownian motion and the “local time” of random walks should be close to each other. Thanks to the Skorohod embedding scheme, it can be shown that Brownian local time at zero has the same distribution as the maximum process
{Mt} of a standard linear Brownian motion (Theorem 1.0.1). And {Mt} has the 1 same distribution as the Mittag-Leffler distribution with index 2 . Recall that Mittag-Leffler distribution is the limiting distribution of the local times of random walks and Markov Chains. R´ev´eszproved that Brownian motion is near to its Skorohod embedding random walk, so are their local times at the same time.
Theorem 1.0.5 (R´ev´esz,1981, [53]). Let {W (t)} be a Brownian motion defined on a probability space {Ω, F,P }. Then on the same probability space, one can define a sequence X1,X2... of i.i.d. r.v.’s with P (Xi = 1) = P (Xi = −1) = 1/2 such that for any > 0
lim n−1/4− sup |`(n, x) − L(n, x)| = 0, a.s. n→∞ x∈N and simultaneously −1/4− lim n |Sn − W (n)| = 0, a.s. n→∞ For more results on the invariance principles of the local times of random 10 walks, one can refer to Bass and Khoshnevisan(1993, 1995), Borodin(1986, 1988), Csaki and Revesz(1983), Csorgo and Revesz(1984,1985,1986), Jacod(1998), Khosh- nevisan(1992,1993). Recently, Bromberg and Kosloff [19] proved a functional weak invariance prin- ciple for the local times of partial sums of Markov Chains with finite state space S ∈ Z under the assumption of strong aperiodicity of the Markov Chain. The limiting distribution is the Brownian local time. Bromberg [17] [18] extent it to
Gibbs-Markov processes and random variables {Xn} which are generated by a dy- namical system (X, B, m, T ) with a quasi-compact transfer operator. It is close to the method used in this dissertation. Chapter 2
Local Limit Theorems
2.1 Motivation
The limiting distributions of local times depend on the properties of the underlying discrete-time processes, one of which is the local limit theorem. For example, the Markov process, as we mentioned in the Introduction, under the local limit theorem, the “Darling-Kac condition” can be implied and hence the normalized local times of Markov processes have Mittag-Leffler distributions as their limiting distributions [24]. For general stationary processes, the local limit theorem or the conditional local limit theorem is also the key element in determining the limiting distributions of the local times. So to prepare us well for finding the limiting distributions of the local times in the next chapter, we shall review and study the local limit theorems of the discrete-time processes that we are interested in in this chapter. Before moving on, we make some remarks on the term “local”: in the context of probabilistic limit theorems, the term “local” means the convergence of densities and “global” means the convergence of distribution functions. The term “local limit theorem” is used when the probability mass functions or the probability density functions are approximated by density functions. 12
2.2 Local limit theorems for independent and identically distributed random variables
Suppose independent and identical distributed random variables {Xn} have a lat- tice distribution, taking values in the arithmetic progression {a + kh : k ∈ Z}, then its partial sum Sn only takes values in the set {an + kh : k ∈ Z}. The local limit theorem is an asymptotic expression for the probability mass function
P (Sn = an + kh) as n → ∞.
Theorem 2.2.1 (Local limit theorem for mass functions, [31]). In order that for some constants An and Bn > 0,
Bn an + kh − An lim sup P(Sn = an + kh) − g( ) = 0, n→∞ k h Bn where g is the density function of some stable distribution G with exponent α ∈ (0, 2], it is necessary and sufficient that
1. the common distribution function F of {Xj} should belong to the domain of attraction of G, i.e. Sn−An converges in distribution to a stable law with Bn distribution function G, and
2. the interval h be maximal. It means that no matter what the choice of b ∈ R
and h1 > h, it is impossible to represent all possible values of X in the form
of b + h1k.
Remark 2.2.2. Local limit theorems provide more information than central limit theorems. The simplest example of the local limit theorem is the De Moivre-Laplace theorem, from which the central limit theorem can be implied.
Another kind of local limit theorem is the approximation to the density func- tions. Suppose {Xn} belongs to the domain of attraction of a stable law with density function g, and the density function p of Sn−An exists for all n ≥ N, the n Bn local limit theorem is under which pn converges to g. The answer is given by the following theorem. 13
Theorem 2.2.3 (Local limit theorem for density functions [31]). In order that for some choice of the constants An and Bn,
lim sup |pn(x) − g(x)| = 0, n→∞ x where g is the density of some stable distribution with exponent α ∈ (0, 2], it is necessary and sufficient that
1. the common distribution function F of {Xn} should belong to the domain of attraction of distribution G with density function g, and
2. there exists N with supx pN (x) < ∞.
2.3 Local limit theorems for Markov chains
The local limit theorems are also known for certain classes of stationary stochastic sequences which are related to Markov chains. Kolmogorov [37] obtained the local limits for Markov chains with finite state space using the methods developed by Markov [43] and Doeblin [25]. Nagaev [47] showed local limits for Markov chains with infinite state space in the normal case using perturbation theory of charac- teristic operators. Aleˇskeviˇcien˙e[6] studied the local limits for certain stationary Markov chains in the stable case. S´eva, Marc [58] showed that the local limit theorem holds for certain non-uniformly ergodic Markov chains with state space N = {0, 1, 2, ···}. Let p(x, A) be the stochastic transition function of a Markov chain X(n) with (n) countable state space S = {ξ1, ξ2, ···} and px,y = P (X(i + n) = y|X(i) = x). Suppose sup |p(x, A) − p(y, A)| = δ < 1, (2.1) x,y,A P∞ which implies infi,j k=1 min(pik, pjk) > 0. Suppose in addition all states in S are essential and constitute a positive class.
Let f(ξn) = a + knh, where kn ∈ Z, a ∈ R and h > 0.
Theorem 2.3.1 (Local limit theorem for Markov Chain [47] ). If the greatest com- P 2 mon factor of the kn is 1, i f (ξi)pi < ∞, σ > 0, where pi are final probabilities, 14 and n X Pπn(s) = P r( f(X(i)) = an + sh), i=1 then √ 2 σ n 1 − zns lim Pπn(s) − √ e 2 = 0 n→∞ h 2π where πi = π(ξi) is the initial distribution and
∞ √ X σ nzns = a(n + 1) + sh − (n + 1) f(ξi)pi. i=1
In [6], Aleˇskeviˇcien˙econsidered the case with stable law of stationary Markov chains. {Xn} is a homogeneous Markov chain with an arbitrary states set S, F is a
σ-algebra on it. Denote Yi = f(Xi), then {Yi} are identically distributed. Denote Pn P1(k) = P (Yn = k), n = 1, 2, ··· . Let Sn = i=1 Xi and Pn(k) = P (Sn = k),
Fn(x) = P (Sn < x) and F (x) = F1(x).
Theorem 2.3.2 (Local limit theorem for Markov chain [6] ). Suppose F (x) belongs to the domain of attraction of the stable law Vα with density function vα and an integral limit theorem holds for {Xn}, i.e.
Fn(Bnx + An) → Vα(x).
Then in order that k − An BnPn(k) − vα( ) → 0 Bn holds uniformly with respect to k, it is necessary and sufficient that the greatest common divisor of the differences k1 − k2 is 1 when P1(k1) and P1(k2) are positive.
2.4 Conditional local limit theorems for station- ary processes
In the normal case for stationary sequences generated by Lasota-Yorke maps of the interval and functions of bounded variation, a local limit theorem exists. It was proved by a spectral decomposition method by Rousseau-Egele [56] and Morita, 15
Takehiko [44] [45]. For ergodic sums generated by Lipschitz continuous functions and Markov kernels on a compact metric space, Guivarc’h, Y. and Hardy, J. [30] also derived a local limit theorem by means of the spectral decomposition. Aaron- son and Denker [5] showed the local limits for stationary sequences generated by mixing Gibbs-Markov maps with aperiodic, Lipschitz continuous functions. In their work, the local limit theorems are established with respect to the sequence of conditional measures on the fibers of T n given by the Perron-Frobenius operators, where T is the measure preserving transition operator of the stationary process. The method is from the spectral theory of Perron-Frobenius operator and the perturbation theory.
Definition 2.4.1 (Markov map, [5]). Let (X, B, m, T ) be a nonsingular transfor- mation of a standard probability space. If there is a measurable partition α such that
• T (a) ∈ σ(α) mod m ∀a ∈ α,
• T|a is bijective, bi-measurable and bi-nonsingular for all a ∈ α, and
• σ({T −nα : n ≥ 0}) = B, then (X, B, m, T, α) is called a Markov map.
For n ≥ 1, there are m−nonsingular inverse branches of T denoted by
n−1 n n−1 _ −i νA : T A → A, A ∈ α0 =: T α i=0 with Radon-Nikodym derivatives
dm ◦ ν ν0 := A . A dm
t(x,y) We fix r ∈ (0, 1) and define the metric d = dr on X by d(x, y) = r where t(x, y) = min{n + 1 : T n(x),T n(y) belong to different atoms of α}.
Definition 2.4.2 (Gibbs-Markov). A Markov map (X, B, m, T, α) is called Gibbs- Markov if the following two additional assumptions are satisfied: 16
• infa∈α m(T (a)) > 0,
•∃ M > 0, such that 0 νA(x) 0 − 1 ≤ Md(x, y) νA(y) n−1 n for all n ≥ 1,A ∈ α0 , x, y ∈ T A.
Definition 2.4.3 (Perron-Frobenius operator). The Perron-Frobenius operator PT 1 1 is defined as PT : L (m) → L (m), Z Z PT f · g dm = f · (g ◦ T ) dm (2.2) X X for any g ∈ L∞(m).
An interpretation of the Perron-Frobenius operator is that it describes the evolution of probability densities under the action of T . That is, if f is the density of some probability measure ν with respect to m, then PT f is the density of the image measure ν ◦ T −1.
Theorem 2.4.4 (Distribution limit [5]). Let (X, B, m, T, α) be a mixing, probabil- ity preserving Gibbs-Markov map. Let
φ : X → R
P be Lipschitz continuous on each a ∈ α, with Dαφ := a∈α Daφ < ∞. Assume m-distribution G of φ is in the domain of attraction of a stable law with order 0 < p < 2, which is equivalent to that
p L1(x) := x (1 − G(x)) = (c1 + o(1))L(x),
p L2(x) := x G(−x) = (c2 + o(1))L(x) as x → ∞ where L is a slowly varying function on R+ and where c1, c2 ≥ 0, c1 + c2 > 0. Then Sn − An → Yp weakly, Bn 17
0, 0 < p < 1, p where nL(Bn) = Bn and An = γn, 1 < p < 2, 2n γn + π (H1(Bn) − H2(Bn)), p = 1. Theorem 2.4.5 (Conditional lattice local limit theorem [5]). Suppose that φ : itφ f◦T X → Z is aperiodic, i.e. the only solution to the equation e = λ f with f : X → T measurable is t ∈ 2πZ, f ≡ 1, λ = 1. Let An,Bn be as in Theorem 2.4.4, and suppose that k ∈ , kn−An → κ ∈ as n → ∞, then n Z Bn R
lim ||BnPT n (1{S =k }) − fY (κ)||∞ = 0, n→∞ n n p
where fYp is the density function of the random variable Yp in Theorem 2.4.4. And in particular,
lim BnP (Sn = kn) = fY (κ). n→∞ p Theorem 2.4.6 (Conditional non-lattice local limit theorem [5]). Suppose that
φ : X → R is aperiodic, let An,Bn be as in Theorem 2.4.4, let I ⊂ R be an interval, and suppose that k ∈ , kn−An → κ ∈ as n → ∞, then n Z Bn R
lim BnPT n (1{S ∈k +I}) = |I|fY (κ), n→∞ n n p where |I| is the length of I and in particular,
lim BnP (Sn ∈ kn + I) = |I|fY (κ). n→∞ p
Another family of processes admitting conditional local limit theorem is the AFU map. A piecewise monotonic map of the interval is a triple (X, T, α) where X is an interval, α is a finite or countable generating partition (mode Lebesgue measure) of X into open intervals and T : X → X is a map such that T |A is a continuous and strictly monotonic for each A ∈ α. A piecewise monotonic map of the interval with the following properties is called a AFU map [1]:
2 ¯ 00 0 2 A Adler’s condition: for all A ∈ α, T |A extends to a C map on A and T /(T ) is bounded on X.
F Finite images: {TA : A ∈ α} is finite. 18
U Uniform expansion: inf |T 0| > 1.
For every measurable f on interval X define its finite variation by VarX (f) = P sup i kf(xi)−f(xi−1)k where the supremum ranges over all partitions x1 < x2 <
··· < xn in X. Denote
_ ∗ ∗ f := inf{VarX (f ): f = f mλ − a.e.}, X where mλ is the Lebesgue measure.
Theorem 2.4.7 ([1]). Let (X, T, µ, α) be an AFU map and weakly mixing, which means for any functions f and g ∈ L2(X, µ),
N−1 Z Z Z 1 X n lim f ◦ T · gdµ − fdµ · gdµ = 0, N→∞ N n=0 X X X W and suppose that φ : X → R satisfies Cφ,α := supA∈α A φ < ∞.
2 1 2 1. If E[φ ] < ∞ and n Var(φn) → σ > 0, then
Z b 2 φn − E(φ) 1 − t Pn,x √ ∈ (a, b) → √ e 2 dt σ n 2π a
n1 as n → ∞, uniformly in x ∈ X, where Pn,x(A) = PT A(x) and φn = Pn i−1 i=1 φ ◦ T .
2. If in addition φ : X → Z is aperiodic, then
2 √ 1 − t σ nPn,x(φn = kn) → √ e 2 2π
k −nE(φ) n √ as n → ∞, kn ∈ Z, σ n → t, uniformly in x ∈ X and t ∈ K for all k ⊂ R compact.
3. If in addition φ : X → R is aperiodic, and I ⊂ R is a finite interval, then
2 √ 1 − t σ nPn,x(φn ∈ kn + I) → √ e 2 |I| 2π 19
k −nE(φ) n √ as n → ∞, kn ∈ Z, σ n → t, uniformly in x ∈ X and t ∈ K for all k ⊂ R compact.
2.5 Conditional local limit theorem for discrete- time fractional Brownian motion
In this section, we show a conditional local limit theorem of discrete-time fractional Brownian motion. This result plays an important role in establishing the limiting distribution of the local time of discrete-time fractional Brownian motion in the next chapter. Fractional Brownian motion (fBm), also called fractal Brownian motion, was first introduced by Mandelbrot and van Ness (1968) [41]. It is a generalization of Brownian motion. Unlike classical Brownian motion, the increments of fBm need not be independent. Fractional Brownian motion is a continuous-time Gaussian process BH (t) on [0,T ], which starts at zero, has expectation zero for all t in [0,T ], and has the following covariance function:
1 E[B (t)B (s)] = (|t|2H + |s|2H − |t − s|2H ), H H 2 where H ∈ (0, 1) is called Hurst index associated with the fractional Brownian motion. If H = 1/2, then the process is in fact a Brownian motion. The increment process, Xt = BH (t+1)−BH (t), is known as fractional Gaussian noise (fGn). When
H > 1/2, the increments {Xn} has long-range dependence property, that is, for a stationary sequence {Xn}, its auto covariance functions b(n) = Cov(Xk,Xk+n) satisfy
b(n) lim = 1 n→∞ cn−α for some constant c and α ∈ (0, 1). In this case, the dependence between Xk and P∞ Xk+n decays slowly as n tends to infinity and n=1 b(n) = ∞. We will prove a conditional local limit theorem for the discrete fractional Gaus- sian noise Xn when H > 3/4.
Theorem 2.5.1 (Conditional Local Limit Theorem). Suppose {Xn} is a sequence 20 of stationary Gaussian random variables in the probability space (Ω, F,P ) with 1 mean 0, and its covariance function b(i − j) = E(X X ) satisfies b(n) = [(n + i j 2 2H 2H 2H 1) − 2n + (n − 1) ], where 3/4 < H < 1. Denote by Sn the partial sum of Pn {Xn}: Sn = i=1 Xi. Suppose (a, b) ⊂ R is an interval. H Then there exists a normalization sequence {dn}, satisfying dn/n → K for qn some constant K as n → ∞. For any sequence {qn} and κ ∈ R, such that → κ dn as n → ∞, the conditional probability satisfies
lim dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ··· ) = (b − a)g(κ), a.s., (2.3) n→∞ where g is the density function of the standard normal distribution.
In the case that qn = 0, the convergence is uniformly for almost all ω and for any interval (a, b).
In the rest of the Chapter, we will give the proof of Theorem 2.5.1.
2.5.1 Proof of the conditional local limit theorem
In this part, we state claims which are the key points in the proof of Theorem 2.5.1 and provide the proof modulo these conditions.
Proof of Theorem 2.5.1. For fixed positive integer k, the conditional probability P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...Xn+k) is given by a normal distribution. T Indeed, let the (n + k)-dimensional random variable X = (X1, ··· ,Xn+k) be " # X partitioned as [1] with sizes n and k respectively. The covariance matrix of X[2] " # Σ11(n, n)Σ12(n, k) X is denoted as Σ = , where Σ11 and Σ22 are symmetric Σ21(k, n)Σ22(k, k) Toeplitz matrices:
b(0) b(1) b(2) . . . b(n − 1) b(1) b(0) b(1) . . . b(n − 2) Σ11(n, n) = , . . . . . . . . . . b(n − 1) b(n − 2) b(n − 3) . . . b(0) 21
and
b(0) b(1) b(2) . . . b(k − 1) b(1) b(0) b(1) . . . b(k − 2) Σ22(k, k) = . . . . . . . . . . . b(k − 1) b(k − 2) b(k − 3) . . . b(0)
And
b(n) b(n + 1) b(n + 2) . . . b(n + k − 1) b(n − 1) b(n) b(n + 1) . . . b(n + k − 2) T Σ12(n, k) = Σ (k, n) = . 21 . . . . . . . . . . b(1) b(2) b(3) . . . b(k)
" # e(n) 0 Let D be a (k + 1) × (n + k) matrix, defined by D = , where 0 Ik T e(n) = (1, 1, ..., 1) and Ik is the identity matrix of size k. Then DX ∼ N (0,DΣD ), | {z } n i.e. " # " # S e(n)Σ e(n)T e(n)Σ n ∼ N 0, 11 12 . T X[2] Σ21e(n) Σ22
By the conditional normal formula (see for example [12], section 5.5), when Σ22 is of full rank, 2 (Sn|X[2]) ∼ N µ(n, k), σ (n, k) , where −1 µ(n, k) = e(n)Σ12(n, k)Σ22 (k, k)X[2] (2.4) and
2 T −1 T σ (n, k) = e(n)Σ11(n, n)e(n) − e(n)Σ12(n, k)Σ22 (k, k)Σ21(k, n)e(n) . (2.5)
R That is, P (Sn ∈ A|X[2]) = A f(y1|X[2])dy1, with
2 1 − (y1−µ(n,k)) 2σ2(n,k) f(y1|X[2]) = √ e . 2πσ(n, k) 22
For our convenience, we introduce a vector B:
T T B = (B(1),B(2), ··· ,B(k)) := Σ21(k, n)e(n) ,
Pn+s−1 i.e. B(s) = i=s b(i), s = 1, 2, ··· , k. Then the mean and the variance can be written as T −1 µ(n, k) = B Σ22 (k, k)X[2] and 2 T T −1 σ (n, k) = eΣ11(n, n)e − B Σ22 (k, k)B.
It follows that σ(n, k)P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...Xn+k)
1 Z b+qn (x − BT Σ−1(k, k)X )2 √ 22 [2] = exp − 2 dx 2π a+qn 2σ (n, k) Z (b+qn)/σ(n,k) 1 1 1 T −1 2 = σ(n, k)√ exp − (y − B Σ22 (k, k)X[2]) dy 2π (a+qn)/σ(n,k) 2 σ(n, k) 1 1 ξ + qn 1 T −1 2 = σ(n, k)√ (b − a)/σ(n, k) exp − ( − B Σ (k, k)X[2]) 2π 2 σ(n, k) σ(n, k) 22 1 1 ξ + qn 1 T −1 2 = √ (b − a) exp − ( − B Σ (k, k)X[2]) . 2π 2 σ(n, k) σ(n, k) 22
We used the mean value theorem in the second-last step and ξ ∈ [a, b]. Now we make two claims and they will be proved in section 2.5.2 and section 2.5.3 separately.
2 2 H Claim 1 For fixed n, let dn = limk→∞ σ (n, k). Then dn/n → K as n → ∞, where K is a constant. 1 Claim 2 For fixed n, lim BT Σ−1(k, k)X = 0 P-a.s. k→∞ σ(n, k) 22 [2] As a consequence,
ξ + q ξ q lim n = + n =: κ(n). k→∞ σ(n, k) dn dn
Since ξ ∈ [a, b] and qn → κ as n → ∞, lim κ(n) = κ. dn n→∞ 23
Hence lim σ(n, k)P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...Xn+k) k→∞ 1 κ(n)2 = √ (b − a) exp(− ) = g(κ(n))(b − a) almost surely, 2π 2 where g is the density function of the standard normal random variable. On the other hand, almost surely,
lim σ(n, k)P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ..., Xn+k) k→∞ = dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...).
Hence almost surely,
dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...) = (b − a)g(κ(n)).
It follows that lim dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...) = g(κ)(b − a) almost surely. n→∞
In particular, when qn = 0, almost surely, dnP Sn ∈ (a, b)|(Xn+1,Xn+2, ...) = (b − a)g(0).
So lim dnP Sn ∈ (a, b)|(Xn+1,Xn+2, ...) = g(0)(b − a), n→∞ uniformly for almost all ω ∈ Ω and (a, b).
In the following two sections, we give the proof of the two claims. 24
2.5.2 Estimate of the variance
In this part, we prove Claim 1:
2 2 2H lim σ (n, k) = dn = n L(n). k→∞
2H−2 2 T Since b(n) ∼ n L(n), the first term of Σ (n, k) satisfies e(n)Σ11(n, n)e(n) ∼ n2H L(n), when n → ∞, where L(n) is slowly varying [60], it is sufficient to prove that the second term of σ2(n, k) converges to 0 as k → ∞, i.e.
T −1 lim B Σ22 (k, k)B = 0. k→∞
Pn+s−1 First we give an estimate of the elements B(s) = i=s b(i) of vector B.
Lemma 2.5.2. It holds that
2H 1 B(s) = n(s + n)2H−2 1 + O( ) , as s → ∞ 2 s
Pk 2 2 4H−3 and therefore i=1 B (i) = O(n (k + n) ) as k → ∞.
a P∞ a i a Proof. By Taylor expansion, (1 + x) = i=0 i x when |x| < 1, where i = a(a − 1) ··· (a − i + 1) . So by definition of B(s) and b(t), i!
2(sn)−2H B(s) = (s + n)2H − (s + n − 1)2H − s2H + (s − 1)2H (sn)−2H sn − s − n2H sn − s − n + 12H = 1 − − 1 − sn sn sn − s2H sn − s + 12H − 1 − + 1 − sn sn ∞ X 2H = (−1)if(s, n, i)(sn)−i, (2.6) i i=2 where f(s, n, i) = (sn−s−n)i −(sn−s−n+1)i −(sn−s)i +(sn−s+1)i. Using the Pi i i−j i−j binomial formula, we can rewrite f(s, n, i) = j=1 j (sn−s) −(sn−s−n) . 25
Since
i−j X i − j (sn − s)i−j − (sn − s − n)i−j = (sn − s − n)i−j−lnl l l=1 n n = n(i − j)(sn − s − n)i−j−1 1 + O , as → 0, sn − s − n sn − s − n a straight forward calculation furthermore shows that
n f(s, n, i) = i(i − 1)n(sn − s − n)i−2 1 + O ns − s − n
n 1 as → 0 and → 0. Plugging into equation (2.6) we obtain sn − s − n sn − s − n
2(sn)−2H B(s) ∞ X 2H n = (−1)ii(i − 1)n(sn − s − n)i−2(sn)−i(1 + O( )) i ns − s − n i=2 1 1 1 n = (2H)(2H − 1) ( + )2H−2 1 + O . s2n s n ns − s − n
1 This shows B(s) = 2Hn(s + n)2H−2 1 + O( ) as s → ∞. 2 s
T −1 The main idea of estimating B Σ22 B is to write
∞ ∞ T −1 T −1 T X l X T l B Σ22 B = c(k)B A B = c(k)B (I − A) B = c(k) B (I − A) B, l=0 l=0 where c(k) is an appropriate constant chosen below satisfying ||I − A||2 < 1 for
A = c(k)Σ22.
We consider the minimal and maximal eigenvalues of Σ22, denoted by λmin(Σ22) −1 and λmax(Σ22), before determining c(k), which is closely related to the norm of Σ22 .
1 Lemma 2.5.3. Suppose H > 2 , then λmin(Σ22) → c = essinff > 0 as k → ∞.
Proof. One can define[51] the power spectrum (see [22], chapter 14) with a singu- 26 larity at λ = 0 by
∞ X 1 1 f(λ) := b(k)e−i2πλk, − < λ < , 2 2 k=−∞
(also known as the spectral density function or spectrum of the stationary process
{Xn}) where b(k) = E(XnXn+k) is the covariance function as before. f(λ) has R 1/2 2πikλ an inverse transformation: b(k) = −1/2 f(λ)e dλ. For the fractional Gaussian noise {Xk}, f(λ) has the form (cf. [51], Section 2.3):
∞ ∞ X 1 X 1 f(λ) = C|1 − ei2πλ|2 = C(1 − cos(2πλ)) . |λ + m|1+2H |λ + m|1+2H m=−∞ m=−∞
From [29] page 64/65, limk→∞ λmin(Σ22) = essinff and limk→∞ λmax(Σ22) = esssupf (including the cases ±∞). Since essinfλf(λ) > 0, the lemma is proved.
1 Lemma 2.5.4. Let c(k) = mk2H−1f(n) , where m is an appropriate constant inde- pendent of k and n, then ||I − A||2 = ||I − c(k)Σ22||2 < 1.
p 2H−1 Proof. On the one hand, ||Σ22||2 ≤ ||Σ22||1||Σ22||∞ = O(k ). On the other hand, BT Σ B 22 ≤ λ (Σ ) = ||Σ || BT B max 22 22 2 BT Σ B and 22 = O(k2H−1). Hence ||Σ || = O(k2H−1). BT B 22 2 The eigenvalues of I − A := I − c(k)Σ22 are {1 − c(k)λi(Σ22)}, by the way of choosing c(k), both of |1 − c(k)λmax(Σ22)| and |1 − c(k)λmin(Σ22)| can be less than
1. Hence ||I − c(k)Σ22||2 < 1.
T −1 With the preparation above, we can continue the estimate of B Σ22 B = P∞ T l c(k) l=0 B (I − c(k)Σ22) B.
3 Lemma 2.5.5. Suppose 4 < H < 1, then
T −1 lim B Σ22 (k, k)B = 0. k→∞
2 2 2H It follows that dn := limk→∞ σ (n, k) = O(n ). 27
Proof. We first derive an iterative equation, which will be used frequently. For any k-dimensional column vector V = (V (1),V (2), ··· ,V (k))T , define (l) l (l−1) (0) T V := (I − c(k)Σ22) V = (I − c(k)Σ22)V , V = V . Recall that B = (B(1),B(2), ··· ,B(k)), then
T (l) T (l−1) B V = B (I − c(k)Σ22)V T (l−1) T (l−1) = B V − c(k)B Σ22V k k k X X X = B(s)V (l−1)(s) − c(k) B(s) V (l−1)(i)b(i − s) s=1 s=1 i=1 k k k X X X = B(s)V (l−1)(s) − c(k) B(i) V (l−1)(s)b(i − s) s=1 i=1 s=1 k k X X = V (l−1)(s)(B(s) − c(k) B(i)b(i − s)) s=1 i=1 k k X X B(i) = V (l−1)(s)B(s) 1 − c(k) b(i − s) B(s) s=1 i=1
So we get an iterative equation for any vector V :
k k X X B(s)V (l)(s) = B(s)V (l−1)(s) 1 − q(s) , l ≥ 1 (2.7) s=1 s=1 where
k X B(i) X B(i) q(s) = c(k) b(i − s) = c(k)( b(i − s) + 1). (2.8) B(s) B(s) i=1 1≤i≤k,i6=s
2H 2H 2H−2 ( 4 ) 1 By Lemma 2.5.2 and b(t) = 2 |t| (1 + 2H t2 + ··· ) for t ≥ 1, ( 2 )
Z s−2 B(i) Z k−1 B(i) q(s) ≥ c(k) b(s − i)di + b(s − i)di 2 B(s) s+2 B(s) Z s−2 2H−2 Z k−1 2H−2 (i + n) 2H−2 (i + n) 2H−2 ≥ c(k) C 2H−2 (s − i) di + 2H−2 (i − s) di 2 (s + n) s+2 (s + n) s−2 C n + s Z k n s = ( )2−2H (x + )2H−2( − x)2H−2dx f(n)m k 2 k k k 28
k−1 Z k n s + (x + )2H−2(x − )2H−2dx s+2 k k k 1 n + s = ( )2−2H I(k, s, n). f(n)m k
The integral I(k, s, n) is bounded below by some constant K independent of k, s and n. For all s satisfying
(s + n)2−2H ≥ f(n)n2−2H kγ(2−2H) (2.9) where γ ∈ [0, 1], that is
s ≥ s∗ := f(n)1/(2−2H)nkγ − n,
n2−2H K q(s) ≥ q := , when s ≥ s∗. k(2−2H)(1−γ) m In the iterative equation 2.7, set V (s) to be V (s) = B(s), s = 1, 2, ··· , k, (l) l (l−1) V = (I − c(k)Σ22) B = (I − c(k)Σ22)V . Then one has
k k X X B(s)V (l)(s) = 1 − q(s)B(s)V (l−1)(s) s=1 s=1 k X X ≤ (1 − q)B(s)V (l−1)(s) + (q − q(s))B(s)V (l−1)(s) s=1 s
The idea is to incorporate the second term (when s < s∗) into the first one. For any > 0, define
s∗−1 k X X L∗ = inf{l : B(s)V (l)(s) ≥ B(s)V (l)(s)}. (2.10) s=1 s=1
∗ Ps∗−1 (l) Pk (l) If L = ∞, then s=1 B(s)V (s) < s=1 B(s)V (s) for all l. It follows 29 that for all l ≥ 1,
k k X X B(s)V (l)(s) ≤ (1 − q) B(s)V (l−1)(s) s=1 s=1 k X + (q − min q(s)) B(s)V (l−1)(s) s s=1 k X = (1 − q∗) B(s)V (l−1)(s) s=1 k X ≤ (1 − q∗)l B(s)V (0)(s) s=1
∗ where q = q − (q − mins q(s)). By Lemma 2.5.2,
∞ k ∞ k X X X X c(k) B(s)V (l)(s) ≤ c(k) (1 − q∗)l B(s)2 l=1 s=1 l=1 s=1 C ≤ c(k) n2(k + n)4H−3 q∗ C ≤ c(k) n2(k + n)4H−3 (1 − )q n2H = C k(2−2H)(−γ). f(n)
When l = 0,
k k X X c(k) B(s)V (l)(s) = c(k) B2(s) ≤ c(k)C(n2(k + n)4H−3) ≤ C(k−(2−2H)n2). s=1 s=1
Hence,
∞ T −1 X T (l) −γ(2−2H) B Σ22 B = c(k) B V = O(k ), as k → ∞. l=0
∗ T −1 Hence if L = ∞, limk→∞ B Σ22 B = 0. 30
If L∗ < ∞, then for l < L∗,
s∗−1 k X X B(s)V (l)(s) < B(s)V (l)(s), s=1 s=1 and s∗−1 k X ∗ X ∗ B(s)V (L )(s) ≥ B(s)V (L )(s). (2.11) s=1 s=1 P∞ T (l) In this case, we split l=1 B V into two parts:
∞ k L∗−1 k ∞ k X X X X X X B(s)V (l)(s) = B(s)V (l)(s) + B(s)V (l)(s). l=1 s=1 l=1 s=1 l=L∗ s=1
The first term can be handled in the same way as the case when L∗ = ∞:
L∗−1 k L∗−1 k X X X X c(k) B(s)V (l)(s) ≤ c(k) (1 − q∗)l B2(s) (2.12) l=1 s=1 l=1 s=1 k 1 X ≤ c(k) B2(s) (2.13) q∗ s=1 n2H ≤ C k(2−2H)(−γ). (2.14) f(n)
For the second term, set V in the iteration equation (2.7) to be Vnew := (I − L∗ (L∗) ∗ c(k)Σ22) B = V , then by changing variable d = l − L , one has
∞ ∞ X T (l) X T l−L∗ B V = B (I − c(k)Σ22) Vnew l=L∗ l=L∗ ∞ X T (l−L∗) = B Vnew l=L∗ ∞ k X X (d) = B(s)Vnew(s) d=0 s=1 k ∞ k X (0) X X (d−1) = B(s)Vnew(s) + (1 − q(s))B(s)Vnew (s) s=1 d=1 s=1 31
∞ k X X ∗ ≤ (1 − min q(s))d B(s)V (L )(s) s d=0 s=1 ∞ s∗−1 X 1 X ∗ ≤ (1 − min q(s))d · B(s)V (L )(s) by (2.11) s d=0 s=1 s∗−1 1 X ≤ B(s)2 min q(s) s s=1 C ≤ n2kγ(4H−3). mins q(s)
1 n+s 2−2H 1 Since q(s) ≥ f(n)m ( k ) K, mins q(s) ≥ C k2−2H , one has
∞ X C c(k) BT V (l) ≤ c(k) n2kγ(4H−3) = O(k−(1−γ)(4H−3)). (2.15) mins q(s) l=L∗
T −1 −(1−γ)(4H−3) −γ(2−2H) Combining (2.12) and (2.15), one has B Σ22 B ≤ C(k +k ). 3 T −1 Since H ∈ ( 4 , 1), limk→∞ B Σ22 B = 0 follows.
2.5.3 Estimate of the mean
Lemma 2.5.6. Suppose 3/4 < H < 1, then almost surely,
T −1 lim B Σ22 X[2] = 0, (2.16) k→∞
T where X[2] = (Xn+1,Xn+2, ··· ,Xn+k) .
T −1 Proof. Random variable B Σ22 X[2] has normal distribution. Its mean is 0 and its T −1 −1 T −1 variance is (B Σ22 )Σ22(Σ22 B) = B Σ22 B. So for any > 0,
∞ ∞ X X 1 1 T −1 T −1 T −1 2 T −1 2 P (|B Σ22 (k, k)X[2]| > ) = P (|B Σ22 X[2]|/(B Σ22 B) > /(B Σ22 B) ) k=1 k=1 ∞ Z ∞ 2 2 X − x = √ e 2 dx 1 2π T −1 2 k=1 /(B Σ22 B) ∞ T −1 1 2 2 X (/(B Σ B) 2 ) ≤ √ exp − 22 2 2π k=1 32
∞ 2 X 2 = √ exp (− (BT Σ−1B)−1). 2 22 2π k=1
T −1 By Borel-Cantelli Lemma, in order to prove that B Σ22 X[2] → 0 as k → ∞ P∞ 2 T −1 −1 almost surely, it is sufficient to prove that k=1 exp (− 2 (B Σ22 B) ) < ∞, which is true from the proof of Lemma 2.5.5. Chapter 3
Limiting Distributions of Local Times
There has been a great deal of research work on the local time of random walks and Markov processes. It was proved that under proper normalization, the distribution of the occupation time of Markov processes, in particular, random walks, is the Mittag-Leffler distribution. We will have a review on that in section 3.1 and 3.2. In this work, we explore a class of stationary processes with conditional local limit theorems and it turns out that the limiting distribution of their local times are closely related to the Mittag-Leffler distribution.
3.1 Local times of random walks
In this section, we present some properties of the local times of random walks.
Throughout this section, let {Xi, i = 1, 2, ...} be independent identically dis- Pn tributed (i.i.d.) random variables and Sn = i=1 Xi is the random walk. Specifi- 1 cally, when P (Xi = 1) = P (Xi = −1) = 2 , it is called a simple symmetric random walk. Local times of the random walk Sn on the level x ∈ Z at time n is defined as `(n, x) = #{i = 1, 2...n : Si = x}. It can be interpreted as the number of excursions away from x completed before n. We also use `n to represent `(n, 0) for short.
Let ρ0(x) = 0, ρk(x) = min{j : j > ρk−1(x),Sj = x}. So ρk(x) is the moment th that the random walk hits x the k time. When Si is a simple symmetric random 34
walk, {ρn(x) − ρn−1(x)} is a sequence of i.i.d. random variables with
x−1 X n P (ρ (x) > n) = 2−n (x = 1, 2, ...). 1 b(n − j)/2c j=0
We denote ρk(0) by ρk for short. Then
2k − 2 P (ρ = 2k) = 2−2k+1k−1 , k = 1, 2, ··· 1 k − 1
Also, since P (`(n, x) = k) = P (ρk(x) ≤ n, ρk+1(x) > n), it follows that
Theorem 3.1.1 (c.f. e.g. [54] Theorem 9.4). Let x > 0, k > 0. Then for any k = 1, 2, ··· ,
1 n−k+1 , if n + x is even, 2n−k+1 (n+x)/2 P (`(n, x) = k) = 1 n−k , if n + x is odd. 2n−k (n+x−1)/2
Together with Stirling’s formula, the limiting distribution of the local time can be found.
Theorem 3.1.2 (c.f. e.g. [54] Theorem 9.11).
Z x ρ1(k) ρk 1 − 3 − 1 v √ 2 2 lim P ( 2 < x) = lim P ( 2 < x) = v e dv, k→∞ k k→∞ k 2π 0
1 − 3 θk P (ρ = 2k) = √ k 2 exp( ), where |θ | ≤ 1, 1 2 π k k
r 2 1 P (ρ ≥ x) = 1 + O (x → ∞), 1 πx x and
r Z x 2 √ √ 2 − u lim P (`n/ n < x) = lim P (`(n, z)/ n < x) = e 2 du, x ∈ Z. n→∞ n→∞ π 0
The right hand side is the distribution function of the absolute value of a random variable normally distributed. 35
In 1949, Chung and Hunt [21] also introduced waiting time ρ1, ρ2, ··· . By considering the moment generating function of ρ1, they gave the asymptotic dis- tribution of `n.
Theorem 3.1.3 ([21]).
Z ∞ 2 1/2 − u P (`2n ≥ r) = (2/π) e 2 du + , t 6n
− 1 with t = 2r(2n+2/3) 2 , || < 1, and n is big enough. It follows that as n → ∞, the ρ distribution of n approaches the stable distribution whose characteristic function 4n2 is 1 exp{− (1 − i sgn t)|t|1/2}. 2 At the same time the distribution goes to that of 1/Y 2, where Y ∼ N(0, 1). Also, ` the distribution of √n tends to that of |Y |. n ` Corollary 3.1.4. For every fixed t, the distribution of √[nt] converges to the dis- n tribution of the local time L(t) of the standard Brownian motion B(t) at level zero, ` i.e. √[nt] → L(t) in distribution. n Chung and Hunt (1949) are also pioneers in establishing the first law of iterated logrithm(LIL) for the local time of simple symmetric random walks: almost surely, for any ω, ∃N0(ω), such that
`n 1 √ < ((2 + ) log log n) 2 ∀ n > N (ω). n 0
This result can be strengthened. In functional space, it holds in the sense of weakly convergence. Refer to Theorem 1.0.5, Borodin’s work [14] and Perkins’ work [49] on the weak convergence or convergence with probability 1 for general random walks’ local times to the Brownian local time. The study of the distribution convergence is not limited to simple symmetric random walks. A. Aleˇskeviˇcien´e[8] [9] considered aperiodic recurrent random walks. And she relaxed the moment requirement to be only finite variance σ2 = 36
VarX1 < ∞. Then for r > 0,
r Z ∞ 2 `(n, x) 2 − u lim P ( √ ≥ r) = e 2 du, −∞ < x < ∞, x ∈ . −1 Z n→∞ σ n π r
3 If in addition β3 = E[|X1| ] < ∞, then for r > 0,
r Z ∞ 2 2+ `(n, x) 2 − u 1 + |x| 1 + |x| P ( √ ≥ r) − e 2 du ≤ c √ + , x ∈ , > 0, −1 2 Z σ n π r r n r n (3.1) where is an arbitrary small constant positive number. The approximation is good √ only when |x| ≤ n n, where n → 0 with sufficient speed. In 1984, she improved
(3.1) and proved if β3 < ∞, then
|x| P (`(n, x) = 0) = G( √ ) + O(n−1/6), σ n
r Z r |x| `(n, x) |x| 2 − 1 ( √ +u)2 P ( √ < r) = G( √ ) + e 2 σ n du + O(n−1/6), −1 σ n σ n π 0
r 2 2 R x − u where G(x) = e 2 du, r > 0. π 0 Theorem 3.1.5 ([9]). In the same setting, in addition if E[`(n, x)3] < ∞, then for any integer l ≥ 1,
r Z ∞ 2 `(n, x) l 2 − (z+y) l l − l |x| E( √ ) = e 2 y dy + σ n 2 l!R (x), z = √ −1 l,n σ n π 0 σ n
For Rl,n(x) one has
l −l/2 c0 ln n c0 l σ n |Rl,n(x)| ≤ √ (1 + √ ) , 2πn 2πn
Z ∞ 2 l −l/2 1 c0 ln n c0l l 3 − y σ n |Rl,n(x)| ≤ √ (y + √ ) y e 2 dy, l! 2πn 0 2πn where c0 is a constant.
For general random walks, Jain and Pruitt (1984, [32]) obtained some LIL results for local times of recurrent random walks under a very general condition 37 using absolute potential kernel. We describe the details below.
Let Xn be integer-valued i.i.d. random variables with mean 0 and with the same distribution F . For x > 0, define Z 1 2 G(x) = P (|X| > x),K(x) = 2 y dF (y), x |y|≤x and Q(x) = G(x) + K(x) = E(x−1|X| ∧ 1)2.
Assume that G(x) lim sup < 1, (3.2) x→∞ K(x) which implies that E[|X|] < ∞. Together with E[X] = 0, the random walk is recurrent. The quantity on the left of (3.2) was introduced by Feller (1966) to describe the compactness and convergence of the normalized random walks. If X is in the domain of attraction of a stable law of index α, then
G(x) 2 − α lim sup = . x→∞ K(x) α
So Jain and Pruitt’s results include all cases when X is in the domain of attraction of a stable law of index α > 1 and of zero mean. The class of the distributions described by (3.2) is much larger than this. Jain and Pruitt also pointed out that condition (3.2) excludes the case when the local time has a slowly varying increasing rate. The function Q is continuous and strictly decreasing for x large enough. Thus one can define ay by Q(ay) = 1/y for y > y0 = 1/Q(1) and ay = 1 for y ∈ [0, y0].
Let cn = an/ log log n, which is the normalizing coefficient.
Theorem 3.1.6 ([32]). There exists θ1 ∈ (0, ∞) such that for all x ∈ Z,
cn lim sup `(n, x) = θ1 a.s. n→∞ n
Theorem 3.1.7 ([32]). Given > 0, there exists δ > 0 such that
lim sup sup (cn/n)|`(n, x) − `(n, y)| < a.s. n→∞ |x−y|≤cnδ 38
Theorem 3.1.8 ([32]). There exists θ2 ∈ (0, ∞) such that
lim sup sup(cn/n)`(n, x) = θ2 a.s. n→∞ x
Remark 3.1.9. If X is in the domain of attraction of a stable law, then θ1 = θ2.
In all, in this section we reviewed the exact and asymptotic distributions of local times of recurrent random walks. Those questions will be answered as well for the local times of stationary processes in our work.
3.2 Occupation times of Markov chains
In this section, we revisit the occupation time of Markov chains and give the formal statement of the results. Let X(t) be a Markov process with stationary transitions and takes values in an abstract space; V (x) is a non-negative function over the 1 R t abstract space. Darling and Kac [24] studied the limit of u(t) 0 V (X(s))ds, where u(t) is a suitable normalization. If V (x) is the indicator function of a set, then R t 0 V (X(s))ds is the occupation time of the set. They proved that under suitable conditions, the limiting distribution must be the Mittag-Leffler distribution of some index. Karamata’s Tauberian theorem plays a key role in the proof. Darling and Kac’s method is applicable to Markov chain, and in particular, to the sums of i.i.d. random variables. Suppose the transition probability of the Markov process is P (x, E; t) with ini- R ∞ −st tial state X(0) = x0. Its Laplace transformation is ps(x, E) = 0 P (x, E; t)e dt, which defines a measure. Suppose P and V satisfy the “Darling-Kac condition” (refer to (1.2) in the Introduction and Overview). Next we review some definitions which will be used later.
Definition 3.2.1 (Slowly varying functions). A positive function h(x), defined for x > 0, is slowly varying (at infinity) if for all t > 0,
h(tx) lim = 1. x→∞ h(x)
Definition 3.2.2 (Regularly varying). A measurable function f : R+ → R+ is 39 regularly varying at infinity if
f(xy) lim = yα (y > 0). x→∞ f(x)
The number α is called the index of regular variation (of f) and f is said to be regularly varying at infinity with index α. Similarly, a measurable function f : R+ → R+ is regualry varying at 0 with index α if
f(xy) lim = yα (y > 0). x→0 f(x)
Theorem 3.2.3. (Karamata’s Tauberian theorem) Suppose that u : R+ → R+ is measurable and locally integrable. Let
Z x Z ∞ a(x) := u(t)dt, u¯(p) := e−ptu(t)dt, 0 0 then
1. the following are equivalent:
(a) x → a(x) is regularly varying at ∞. (b) p → u¯(p) is regularly varying at 0. u¯(p) (c) limp→0 1 exists and the limit is a positive real number. a( p ) 1 In this case, u¯(p) ∼ Γ(1 + α)a( ) as p → 0 where α ≥ 0 is the (mutual) p index of regular variation.
2. Suppose that α > 0. tu(t) (a) If u is regularly varying at ∞ with index α − 1, then a(t) ∼ as α t → ∞. (b) Conversely, if a is regularly varying at ∞ with index α, and u is mono- αa(t) tone, then u(t) ∼ as t → ∞. t Definition 3.2.4 (Mittag-Leffler distribution of order α). Let α ∈ [0, 1]. The random variable Yα on R+ has the normalized Mittag-Leffler distribution of order 40
α if ∞ X Γ(1 + α)pzp E(ezYα ) = . Γ(1 + pα) p=0 The probability density function is
∞ 1 X (−1)j−1 f (x) = sin(παj)Γ(αj + 1)xj−1. Yα πα j! j=1
The cumulative distribution function is
∞ 1 X (−1)j−1 F (x) = sin(παj)Γ(αj + 1)xj. Yα πα jj! j=1
Note that E[Yα] = 1. Y1 = 1, and the density functions of Y0 and Y 1 are given 2 by −y fY0 (y) = e , and 2 y2 − π fY 1 (y) = e . 2 π
Note that fY 1 (y) is the density function of the absolute value of a random variable 2 normally distributed.
The idea of Darling and Kac’s method is first to estimate the Laplacian trans- formation of the kth moments of the Occupation time in related to a slowly varying function, then by Karamata’s Tauberian theorem, the kth moment can be repre- sented, which is consistent with the kth moments of Mittag-Leffler distribution. To illustrate it, we take Brownian motion W (t) as an example. Suppose the initial value of the Brownian motion is 0. Let V be as before. The second moment is
Z t 2 µ2(t) = E( V (W (s))ds) 0 Z t Z s2 ZZ = 2 V (x1)V (x2)P (0|x1; s1)P (x1|x2; s2 − s1)dx1dx2ds1ds2, 0 0 where P (x|y; t) is the probability that the Brownian motion starts at x and arrives 41
1 ||x−y||2 y after time t: P (x|y; t) = 2πt exp(− 2t ). Its Laplace transform is
Z ∞ Z ∞ √ −st 1 −st−||x−y||2/2t 1 1 e P (x|y; t)dt = e dt = K0( 2s · ||x − y||), 0 2π 0 t π √ 1 1 where π K0( 2s · ||x − y||) = 2π log 1/s − 1/π log ||x − y|| + O(1), s → 0. There are 1 two parts: the infinite part is 2π log 1/s, later this is called h(s) and the potential part is 1/π log ||x − y||. Next the Laplace transform of the second moment is introduced. One has that as u → 0, Z ∞ Z ∞ −ut 2! 2 2 u e µ2(t)dt ∼ 2 ( V (x)dx) (log 1/u) . 0 (2π) −∞ Similarly, as u → 0,
Z ∞ Z ∞ −ut k! k k u e µk(t)dt ∼ k ( V (x)dx) (log 1/u) . (3.3) 0 (2π) −∞
The fact that log 1/s is slowly varying enables one to use Karamata’s Tauberian theorem, it follows that
Z ∞ k! k k µk(t) ∼ k ( V (x)dx) (log t) , as t → ∞. (2π) −∞
n n! Since the moments of X ∼ exp(λ), for n = 1, 2, ..., are given by E(X ) = λn , 2π R t −α we get limt→∞ P ( C log t 0 V (x(s)ds < α) = 1 − e .
For the case of Markov processes, the Darling-Kac’s condition (1.1) ensures the convergence of the Laplace transform of the kth moments. However, in order to apply Karamata’s Tauberian theorem, some restrictions on the normalization u(t) for the occupation time have to be made.
Theorem 3.2.5 ([24]). If L(1/s) h(s) = sα with 0 ≤ α < 1, where L(1/s) is slowly varying as s → 0, then 42
1 Z t lim P V (x(τ))dτ < x t→∞ Ch(1/t) 0 ∞ 1 Z x X (−1)j−1 = sin(παj)Γ(αj + 1)yj−1dy πα j! 0 j=1
=: Gα(x),
and t !k R V (x(s))ds k! lim E 0 = . t→∞ Ch(1/t) Γ(αk + 1)
The right hand side is known to be the moments of the Mittag-Leffler distribution
Gα(x). The interesting part is that the converse of the theorem is also true.
Theorem 3.2.6 ([24]). If X(t) and V (x) satisfy the Darling-Kac condition (1.1) and if in addition for some u(t) > 0,
1 Z t lim P V (X(s))ds < x = G(x), t→∞ u(t) 0 where G(x) is a nondegenerate distribution, then
h(s) = L(1/s)/sα
for some α ∈ [0, 1) and slowly varying function L(1/s). Hence G(s) = Gα(x/b) for some appropriate constant b.
When applied to the Markov chains, the Laplace transform is replaced by the generating function. Let Xn be a Markov Chain with transition probabilities
Pn(x, E) = P (Xn+k ∈ E|Xk = x). Define
∞ X n pz(x, E) = Pn+1(x, E)z n=0 with 0 ≤ z < 1, and V (x) is a non-negative, measurable function. Suppose the “Darling-Kac condition” holds: there exists a function h(z) → ∞, z → 1, and 43
C > 0, such that Z p (x, dy) z V (y) → C, z → 1. h(z) The convergence is uniform in {x|V (x) > 0}. Under these conditions, one has the following theorem.
Theorem 3.2.7 ([24]). A necessary and sufficient condition that for some nor- malizing sequence un the limiting distribution of
n 1 X V (X ) u j n j=0
1 1 exists and is nonsingular is that h(z) = (1−z)α L( 1−z ) for some α, 0 ≤ α < 1 and L is slowly varying as z → 1. In case h is satisfied, un can be taken to be Ch(1−1/n) and the limiting distribution is then the Mittag-Leffler distribution.
When Xn are i.i.d., suppose its common distribution function is F (x) and the (n) characteristic function is φ(t). F (x) is the distribution function of Sn. Then
1 Z T ψ(t, x, E)φ(t) pz(x, E) = lim dt, T →∞ 2π −T 1 − zφ(t)
R −it(y−x) where ψ(t, x, E) = E e dy. Suppose either F (x) has (1) an absolutely con- tinuous component or (2) has a lattice structure and if as t → 0, φ(t) ∼ 1 − |t|γ, then “Darling-Kac condition” is met with
1 1 h(z) = , 1 < γ ≤ 2, γ sin(π/γ) (1 − z)1−1/γ and 1 1 h(z) = log , γ = 1. π 1 − z When V is the characteristic function of set B, in case (1), the constant C in the “Darlind-Kac condition” is the Lebesgue measure of B, while in case (2), C is the number of lattice points in B. 44
3.3 Ergodic sums of infinite measure preserving transformation
To prepare us well for the study of the asymptotic distribution of stationary pro- cesses with a conditional local limit theorem, we shall have a recall on the related concepts and results in the infinite ergodic theory.
Definition 3.3.1 (Conservative measure). Given a dynamical system (X, B, T, m), we say measure m or transformation T is conservative if m(A) = 0 whenever A ∈ B −n is such that {T A}n≥0 are pairwise disjoint. It means that given any measurable S −n set A, almost all points of A will eventually return to this set, i.e. A ⊂ n≥1 T A (mod m) for all A ∈ B with m(A) > 0.
Proposition 3.3.2 (cf., eg.,[3]).
∞ X 1 [ PT k f = ∞] = C(T ) mod ∀f ∈ L (m), f > 0. n=1
Here C(T ) = X \D(T ) is the conservative part of T and D(T ) = U(W(T )) is the measurable union of the collection of measurable wandering sets W(T ) for T.
Definition 3.3.3 (Ergodic). Given a dynamical system (X, B, T, m), measure pre- serving and non-singular transformation T is called ergodic if A ∈ B,T −1A = A implies m(A) = 0 or m(Ac) = 0.
Generally, for invariant set A, since T −1A = A, T −1Ac = Ac. It implies that TA = A and TAc = Ac. So (X, B, T, m) can be broken up into two pieces and ergodicity means that there is no non-trival invariant set.
Definition 3.3.4 (A moment set). Let (X, B, m, T ) be a conservative, ergodic, measure preserving transformation. Let A ∈ B, 0 < m(A) < ∞ and set
n−1 ∞ X m(A ∩ T −kA) X m(A ∩ T −kA) a (A) := , uA(λ) := e−λk . (3.4) n m(A)2 m(A)2 k=0 k=0 45
T Pn−1 k Denote Sn (f)(x) := i=0 f(T x). The set A is called a moment set for T if
∞ X Z uA(λ)p e−λn ST (1 )pdm ∼ p!m(A)p+1 as λ → 0, ∀p ∈ . n A λ N n=0 A
P∞ −λn R T 1 p Here n=0 e A Sn ( A) dm is similar to the Laplace transform of the pth moment. The moment set plays the same role as that of Darling and Kac’s condi- tion and it has similar form as (3.3). With these preparatory concepts, we can discuss Darling and Kac’s theorem in infinite space. Let (X, B, m, T ) be a conservative, ergodic, measure preserving transformation of a σ-finite non-atomic infinite measure space. And f : X → [0, ∞) is a m-integrable function with non-zero integral, then under certain condi- T tions, Sn (f)(x) converges to the Mittag-Leffler distribution after properly scaled.
Theorem 3.3.5 (Aaronson’s Darling-Kac theorem, [2]). Suppose A is a moment set for T , and that uA(λ) is regularly varying with index α ∈ [0, 1] as λ → 0. Then
T Sn (f) ∞ g → E[g(m(f)Yα)] weak* in L (X) an(A)
1 whenever f ∈ L+(m) and g ∈ C([0, ∞]). Here Yα has the normalized Mittag-Leffler distribution with the index α.
We rewrite this convergence in the following way:
T Sn d −→ Yα. an(A)
Similar to the Markov process case, the converse is also true. If A is a moment set for a conservative, ergodic, measure preserving transformation T and for some random variable Y on (0, ∞) and constants an → ∞,
ST n −→d Y, an then uA(λ) is regularly varying as λ → 0. A nature question arises: what is a sufficient condition for a set to be a moment set? 46
Definition 3.3.6 (Pointwise dual ergodic). A conservative ergodic measure pre- serving transformation (c.e.m.p.t.) (X, B, m, T ) is called pointwise dual ergodic if there are constants {an} such that
n−1 Z 1 X 1 PT k f → fdm a.e. as n → ∞ ∀f ∈ L (X), an k=0 X where PT n is the Perron-Frobenius operator defined in section 2.3. The sequence
{an} is called the return sequence.
Theorem 3.3.7 ([2]). Suppose that T is pointwise dual ergodic and that A ∈ B with positive measure satisfies
n−1 1 X sup || PT k 1A||L∞(A) ≤ MA, n an(A) k=0 where an(A) is defined in (3.4).
The following corollary is the generalization of the result on the distributional convergence of normalized occupation times of Markov processes of Darling and Kac.
Corollary 3.3.8 (Generalized Darling-Kac theorem, [2]). Suppose (X, B, m, T ) is pointwise dual ergodic and that an(T ) is regularly varying with index α ∈ [0, 1], then T Sn d −→ Yα, an(T ) where Yα has Mittag-Leffler distribution with index α.
Remark 3.3.9. The scaling factor an(A) in Theorem 3.3.7 is exact the same as an(T ) in Corollary 3.3.8. In the definition of an(A), it has been scaled and actually is independent of the set A.
In the next section, we will construct a pointwise dual ergodic dynamical system and apply the generalized “Darling-Kac” theorem in an infinite space. 47
3.4 Asymptotic distribution of the local times `n of stationary processes with conditional local limit theorems
In this section, we will use the generalized Darling-Kac theorem in the infinite space to study the local times of certain stationary processes with conditional local ∞ limit theorems. To be precise, let (Ω, F,P ) be a probability space, and {Xn}n=1 is an integer-valued stationary process with E[Xn] = 0. Due to the stationary proposition, there exists a measure preserving transformation T :Ω → Ω and R 1 a random variable φ :Ω → Z with Ω φ dP = 0 and φ ∈ L (P ), such that n−1 T Pn−1 k Xn = φ◦T , ∀ n ≥ 1. As before, denote Sn (f) := k=0 f ◦T for any f :Ω → Z. T Specifically, Sn := Sn (φ). The local time of {Xn} at time n on the level x ∈ Z Pn 1 is defined as `(n, x) = k=1 {Sk=x}. Denote by `n the local time at 0 at time n for short. By expanding the probability space (Ω, F,P ) to be a product space, the local times can be represented as ergodic sums in an infinite space. Define T˜ :Ω × Z → Ω × Z by T˜(ω, n) = (T ω, n + φ(ω)), (3.5) where φ is the same as above. By induction, one has
˜k k T (ω, n) = (T ω, n + Sk(ω)). (3.6)
Let mZ be the counting measure on the integer space Z, and Z is the Borel-σ algebra of Z. A new dynamical system (X, B, µ, T˜) then can be defined, where
X = Ω × Z, B = F ⊗ Z and µ = P ⊗ mZ is the product measure. T˜ Pn ˜i Let A = Ω × {0}, and define Sn (f)(ω, m) := i=1 f ◦ T (ω, m). By (3.6), the local time of {Sn} has the following representation:
n n X 1 X 1 ˜i T˜ 1 `n(ω) = {Si(ω)=0} = {A}(T (ω, 0)) = Sn ( {A})(ω, 0). i=1 i=1 In this section we study the connection of local times and local limit theorems as stated below.
Definition 3.4.1. A centered integer-valued stationary process {Xn} is said to 48 have a conditional local limit theorem at 0, if there exists a constant g(0) and a sequence {Bn} of positive real numbers, such that for all x ∈ Z
BnP (Sn = x|(Xn+1,Xn+2, ...) = ·) → g(0) (3.7) almost surely.
The full formulation of the corresponding form of a local limit theorem goes back to Stone and reads in the conditional form (see [5]) that
B P (S = k |(X ,X , ...) = ·) → g(κ) as kn−An → κ P -a.s. n n n n+1 n+2 Bn for all κ ∈ R, where An is some centering constant. Definition 3.4.1 can be re- ∞ formulated using the dual operator PT of the isometry Uf = f ◦ T on L (P ) operating on L1(P ), where we take the Lp-spaces of P restricted to the σ-field generated by all Xn and T the shift operation T (X1,X2, ...) = (X2,X3, ...). This operator is called the transfer operator. The local limit theorem at 0 then reads
n 1 BnPT ( {Sn=x}) → g(0) for all x ∈ Z, P -a.s.. (3.8)
In this section, we assume that {Xn} has the conditional local limit theorem at 0 as formulated in Definition 3.4.1. In addition, we assume that the convergence is uniformly for all x.
Remark 3.4.2. If the convergence is uniformly for almost all ω and {kn}, it would imply that {Xn} has the local limit theorem by talking expectation on both hand sides, then by [31], {Xn} are in the domain of attraction of a stable law with some index d: Sn − An W −→ Zd. Bn
The probability density function of Zd is g as above and denote the cumulative R distribution function by G(x). Since Ω φ dP = 0, we can (and will) assume that
An = 0. It is necessary [31] that {Bn} is regularly varying of order β = 1/d.
We will use Hurewicz’s Ergodic Theorem and Disintegration Theorem in the proof later, so we state them below. 49
Theorem 3.4.3 (Hurewicz’s Ergodic Theorem). Suppose that T is a conservative, non-singular transformation of the σ-finite measure space (X, B, m). Then
Pn k=1 PT k f(x) Pn → h(f, g)(x), a.s. k=1 PT k g(x)
1 R for x ∈ X, ∀f, g ∈ L (m), g > 0, where h(f, g) ◦ T = h and X h(f, g)ug dm = R ∞ X uf dm ∀u ∈ L (m), u ◦ T = u. When T is ergodic, R X f dm h(f, g) = R . X g dm
Since there is no obvious evidence that (X, B, µ, T˜) is pointwise dual ergodic, we will make a decompose of it by the following theorem.
Theorem 3.4.4 (Disintegration theorem, cf. eg., [3]). Let (X, B, m) and (Y, C, µ) be standard probability spaces and suppose π : X → Y is measurable and µ = −1 m ◦ π , then ∃ Y0 ∈ C such that µ(Y0) = 1 and there exists a measurable function −1 y → my such that my(π {y}) = 1, ∀y ∈ Y0 and Z −1 m(A ∩ π B) = my(A) dµ(y) ∀ A ∈ B,B ∈ C. B
−1 The measure my is called the fibre measure over π {y}.
With the assumption of the conditional local limit theorem 3.4.1, the lemma below is the key ingredient in finding the limiting distribution of local times.
Lemma 3.4.5. Suppose {Xn} has the conditional local limit theorem 3.4.1. Then the following holds.
1. T˜ defined in (3.5) is a conservative and measure preserving transformation of (X, B, µ).
2. Ergodic Decomposition: there exists a probability space (Y, C, λ), and a col-
lection of measures {µy : y ∈ Y } on (X, B) such that
(a) For y ∈ Y , T˜ is a conservative ergodic measure preserving transforma-
tion of (X, B, µy). 50
(b) For A ∈ B, the map y → µy(A) is measurable and Z µ(A) = µy(A) dλ(y). Y
˜ 3. λ-almost surely for y, (X, B, µy, T ) is pointwise dual ergodic.
1 Proof. 1. For any m ∈ Z, let f :Ω×Z → R be defined as f(ω, k) = h(ω)⊗ {m}(k). It can be proved that µ almost surely for (ω, k) ∈ X,
! 1 1 PT˜n (h ⊗ {m})(ω, k) = PT n h(·) {m} k − Sn(·) (ω). (3.9)
Indeed, for any u(ω, k) ∈ L∞((X, µ)),
Z n 1 PT˜ (h ⊗ {m})(ω, k)u(ω, k) dµ(ω, k) X Z ˜n = (h ⊗ 1{m})(ω, k) u ◦ T (ω, k) dµ(ω, k) Ω×Z Z n = (h ⊗ 1{m})(ω, k)u(T ω, k + Sn(ω)) dµ(ω, k) Ω×Z Z Z 1 n = h(ω) {m}(k)u(T ω, k + Sn(ω)) dP (ω) dmZ(k) Z Ω Z Z 1 0 n 0 0 = h(ω) {m}(k − Sn(ω))u(T (ω), k ) dP (ω) dmZ(k ) Z Ω Z Z ! 1 0 0 0 = PT n h(ω) {m}(k − Sn(ω)) u(ω, k ) dP (ω) dmZ(k ) Z Ω
Z 0 0 0 = PT n h(ω)1{m}(k − Sn(ω))u(ω, k ) dµ(ω, k ) Ω×Z Z ! = PT n h(·)1{m} k − Sn(·) (ω)u(ω, k) dµ(ω, k). X
So (3.9) is proved. Set h ≡ 1, under the assumption of the conditional local limit theorem 3.4.1,
N N X X n PT n 1{m} k − Sn(·) (ω) = P (Sn = k − m|T () = ω) n=1 n=1 51
N X 1 ∼ g(0) =: a P − a.s. for ω. B N n=1 n
P∞ 1 Since n=1 = ∞, it follows that Bn
∞ X 1 1 PT˜n ( ⊗ {m}) = ∞ µ − a.s. n=1 P By linearity of P n , for any f(ω, x) = k 1 (x) with k > 0 and T m∈Z m {m} m P k < ∞, one has m∈Z m
∞ X PT˜n f = ∞ µ − a.s. n=1
By Proposition 3.3.2, the conservative part of T˜ satisfies C(T˜) = X mod µ, which means that (X, B, T˜ , µ) is conservative. 2. The proof of the ergodic decomposition is an adaptation of the corresponding argument of section 2.2.9 of [3](page 63). We show that T˜ doesn’t necessary have to be invertible in order to make ergodic decomposition. It can be proved by Proposition 3.4.6 and Proposition 3.4.7 below.
Proposition 3.4.6. Let T be a conservative, non-singular, measure preserving transformation of a standard probability space (X, B, m), then there exists a probability space (Y, C, µ), and a collection of probabilities {my : y ∈ Y } on (X, B) such that
1. for y ∈ Y , T is a conservative, ergodic, non-singular, measure preserving
transformation of (X, B, my) and
−1 −1 dmy ◦ T dm ◦ T = , my − a.s. dmy dm
2. for A ∈ B, the map y 7→ my(A) is measurable, and Z m(A) = my(A)dµ(y). Y 52
Proof. Let (Y, C, µ, I) be the invariant factor of T and let π : X → Y be the c invariant factor map. By Disintegration Theorem 3.4.4, ∃ Y0 ∈ C, µ(Y0 ) = 0 and y 7→ my measurable Y0 → P (X, B) such that
−1 my(π {y}) = 1 ∀y ∈ Y0, Z −1 m(A ∩ π B) = my(A) dµ(y) ∀A ∈ B,B ∈ C. B
−1 dm◦T −1 0 Also, since T is non-singular, m ◦ T ∼ m. Denote dm = T . Next we prove −1 −1 dmy ◦ T dm ◦ T = my − a.s. dmy dm ∞ ∞ It is sufficient to prove that for any f ∈ L (m)+ and g ∈ L (µ)+,
Z Z Z Z 0 −1 T f dmy g(y) µ(dy) = f dmy ◦ T g(y) µ(dy). Y X Y X −1 Indeed, since my(π {y}) = 1 and by Disintergration Theorem 3.4.4, one has
Z Z Z Z 0 0 T f dmy g(y) µ(dy) = g(πx) T f dmy µ(dy) Y X Y X Z = g(πx)T 0f dm X Z = g(πx)f dm ◦ T −1 X Z Z −1 = g(πx)f dmy ◦ T dµ(y) Y X Z Z −1 dmy ◦ T = g(πx)f dmy dµ(y) Y X dmy Z Z −1 dmy ◦ T = f dmy g(y) dµ(y) Y X dmy Z Z −1 = f dmy ◦ T g(y) dµ(y). Y X
−1 dvf ◦T R As PT f = dm , where vf (C) = C fdm and T is conservative on (X, B, m), we have 53
X X n 0 PT n 1 = (T ) = ∞ m − a.s. n≥1 n≥1 with the consequence that
X n 0 (T ) = ∞ my − a.s. for a.s. y ∈ Y. n≥1
It can be assumed that for every y ∈ Y ,
X n 0 (T ) = ∞ my − a.e., n≥1 with the consequence that T is conservative on (X, B, my) for y ∈ Y.
By the Hurewicz ergodic theorem 3.4.3 for T acting on (X, B, my),
Pn−1 1 k=1 PT k A(x) lim = E(1A|T )(x) = mπx(A) n→∞ Pn−1 1 k=1 PT k (x) for a.e. x ∈ X, and A ∈ B, where T is the collection of all invariance sets, whence, for y ∈ Y a.s.,
Pn−1 1 k=1 PT k A lim = my(A), my − a.s. ∀A ∈ A. n→∞ Pn−1 1 k=1 PT k
On the other hand, for y ∈ Y , by the Hurewicz ergodic theorem, for T acting on (X, B, my),
Pn−1 1 k=1 PT k A lim = Em (1A|T ), my − a.s. ∀A ∈ A n→∞ Pn−1 1 y k=1 PT k
So it follows that for A ∈ A and a.s. y ∈ Y ,
1 Emy ( A|T ) = my(A), my − a.s.
Hence T = {∅,X} mod my, and T is ergodic on (X, B, my).
Next, we expand Proposition 3.4.6 to σ-finite measure space, which is the following proposition. 54
Proposition 3.4.7. Suppose that T is a conservative, non-singular, measure pre- serving transformation of a standard σ-finite measure space (X, B, µ), then there is a probability space (Ω, J , λ) and a collection of measures {µω : ω ∈ Ω} on (X, B) such that
1. For ω ∈ Ω, T is a conservative, ergodic, measure-preserving transformation
of (X, B, µω). R 2. For A ∈ B, the map ω → µω(A) is measurable, and µ(A) = Ω µω(A)dλ(ω).
Proof. The proof follows from Proposition 3.4.6 and [3] (Page.63). The key point dµ is introducing a probability measure m on (X, C) and dm . We only need to prove that µω is conservative and ergodic. Suppose W is a wondering set for T , so {T −nW } are disjoint. Also, suppose R µω(W ) > 0, then since µω(W ) = W fdmω, one has mω(W ) > 0, it is a contradic- tion with the fact that T is conservative on (X, B, mω). So T is conservative on
(X, B, µω). −1 For any invariant set A = T A, since T is ergodic on (X, B, mω), either c R mω(A) = 0 or mω(A ) = 0. Then one has either µω(A) = A fdmω = 0 or c R µω(A ) = Ac fdmω = 0, so T is ergodic on (X, B, µω).
The second part of Lemma 3.4.5 can be proved by applying Proposition 3.4.6 and Proposition 3.4.7 above to (X, B, T˜). 3. We end up with the proof of Lemma 3.4.5 by showing that λ-almost surely ˜ for y,(X, B, µy, T ) is pointwise dual ergodic. Since
N X 1 1 PT˜n ( Ω ⊗ {m}) ∼ aN , µ − a.s., n=1 one has
N X 1 1 PT˜n ( Ω ⊗ {m}) ∼ aN , µy − a.s.. n=1 55
From the second part of Lemma 3.4.5, it is known that T˜ is conservative and 1 ergodic on (X, B, µy), by Hurewicz’s ergodic theorem, ∀f ∈ L (µy), almost surely,
n n R P fdµy 1 X k=0 PT˜k f Ω×Z PT˜n f ∼ → R . an n 1Ω ⊗ 1{m}dµy k=0 P 1 1 Ω×Z k=0 PT˜k Ω ⊗ {m}
1 Since aN doesn’t depend on the integer m we choose, R doesn’t 1⊗1{m}dµy Ω×Z ˜ depend on m, let’s denote it by C(y). Hence, (X, B, µy, T ) is pointwise dual ergodic with return sequence n X 1 a C(y) = C(y)g(0) . n B i=1 i
The theorem below provides the limiting distribution of the local time of sta- tionary processes with a conditonal local limit theorem.
Theorem 3.4.8 (Convergence of local times). Suppose stationary process {Xn := φ ◦ T n−1 : n ≥ 1} defined in a probability space (Ω, F,P ) has a conditional local β limit theorem 3.4.1 with regularly varying scaling coefficient Bn = n L(n), where 1 β ∈ [ 1 , 1) and L(n) is slowly varying. Denote a := g(0) Pn → ∞. Then `n 2 n k=1 an Bk converges to Yα in the following sense: Z `n(ω) g H(ω)dP (ω) → E[g(Yα)], (3.10) Ω an for any bounded and continuous function g and any probability density function
H on (Ω, F,P ). Here Yα has the normalized Mittag-Leffler distribution of order α = 1 − β.
β Proof of Theorem 3.4.8. Since Bn = n L(n) is regularly varying with order β, ∞ by Karamata’s integral theorem (c.f. e.g.[62] Theorem A.9.), {an}n=1 is regularly C varying of order α = 1 − β ∈ (0, 1 ] and a ∼ nα . 2 n L(n) ˜ By Lemma 3.4.5, (X, B, µy, T ) is pointwise dual ergodic. Since an is regularly 1 varying, by applying Theorem 3.3.8, for any f ∈ L (µy), f ≥ 0, one has strong convergence in the following sense: 56
Z T˜ Sn (f) g hydµy → E[g(C(y)µy(f)Yα)], (3.11) X an for any bounded and continuous function g and for any probability density function hy of (X, B, µy). Here Yα has the normalized Mittag-Leffler distribution of order α = 1 − β. Define a probability density function H(ω, m) of (X, B, µ) as
H(ω), m = 0, H(ω, m) = 0, m 6= 0, where H(ω) is an arbitrary probability density function in Ω. For each y, let
1 R R H(ω,k)dµ (ω,k) H(ω, j), X H(ω, k)dµy(ω, k) 6= 0; h (ω, j) = X y (3.12) y R 0, X H(ω, k)dµy(ω, k) = 0.
Then hy(ω, j) is a probability density function on (X, B, µy) for y ∈ U where R U = {y ∈ Y : X H(ω, j)dµy(ω, j) 6= 0}. By the Disintegration Theorem 3.4.4,
Z ST˜(f)(ω, x) g n H(ω, x)dµ(ω, x) X an Z Z T˜ Sn (f)(ω, x) = g H(ω, x)dµy(ω, x)dλ(y) U X an Z Z T˜ Sn (f)(ω, x) + g H(ω, x)dµy(ω, x)dλ(y) Y \U X an Z Z Z T˜ Sn (f)(ω, x) = H(ω, x)dµy g hy(ω, x)dµy(ω, x)dλ(y) U X X an Z Z Z T˜ Sn (f)(ω, x) = H(ω, x)dµy g hy(ω, x)dµy(ω, x)dλ(y) Y X X an Z → µy(H)E[g(C(y)µy(f)Yα)]dλ(y). Y
In the last step, the Dominant Convergence Theorem is used. Let f = 1Ω × 1{0}, 57
then C(y)µy(f) = 1. The result above asserts that
Z `n g H(ω)dP (ω) → E[g(Yα)] Ω an for any bounded and continuous function g and any probability density function H of (Ω, F,P ).
Corollary 3.4.9 (Gibbs-Markov transformation [5]). Let (Ω, B, P, T, α) be a mix- ing, probability preserving Gibbs-Markov map (see [5] for definition), and let φ : Ω → Z be Lipschitz continuous on each a ∈ α, with
|φ(x) − φ(y)| Dαφ := supa∈αDaφ = supa∈α sup < ∞ x,y∈a d(x, y) and distribution G in the domain of attraction of a stable law with order 1 < d ≤ 2. n−1 1/d Then {Xn := φ ◦ T } has conditional local limit theorem with Bn = n L(n), where L(n) is a slowly varying function. By Theorem 3.4.8, the scaled local time of Sn converges to Mittag-Leffler distribution strongly.
In [5], the conditions for finite and countable state Markov chains and Markov interval maps to imply the Gibbs-Markov property are listed. Next, we show two examples of stationary processes whose local times converge to the Mittag-Leffler distribution.
Example 3.4.10 (Continued Fractions). Any irrational number x ∈ (0, 1] can be uniquely expressed as a simple non-terminating continued fraction
1 x = [0; c1(x), c2(x), ··· ] =: 1 . c1(x) + 1 c2(x)+ c3(x)+···
The continued fraction transformation T is defined by
1 T (x) = x − [ ]. x
n−1 Define φ : (0, 1] → N by φ(x) = c1(x) and Xn := φ ◦ T . We have the following convergence in distribution with respect to any absolutely continuous probability 58 measure m λ, where λ is the Lebesgue measure, i.e.
Pn X i=1 i − log n → F, n/ log 2 where F has a stable distribution (cf. eg. [50]).
Let an := {x ∈ (0, 1] : c1(x) = n} for every n ∈ N+ and the partition is α = {an : n ∈ N}. Then (Ω, B, µ, T, α) is the continued fraction transformation where Ω = (0, 1]. It is a mixing and measure preserving Gibbs-Markov map with 1 1 respect to the Gauss measure dµ = ln 2 1+x dx. Define the metric on Ω to be d(x, y) = rinf{n:an(x)6=an(y)}, where r ∈ (0, 1). Note that φ is Lipschitz continuous on each partition.
Define (X, F, ν, TX , β) to be the direct product of (Ω, B, µ, T, α) with metric 0 0 0 0 dX ((x, y), (x , y )) = max{d(x, x ), d(y, y )}. One can check that (X, F, ν, TX , β) is still a mixing and measure preserving Gibbs-Markov map. Let f : X → Z be defined by f(x, y) = φ(x) − φ(y). Since φ is Lipschitz on partitions α, so is f. n−1 Define Yn((x, y)) = f ◦ TX (x, y) = Xn(x) − Xn(y), (x, y) ∈ X.Yn is in the Pn domain of attraction of a stable law. Let Sn := i=1 Yi .The local time at level Pn 1 0 of Sn is denoted to be `n(x, y) = i=1 {Si(x,y)=0}. By applying Corollary 3.4.9 to the Gibbs-Markov map (X, F, ν, TX , β) and the Lipschitz continuous function f, Sn has a conditional local limit theorem and the local time `n converges to the Mittag-Leffler distribution after scaled.
Example 3.4.11 (β transformation). Fix β > 1 and T : [0, 1] → [0, 1] is defined by T x := βx mod 1. Let φ : [0, 1] → Z be defined as φ(x) = [βx] and Xn(x) = φ◦T n−1(x) = [βT n−1x]. There exists an absolutely continuous invariant probability measure P . By [1], there is a conditional local limit theorem for the partial sum
Sn of {Xn}. Then Theorem 3.4.8 can be applied to ([0, 1], B,P,T ) and {Xn}, it follows that the scaled local time of Sn at level E[φ] converges to the Mittag-Leffler distribution. 59
3.5 Limit theorems of local times of discrete- time fractional Brownian motion
In this section, we study the limiting behavior of the local time of discrete-time fractional Brownian motion.
Since the state of fractional Brownian motion BH is R, we consider its occupa- 1 tion time. Suppose D is a subset of the state space R and V := D, the occupation P time at n of BH is defined to be λ(n, D) := i≤n V (BH (i)). We have the following result for the occupation time of the discrete-time fractional Brownian motion.
Theorem 3.5.1 (Limiting distribution of the occupation time of fractional Brow- nian motions). Let (Xn)n∈N be as in Theorem 2.5.1. Denote the occupation time of Pn 1 Sk in the interval (a, b) at time n by `n([a, b]) = i=1 (a,b)(Si). Then there exists 1−H a sequence of numbers an = O(n ) such that
Z Z 1 `n([a − x, b − x]) g ψ(ω)dP (ω) dx → E[g((b − a)Yα)](3.13), 2 − Ω an for any > 0, any bounded and continuous function g, any probability density 1 function ψ ∈ L (P ), where Yα is a random variable having the Mittag-Leffler distribution with index α = 1 − H.
Remark 3.5.2. (1) Taking ψ = 1 one could try to evaluate the left hand side when → 0. This could show that the occupation times have a weak limit which is Mittag- Leffler distribution. We do not know this, but the result shows that convergence in the weak-∗ sense in L∞(dx). (2) We do not know the connection of this result to the local time of fractional Brownian motion. In [33] it is remarked that the law of the local time of a fractional Brownian motion is not a Mittag-Leffler distribution unless it is Brownian motion, although Kono’s result in [38] suggested that it may be true. Theorem 3.5.1 may give a hint to explain this phenomenon. Kasahara and Matsumoto have found that the limiting distribution of the occupation time of BH is similar but not equal to a Mittag-Leffler distribution. 60
3.5.1 Occupation times of discrete-time fractional Brown- ian motions
In this section, we interpret the stationary Gaussian random variables {Xn} above from the point view of dynamical systems.
Without loss of generality, suppose the random variables {Xn} are defined in the probability space(Ω, Σ,P ) with Ω = RN and Σ is the σ-algebra generated by the cylinder sets of RN. T is the shift operator
T :Ω → Ω, (T ω)i = ωi+1
N R where ω = (ω1, ω2...) ∈ R . Define φ :ΩtoR as φ(ω) := ω1, Ω |φ|dP < ∞ and R (n−1) Ω φ dP = 0. Let random variables Xn(ω) := φ ◦ T (ω) = ωn satisfy the jointed Gaussian probability distribution with zero mean: for any family of Borel sets
C1,C2...Cr ∈ R,
P ({ω ∈ Ω: Xn1 (ω) ∈ C1,Xn2 (ω) ∈ C2, ...Xnr (ω) ∈ Cr}) Z = p(t)dt. C1×C2...×Cr
1 Here p is the normal probability density function: p(t) = C exp(− 2 (Dt, t)), where t = (t1, t2, ··· , tr) and D is the matrix inverse to the covariance matrix
B = (b(ni − nj)) and b(ni − nj) = E[Xni Xnj ].
We represent the occupation time of {Sn} by introducing the skew product as before: let (X, B, µ) = (Ω × R, Σ × σ(R),P × mR), where σ(R) is the Borel ˜ σ-algebra and mR is the Lebegue measure on R. Define T :Ω × R → Ω × R by ˜ ˜n n T (ω, r) := (T (ω), r +φ(ω)), then by induction, T (ω, r) = (T ω, r +Sn(ω)), where
Sn is the partial sum of {Xn}. T˜ Pn−1 ˜k T Define Sn (f) := k=0 f ◦ T for any f :Ω → R. Specifically, Sn (φ) = Sn. Let A = Ω × D, then the occupation time of {Sn} for set D ⊂ R has the following representation:
n n X 1 X 1 ˜i T˜ 1 λ(n, D) = {Si∈D} = {A}(T (ω, 0)) = Sn ( {A})(ω, 0). (3.14) i=1 i=1 Similarly, we can make an ergodic decomposition of the dynamical system 61
(X, B, µ, T˜).
Proposition 3.5.3 (Conservative and Ergodic Decomposition).
1. T˜ is a conservative and measure preserving transformation of (X, B, µ).
2. There exists a probability space (Y, C, λ) and a collection of measures {µy : y ∈ Y } on (X, B) such that
(a) For y ∈ Y , λ- almost surely, T˜ is a conservative ergodic measure-
preserving transformation of (X, B, µy).
(b) For A ∈ B, the map y → µy(A) is measurable and Z µ(A) = µy(A)dλ(y). Y
˜ 3. λ-almost surely for y, (X, B, µy, T ) is pointwise dual ergodic.
Proof. 1. By Corollary 8.1.5 in [3], to prove T˜ is conservative, it is sufficient to prove that R (1) φ :Ω → R is integrable, and Ω φ dP = 0. (2) T is ergodic and probability measure preserving on (Ω, Σ,P ).
By the assumption, (1) is true. For (2), by [22] (page 369), limn→∞ b(n) = 0 is a sufficient and necessary condition that T is mixing: |P (A ∩ T −nB) −
P (A)P (B)| → 0 as n → ∞, where b(n) = E[XkXk+n]. It implies that T
is ergodic. T is also a probability preserving transformation, since {Xn} are stationary. Hence T˜ is conservative. T˜ is measure preserving since T is measure preserving.
2. For the second and the third parts of Proposition 3.5.3, the proof is similar to
that of Proposition 3.4.6 except that the returning sequence is anC(y) where C(y) = (b−a) for any interval (a, b) and b > a. µy(Ω⊗(a,b))
We end up this section with the proof of Theorem 3.5.1. 62
˜ Proof. Since T is pointwise dual ergodic with respect to measure µy, an is regularly Pn g(0) varying with index α = 1 − H and has the same order as , where dn is i=0 di ST˜ the scaling coefficient in Theorem 2.5.1, then by Theorem 3.3.8, n converges C(y)an strongly in distribution, i.e.,
Z T˜ Sn (f)(ω, x) g hy(ω, x)dµy(ω, x) → E[g(µy(f)Yα)], (3.15) X C(y)an or equivalently,
Z T˜ Sn (f)(ω, x) g hy(ω, x)dµy(ω, x) → E[g(C(y)µy(f)Yα)], (3.16) X an
1 R for any bounded and continuous function g and for any hy ∈ L (µy) and X hydµy = T˜ Pn ˜i−1 1, where Sn (f) = i=1 f ◦ T , and Yα has the normalized Mittag-Leffler distri- bution of order α = 1 − H. 1 1 T˜ Pn 1 Let f = Ω × (a,b), then Sn (f)(x, ω) = i=1 (a,b)(x + Si(ω)), which is the b−a occupation time of Sn at time n on interval (a−x, b−x). Since C(y) = , µy(1Ω⊗1(a,b)) R C(y)µy(1Ω ⊗ 1(a,b)) = 1(a,b)dm, then the right hand side of (3.16) is simplified R to be E[g((b − a)Yα)]. Let H(ω, x) be any probability density function on (X, B, µ), for each y, define
1 R R H(ω,x)dµ H(ω, x), X H(ω, x)dµy 6= 0; h (ω, x) = X y (3.17) y R 0, X H(ω, x)dµy = 0. hy(ω, x) is a density function on (X, B, µy) for y ∈ U where U = {y ∈ Y : R X H(ω, x)dµy 6= 0}. By (3.16), one has
Z Pn 1 (x + S (ω)) g i=1 (a,b) i H(ω, x)dµ X an Z Z Pn 1 i=1 (a,b)(x + Si(ω)) = g H(ω, x)dµydλ(y) Y X an Z Z Pn 1 i=1 (a,b)(x + Si(ω)) = g H(ω, x)dµydλ(y) U X an 63
Z Z Pn 1 i=1 (a,b)(x + Si(ω)) + g H(ω, x)dµydλ(y) Y \U X an Z Z Z Pn 1 i=1 (a,b)(x + Si(ω)) = ( H(ω, x)dµy) g hy(x, ω)dµydλ(y) U X X an Z Z Z Pn 1 i=1 (a,b)(x + Si(ω)) = ( H(ω, x)dµy) g hy(x, ω)dµydλ(y) Y X X an Z → µy(H(ω, x))E[g((b − a)Yα)]dλ(y) by DCT Y = E[g((b − a)Yα)].
1 1 Let H(ω, x) = 2 {−,}(x)⊗ψ(ω) where > 0 and ψ(ω) is a probability density function on (Ω, Σ,P ). Then one has that as n → ∞,
Z Z Pn 1 1 i=1 (a−x,b−x)(Si(ω)) g ψ(ω)dP (ω) dx → E[g((b − a)Yα)]. 2 − Ω an
3.5.2 Occupation times of continuous fractional Brownian motions
In [33], Kasahara and Matsumoto studied the occupation time of the d-dimensional d fractional Brownian motion BH = (BH,1,BH,2, ··· ,BH,d), where BH,i are indepen- d dent copies of BH . If Hd < 1, then the local time of BH exists and there exists a jointly continuous version Ld(t, x) such that
Z t Z d d f(BH (u))du = f(x)L (t, x)dx, 0 Rd for any bounded and continuous function f. The Kallianpur-Robbins law states that for a bounded summable function V on Rd,
1. if 0 < Hd < 1, then
Z t ¯ 1 d V d 1−Hd V (BH (s))ds → √ d L (1, 0) t 0 2π
as t → ∞; 64
2. if Hd = 1 and d ≥ 2, then
Z t ¯ 1 d V V (BH (s))ds → √ L1 log t 0 2π
¯ R as t → ∞, where L1 ∼ exp(1) and V = V (x)dx..
Put this result with the Darling-Kac theorem together, a question of interest is whether Ld(x, t) has the Mittag-Leffler distribution. Kasahara and Matsumoto d have found that the limiting distribution of the occupation time of BH is similar to but not the Mittag-Leffler distribution, which is consistent with our conclusion for the discrete-time fractional Brownian motion.
Theorem 3.5.4 ([33]). Suppose d ≥ 2, 0 < Hd < 1, and let α = 1 − Hd.
1 n!Γ(α)n n = √ nd , n = 1 E[Ld(1, 0) ] 2π Γ(αn+1) 1 n!Γ(α)n > √ nd , n ≥ 2. 2π Γ(αn+1) Also n n 1 n!γ(α) E[Ld(1, 0) ] ≤ √ , n ≥ 1. πnd γ(αn + 1) One Corollary is that when d ≥ 2, the distribution of Ld(1, 0) is not the Mittag- Leffler distribution with any index.
Remark 3.5.5. In the proof that the limiting distribution is not Mittag-Leffler, it 1 assumes that d ≥ 2, 0 < Hd < 1. In fact, the proof is still available when H 6= , 2 0 < Hd < 1. Chapter 4
Almost Sure Central Limit Theorems
4.1 Almost sure central limit theorems for local times of random walks
In this chapter, we shall discuss a new type of limit theorem: almost sure central limit theorem (ASCLT). It was first discovered by Brosamler (1988, [20]) and Schatte(1988, [57]) independently. In the past, it has been extensively investigated for partial sums of independent random variables and some dependent variables.
The simplest form of ASCLT is when {Xn} is a sequence of i.i.d. random variables 2 with mean zero and variance σ < ∞. Let Sn be the partial sum of {Xn}. Then
N 1 √ 1 X {S ≤σ kx} lim k = Φ(x) a.s., N→∞ log N k k=1 where Φ is the standard normal distribution function.
For independent random variables {Xn}, the general form of ASCLT is
N 1 X 1{(S −a )/b ≤x} lim k k k = Ψ(x) a.s. N→∞ log N k k=1 for some cumulative density function Ψ. It is a direct consequence of the central limit theorem S − a n n → N (0, 1) bn 66
under some moment hypotheses on the underlying variable {Xn}. So despite its pointwise property, ASCLT is weaker than the distribution convergence. In [11], Berkes and Cs´akishow that not only the central limit theorem, but every weak limit theorem for independent random variables has an analogous almost sure “logarithmic” version. The generic form of a weak limit theorem for a sequence of independent random variables is
fk(X1,X2, ··· ) → G, where f : R∞ → R is a measurable function and G is a distribution funcition. Pk For example, when fk(x1, x2, ··· ) = i=1 xi/k, it corresponds to the central limit Pk 1 Pi theorem. When fk(x1, x2, ··· ) = ak i=1 x( j=1 xj) for some ak, it becomes the weak convergence of the local time of the process {Xn}. Other examples include extrema, U-statistics and so on.
Particularly, we are interested in the local time. Let X1,X2... be i.i.d. integer- valued random variables with EX1 = 0. Let ψ(t) be the characteristic function of
X1 and assume that ψ(2πt) = 1 if and only if t is an integer and ψ satisfies
1 Z π ψ(t) C dt ∼ α as λ → 1, 2π −π 1 − λψ(t) (1 − λ) where 0 < α ≤ 1/2. Let Sn be the partial sum of {Xn}: Sn = X1 + X2 + ...Xn, it follows that the random walk {Sn, n ≥ 1} is aperiodic and X1 is in the domain of attraction of a stable law of order d = 1/(1 − α) ∈ (1, 2]. Define the local time `(n, x) as before. It was shown by Darling and Kac (1957) [24] that
∞ j−1 `n 1 X (−1) j lim P < x = Fα(x) = sin(παj)Γ(1 + αj)x . n→∞ Cnα πα j!j j=1
Also the local time has ASCLT.
Theorem 4.1.1 ([11]). For the random walk defined above, one has
1 X 1 ` lim { k 4.2 Almost sure central limit theorem for sta- tionary processes When the sequence {Xn} is dependent, we would like to know whether the local time of its partial sum still has almost sure central limit theorem. In general, this is not always true. In this section, we study the case when {Xn} is stationary and its characteristic operator has a spectral gap. Suppose {Xn} is a stationary process in probability space (Ω, B,P ) and it satisfies that there exists a random variable φ :Ω → Z with finite Lipschitz constant and a measure preserving transformation T :Ω → Ω such that Xn = φ ◦ T n−1. 1 1 Define the characteristic function operator Pt : L (P ) → L (P ) by Ptf := itφ PT (e f), where PT is the Perron-Frobenius operator. Pt is the perturbation of n PT around t = 0, so Pt and PT are identical when t = 0. By induction, Pt f = itSn P i−1 PT n (e f), where Sn = i≤n φ ◦ T =: Sn(φ). Let the space L be the space 1 L (P ) equipped with Lipschitz norm: kfk := kfk∞ +Df , where Df is the Lipschitz constant of f. With some conditions on the characteristic operator Pt, we have the almost sure central limit theorem for the local time of Sn. Here, we prove that an almost sure weak limit theorem holds for local times of stationary processes when the local limit theorem holds in a stronger form than Definition 3.4.1. Definition 4.2.1. An integer-valued stationary process {Xn}n≥1 is said to satisfy ∞ the L conditional local limit theorem at 0 if there exists a sequence gn ∈ R of real constants such that lim gn = g(0) > 0 n→∞ and n 1 kBnPT ( {Sn=x}) − gnk∞ decreases exponentially fast. This condition is essentially stronger than condition (3.7) holding in L∞(P ) and the convergence is exponentially fast. 68 Theorem 4.2.2 (Almost sure central limit theorem for the local times). Let {Xn}n∈N be an integer-valued stationary process satisfying the local limit theorem β at 0 in Definition 4.2.1 and Bn = n L(n), where slowly varying function L(n) converges to c > 0. Moreover, assume that the following two conditions are satis- fied: for some constants K > 0 and δ > 0 and for all bounded Lipschitz continuous functions g, F ∈ Cb(R) and x ∈ Z it holds that −1−δ Cov (g(`k),F ◦ T2k)) ≤ C (log log k) (4.1) and ∞ X α 1 1 1−α |E {Sn=x} − {Sn=0} | ≤ K(1 + |x| ), (4.2) n=1 where α = 1 − β. Then N 1 X 1 1 ` lim { k ≤x} = M(x) a.s. (4.3) N→∞ log N k a k=1 k is equivalent to N 1 X 1 `k lim P ( ≤ x) = M(x), (4.4) N→∞ log N k ak k=1 where M(x) is a cumulative distribution function. Corollary 4.2.3 (Gibbs-Markov maps). The almost sure central limit theorem for the local times holds under the same setting as Corollary 3.4.9. It is because the conditional local limit theorem in the sense of Definition 4.2.1 holds. And the assumptions on the transfer operator in section 4.4 are satisfied. 69 4.3 Proof of almost sure central limit theorem (ASCLT) 4.3.1 Proof of Theorem 4.2.2 In this section, we shall show that the local time `n of {Xn} has an almost sure weak convergence theorem under the assumptions of Theorem 4.2.2. The following proposition will be used in the proof of Theorem 4.2.2, so we state it below and the proof of it is in Section 4.3.2. Proposition 4.3.1. N 1 X 1 `k Var g( ) = O (log log N)(−1−δ) log N k ak k=1 for some δ > 0, as N → ∞, where g is any bounded Lipschitz function with Lipschitz constant 1. Once Proposition 4.3.1 is granted, the proof of Theorem 4.2.2 can be proved using standard arguments. We sketch these shortly. Proof of Theorem 4.2.2. By the dominant convergence theorem, statement (4.3) implies (4.4) if taking expectation. To prove the other direction, it is sufficient to prove that (see e.g. Lacey and Philipp, 1990 [39]) N 1 X 1 lim ξk = 0, a.s. (4.5) N→∞ log N k k=1 for any bounded Lipschitz continuous function g with Lipschitz constant 1, where h i `k `k ξk := g − E g . ak ak 1 Taking N = exp exp i, for any > , then Proposition 4.3.1 implies i 1 + δ ∞ N X 1 Xi 1 E( ξ )2 < ∞. (4.6) log2 N k k i=1 i k=1 70 By Borel-Cantelli lemma, N 1 Xi 1 lim ξk = 0, a.s. (4.7) i→∞ log Ni k k=1 For any N, there exists k such that Nk ≤ N < Nk+1 and we have N N Nk+1 1 X 1 1 Xk 1 X 1 | ξj| ≤ (| ξj| + |ξj|) log N j log Nk j j j=1 j=1 j=Nk+1 N 1 Xk 1 C ≤ | ξ | + (log N − log N ) log N j j log N k+1 k k j=1 k → 0 as k → ∞, a.s. log N The last step is because ((1 + k) − k) → 0 as k → ∞ for any < 1, k+1 = log Nk e((1+k)−k) → 1 as k → ∞. Hence (4.5) holds and the proof is done. 4.3.2 Proof of Proposition 4.3.1 Proposition 4.3.1 is a result of the following two lemmas. Lemma 4.3.2. Suppose {Xn} has conditional local limit theorem 4.2.1, then one Pn 1 has E[`n] = O(an), where an = g(0) i=1 . Bi Proof. Since the convergence in the conditional local limit theorem is in the sense n of Definiton 4.2.1, BnP (Sn = 0) = BnE (P (Sn = 0|T )) → g(0) as n → ∞. So n n X X 1 E(` ) = P (S = 0) ∼ a = g(0) . n i n B i=1 i=1 i Lemma 4.3.3. When j > 2k, and k, j → ∞, 2 α E (`j − `(j, S2k) − `2k + `(2k, S2k)) = O(ajE[|S2k| 1−α ]), (4.8) 1 Pn g(0) where α ∈ (0, ] and an = . 2 i=1 Bi 71 Remark 4.3.4. For i.i.d. case, by Kesten and Spizer (1979) [34], it is known that 2 α α E(`(n, x) − `(n, y)) ≤ C|x − y| 1−α n when Xn is in the domain of attraction of a 1 stable law of order d = 1−α . Proof.