The Pennsylvania State University The Graduate School Eberly College of Science

STUDIES ON THE LOCAL TIMES OF DISCRETE-TIME

STOCHASTIC PROCESSES

A Dissertation in by Xiaofei Zheng

c 2017 Xiaofei Zheng

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

August 2017 The dissertation of Xiaofei Zheng was reviewed and approved∗ by the following:

Manfred Denker Professor of Mathematics Dissertation Adviser Chair of Committee

Alexei Novikov Professor of Mathematics

Anna Mazzucato Professor of Mathematics

Zhibiao Zhao Associate Professor of

Svetlana Katok Professor of Mathematics Director of Graduate Studies

∗Signatures are on file in the Graduate School. Abstract

This dissertation investigates the limit behaviors of the local times `(n, x) of the n Pn 1 partial sum {Sn} of stationary processes {φ ◦ T }: `(n, x) = i=1 {Si=x}. Under the conditional local limit theorem assumption:

n kn BnP (Sn = kn|T (·) = ω) → g(κ) if → κ, P − a.s., Bn we show that the limiting distribution of the local time is the Mittag-Leffler distri- bution when the state space of the is Z. The method is from the infinite ergodic theory of dynamic systems. We also prove that the discrete-time fractional Brownian motion (dfBm) admits a conditional local limit theorem and the local time of dfBm is closely related to but different from the Mittag-Leffler dis- tribution. We also prove that the local time of certain stationary processes satisfies an almost sure (ASCLT) under the additional assumption that the characteristic operator has a spectral gap.

iii Table of Contents

Acknowledgments vi

Chapter 1 Introduction and Overview 1 1.0.1 Brownian local time ...... 1 1.0.2 Local time of discrete-time stochastic processes ...... 4 1.0.3 Connection between the Brownian local time and the local times of discrete-time processes ...... 9

Chapter 2 Local Limit Theorems 11 2.1 Motivation ...... 11 2.2 Local limit theorems for independent and identically distributed random variables ...... 12 2.3 Local limit theorems for Markov chains ...... 13 2.4 Conditional local limit theorems for stationary processes ...... 14 2.5 Conditional local limit theorem for discrete-time fractional Brown- ian motion ...... 19 2.5.1 Proof of the conditional local limit theorem ...... 20 2.5.2 Estimate of the ...... 24 2.5.3 Estimate of the ...... 31

Chapter 3 Limiting Distributions of Local Times 33 3.1 Local times of random walks ...... 33 3.2 Occupation times of Markov chains ...... 38 3.3 Ergodic sums of infinite measure preserving transformation . . . . . 44

iv 3.4 Asymptotic distribution of the local times `n of stationary processes with conditional local limit theorems ...... 47 3.5 Limit theorems of local times of discrete-time fractional Brownian motion ...... 59 3.5.1 Occupation times of discrete-time fractional Brownian motions 60 3.5.2 Occupation times of continuous fractional Brownian motions 63

Chapter 4 Almost Sure Central Limit Theorems 65 4.1 Almost sure central limit theorems for local times of random walks 65 4.2 Almost sure central limit theorem for stationary processes . . . . . 67 4.3 Proof of almost sure central limit theorem (ASCLT) ...... 69 4.3.1 Proof of Theorem 4.2.2 ...... 69 4.3.2 Proof of Proposition 4.3.1 ...... 70 4.4 Transfer operators ...... 74 4.5 Bounds of local times of stationary processes ...... 78

Chapter 5 Conclusion and Open Questions 80

Bibliography 82

v Acknowledgments

Over the past five years I have received support and encouragement from a great number of individuals. I must first thank my adviser, Professor Manfred Denker, for his continuous guidance, endless encouragement and generous help during my graduate study and research. This dissertation could not have been finished without his advices and support. His guidance and friendship have made my graduate study a thoughtful and rewarding journey. I would also like to thank my dissertation committee members: Alexei Novikov, Anna Mazzucato and Zhibao Zhao for generously offering their time, insightful comments and support. I learned fractional Brownian motion from Professor Novikov and benefited a lot from his precious guidance and endless patience. I own my thanks to Professor Mazzucato for her invaluable advices as my mentor when I first came to Penn State. I am grateful to Professor Zhao for his valuable time. I offer my thanks to Professor Svetlana Katok for providing me the oppor- tunity to study in the Ph.D. program. I also thank the staffs of the Department of Mathematics for their kindly assistance. I am most grateful and indebted to my parents and the rest of my family for their unconditional love. Lastly, I must thank Changguang Dong for his unwavering love, patience and support. There are certainly many others who deserve mentioning to whom I offer a simple message: Thank You!

vi Chapter 1

Introduction and Overview

The local time of a continuous-time process is a associated with an underlying stochastic process such as a Brownian motion, a Markov process, a diffusion process and so on, that characterizes the amount of time a particle has spent at a given level. It provides a very fine description of the sample paths of the underlying process. While the local time of a discrete-time process measures how often a state is visited and it is a refinement of the notion of recurrence. The local times in these two cases show many similar properties [2] and are closely connected by the invariance principle.

1.0.1 Brownian local time

The notion of local time of a Brownian motion was first introduced by Le´vyin 1948 [40]. His contribution to the deep properties of the local time of Brownian motions laid the foundation of the theory of local times of stochastic processes. And later the theory was further developed by Trotter, Knight, Ray, Itˆo,and McKean, etc. Let {W (s), s ≥ 0} be a 1-dimensional Brownian motion. The occupation time R t 1 of a set A ⊂ R at time t is defined to be µt(A, ω) = 0 {W (s,ω)∈A}ds, which is a random measure on (R, B(R)). Le´vy(1948, [40]) proved that for almost all ω, for any t ≥ 0, µt is absolutely continuous with respect to the Lebesgue measure, so the

Radon-Nikodym derivative Lt,ω exists and is Lebesgue-almost everywhere unique: Z µt(A, ω) = Lt,ω(x)dx. A 2

Trotter (1958, [61]) proved that for almost all ω ∈ Ω, there exists a function L(t, x, ω) that is continuous in (t, x) ∈ [0, ∞) × R, such that Z µt(A, ω) = L(t, x, ω)dx. A

From now on, we use L(t, x), the jointed continuous version of the Brownian local times and we make the following remarks on the notations:

1. {L(t, x)}t≥0,x∈R is called the Brownian local time.

2. Fix x, {L(t, x)}t≥0 is called the Brownian local time at level x.

3. Fix t, {L(t, x)}x∈R is the Brownian local time at time t.

4. Fix x and t, L(t, x) is a random variable.

5. When x = 0, we use L(t) to denote L(t, 0) for short.

So L(t, x) can be studied as a function of t or of x. Interesting questions of L(t, x) such as the exact distribution, the limiting distribution as t goes to ∞ when x is fixed, the fluctuation of the Brownian local time are well studied. We recall some striking results related to the problems we are to deal with in this dissertation. As a function of t, the distribution of the Brownian local time at the level 0 is given by the following theorem.

Theorem 1.0.1 (L´evyidentity, 1948). The processes {(|W (t)|,L(t, 0)) : t ≥ 0} and {(M(t) − W (t),M(t)) : t ≥ 0} have the same distribution, where M(t) = max{W (s): s ∈ [0, t]}.

L´evy(1948) proved the theorem by showing that {M(t) − B(t): t ≥ 0} is a reflected Brownian motion. In [46], the theorem is proved by first defining the Brownian local time through the number of downcrossings of a Brownian motion and then using the embedded random walks into Brownian motions. The local time from this definition can be proved to be the density of the occupation measure. The importance of the concept Brownian local time also lies in its deep con- nection with Itˆo’sformula. 3

Theorem 1.0.2 (Tanaka’s formula).

Z t + 1 W (t) = 1{W (s)>0}dW (s) + L(t) 0 2 and Z t |W (t)| = sgn(W (s))dW (s) + L(t), 0 where sgn denotes the sign function

 +1, x > 0 sgn(x) = −1, x ≤ 0.

It is still true when W (t) is replaced by a continuous semimartingale. Tanaka’s formula is the explicit Doob-Meyer decomposition of the submartingale |W (t)| into the martingale part and a continuous increasing process (local time). Tanaka’s formula can be generalized by Ito-Tanaka’s formula, which is also an extension of Itˆo’sformula to convex functions.

Theorem 1.0.3 (Itˆo-Tanaka’s formula). If f is the difference of two convex func- tions, then

Z t Z 0 1 00 f(W (t)) = f(W (0)) + f−(W (s))dW (s) + L(t, x)f (dx). 0 2 R

Recall that if f is convex, its second derivative f 00 in the sense of distributions is a positive measure. Itˆo-Tanaka’s formula holds for any continuous semimartingale. Similar to Brownian motion, the Brownian local time’s magnitude of the fluc- tuations can also be described by the law of the iterated logarithm (LIL). ˆ Theorem 1.0.4 (Kesten, 1965 [35]). Let L(t) = supx∈R L(t, x), then almost surely, for any given x,

Lˆ(t) L(t, x) lim sup 1/2 = lim sup 1/2 = 1 t→∞ (2t log log t) t→∞ (2t log log t) and almost surely, log log t1/2 lim inf Lˆ(t) = γ > 0. t→∞ t 4

The local times of random walks have similar results, but in Kesten’s words, it is “not a mere ‘translation’ to the discrete case”. In this dissertation, we study the law of the iterated logarithm for the local times of certain stationary processes, which will be discussed in Chapter 4. Revuz and Yor [55] and the survey article Borodin [15] are good references to the Brownian local times. Local times of Markov processes, Gaussian Processes, L´evy Processes and others are also studied. Since this article focuses on the discrete- time processes, we won’t go further. More information on local times of continuous stochastic processes can be found in the survey [27].

1.0.2 Local time of discrete-time stochastic processes

A similar question is how to define the local time of a discrete-time process. Since the time space is discrete, we can only measure how many times the process is in some subset of the state space. That is, for a discrete-time process {Sn}, the occupation time λ(n, A) of a subset A of the state space S at time n is defined to be the total number of the visits of Si to the subset A during the first n transitions:

λ(n, A) = #{i ≤ n : Si ∈ A}. Local time can be interpreted as the density of the occupation time measure with respect to the counting measure (when S = Z) or to the Lebesgue measure (when S = R). It is defined as `(n, x) = #{i ≤ n : Si = x} R for x ∈ S, and hence, λ(n, A) = A `(n, x)dm(x), where m(·) is the counting measure or the Lebesgue measure. To our knowledge, the earliest result on the discrete-time process is Chung and Hunt’s work in 1949 [21]. They studied the zeros at time n of simple random walks

Sn: `n = `(n, 0) := #{i ≤ n : Si = 0} and showed the limiting behaviors of the sequence {`n}n≥1. The exact and the limiting distributions of `n are also studied by R´enyi in 1970 [52]. Aleˇskeviˇcien˙e [7] [9] studied the asymptotic distribution and the moments of local times of aperiodic recurrent random walks. For the case when the symmetric random walk is of dimension 2, Erd˝osand Taylor [26] studied the aysmpototic distribution of the local time and from that they obtained strong laws analogous to the law of the iterated logarithm. Another important underlying stochastic process is the Markov process. In 1957, Darling and Kac [24] established some limit theorems for the occupation 5

times for homogeneous Markov processes Xt with stationary transition probabili- ties. For each s > 0, let ps(x, E) be the Laplace transformation of the transition probability P (x, E; t) of the Markov process {Xt}, and V (x) is a non-negative function over the state space. Suppose they satisfy the Darling and Kac con- dition: there exists a function h(s) → ∞, s → 0, and a positive constant C such that Z p (x, dy) s V (y) → C, s → 0, (1.1) h(s) or equivalently, Z ∞ −su 1 Ex[ e V (Xu)du] ∼ Ch(s), s → 0, 0 the convergence is uniform in the support of V . In addition, if h is regularly varying with index α ∈ [0, 1) as s → 0, then the limiting distribution of 1 Z t V (Xu)du Ch(1/t) 0 is the Mittag-Leffler distribution with index α. The result essentially exhausts all possibilities, in view of the following converse.

If {Xt} and V (x) meet the “Darling and Kac condition”, and if in addition for some scaling function u(t) > 0,

1 lim P ( < x) = G(x), (1.2) t→∞ R t u(t) 0 V (Xs)ds where G is a non-degenerate distribution function, then h is a regularly varying function with some index α. It follows that G is the distribution function of the Mittag-Leffler distribution. In the proof, the elementary Tauberian theorem of Karamata plays an important role. Darling and Kac’s theory is applicable to Markov chains by replacing the Laplace transform with generating functions. In particular, it works for random walks, the sum of independent, identically distributed random variables {Xn} with common distribution function F . Xn may take lattice distribution or non-lattice distribution. When F is of mean 0 and in the domain of attraction of some stable law with index d, denoted as F ∈ D0(d)(d ∈ (1, 2]), it is known that Sn, the

1 an In the whole dissertation, an ∼ bn → 1, n → ∞. bn 6

partial sum of {Xn} has the local limit theorem [31] [59]. It leads to the important “Darling-Kac condition” of the i.i.d. case:

n X 1− 1 P (Sk ∈ A) ∼ |A|n d L(n), as n → ∞, (1.3) k=0 where L(n) is a slowly varying function. By Darling and Kac [24], it implies that the normalized occupation time of Sn converges to the Mittag-Leffler distribution.

Therefore, in the i.i.d. case, the central limit theorem of Sn implies the central limit theorem of the occupation time of Sn. Independently, Kesten [36] and Bretagnolle and Dacunha-Castelle [16] made a conjecture that the converse is also true: (1.3) implies that F ∈ D0(d). Kesten [36] proved it when Sn is a symmetric random walk, and Bingham and Hawkes [13] extent it to symmetric zero-mean L´evyprocesses, left-continuous random walks and completely asymmetric L´evyprocesses. The Darling-Kac theorem can also be extended from Markov processes to the sums of weakly dependent random variables, for example, the R´ev´esz-dependent sequence [23] by Cs¨orgo. In our work, instead of random walks or Markov processes, we consider when the “Darling-Kac” theorem holds for local times for the random variables with dependent structures and what the limiting distributions of the local times are.

One class of dependent random variables is the stationary process {Xn} with a “conditional local limit theorem”. The “Darling-Kac condition” is a consequence of the “conditional local limit theorem”. In Chapter 2, we have a review on local limit theorem and then introduce the “conditional local limit theorem”. As an example, we will prove that the discrete-time fractional Brownian motion has a conditional local limit theorem. In Chapter 3, we shall show that under the assumption of the “conditional local limit theorem”, the normalized local time of the partial sum {Sn} of {Xn} converges to the Mittag-Leffler distribution when the state space of the discrete- time processes {Sn} is Z. When the state space is R, the limiting distribution is not necessarily to be the Mittag-Leffler distribution, but it is closely related to it. The proof is in the framework of the infinite ergodic theory. Occupation times can be represented as a partial sum of iterative transformations in a dynamical system (X, B, µ, S). When the dynamical systems is pointwise dual ergodic, there 7 exists a “Darling-Kac theorem” in the infinite ergodic dynamical system[2] by Aaronson, which is like the “Birkhoff theorem” in finite space. The dynamical system (X, B, µ, S) in which the occupation time is defined can be decomposed into pointwise dual ergodic system and then Aaronson’s Darling-Kac theorem can be applied in each component. The idea of the proof of the Aaronson’s Darling-Kac theorem in infinite dynamical system is to estimate the Laplace transform of all moments of the partial sum of iterative transformations and apply the Karamata’s Tauberian theorem to recover the moments. The “Darling-Kac condition” and hence the conditional local limit theorem in Chapter 2 is the key point in constructing a pointwise dual ergodic dynamical system. In i.i.d. case, the conditional local limit theorem can be deduced to be the local limit theorem. The conditional local limit theorems are widely satisfied, for example Denker and Aaronson’s work on stationary processes in the domain of attraction of a normal distribution [4] and partial sums of stationary sequences generated by Gibbs-Markov maps [5]. H Since the discrete-time fractional Brownian motion Bn also has the conditional local limit theorem when the Hurst index satisfies H ∈ (3/4, 1), it enables us to use the infinite ergodic theory to study the limiting distribution of the following variables: n X H V (Bi ) as n → ∞, i=1 where V is a non-negative function over R. If V (x) is the characteristic function of Pn H H some set B ⊂ R, then i=1 V (Bi ) becomes the occupation time of Bn of the set H Pn H B: λ(B, n) = #{i ≤ n : Bi ∈ B}. The limiting distribution of i=1 V (Bi ) after normalized is closely related to Mittag-Leffler distribution but not Mittag-Leffler distribution. The theory of the infinity ergodic theory has many applications in the area of probability. In Chapter 4, we use it to explore another property of the local times of stationary processes with conditional local limit theorems: the almost sure central limit theorem (ASCLT). Almost sure central limit theory was discovered firstly by Brosamler [20] and Schatte [57] independently. The simplest form of it says that a sequence of independent identically distributed random variables {Xk} 8

2 with moments EX1 = 0 and EX1 = 1 obeys

n 1 X 1 lim 1 S = Φ(x) a.s. { √k ≤x} n→∞ log n k k k=1 for each x. Here 1{·} denotes the indicator function of events, Φ is the distribu- tion function of the standard normal distribution. In the past, ASCLT has been obtained for several classes of independent and dependent random variables. We list some of them. Lacey, Michael T and Philipp, Walter [39] gave a new proof of the ASCLT based on an almost sure invariance principle and extended ASCLT to weakly dependent variables: For any sequence of random variables {Xn}, if its partial sum can be approximated almost surely by i.i.d. normal random variables 1 1 {Y }, then P 1 converges weakly to standard Brownian motion n log n k≤n k {sk(·,ω)} on [0, 1], where

 −1/2 n Sk if t = k/n, k = 0, 1, ··· , n sn(t, ω) = linear in between.

When {Xn} are independent, not necessary identically distributed random vari- ables, Berkes, Istv´anand Dehling, Herold [10] gave necessary and sufficient criteria for the generalized ASCLT and its functional version under mild growth conditions on the partial sum Sn. Peligrad, Magda and Shao, Qi-Man [48] obtained the AS- CLT for stationary associated sequence, strongly mixing and ρ-mixing sequences under the same conditions that assure the usual central limit theorem. G. Khurel- baatar [28] proved ASCLT for strongly mixing sequence of random variables with a slower mixing rate than [48]. And he also showed that ASCLT holds for an associated sequence without a stationary assumption. In [11], Berkes, Istv´anand Cs´aki,Endre show that not only the central limit theorem, but every “weak” limit theorem for independent random variables has an analogous almost sure version. However, the study of the almost sure central limit theorem for local times is very little. We only know that for aperiodic integer-valued random walks, Berkes,

Istv´anand Cs´aki[11], established an almost sure central limit theorem when {Xn} are i.i.d. and its law is in the domain of attraction of a stable law of order d ∈ (1, 2]. In Chapter 4, we shall prove that an almost sure central limit theorem holds for 9 local times of stochastic processes whose Perron-Frobenius operators have spectral gaps, which include a different class of processes. At the end of Chapter 4, we also review the bounds and deviations of local times. The bounds of local times of random walks are studied by Chung and Hunt [21]. They have the similar form as that of Brownian local time. We extend the result to the local time of stochastic processes with “conditional local limit theorems” by infinite ergodic theory.

1.0.3 Connection between the Brownian local time and the local times of discrete-time processes

In 1980s, an extensive literature on the invariance principles of local times ap- peared. It is well known that a random walk is the discrete version of Brownian motion. In the light of the invariance principle, it is not a surprise that the local time of Brownian motion and the “local time” of random walks should be close to each other. Thanks to the Skorohod embedding scheme, it can be shown that Brownian local time at zero has the same distribution as the maximum process

{Mt} of a standard linear Brownian motion (Theorem 1.0.1). And {Mt} has the 1 same distribution as the Mittag-Leffler distribution with index 2 . Recall that Mittag-Leffler distribution is the limiting distribution of the local times of random walks and Markov Chains. R´ev´eszproved that Brownian motion is near to its Skorohod embedding random walk, so are their local times at the same time.

Theorem 1.0.5 (R´ev´esz,1981, [53]). Let {W (t)} be a Brownian motion defined on a probability space {Ω, F,P }. Then on the same probability space, one can define a sequence X1,X2... of i.i.d. r.v.’s with P (Xi = 1) = P (Xi = −1) = 1/2 such that for any  > 0

lim n−1/4− sup |`(n, x) − L(n, x)| = 0, a.s. n→∞ x∈N and simultaneously −1/4− lim n |Sn − W (n)| = 0, a.s. n→∞ For more results on the invariance principles of the local times of random 10 walks, one can refer to Bass and Khoshnevisan(1993, 1995), Borodin(1986, 1988), Csaki and Revesz(1983), Csorgo and Revesz(1984,1985,1986), Jacod(1998), Khosh- nevisan(1992,1993). Recently, Bromberg and Kosloff [19] proved a functional weak invariance prin- ciple for the local times of partial sums of Markov Chains with finite state space S ∈ Z under the assumption of strong aperiodicity of the Markov Chain. The limiting distribution is the Brownian local time. Bromberg [17] [18] extent it to

Gibbs-Markov processes and random variables {Xn} which are generated by a dy- namical system (X, B, m, T ) with a quasi-compact transfer operator. It is close to the method used in this dissertation. Chapter 2

Local Limit Theorems

2.1 Motivation

The limiting distributions of local times depend on the properties of the underlying discrete-time processes, one of which is the local limit theorem. For example, the Markov process, as we mentioned in the Introduction, under the local limit theorem, the “Darling-Kac condition” can be implied and hence the normalized local times of Markov processes have Mittag-Leffler distributions as their limiting distributions [24]. For general stationary processes, the local limit theorem or the conditional local limit theorem is also the key element in determining the limiting distributions of the local times. So to prepare us well for finding the limiting distributions of the local times in the next chapter, we shall review and study the local limit theorems of the discrete-time processes that we are interested in in this chapter. Before moving on, we make some remarks on the term “local”: in the context of probabilistic limit theorems, the term “local” means the convergence of densities and “global” means the convergence of distribution functions. The term “local limit theorem” is used when the probability mass functions or the probability density functions are approximated by density functions. 12

2.2 Local limit theorems for independent and identically distributed random variables

Suppose independent and identical distributed random variables {Xn} have a lat- tice distribution, taking values in the arithmetic progression {a + kh : k ∈ Z}, then its partial sum Sn only takes values in the set {an + kh : k ∈ Z}. The local limit theorem is an asymptotic expression for the probability mass function

P (Sn = an + kh) as n → ∞.

Theorem 2.2.1 (Local limit theorem for mass functions, [31]). In order that for some constants An and Bn > 0,

Bn an + kh − An lim sup P(Sn = an + kh) − g( ) = 0, n→∞ k h Bn where g is the density function of some stable distribution G with exponent α ∈ (0, 2], it is necessary and sufficient that

1. the common distribution function F of {Xj} should belong to the domain of attraction of G, i.e. Sn−An converges in distribution to a stable law with Bn distribution function G, and

2. the interval h be maximal. It means that no matter what the choice of b ∈ R

and h1 > h, it is impossible to represent all possible values of X in the form

of b + h1k.

Remark 2.2.2. Local limit theorems provide more information than central limit theorems. The simplest example of the local limit theorem is the De Moivre-Laplace theorem, from which the central limit theorem can be implied.

Another kind of local limit theorem is the approximation to the density func- tions. Suppose {Xn} belongs to the domain of attraction of a stable law with density function g, and the density function p of Sn−An exists for all n ≥ N, the n Bn local limit theorem is under which pn converges to g. The answer is given by the following theorem. 13

Theorem 2.2.3 (Local limit theorem for density functions [31]). In order that for some choice of the constants An and Bn,

lim sup |pn(x) − g(x)| = 0, n→∞ x where g is the density of some stable distribution with exponent α ∈ (0, 2], it is necessary and sufficient that

1. the common distribution function F of {Xn} should belong to the domain of attraction of distribution G with density function g, and

2. there exists N with supx pN (x) < ∞.

2.3 Local limit theorems for Markov chains

The local limit theorems are also known for certain classes of stationary stochastic sequences which are related to Markov chains. Kolmogorov [37] obtained the local limits for Markov chains with finite state space using the methods developed by Markov [43] and Doeblin [25]. Nagaev [47] showed local limits for Markov chains with infinite state space in the normal case using perturbation theory of charac- teristic operators. Aleˇskeviˇcien˙e[6] studied the local limits for certain stationary Markov chains in the stable case. S´eva, Marc [58] showed that the local limit theorem holds for certain non-uniformly ergodic Markov chains with state space N = {0, 1, 2, ···}. Let p(x, A) be the stochastic transition function of a Markov chain X(n) with (n) countable state space S = {ξ1, ξ2, ···} and px,y = P (X(i + n) = y|X(i) = x). Suppose sup |p(x, A) − p(y, A)| = δ < 1, (2.1) x,y,A P∞ which implies infi,j k=1 min(pik, pjk) > 0. Suppose in addition all states in S are essential and constitute a positive class.

Let f(ξn) = a + knh, where kn ∈ Z, a ∈ R and h > 0.

Theorem 2.3.1 (Local limit theorem for Markov Chain [47] ). If the greatest com- P 2 mon factor of the kn is 1, i f (ξi)pi < ∞, σ > 0, where pi are final probabilities, 14 and n X Pπn(s) = P r( f(X(i)) = an + sh), i=1 then √  2  σ n 1 − zns lim Pπn(s) − √ e 2 = 0 n→∞ h 2π where πi = π(ξi) is the initial distribution and

∞ √ X σ nzns = a(n + 1) + sh − (n + 1) f(ξi)pi. i=1

In [6], Aleˇskeviˇcien˙econsidered the case with stable law of stationary Markov chains. {Xn} is a homogeneous Markov chain with an arbitrary states set S, F is a

σ-algebra on it. Denote Yi = f(Xi), then {Yi} are identically distributed. Denote Pn P1(k) = P (Yn = k), n = 1, 2, ··· . Let Sn = i=1 Xi and Pn(k) = P (Sn = k),

Fn(x) = P (Sn < x) and F (x) = F1(x).

Theorem 2.3.2 (Local limit theorem for Markov chain [6] ). Suppose F (x) belongs to the domain of attraction of the stable law Vα with density function vα and an integral limit theorem holds for {Xn}, i.e.

Fn(Bnx + An) → Vα(x).

Then in order that k − An BnPn(k) − vα( ) → 0 Bn holds uniformly with respect to k, it is necessary and sufficient that the greatest common divisor of the differences k1 − k2 is 1 when P1(k1) and P1(k2) are positive.

2.4 Conditional local limit theorems for station- ary processes

In the normal case for stationary sequences generated by Lasota-Yorke maps of the interval and functions of bounded variation, a local limit theorem exists. It was proved by a spectral decomposition method by Rousseau-Egele [56] and Morita, 15

Takehiko [44] [45]. For ergodic sums generated by Lipschitz continuous functions and Markov kernels on a compact metric space, Guivarc’h, Y. and Hardy, J. [30] also derived a local limit theorem by means of the spectral decomposition. Aaron- son and Denker [5] showed the local limits for stationary sequences generated by mixing Gibbs-Markov maps with aperiodic, Lipschitz continuous functions. In their work, the local limit theorems are established with respect to the sequence of conditional measures on the fibers of T n given by the Perron-Frobenius operators, where T is the measure preserving transition operator of the stationary process. The method is from the spectral theory of Perron-Frobenius operator and the perturbation theory.

Definition 2.4.1 (Markov map, [5]). Let (X, B, m, T ) be a nonsingular transfor- mation of a standard probability space. If there is a measurable partition α such that

• T (a) ∈ σ(α) mod m ∀a ∈ α,

• T|a is bijective, bi-measurable and bi-nonsingular for all a ∈ α, and

• σ({T −nα : n ≥ 0}) = B, then (X, B, m, T, α) is called a Markov map.

For n ≥ 1, there are m−nonsingular inverse branches of T denoted by

n−1 n n−1 _ −i νA : T A → A, A ∈ α0 =: T α i=0 with Radon-Nikodym derivatives

dm ◦ ν ν0 := A . A dm

t(x,y) We fix r ∈ (0, 1) and define the metric d = dr on X by d(x, y) = r where t(x, y) = min{n + 1 : T n(x),T n(y) belong to different atoms of α}.

Definition 2.4.2 (Gibbs-Markov). A Markov map (X, B, m, T, α) is called Gibbs- Markov if the following two additional assumptions are satisfied: 16

• infa∈α m(T (a)) > 0,

•∃ M > 0, such that 0 νA(x) 0 − 1 ≤ Md(x, y) νA(y) n−1 n for all n ≥ 1,A ∈ α0 , x, y ∈ T A.

Definition 2.4.3 (Perron-Frobenius operator). The Perron-Frobenius operator PT 1 1 is defined as PT : L (m) → L (m), Z Z PT f · g dm = f · (g ◦ T ) dm (2.2) X X for any g ∈ L∞(m).

An interpretation of the Perron-Frobenius operator is that it describes the evolution of probability densities under the action of T . That is, if f is the density of some probability measure ν with respect to m, then PT f is the density of the image measure ν ◦ T −1.

Theorem 2.4.4 (Distribution limit [5]). Let (X, B, m, T, α) be a mixing, probabil- ity preserving Gibbs-Markov map. Let

φ : X → R

P be Lipschitz continuous on each a ∈ α, with Dαφ := a∈α Daφ < ∞. Assume m-distribution G of φ is in the domain of attraction of a stable law with order 0 < p < 2, which is equivalent to that

p L1(x) := x (1 − G(x)) = (c1 + o(1))L(x),

p L2(x) := x G(−x) = (c2 + o(1))L(x) as x → ∞ where L is a slowly varying function on R+ and where c1, c2 ≥ 0, c1 + c2 > 0. Then Sn − An → Yp weakly, Bn 17

 0, 0 < p < 1,  p  where nL(Bn) = Bn and An = γn, 1 < p < 2,   2n γn + π (H1(Bn) − H2(Bn)), p = 1. Theorem 2.4.5 (Conditional lattice local limit theorem [5]). Suppose that φ : itφ f◦T X → Z is aperiodic, i.e. the only solution to the equation e = λ f with f : X → T measurable is t ∈ 2πZ, f ≡ 1, λ = 1. Let An,Bn be as in Theorem 2.4.4, and suppose that k ∈ , kn−An → κ ∈ as n → ∞, then n Z Bn R

lim ||BnPT n (1{S =k }) − fY (κ)||∞ = 0, n→∞ n n p

where fYp is the density function of the random variable Yp in Theorem 2.4.4. And in particular,

lim BnP (Sn = kn) = fY (κ). n→∞ p Theorem 2.4.6 (Conditional non-lattice local limit theorem [5]). Suppose that

φ : X → R is aperiodic, let An,Bn be as in Theorem 2.4.4, let I ⊂ R be an interval, and suppose that k ∈ , kn−An → κ ∈ as n → ∞, then n Z Bn R

lim BnPT n (1{S ∈k +I}) = |I|fY (κ), n→∞ n n p where |I| is the length of I and in particular,

lim BnP (Sn ∈ kn + I) = |I|fY (κ). n→∞ p

Another family of processes admitting conditional local limit theorem is the AFU map. A piecewise monotonic map of the interval is a triple (X, T, α) where X is an interval, α is a finite or countable generating partition ( Lebesgue measure) of X into open intervals and T : X → X is a map such that T |A is a continuous and strictly monotonic for each A ∈ α. A piecewise monotonic map of the interval with the following properties is called a AFU map [1]:

2 ¯ 00 0 2 A Adler’s condition: for all A ∈ α, T |A extends to a C map on A and T /(T ) is bounded on X.

F Finite images: {TA : A ∈ α} is finite. 18

U Uniform expansion: inf |T 0| > 1.

For every measurable f on interval X define its finite variation by VarX (f) = P sup i kf(xi)−f(xi−1)k where the supremum ranges over all partitions x1 < x2 <

··· < xn in X. Denote

_ ∗ ∗ f := inf{VarX (f ): f = f mλ − a.e.}, X where mλ is the Lebesgue measure.

Theorem 2.4.7 ([1]). Let (X, T, µ, α) be an AFU map and weakly mixing, which means for any functions f and g ∈ L2(X, µ),

N−1 Z Z Z 1 X n lim f ◦ T · gdµ − fdµ · gdµ = 0, N→∞ N n=0 X X X W and suppose that φ : X → R satisfies Cφ,α := supA∈α A φ < ∞.

2 1 2 1. If E[φ ] < ∞ and n Var(φn) → σ > 0, then

  Z b 2 φn − E(φ) 1 − t Pn,x √ ∈ (a, b) → √ e 2 dt σ n 2π a

n1 as n → ∞, uniformly in x ∈ X, where Pn,x(A) = PT A(x) and φn = Pn i−1 i=1 φ ◦ T .

2. If in addition φ : X → Z is aperiodic, then

2 √ 1 − t σ nPn,x(φn = kn) → √ e 2 2π

k −nE(φ) n √ as n → ∞, kn ∈ Z, σ n → t, uniformly in x ∈ X and t ∈ K for all k ⊂ R compact.

3. If in addition φ : X → R is aperiodic, and I ⊂ R is a finite interval, then

2 √ 1 − t σ nPn,x(φn ∈ kn + I) → √ e 2 |I| 2π 19

k −nE(φ) n √ as n → ∞, kn ∈ Z, σ n → t, uniformly in x ∈ X and t ∈ K for all k ⊂ R compact.

2.5 Conditional local limit theorem for discrete- time fractional Brownian motion

In this section, we show a conditional local limit theorem of discrete-time fractional Brownian motion. This result plays an important role in establishing the limiting distribution of the local time of discrete-time fractional Brownian motion in the next chapter. Fractional Brownian motion (fBm), also called fractal Brownian motion, was first introduced by Mandelbrot and van Ness (1968) [41]. It is a generalization of Brownian motion. Unlike classical Brownian motion, the increments of fBm need not be independent. Fractional Brownian motion is a continuous-time Gaussian process BH (t) on [0,T ], which starts at zero, has expectation zero for all t in [0,T ], and has the following covariance function:

1 E[B (t)B (s)] = (|t|2H + |s|2H − |t − s|2H ), H H 2 where H ∈ (0, 1) is called Hurst index associated with the fractional Brownian motion. If H = 1/2, then the process is in fact a Brownian motion. The increment process, Xt = BH (t+1)−BH (t), is known as fractional Gaussian noise (fGn). When

H > 1/2, the increments {Xn} has long- dependence property, that is, for a stationary sequence {Xn}, its auto covariance functions b(n) = Cov(Xk,Xk+n) satisfy

b(n) lim = 1 n→∞ cn−α for some constant c and α ∈ (0, 1). In this case, the dependence between Xk and P∞ Xk+n decays slowly as n tends to infinity and n=1 b(n) = ∞. We will prove a conditional local limit theorem for the discrete fractional Gaus- sian noise Xn when H > 3/4.

Theorem 2.5.1 (Conditional Local Limit Theorem). Suppose {Xn} is a sequence 20 of stationary Gaussian random variables in the probability space (Ω, F,P ) with 1 mean 0, and its covariance function b(i − j) = E(X X ) satisfies b(n) = [(n + i j 2 2H 2H 2H 1) − 2n + (n − 1) ], where 3/4 < H < 1. Denote by Sn the partial sum of Pn {Xn}: Sn = i=1 Xi. Suppose (a, b) ⊂ R is an interval. H Then there exists a normalization sequence {dn}, satisfying dn/n → K for qn some constant K as n → ∞. For any sequence {qn} and κ ∈ R, such that → κ dn as n → ∞, the conditional probability satisfies

  lim dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ··· ) = (b − a)g(κ), a.s., (2.3) n→∞ where g is the density function of the standard normal distribution.

In the case that qn = 0, the convergence is uniformly for almost all ω and for any interval (a, b).

In the rest of the Chapter, we will give the proof of Theorem 2.5.1.

2.5.1 Proof of the conditional local limit theorem

In this part, we state claims which are the key points in the proof of Theorem 2.5.1 and provide the proof modulo these conditions.

Proof of Theorem 2.5.1. For fixed positive integer k, the conditional probability  P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...Xn+k) is given by a normal distribution. T Indeed, let the (n + k)-dimensional random variable X = (X1, ··· ,Xn+k) be " # X partitioned as [1] with sizes n and k respectively. The covariance matrix of X[2] " # Σ11(n, n)Σ12(n, k) X is denoted as Σ = , where Σ11 and Σ22 are symmetric Σ21(k, n)Σ22(k, k) Toeplitz matrices:

 b(0) b(1) b(2) . . . b(n − 1)    b(1) b(0) b(1) . . . b(n − 2) Σ11(n, n) =   ,  . . . . .   . . . . .  b(n − 1) b(n − 2) b(n − 3) . . . b(0) 21

and

 b(0) b(1) b(2) . . . b(k − 1)    b(1) b(0) b(1) . . . b(k − 2) Σ22(k, k) =   .  . . . . .   . . . . .  b(k − 1) b(k − 2) b(k − 3) . . . b(0)

And

 b(n) b(n + 1) b(n + 2) . . . b(n + k − 1) b(n − 1) b(n) b(n + 1) . . . b(n + k − 2) T   Σ12(n, k) = Σ (k, n) =   . 21  . . . . .   . . . . .  b(1) b(2) b(3) . . . b(k)

" # e(n) 0 Let D be a (k + 1) × (n + k) matrix, defined by D = , where 0 Ik T e(n) = (1, 1, ..., 1) and Ik is the identity matrix of size k. Then DX ∼ N (0,DΣD ), | {z } n i.e. " # " # S  e(n)Σ e(n)T e(n)Σ  n ∼ N 0, 11 12 . T X[2] Σ21e(n) Σ22

By the conditional normal formula (see for example [12], section 5.5), when Σ22 is of full rank, 2  (Sn|X[2]) ∼ N µ(n, k), σ (n, k) , where −1 µ(n, k) = e(n)Σ12(n, k)Σ22 (k, k)X[2] (2.4) and

2 T −1 T σ (n, k) = e(n)Σ11(n, n)e(n) − e(n)Σ12(n, k)Σ22 (k, k)Σ21(k, n)e(n) . (2.5)

R That is, P (Sn ∈ A|X[2]) = A f(y1|X[2])dy1, with

2 1 − (y1−µ(n,k)) 2σ2(n,k) f(y1|X[2]) = √ e . 2πσ(n, k) 22

For our convenience, we introduce a vector B:

T T B = (B(1),B(2), ··· ,B(k)) := Σ21(k, n)e(n) ,

Pn+s−1 i.e. B(s) = i=s b(i), s = 1, 2, ··· , k. Then the mean and the variance can be written as T −1 µ(n, k) = B Σ22 (k, k)X[2] and 2 T T −1 σ (n, k) = eΣ11(n, n)e − B Σ22 (k, k)B.

It follows that   σ(n, k)P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...Xn+k)

1 Z b+qn  (x − BT Σ−1(k, k)X )2  √ 22 [2] = exp − 2 dx 2π a+qn 2σ (n, k) Z (b+qn)/σ(n,k)   1 1 1 T −1 2 = σ(n, k)√ exp − (y − B Σ22 (k, k)X[2]) dy 2π (a+qn)/σ(n,k) 2 σ(n, k)   1 1 ξ + qn 1 T −1 2 = σ(n, k)√ (b − a)/σ(n, k) exp − ( − B Σ (k, k)X[2]) 2π 2 σ(n, k) σ(n, k) 22   1 1 ξ + qn 1 T −1 2 = √ (b − a) exp − ( − B Σ (k, k)X[2]) . 2π 2 σ(n, k) σ(n, k) 22

We used the mean value theorem in the second-last step and ξ ∈ [a, b]. Now we make two claims and they will be proved in section 2.5.2 and section 2.5.3 separately.

2 2 H Claim 1 For fixed n, let dn = limk→∞ σ (n, k). Then dn/n → K as n → ∞, where K is a constant. 1 Claim 2 For fixed n, lim BT Σ−1(k, k)X = 0 P-a.s. k→∞ σ(n, k) 22 [2] As a consequence,

ξ + q ξ q lim n = + n =: κ(n). k→∞ σ(n, k) dn dn

Since ξ ∈ [a, b] and qn → κ as n → ∞, lim κ(n) = κ. dn n→∞ 23

Hence   lim σ(n, k)P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...Xn+k) k→∞ 1 κ(n)2 = √ (b − a) exp(− ) = g(κ(n))(b − a) almost surely, 2π 2 where g is the density function of the standard normal random variable. On the other hand, almost surely,

  lim σ(n, k)P Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ..., Xn+k) k→∞   = dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...).

Hence almost surely,

  dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...) = (b − a)g(κ(n)).

It follows that   lim dnP Sn ∈ (qn + a, qn + b)|(Xn+1,Xn+2, ...) = g(κ)(b − a) almost surely. n→∞

In particular, when qn = 0, almost surely,   dnP Sn ∈ (a, b)|(Xn+1,Xn+2, ...) = (b − a)g(0).

So   lim dnP Sn ∈ (a, b)|(Xn+1,Xn+2, ...) = g(0)(b − a), n→∞ uniformly for almost all ω ∈ Ω and (a, b).

In the following two sections, we give the proof of the two claims. 24

2.5.2 Estimate of the variance

In this part, we prove Claim 1:

2 2 2H lim σ (n, k) = dn = n L(n). k→∞

2H−2 2 T Since b(n) ∼ n L(n), the first term of Σ (n, k) satisfies e(n)Σ11(n, n)e(n) ∼ n2H L(n), when n → ∞, where L(n) is slowly varying [60], it is sufficient to prove that the second term of σ2(n, k) converges to 0 as k → ∞, i.e.

T −1 lim B Σ22 (k, k)B = 0. k→∞

Pn+s−1 First we give an estimate of the elements B(s) = i=s b(i) of vector B.

Lemma 2.5.2. It holds that

2H  1  B(s) = n(s + n)2H−2 1 + O( ) , as s → ∞ 2 s

Pk 2 2 4H−3 and therefore i=1 B (i) = O(n (k + n) ) as k → ∞.

a P∞ a i a Proof. By Taylor expansion, (1 + x) = i=0 i x when |x| < 1, where i = a(a − 1) ··· (a − i + 1) . So by definition of B(s) and b(t), i!

2(sn)−2H B(s) = (s + n)2H − (s + n − 1)2H − s2H + (s − 1)2H  (sn)−2H  sn − s − n2H  sn − s − n + 12H = 1 − − 1 − sn sn  sn − s2H  sn − s + 12H − 1 − + 1 − sn sn ∞ X 2H = (−1)if(s, n, i)(sn)−i, (2.6) i i=2 where f(s, n, i) = (sn−s−n)i −(sn−s−n+1)i −(sn−s)i +(sn−s+1)i. Using the   Pi i i−j i−j binomial formula, we can rewrite f(s, n, i) = j=1 j (sn−s) −(sn−s−n) . 25

Since

i−j X i − j (sn − s)i−j − (sn − s − n)i−j = (sn − s − n)i−j−lnl l l=1   n  n = n(i − j)(sn − s − n)i−j−1 1 + O , as → 0, sn − s − n sn − s − n a straight forward calculation furthermore shows that

  n  f(s, n, i) = i(i − 1)n(sn − s − n)i−2 1 + O ns − s − n

n 1 as → 0 and → 0. Plugging into equation (2.6) we obtain sn − s − n sn − s − n

2(sn)−2H B(s) ∞ X 2H n = (−1)ii(i − 1)n(sn − s − n)i−2(sn)−i(1 + O( )) i ns − s − n i=2 1 1 1   n  = (2H)(2H − 1) ( + )2H−2 1 + O . s2n s n ns − s − n

 1  This shows B(s) = 2Hn(s + n)2H−2 1 + O( ) as s → ∞. 2 s

T −1 The main idea of estimating B Σ22 B is to write

∞ ∞ T −1 T −1 T X l X T l B Σ22 B = c(k)B A B = c(k)B (I − A) B = c(k) B (I − A) B, l=0 l=0 where c(k) is an appropriate constant chosen below satisfying ||I − A||2 < 1 for

A = c(k)Σ22.

We consider the minimal and maximal eigenvalues of Σ22, denoted by λmin(Σ22) −1 and λmax(Σ22), before determining c(k), which is closely related to the norm of Σ22 .

1 Lemma 2.5.3. Suppose H > 2 , then λmin(Σ22) → c = essinff > 0 as k → ∞.

Proof. One can define[51] the power spectrum (see [22], chapter 14) with a singu- 26 larity at λ = 0 by

∞ X 1 1 f(λ) := b(k)e−i2πλk, − < λ < , 2 2 k=−∞

(also known as the spectral density function or spectrum of the stationary process

{Xn}) where b(k) = E(XnXn+k) is the covariance function as before. f(λ) has R 1/2 2πikλ an inverse transformation: b(k) = −1/2 f(λ)e dλ. For the fractional Gaussian noise {Xk}, f(λ) has the form (cf. [51], Section 2.3):

∞ ∞ X 1 X 1 f(λ) = C|1 − ei2πλ|2 = C(1 − cos(2πλ)) . |λ + m|1+2H |λ + m|1+2H m=−∞ m=−∞

From [29] page 64/65, limk→∞ λmin(Σ22) = essinff and limk→∞ λmax(Σ22) = esssupf (including the cases ±∞). Since essinfλf(λ) > 0, the lemma is proved.

1 Lemma 2.5.4. Let c(k) = mk2H−1f(n) , where m is an appropriate constant inde- pendent of k and n, then ||I − A||2 = ||I − c(k)Σ22||2 < 1.

p 2H−1 Proof. On the one hand, ||Σ22||2 ≤ ||Σ22||1||Σ22||∞ = O(k ). On the other hand, BT Σ B 22 ≤ λ (Σ ) = ||Σ || BT B max 22 22 2 BT Σ B and 22 = O(k2H−1). Hence ||Σ || = O(k2H−1). BT B 22 2 The eigenvalues of I − A := I − c(k)Σ22 are {1 − c(k)λi(Σ22)}, by the way of choosing c(k), both of |1 − c(k)λmax(Σ22)| and |1 − c(k)λmin(Σ22)| can be less than

1. Hence ||I − c(k)Σ22||2 < 1.

T −1 With the preparation above, we can continue the estimate of B Σ22 B = P∞ T l c(k) l=0 B (I − c(k)Σ22) B.

3 Lemma 2.5.5. Suppose 4 < H < 1, then

T −1 lim B Σ22 (k, k)B = 0. k→∞

2 2 2H It follows that dn := limk→∞ σ (n, k) = O(n ). 27

Proof. We first derive an iterative equation, which will be used frequently. For any k-dimensional column vector V = (V (1),V (2), ··· ,V (k))T , define (l) l (l−1) (0) T V := (I − c(k)Σ22) V = (I − c(k)Σ22)V , V = V . Recall that B = (B(1),B(2), ··· ,B(k)), then

T (l) T (l−1) B V = B (I − c(k)Σ22)V T (l−1) T (l−1) = B V − c(k)B Σ22V k k k X X X = B(s)V (l−1)(s) − c(k) B(s) V (l−1)(i)b(i − s) s=1 s=1 i=1 k k k X X X = B(s)V (l−1)(s) − c(k) B(i) V (l−1)(s)b(i − s) s=1 i=1 s=1 k k X X = V (l−1)(s)(B(s) − c(k) B(i)b(i − s)) s=1 i=1 k k X  X B(i)  = V (l−1)(s)B(s) 1 − c(k) b(i − s) B(s) s=1 i=1

So we get an iterative equation for any vector V :

k k X X   B(s)V (l)(s) = B(s)V (l−1)(s) 1 − q(s) , l ≥ 1 (2.7) s=1 s=1 where

k X B(i) X B(i) q(s) = c(k) b(i − s) = c(k)( b(i − s) + 1). (2.8) B(s) B(s) i=1 1≤i≤k,i6=s

2H 2H 2H−2 ( 4 ) 1 By Lemma 2.5.2 and b(t) = 2 |t| (1 + 2H t2 + ··· ) for t ≥ 1, ( 2 )

 Z s−2 B(i) Z k−1 B(i)  q(s) ≥ c(k) b(s − i)di + b(s − i)di 2 B(s) s+2 B(s)  Z s−2 2H−2 Z k−1 2H−2  (i + n) 2H−2 (i + n) 2H−2 ≥ c(k) C 2H−2 (s − i) di + 2H−2 (i − s) di 2 (s + n) s+2 (s + n) s−2 C n + s  Z k n s = ( )2−2H (x + )2H−2( − x)2H−2dx f(n)m k 2 k k k 28

k−1 Z k n s  + (x + )2H−2(x − )2H−2dx s+2 k k k 1 n + s = ( )2−2H I(k, s, n). f(n)m k

The integral I(k, s, n) is bounded below by some constant K independent of k, s and n. For all s satisfying

(s + n)2−2H ≥ f(n)n2−2H kγ(2−2H) (2.9) where γ ∈ [0, 1], that is

s ≥ s∗ := f(n)1/(2−2H)nkγ − n,

n2−2H K q(s) ≥ q := , when s ≥ s∗. k(2−2H)(1−γ) m In the iterative equation 2.7, set V (s) to be V (s) = B(s), s = 1, 2, ··· , k, (l) l (l−1) V = (I − c(k)Σ22) B = (I − c(k)Σ22)V . Then one has

k k X X B(s)V (l)(s) = 1 − q(s)B(s)V (l−1)(s) s=1 s=1 k X X ≤ (1 − q)B(s)V (l−1)(s) + (q − q(s))B(s)V (l−1)(s) s=1 s

The idea is to incorporate the second term (when s < s∗) into the first one. For any  > 0, define

s∗−1 k X X L∗ = inf{l : B(s)V (l)(s) ≥  B(s)V (l)(s)}. (2.10) s=1 s=1

∗ Ps∗−1 (l) Pk (l) If L = ∞, then s=1 B(s)V (s) <  s=1 B(s)V (s) for all l. It follows 29 that for all l ≥ 1,

k k X X B(s)V (l)(s) ≤ (1 − q) B(s)V (l−1)(s) s=1 s=1 k X + (q − min q(s)) B(s)V (l−1)(s) s s=1 k X = (1 − q∗) B(s)V (l−1)(s) s=1 k X ≤ (1 − q∗)l B(s)V (0)(s) s=1

∗ where q = q − (q − mins q(s)). By Lemma 2.5.2,

∞ k ∞ k X X X X c(k) B(s)V (l)(s) ≤ c(k) (1 − q∗)l B(s)2 l=1 s=1 l=1 s=1 C ≤ c(k) n2(k + n)4H−3 q∗ C ≤ c(k) n2(k + n)4H−3 (1 − )q n2H = C k(2−2H)(−γ). f(n)

When l = 0,

k k X X c(k) B(s)V (l)(s) = c(k) B2(s) ≤ c(k)C(n2(k + n)4H−3) ≤ C(k−(2−2H)n2). s=1 s=1

Hence,

∞ T −1 X T (l) −γ(2−2H) B Σ22 B = c(k) B V = O(k ), as k → ∞. l=0

∗ T −1 Hence if L = ∞, limk→∞ B Σ22 B = 0. 30

If L∗ < ∞, then for l < L∗,

s∗−1 k X X B(s)V (l)(s) <  B(s)V (l)(s), s=1 s=1 and s∗−1 k X ∗ X ∗ B(s)V (L )(s) ≥  B(s)V (L )(s). (2.11) s=1 s=1 P∞ T (l) In this case, we split l=1 B V into two parts:

∞ k L∗−1 k ∞ k X X X X X X B(s)V (l)(s) = B(s)V (l)(s) + B(s)V (l)(s). l=1 s=1 l=1 s=1 l=L∗ s=1

The first term can be handled in the same way as the case when L∗ = ∞:

L∗−1 k L∗−1 k X X X X c(k) B(s)V (l)(s) ≤ c(k) (1 − q∗)l B2(s) (2.12) l=1 s=1 l=1 s=1 k 1 X ≤ c(k) B2(s) (2.13) q∗ s=1 n2H ≤ C k(2−2H)(−γ). (2.14) f(n)

For the second term, set V in the iteration equation (2.7) to be Vnew := (I − L∗ (L∗) ∗ c(k)Σ22) B = V , then by changing variable d = l − L , one has

∞ ∞ X T (l) X T l−L∗ B V = B (I − c(k)Σ22) Vnew l=L∗ l=L∗ ∞ X T (l−L∗) = B Vnew l=L∗ ∞ k X X (d) = B(s)Vnew(s) d=0 s=1 k ∞ k X (0) X X (d−1) = B(s)Vnew(s) + (1 − q(s))B(s)Vnew (s) s=1 d=1 s=1 31

∞ k X X ∗ ≤ (1 − min q(s))d B(s)V (L )(s) s d=0 s=1 ∞ s∗−1 X 1 X ∗ ≤ (1 − min q(s))d · B(s)V (L )(s) by (2.11) s  d=0 s=1 s∗−1 1 X ≤ B(s)2 min q(s) s s=1 C ≤ n2kγ(4H−3). mins q(s)

1 n+s 2−2H 1 Since q(s) ≥ f(n)m ( k ) K, mins q(s) ≥ C k2−2H , one has

∞ X C c(k) BT V (l) ≤ c(k) n2kγ(4H−3) = O(k−(1−γ)(4H−3)). (2.15) mins q(s) l=L∗

T −1 −(1−γ)(4H−3) −γ(2−2H) Combining (2.12) and (2.15), one has B Σ22 B ≤ C(k +k ). 3 T −1 Since H ∈ ( 4 , 1), limk→∞ B Σ22 B = 0 follows.

2.5.3 Estimate of the mean

Lemma 2.5.6. Suppose 3/4 < H < 1, then almost surely,

T −1 lim B Σ22 X[2] = 0, (2.16) k→∞

T where X[2] = (Xn+1,Xn+2, ··· ,Xn+k) .

T −1 Proof. Random variable B Σ22 X[2] has normal distribution. Its mean is 0 and its T −1 −1 T −1 variance is (B Σ22 )Σ22(Σ22 B) = B Σ22 B. So for any  > 0,

∞ ∞ X X 1 1 T −1 T −1 T −1 2 T −1 2 P (|B Σ22 (k, k)X[2]| > ) = P (|B Σ22 X[2]|/(B Σ22 B) > /(B Σ22 B) ) k=1 k=1 ∞ Z ∞ 2 2 X − x = √ e 2 dx 1 2π T −1 2 k=1 /(B Σ22 B) ∞ T −1 1 2 2 X (/(B Σ B) 2 ) ≤ √ exp − 22 2 2π k=1 32

∞ 2 X 2 = √ exp (− (BT Σ−1B)−1). 2 22 2π k=1

T −1 By Borel-Cantelli Lemma, in order to prove that B Σ22 X[2] → 0 as k → ∞ P∞ 2 T −1 −1 almost surely, it is sufficient to prove that k=1 exp (− 2 (B Σ22 B) ) < ∞, which is true from the proof of Lemma 2.5.5. Chapter 3

Limiting Distributions of Local Times

There has been a great deal of research work on the local time of random walks and Markov processes. It was proved that under proper normalization, the distribution of the occupation time of Markov processes, in particular, random walks, is the Mittag-Leffler distribution. We will have a review on that in section 3.1 and 3.2. In this work, we explore a class of stationary processes with conditional local limit theorems and it turns out that the limiting distribution of their local times are closely related to the Mittag-Leffler distribution.

3.1 Local times of random walks

In this section, we present some properties of the local times of random walks.

Throughout this section, let {Xi, i = 1, 2, ...} be independent identically dis- Pn tributed (i.i.d.) random variables and Sn = i=1 Xi is the random walk. Specifi- 1 cally, when P (Xi = 1) = P (Xi = −1) = 2 , it is called a simple symmetric random walk. Local times of the random walk Sn on the level x ∈ Z at time n is defined as `(n, x) = #{i = 1, 2...n : Si = x}. It can be interpreted as the number of excursions away from x completed before n. We also use `n to represent `(n, 0) for short.

Let ρ0(x) = 0, ρk(x) = min{j : j > ρk−1(x),Sj = x}. So ρk(x) is the th that the random walk hits x the k time. When Si is a simple symmetric random 34

walk, {ρn(x) − ρn−1(x)} is a sequence of i.i.d. random variables with

x−1 X  n  P (ρ (x) > n) = 2−n (x = 1, 2, ...). 1 b(n − j)/2c j=0

We denote ρk(0) by ρk for short. Then

2k − 2 P (ρ = 2k) = 2−2k+1k−1 , k = 1, 2, ··· 1 k − 1

Also, since P (`(n, x) = k) = P (ρk(x) ≤ n, ρk+1(x) > n), it follows that

Theorem 3.1.1 (c.f. e.g. [54] Theorem 9.4). Let x > 0, k > 0. Then for any k = 1, 2, ··· ,

 1  n−k+1 , if n + x is even, 2n−k+1 (n+x)/2 P (`(n, x) = k) = 1  n−k , if n + x is odd. 2n−k (n+x−1)/2

Together with Stirling’s formula, the limiting distribution of the local time can be found.

Theorem 3.1.2 (c.f. e.g. [54] Theorem 9.11).

Z x ρ1(k) ρk 1 − 3 − 1 v √ 2 2 lim P ( 2 < x) = lim P ( 2 < x) = v e dv, k→∞ k k→∞ k 2π 0

1 − 3 θk P (ρ = 2k) = √ k 2 exp( ), where |θ | ≤ 1, 1 2 π k k

r 2   1  P (ρ ≥ x) = 1 + O (x → ∞), 1 πx x and

r Z x 2 √ √ 2 − u lim P (`n/ n < x) = lim P (`(n, z)/ n < x) = e 2 du, x ∈ Z. n→∞ n→∞ π 0

The right hand side is the distribution function of the absolute value of a random variable normally distributed. 35

In 1949, Chung and Hunt [21] also introduced waiting time ρ1, ρ2, ··· . By considering the moment generating function of ρ1, they gave the asymptotic dis- tribution of `n.

Theorem 3.1.3 ([21]).

Z ∞ 2 1/2 − u  P (`2n ≥ r) = (2/π) e 2 du + , t 6n

− 1 with t = 2r(2n+2/3) 2 , || < 1, and n is big enough. It follows that as n → ∞, the ρ distribution of n approaches the stable distribution whose characteristic function 4n2 is 1 exp{− (1 − i sgn t)|t|1/2}. 2 At the same time the distribution goes to that of 1/Y 2, where Y ∼ N(0, 1). Also, ` the distribution of √n tends to that of |Y |. n ` Corollary 3.1.4. For every fixed t, the distribution of √[nt] converges to the dis- n tribution of the local time L(t) of the standard Brownian motion B(t) at level zero, ` i.e. √[nt] → L(t) in distribution. n Chung and Hunt (1949) are also pioneers in establishing the first law of iterated logrithm(LIL) for the local time of simple symmetric random walks: almost surely, for any ω, ∃N0(ω), such that

`n 1 √ < ((2 + ) log log n) 2 ∀ n > N (ω). n 0

This result can be strengthened. In functional space, it holds in the sense of weakly convergence. Refer to Theorem 1.0.5, Borodin’s work [14] and Perkins’ work [49] on the weak convergence or convergence with probability 1 for general random walks’ local times to the Brownian local time. The study of the distribution convergence is not limited to simple symmetric random walks. A. Aleˇskeviˇcien´e[8] [9] considered aperiodic recurrent random walks. And she relaxed the moment requirement to be only finite variance σ2 = 36

VarX1 < ∞. Then for r > 0,

r Z ∞ 2 `(n, x) 2 − u lim P ( √ ≥ r) = e 2 du, −∞ < x < ∞, x ∈ . −1 Z n→∞ σ n π r

3 If in addition β3 = E[|X1| ] < ∞, then for r > 0,

r Z ∞ 2  2+  `(n, x) 2 − u 1 + |x| 1 + |x| P ( √ ≥ r) − e 2 du ≤ c √ + , x ∈ ,  > 0, −1 2 Z σ n π r r n r n (3.1) where  is an arbitrary small constant positive number. The approximation is good √ only when |x| ≤ n n, where n → 0 with sufficient speed. In 1984, she improved

(3.1) and proved if β3 < ∞, then

|x| P (`(n, x) = 0) = G( √ ) + O(n−1/6), σ n

r Z r |x| `(n, x) |x| 2 − 1 ( √ +u)2 P ( √ < r) = G( √ ) + e 2 σ n du + O(n−1/6), −1 σ n σ n π 0

r 2 2 R x − u where G(x) = e 2 du, r > 0. π 0 Theorem 3.1.5 ([9]). In the same setting, in addition if E[`(n, x)3] < ∞, then for any integer l ≥ 1,

r Z ∞ 2 `(n, x) l 2 − (z+y) l l − l |x| E( √ ) = e 2 y dy + σ n 2 l!R (x), z = √ −1 l,n σ n π 0 σ n

For Rl,n(x) one has

l −l/2 c0 ln n c0 l σ n |Rl,n(x)| ≤ √ (1 + √ ) , 2πn 2πn

Z ∞ 2 l −l/2 1 c0 ln n c0l l 3 − y σ n |Rl,n(x)| ≤ √ (y + √ ) y e 2 dy, l! 2πn 0 2πn where c0 is a constant.

For general random walks, Jain and Pruitt (1984, [32]) obtained some LIL results for local times of recurrent random walks under a very general condition 37 using absolute potential kernel. We describe the details below.

Let Xn be integer-valued i.i.d. random variables with mean 0 and with the same distribution F . For x > 0, define Z 1 2 G(x) = P (|X| > x),K(x) = 2 y dF (y), x |y|≤x and Q(x) = G(x) + K(x) = E(x−1|X| ∧ 1)2.

Assume that G(x) lim sup < 1, (3.2) x→∞ K(x) which implies that E[|X|] < ∞. Together with E[X] = 0, the random walk is recurrent. The quantity on the left of (3.2) was introduced by Feller (1966) to describe the compactness and convergence of the normalized random walks. If X is in the domain of attraction of a stable law of index α, then

G(x) 2 − α lim sup = . x→∞ K(x) α

So Jain and Pruitt’s results include all cases when X is in the domain of attraction of a stable law of index α > 1 and of zero mean. The class of the distributions described by (3.2) is much larger than this. Jain and Pruitt also pointed out that condition (3.2) excludes the case when the local time has a slowly varying increasing rate. The function Q is continuous and strictly decreasing for x large enough. Thus one can define ay by Q(ay) = 1/y for y > y0 = 1/Q(1) and ay = 1 for y ∈ [0, y0].

Let cn = an/ log log n, which is the normalizing coefficient.

Theorem 3.1.6 ([32]). There exists θ1 ∈ (0, ∞) such that for all x ∈ Z,

cn lim sup `(n, x) = θ1 a.s. n→∞ n

Theorem 3.1.7 ([32]). Given  > 0, there exists δ > 0 such that

lim sup sup (cn/n)|`(n, x) − `(n, y)| <  a.s. n→∞ |x−y|≤cnδ 38

Theorem 3.1.8 ([32]). There exists θ2 ∈ (0, ∞) such that

lim sup sup(cn/n)`(n, x) = θ2 a.s. n→∞ x

Remark 3.1.9. If X is in the domain of attraction of a stable law, then θ1 = θ2.

In all, in this section we reviewed the exact and asymptotic distributions of local times of recurrent random walks. Those questions will be answered as well for the local times of stationary processes in our work.

3.2 Occupation times of Markov chains

In this section, we revisit the occupation time of Markov chains and give the formal statement of the results. Let X(t) be a Markov process with stationary transitions and takes values in an abstract space; V (x) is a non-negative function over the 1 R t abstract space. Darling and Kac [24] studied the limit of u(t) 0 V (X(s))ds, where u(t) is a suitable normalization. If V (x) is the indicator function of a set, then R t 0 V (X(s))ds is the occupation time of the set. They proved that under suitable conditions, the limiting distribution must be the Mittag-Leffler distribution of some index. Karamata’s Tauberian theorem plays a key role in the proof. Darling and Kac’s method is applicable to Markov chain, and in particular, to the sums of i.i.d. random variables. Suppose the transition probability of the Markov process is P (x, E; t) with ini- R ∞ −st tial state X(0) = x0. Its Laplace transformation is ps(x, E) = 0 P (x, E; t)e dt, which defines a measure. Suppose P and V satisfy the “Darling-Kac condition” (refer to (1.2) in the Introduction and Overview). Next we review some definitions which will be used later.

Definition 3.2.1 (Slowly varying functions). A positive function h(x), defined for x > 0, is slowly varying (at infinity) if for all t > 0,

h(tx) lim = 1. x→∞ h(x)

Definition 3.2.2 (Regularly varying). A measurable function f : R+ → R+ is 39 regularly varying at infinity if

f(xy) lim = yα (y > 0). x→∞ f(x)

The number α is called the index of regular variation (of f) and f is said to be regularly varying at infinity with index α. Similarly, a measurable function f : R+ → R+ is regualry varying at 0 with index α if

f(xy) lim = yα (y > 0). x→0 f(x)

Theorem 3.2.3. (Karamata’s Tauberian theorem) Suppose that u : R+ → R+ is measurable and locally integrable. Let

Z x Z ∞ a(x) := u(t)dt, u¯(p) := e−ptu(t)dt, 0 0 then

1. the following are equivalent:

(a) x → a(x) is regularly varying at ∞. (b) p → u¯(p) is regularly varying at 0. u¯(p) (c) limp→0 1 exists and the limit is a positive real number. a( p ) 1 In this case, u¯(p) ∼ Γ(1 + α)a( ) as p → 0 where α ≥ 0 is the (mutual) p index of regular variation.

2. Suppose that α > 0. tu(t) (a) If u is regularly varying at ∞ with index α − 1, then a(t) ∼ as α t → ∞. (b) Conversely, if a is regularly varying at ∞ with index α, and u is mono- αa(t) tone, then u(t) ∼ as t → ∞. t Definition 3.2.4 (Mittag-Leffler distribution of order α). Let α ∈ [0, 1]. The random variable Yα on R+ has the normalized Mittag-Leffler distribution of order 40

α if ∞ X Γ(1 + α)pzp E(ezYα ) = . Γ(1 + pα) p=0 The probability density function is

∞ 1 X (−1)j−1 f (x) = sin(παj)Γ(αj + 1)xj−1. Yα πα j! j=1

The cumulative distribution function is

∞ 1 X (−1)j−1 F (x) = sin(παj)Γ(αj + 1)xj. Yα πα jj! j=1

Note that E[Yα] = 1. Y1 = 1, and the density functions of Y0 and Y 1 are given 2 by −y fY0 (y) = e , and 2 y2 − π fY 1 (y) = e . 2 π

Note that fY 1 (y) is the density function of the absolute value of a random variable 2 normally distributed.

The idea of Darling and Kac’s method is first to estimate the Laplacian trans- formation of the kth moments of the Occupation time in related to a slowly varying function, then by Karamata’s Tauberian theorem, the kth moment can be repre- sented, which is consistent with the kth moments of Mittag-Leffler distribution. To illustrate it, we take Brownian motion W (t) as an example. Suppose the initial value of the Brownian motion is 0. Let V be as before. The second moment is

Z t 2 µ2(t) = E( V (W (s))ds) 0 Z t Z s2 ZZ = 2 V (x1)V (x2)P (0|x1; s1)P (x1|x2; s2 − s1)dx1dx2ds1ds2, 0 0 where P (x|y; t) is the probability that the Brownian motion starts at x and arrives 41

1 ||x−y||2 y after time t: P (x|y; t) = 2πt exp(− 2t ). Its Laplace transform is

Z ∞ Z ∞ √ −st 1 −st−||x−y||2/2t 1 1 e P (x|y; t)dt = e dt = K0( 2s · ||x − y||), 0 2π 0 t π √ 1 1 where π K0( 2s · ||x − y||) = 2π log 1/s − 1/π log ||x − y|| + O(1), s → 0. There are 1 two parts: the infinite part is 2π log 1/s, later this is called h(s) and the potential part is 1/π log ||x − y||. Next the Laplace transform of the second moment is introduced. One has that as u → 0, Z ∞ Z ∞ −ut 2! 2 2 u e µ2(t)dt ∼ 2 ( V (x)dx) (log 1/u) . 0 (2π) −∞ Similarly, as u → 0,

Z ∞ Z ∞ −ut k! k k u e µk(t)dt ∼ k ( V (x)dx) (log 1/u) . (3.3) 0 (2π) −∞

The fact that log 1/s is slowly varying enables one to use Karamata’s Tauberian theorem, it follows that

Z ∞ k! k k µk(t) ∼ k ( V (x)dx) (log t) , as t → ∞. (2π) −∞

n n! Since the moments of X ∼ exp(λ), for n = 1, 2, ..., are given by E(X ) = λn , 2π R t −α we get limt→∞ P ( C log t 0 V (x(s)ds < α) = 1 − e .

For the case of Markov processes, the Darling-Kac’s condition (1.1) ensures the convergence of the Laplace transform of the kth moments. However, in order to apply Karamata’s Tauberian theorem, some restrictions on the normalization u(t) for the occupation time have to be made.

Theorem 3.2.5 ([24]). If L(1/s) h(s) = sα with 0 ≤ α < 1, where L(1/s) is slowly varying as s → 0, then 42

 1 Z t  lim P V (x(τ))dτ < x t→∞ Ch(1/t) 0 ∞ 1 Z x X (−1)j−1 = sin(παj)Γ(αj + 1)yj−1dy πα j! 0 j=1

=: Gα(x),

and t !k R V (x(s))ds k! lim E 0 = . t→∞ Ch(1/t) Γ(αk + 1)

The right hand side is known to be the moments of the Mittag-Leffler distribution

Gα(x). The interesting part is that the converse of the theorem is also true.

Theorem 3.2.6 ([24]). If X(t) and V (x) satisfy the Darling-Kac condition (1.1) and if in addition for some u(t) > 0,

 1 Z t  lim P V (X(s))ds < x = G(x), t→∞ u(t) 0 where G(x) is a nondegenerate distribution, then

h(s) = L(1/s)/sα

for some α ∈ [0, 1) and slowly varying function L(1/s). Hence G(s) = Gα(x/b) for some appropriate constant b.

When applied to the Markov chains, the Laplace transform is replaced by the generating function. Let Xn be a Markov Chain with transition probabilities

Pn(x, E) = P (Xn+k ∈ E|Xk = x). Define

∞ X n pz(x, E) = Pn+1(x, E)z n=0 with 0 ≤ z < 1, and V (x) is a non-negative, measurable function. Suppose the “Darling-Kac condition” holds: there exists a function h(z) → ∞, z → 1, and 43

C > 0, such that Z p (x, dy) z V (y) → C, z → 1. h(z) The convergence is uniform in {x|V (x) > 0}. Under these conditions, one has the following theorem.

Theorem 3.2.7 ([24]). A necessary and sufficient condition that for some nor- malizing sequence un the limiting distribution of

n 1 X V (X ) u j n j=0

1 1 exists and is nonsingular is that h(z) = (1−z)α L( 1−z ) for some α, 0 ≤ α < 1 and L is slowly varying as z → 1. In case h is satisfied, un can be taken to be Ch(1−1/n) and the limiting distribution is then the Mittag-Leffler distribution.

When Xn are i.i.d., suppose its common distribution function is F (x) and the (n) characteristic function is φ(t). F (x) is the distribution function of Sn. Then

1 Z T ψ(t, x, E)φ(t) pz(x, E) = lim dt, T →∞ 2π −T 1 − zφ(t)

R −it(y−x) where ψ(t, x, E) = E e dy. Suppose either F (x) has (1) an absolutely con- tinuous component or (2) has a lattice structure and if as t → 0, φ(t) ∼ 1 − |t|γ, then “Darling-Kac condition” is met with

1 1 h(z) = , 1 < γ ≤ 2, γ sin(π/γ) (1 − z)1−1/γ and 1 1 h(z) = log , γ = 1. π 1 − z When V is the characteristic function of set B, in case (1), the constant C in the “Darlind-Kac condition” is the Lebesgue measure of B, while in case (2), C is the number of lattice points in B. 44

3.3 Ergodic sums of infinite measure preserving transformation

To prepare us well for the study of the asymptotic distribution of stationary pro- cesses with a conditional local limit theorem, we shall have a recall on the related concepts and results in the infinite ergodic theory.

Definition 3.3.1 (Conservative measure). Given a dynamical system (X, B, T, m), we say measure m or transformation T is conservative if m(A) = 0 whenever A ∈ B −n is such that {T A}n≥0 are pairwise disjoint. It means that given any measurable S −n set A, almost all points of A will eventually return to this set, i.e. A ⊂ n≥1 T A (mod m) for all A ∈ B with m(A) > 0.

Proposition 3.3.2 (cf., eg.,[3]).

∞ X 1 [ PT k f = ∞] = C(T ) mod ∀f ∈ L (m), f > 0. n=1

Here C(T ) = X \D(T ) is the conservative part of T and D(T ) = U(W(T )) is the measurable union of the collection of measurable wandering sets W(T ) for T.

Definition 3.3.3 (Ergodic). Given a dynamical system (X, B, T, m), measure pre- serving and non-singular transformation T is called ergodic if A ∈ B,T −1A = A implies m(A) = 0 or m(Ac) = 0.

Generally, for invariant set A, since T −1A = A, T −1Ac = Ac. It implies that TA = A and TAc = Ac. So (X, B, T, m) can be broken up into two pieces and ergodicity means that there is no non-trival invariant set.

Definition 3.3.4 (A moment set). Let (X, B, m, T ) be a conservative, ergodic, measure preserving transformation. Let A ∈ B, 0 < m(A) < ∞ and set

n−1 ∞ X m(A ∩ T −kA) X m(A ∩ T −kA) a (A) := , uA(λ) := e−λk . (3.4) n m(A)2 m(A)2 k=0 k=0 45

T Pn−1 k Denote Sn (f)(x) := i=0 f(T x). The set A is called a moment set for T if

∞ X Z uA(λ)p e−λn ST (1 )pdm ∼ p!m(A)p+1 as λ → 0, ∀p ∈ . n A λ N n=0 A

P∞ −λn R T 1 p Here n=0 e A Sn ( A) dm is similar to the Laplace transform of the pth moment. The moment set plays the same role as that of Darling and Kac’s condi- tion and it has similar form as (3.3). With these preparatory concepts, we can discuss Darling and Kac’s theorem in infinite space. Let (X, B, m, T ) be a conservative, ergodic, measure preserving transformation of a σ-finite non-atomic infinite measure space. And f : X → [0, ∞) is a m-integrable function with non-zero integral, then under certain condi- T tions, Sn (f)(x) converges to the Mittag-Leffler distribution after properly scaled.

Theorem 3.3.5 (Aaronson’s Darling-Kac theorem, [2]). Suppose A is a moment set for T , and that uA(λ) is regularly varying with index α ∈ [0, 1] as λ → 0. Then

 T  Sn (f) ∞ g → E[g(m(f)Yα)] weak* in L (X) an(A)

1 whenever f ∈ L+(m) and g ∈ C([0, ∞]). Here Yα has the normalized Mittag-Leffler distribution with the index α.

We rewrite this convergence in the following way:

T Sn d −→ Yα. an(A)

Similar to the Markov process case, the converse is also true. If A is a moment set for a conservative, ergodic, measure preserving transformation T and for some random variable Y on (0, ∞) and constants an → ∞,

ST n −→d Y, an then uA(λ) is regularly varying as λ → 0. A nature question arises: what is a sufficient condition for a set to be a moment set? 46

Definition 3.3.6 (Pointwise dual ergodic). A conservative ergodic measure pre- serving transformation (c.e.m.p.t.) (X, B, m, T ) is called pointwise dual ergodic if there are constants {an} such that

n−1 Z 1 X 1 PT k f → fdm a.e. as n → ∞ ∀f ∈ L (X), an k=0 X where PT n is the Perron-Frobenius operator defined in section 2.3. The sequence

{an} is called the return sequence.

Theorem 3.3.7 ([2]). Suppose that T is pointwise dual ergodic and that A ∈ B with positive measure satisfies

n−1 1 X sup || PT k 1A||L∞(A) ≤ MA, n an(A) k=0 where an(A) is defined in (3.4).

The following corollary is the generalization of the result on the distributional convergence of normalized occupation times of Markov processes of Darling and Kac.

Corollary 3.3.8 (Generalized Darling-Kac theorem, [2]). Suppose (X, B, m, T ) is pointwise dual ergodic and that an(T ) is regularly varying with index α ∈ [0, 1], then T Sn d −→ Yα, an(T ) where Yα has Mittag-Leffler distribution with index α.

Remark 3.3.9. The scaling factor an(A) in Theorem 3.3.7 is exact the same as an(T ) in Corollary 3.3.8. In the definition of an(A), it has been scaled and actually is independent of the set A.

In the next section, we will construct a pointwise dual ergodic dynamical system and apply the generalized “Darling-Kac” theorem in an infinite space. 47

3.4 Asymptotic distribution of the local times `n of stationary processes with conditional local limit theorems

In this section, we will use the generalized Darling-Kac theorem in the infinite space to study the local times of certain stationary processes with conditional local ∞ limit theorems. To be precise, let (Ω, F,P ) be a probability space, and {Xn}n=1 is an integer-valued stationary process with E[Xn] = 0. Due to the stationary proposition, there exists a measure preserving transformation T :Ω → Ω and R 1 a random variable φ :Ω → Z with Ω φ dP = 0 and φ ∈ L (P ), such that n−1 T Pn−1 k Xn = φ◦T , ∀ n ≥ 1. As before, denote Sn (f) := k=0 f ◦T for any f :Ω → Z. T Specifically, Sn := Sn (φ). The local time of {Xn} at time n on the level x ∈ Z Pn 1 is defined as `(n, x) = k=1 {Sk=x}. Denote by `n the local time at 0 at time n for short. By expanding the probability space (Ω, F,P ) to be a product space, the local times can be represented as ergodic sums in an infinite space. Define T˜ :Ω × Z → Ω × Z by T˜(ω, n) = (T ω, n + φ(ω)), (3.5) where φ is the same as above. By induction, one has

˜k k T (ω, n) = (T ω, n + Sk(ω)). (3.6)

Let mZ be the counting measure on the integer space Z, and Z is the Borel-σ algebra of Z. A new dynamical system (X, B, µ, T˜) then can be defined, where

X = Ω × Z, B = F ⊗ Z and µ = P ⊗ mZ is the product measure. T˜ Pn ˜i Let A = Ω × {0}, and define Sn (f)(ω, m) := i=1 f ◦ T (ω, m). By (3.6), the local time of {Sn} has the following representation:

n n X 1 X 1 ˜i T˜ 1 `n(ω) = {Si(ω)=0} = {A}(T (ω, 0)) = Sn ( {A})(ω, 0). i=1 i=1 In this section we study the connection of local times and local limit theorems as stated below.

Definition 3.4.1. A centered integer-valued stationary process {Xn} is said to 48 have a conditional local limit theorem at 0, if there exists a constant g(0) and a sequence {Bn} of positive real numbers, such that for all x ∈ Z

BnP (Sn = x|(Xn+1,Xn+2, ...) = ·) → g(0) (3.7) almost surely.

The full formulation of the corresponding form of a local limit theorem goes back to Stone and reads in the conditional form (see [5]) that

B P (S = k |(X ,X , ...) = ·) → g(κ) as kn−An → κ P -a.s. n n n n+1 n+2 Bn for all κ ∈ R, where An is some centering constant. Definition 3.4.1 can be re- ∞ formulated using the dual operator PT of the isometry Uf = f ◦ T on L (P ) operating on L1(P ), where we take the Lp-spaces of P restricted to the σ-field generated by all Xn and T the shift operation T (X1,X2, ...) = (X2,X3, ...). This operator is called the transfer operator. The local limit theorem at 0 then reads

n 1 BnPT ( {Sn=x}) → g(0) for all x ∈ Z, P -a.s.. (3.8)

In this section, we assume that {Xn} has the conditional local limit theorem at 0 as formulated in Definition 3.4.1. In addition, we assume that the convergence is uniformly for all x.

Remark 3.4.2. If the convergence is uniformly for almost all ω and {kn}, it would imply that {Xn} has the local limit theorem by talking expectation on both hand sides, then by [31], {Xn} are in the domain of attraction of a stable law with some index d: Sn − An W −→ Zd. Bn

The probability density function of Zd is g as above and denote the cumulative R distribution function by G(x). Since Ω φ dP = 0, we can (and will) assume that

An = 0. It is necessary [31] that {Bn} is regularly varying of order β = 1/d.

We will use Hurewicz’s Ergodic Theorem and Disintegration Theorem in the proof later, so we state them below. 49

Theorem 3.4.3 (Hurewicz’s Ergodic Theorem). Suppose that T is a conservative, non-singular transformation of the σ-finite measure space (X, B, m). Then

Pn k=1 PT k f(x) Pn → h(f, g)(x), a.s. k=1 PT k g(x)

1 R for x ∈ X, ∀f, g ∈ L (m), g > 0, where h(f, g) ◦ T = h and X h(f, g)ug dm = R ∞ X uf dm ∀u ∈ L (m), u ◦ T = u. When T is ergodic, R X f dm h(f, g) = R . X g dm

Since there is no obvious evidence that (X, B, µ, T˜) is pointwise dual ergodic, we will make a decompose of it by the following theorem.

Theorem 3.4.4 (Disintegration theorem, cf. eg., [3]). Let (X, B, m) and (Y, C, µ) be standard probability spaces and suppose π : X → Y is measurable and µ = −1 m ◦ π , then ∃ Y0 ∈ C such that µ(Y0) = 1 and there exists a measurable function −1 y → my such that my(π {y}) = 1, ∀y ∈ Y0 and Z −1 m(A ∩ π B) = my(A) dµ(y) ∀ A ∈ B,B ∈ C. B

−1 The measure my is called the fibre measure over π {y}.

With the assumption of the conditional local limit theorem 3.4.1, the lemma below is the key ingredient in finding the limiting distribution of local times.

Lemma 3.4.5. Suppose {Xn} has the conditional local limit theorem 3.4.1. Then the following holds.

1. T˜ defined in (3.5) is a conservative and measure preserving transformation of (X, B, µ).

2. Ergodic Decomposition: there exists a probability space (Y, C, λ), and a col-

lection of measures {µy : y ∈ Y } on (X, B) such that

(a) For y ∈ Y , T˜ is a conservative ergodic measure preserving transforma-

tion of (X, B, µy). 50

(b) For A ∈ B, the map y → µy(A) is measurable and Z µ(A) = µy(A) dλ(y). Y

˜ 3. λ-almost surely for y, (X, B, µy, T ) is pointwise dual ergodic.

1 Proof. 1. For any m ∈ Z, let f :Ω×Z → R be defined as f(ω, k) = h(ω)⊗ {m}(k). It can be proved that µ almost surely for (ω, k) ∈ X,

 ! 1 1 PT˜n (h ⊗ {m})(ω, k) = PT n h(·) {m} k − Sn(·) (ω). (3.9)

Indeed, for any u(ω, k) ∈ L∞((X, µ)),

Z n 1 PT˜ (h ⊗ {m})(ω, k)u(ω, k) dµ(ω, k) X Z   ˜n = (h ⊗ 1{m})(ω, k) u ◦ T (ω, k) dµ(ω, k) Ω×Z Z n = (h ⊗ 1{m})(ω, k)u(T ω, k + Sn(ω)) dµ(ω, k) Ω×Z Z Z 1 n = h(ω) {m}(k)u(T ω, k + Sn(ω)) dP (ω) dmZ(k) Z Ω Z  Z  1 0 n 0 0 = h(ω) {m}(k − Sn(ω))u(T (ω), k ) dP (ω) dmZ(k ) Z Ω Z Z   ! 1 0 0 0 = PT n h(ω) {m}(k − Sn(ω)) u(ω, k ) dP (ω) dmZ(k ) Z Ω

Z 0 0 0 = PT n h(ω)1{m}(k − Sn(ω))u(ω, k ) dµ(ω, k ) Ω×Z Z  ! = PT n h(·)1{m} k − Sn(·) (ω)u(ω, k) dµ(ω, k). X

So (3.9) is proved. Set h ≡ 1, under the assumption of the conditional local limit theorem 3.4.1,

N   N X X n PT n 1{m} k − Sn(·) (ω) = P (Sn = k − m|T () = ω) n=1 n=1 51

N X 1 ∼ g(0) =: a P − a.s. for ω. B N n=1 n

P∞ 1 Since n=1 = ∞, it follows that Bn

∞ X 1 1 PT˜n ( ⊗ {m}) = ∞ µ − a.s. n=1 P By linearity of P n , for any f(ω, x) = k 1 (x) with k > 0 and T m∈Z m {m} m P k < ∞, one has m∈Z m

∞ X PT˜n f = ∞ µ − a.s. n=1

By Proposition 3.3.2, the conservative part of T˜ satisfies C(T˜) = X mod µ, which means that (X, B, T˜ , µ) is conservative. 2. The proof of the ergodic decomposition is an adaptation of the corresponding argument of section 2.2.9 of [3](page 63). We show that T˜ doesn’t necessary have to be invertible in order to make ergodic decomposition. It can be proved by Proposition 3.4.6 and Proposition 3.4.7 below.

Proposition 3.4.6. Let T be a conservative, non-singular, measure preserving transformation of a standard probability space (X, B, m), then there exists a probability space (Y, C, µ), and a collection of probabilities {my : y ∈ Y } on (X, B) such that

1. for y ∈ Y , T is a conservative, ergodic, non-singular, measure preserving

transformation of (X, B, my) and

−1 −1 dmy ◦ T dm ◦ T = , my − a.s. dmy dm

2. for A ∈ B, the map y 7→ my(A) is measurable, and Z m(A) = my(A)dµ(y). Y 52

Proof. Let (Y, C, µ, I) be the invariant factor of T and let π : X → Y be the c invariant factor map. By Disintegration Theorem 3.4.4, ∃ Y0 ∈ C, µ(Y0 ) = 0 and y 7→ my measurable Y0 → P (X, B) such that

−1 my(π {y}) = 1 ∀y ∈ Y0, Z −1 m(A ∩ π B) = my(A) dµ(y) ∀A ∈ B,B ∈ C. B

−1 dm◦T −1 0 Also, since T is non-singular, m ◦ T ∼ m. Denote dm = T . Next we prove −1 −1 dmy ◦ T dm ◦ T = my − a.s. dmy dm ∞ ∞ It is sufficient to prove that for any f ∈ L (m)+ and g ∈ L (µ)+,

Z Z Z Z 0 −1 T f dmy g(y) µ(dy) = f dmy ◦ T g(y) µ(dy). Y X Y X −1 Indeed, since my(π {y}) = 1 and by Disintergration Theorem 3.4.4, one has

Z Z Z Z 0 0 T f dmy g(y) µ(dy) = g(πx) T f dmy µ(dy) Y X Y X Z = g(πx)T 0f dm X Z = g(πx)f dm ◦ T −1 X Z Z −1 = g(πx)f dmy ◦ T dµ(y) Y X Z Z −1 dmy ◦ T = g(πx)f dmy dµ(y) Y X dmy Z Z −1 dmy ◦ T = f dmy g(y) dµ(y) Y X dmy Z Z −1 = f dmy ◦ T g(y) dµ(y). Y X

−1 dvf ◦T R As PT f = dm , where vf (C) = C fdm and T is conservative on (X, B, m), we have 53

X X n 0 PT n 1 = (T ) = ∞ m − a.s. n≥1 n≥1 with the consequence that

X n 0 (T ) = ∞ my − a.s. for a.s. y ∈ Y. n≥1

It can be assumed that for every y ∈ Y ,

X n 0 (T ) = ∞ my − a.e., n≥1 with the consequence that T is conservative on (X, B, my) for y ∈ Y.

By the Hurewicz ergodic theorem 3.4.3 for T acting on (X, B, my),

Pn−1 1 k=1 PT k A(x) lim = E(1A|T )(x) = mπx(A) n→∞ Pn−1 1 k=1 PT k (x) for a.e. x ∈ X, and A ∈ B, where T is the collection of all invariance sets, whence, for y ∈ Y a.s.,

Pn−1 1 k=1 PT k A lim = my(A), my − a.s. ∀A ∈ A. n→∞ Pn−1 1 k=1 PT k

On the other hand, for y ∈ Y , by the Hurewicz ergodic theorem, for T acting on (X, B, my),

Pn−1 1 k=1 PT k A lim = Em (1A|T ), my − a.s. ∀A ∈ A n→∞ Pn−1 1 y k=1 PT k

So it follows that for A ∈ A and a.s. y ∈ Y ,

1 Emy ( A|T ) = my(A), my − a.s.

Hence T = {∅,X} mod my, and T is ergodic on (X, B, my).

Next, we expand Proposition 3.4.6 to σ-finite measure space, which is the following proposition. 54

Proposition 3.4.7. Suppose that T is a conservative, non-singular, measure pre- serving transformation of a standard σ-finite measure space (X, B, µ), then there is a probability space (Ω, J , λ) and a collection of measures {µω : ω ∈ Ω} on (X, B) such that

1. For ω ∈ Ω, T is a conservative, ergodic, measure-preserving transformation

of (X, B, µω). R 2. For A ∈ B, the map ω → µω(A) is measurable, and µ(A) = Ω µω(A)dλ(ω).

Proof. The proof follows from Proposition 3.4.6 and [3] (Page.63). The key point dµ is introducing a probability measure m on (X, C) and dm . We only need to prove that µω is conservative and ergodic. Suppose W is a wondering set for T , so {T −nW } are disjoint. Also, suppose R µω(W ) > 0, then since µω(W ) = W fdmω, one has mω(W ) > 0, it is a contradic- tion with the fact that T is conservative on (X, B, mω). So T is conservative on

(X, B, µω). −1 For any invariant set A = T A, since T is ergodic on (X, B, mω), either c R mω(A) = 0 or mω(A ) = 0. Then one has either µω(A) = A fdmω = 0 or c R µω(A ) = Ac fdmω = 0, so T is ergodic on (X, B, µω).

The second part of Lemma 3.4.5 can be proved by applying Proposition 3.4.6 and Proposition 3.4.7 above to (X, B, T˜). 3. We end up with the proof of Lemma 3.4.5 by showing that λ-almost surely ˜ for y,(X, B, µy, T ) is pointwise dual ergodic. Since

N X 1 1 PT˜n ( Ω ⊗ {m}) ∼ aN , µ − a.s., n=1 one has

N X 1 1 PT˜n ( Ω ⊗ {m}) ∼ aN , µy − a.s.. n=1 55

From the second part of Lemma 3.4.5, it is known that T˜ is conservative and 1 ergodic on (X, B, µy), by Hurewicz’s ergodic theorem, ∀f ∈ L (µy), almost surely,

n n R P fdµy 1 X k=0 PT˜k f Ω×Z PT˜n f ∼   → R . an n 1Ω ⊗ 1{m}dµy k=0 P 1 1 Ω×Z k=0 PT˜k Ω ⊗ {m}

1 Since aN doesn’t depend on the integer m we choose, R doesn’t 1⊗1{m}dµy Ω×Z ˜ depend on m, let’s denote it by C(y). Hence, (X, B, µy, T ) is pointwise dual ergodic with return sequence n X 1 a C(y) = C(y)g(0) . n B i=1 i

The theorem below provides the limiting distribution of the local time of sta- tionary processes with a conditonal local limit theorem.

Theorem 3.4.8 (Convergence of local times). Suppose stationary process {Xn := φ ◦ T n−1 : n ≥ 1} defined in a probability space (Ω, F,P ) has a conditional local β limit theorem 3.4.1 with regularly varying scaling coefficient Bn = n L(n), where 1 β ∈ [ 1 , 1) and L(n) is slowly varying. Denote a := g(0) Pn → ∞. Then `n 2 n k=1 an Bk converges to Yα in the following sense: Z   `n(ω) g H(ω)dP (ω) → E[g(Yα)], (3.10) Ω an for any bounded and continuous function g and any probability density function

H on (Ω, F,P ). Here Yα has the normalized Mittag-Leffler distribution of order α = 1 − β.

β Proof of Theorem 3.4.8. Since Bn = n L(n) is regularly varying with order β, ∞ by Karamata’s integral theorem (c.f. e.g.[62] Theorem A.9.), {an}n=1 is regularly C varying of order α = 1 − β ∈ (0, 1 ] and a ∼ nα . 2 n L(n) ˜ By Lemma 3.4.5, (X, B, µy, T ) is pointwise dual ergodic. Since an is regularly 1 varying, by applying Theorem 3.3.8, for any f ∈ L (µy), f ≥ 0, one has strong convergence in the following sense: 56

Z  T˜  Sn (f) g hydµy → E[g(C(y)µy(f)Yα)], (3.11) X an for any bounded and continuous function g and for any probability density function hy of (X, B, µy). Here Yα has the normalized Mittag-Leffler distribution of order α = 1 − β. Define a probability density function H(ω, m) of (X, B, µ) as

 H(ω), m = 0, H(ω, m) = 0, m 6= 0, where H(ω) is an arbitrary probability density function in Ω. For each y, let

 1 R  R H(ω,k)dµ (ω,k) H(ω, j), X H(ω, k)dµy(ω, k) 6= 0; h (ω, j) = X y (3.12) y R 0, X H(ω, k)dµy(ω, k) = 0.

Then hy(ω, j) is a probability density function on (X, B, µy) for y ∈ U where R U = {y ∈ Y : X H(ω, j)dµy(ω, j) 6= 0}. By the Disintegration Theorem 3.4.4,

Z ST˜(f)(ω, x) g n H(ω, x)dµ(ω, x) X an Z Z  T˜  Sn (f)(ω, x) = g H(ω, x)dµy(ω, x)dλ(y) U X an Z Z  T˜  Sn (f)(ω, x) + g H(ω, x)dµy(ω, x)dλ(y) Y \U X an Z  Z  Z  T˜  Sn (f)(ω, x) = H(ω, x)dµy g hy(ω, x)dµy(ω, x)dλ(y) U X X an Z  Z  Z  T˜  Sn (f)(ω, x) = H(ω, x)dµy g hy(ω, x)dµy(ω, x)dλ(y) Y X X an Z → µy(H)E[g(C(y)µy(f)Yα)]dλ(y). Y

In the last step, the Dominant Convergence Theorem is used. Let f = 1Ω × 1{0}, 57

then C(y)µy(f) = 1. The result above asserts that

Z   `n g H(ω)dP (ω) → E[g(Yα)] Ω an for any bounded and continuous function g and any probability density function H of (Ω, F,P ).

Corollary 3.4.9 (Gibbs-Markov transformation [5]). Let (Ω, B, P, T, α) be a mix- ing, probability preserving Gibbs-Markov map (see [5] for definition), and let φ : Ω → Z be Lipschitz continuous on each a ∈ α, with

|φ(x) − φ(y)| Dαφ := supa∈αDaφ = supa∈α sup < ∞ x,y∈a d(x, y) and distribution G in the domain of attraction of a stable law with order 1 < d ≤ 2. n−1 1/d Then {Xn := φ ◦ T } has conditional local limit theorem with Bn = n L(n), where L(n) is a slowly varying function. By Theorem 3.4.8, the scaled local time of Sn converges to Mittag-Leffler distribution strongly.

In [5], the conditions for finite and countable state Markov chains and Markov interval maps to imply the Gibbs-Markov property are listed. Next, we show two examples of stationary processes whose local times converge to the Mittag-Leffler distribution.

Example 3.4.10 (Continued Fractions). Any irrational number x ∈ (0, 1] can be uniquely expressed as a simple non-terminating continued fraction

1 x = [0; c1(x), c2(x), ··· ] =: 1 . c1(x) + 1 c2(x)+ c3(x)+···

The continued fraction transformation T is defined by

1 T (x) = x − [ ]. x

n−1 Define φ : (0, 1] → N by φ(x) = c1(x) and Xn := φ ◦ T . We have the following convergence in distribution with respect to any absolutely continuous probability 58 measure m  λ, where λ is the Lebesgue measure, i.e.

Pn X i=1 i − log n → F, n/ log 2 where F has a stable distribution (cf. eg. [50]).

Let an := {x ∈ (0, 1] : c1(x) = n} for every n ∈ N+ and the partition is α = {an : n ∈ N}. Then (Ω, B, µ, T, α) is the continued fraction transformation where Ω = (0, 1]. It is a mixing and measure preserving Gibbs-Markov map with 1 1 respect to the Gauss measure dµ = ln 2 1+x dx. Define the metric on Ω to be d(x, y) = rinf{n:an(x)6=an(y)}, where r ∈ (0, 1). Note that φ is Lipschitz continuous on each partition.

Define (X, F, ν, TX , β) to be the direct product of (Ω, B, µ, T, α) with metric 0 0 0 0 dX ((x, y), (x , y )) = max{d(x, x ), d(y, y )}. One can check that (X, F, ν, TX , β) is still a mixing and measure preserving Gibbs-Markov map. Let f : X → Z be defined by f(x, y) = φ(x) − φ(y). Since φ is Lipschitz on partitions α, so is f. n−1 Define Yn((x, y)) = f ◦ TX (x, y) = Xn(x) − Xn(y), (x, y) ∈ X.Yn is in the Pn domain of attraction of a stable law. Let Sn := i=1 Yi .The local time at level Pn 1 0 of Sn is denoted to be `n(x, y) = i=1 {Si(x,y)=0}. By applying Corollary 3.4.9 to the Gibbs-Markov map (X, F, ν, TX , β) and the Lipschitz continuous function f, Sn has a conditional local limit theorem and the local time `n converges to the Mittag-Leffler distribution after scaled.

Example 3.4.11 (β transformation). Fix β > 1 and T : [0, 1] → [0, 1] is defined by T x := βx mod 1. Let φ : [0, 1] → Z be defined as φ(x) = [βx] and Xn(x) = φ◦T n−1(x) = [βT n−1x]. There exists an absolutely continuous invariant probability measure P . By [1], there is a conditional local limit theorem for the partial sum

Sn of {Xn}. Then Theorem 3.4.8 can be applied to ([0, 1], B,P,T ) and {Xn}, it follows that the scaled local time of Sn at level E[φ] converges to the Mittag-Leffler distribution. 59

3.5 Limit theorems of local times of discrete- time fractional Brownian motion

In this section, we study the limiting behavior of the local time of discrete-time fractional Brownian motion.

Since the state of fractional Brownian motion BH is R, we consider its occupa- 1 tion time. Suppose D is a subset of the state space R and V := D, the occupation P time at n of BH is defined to be λ(n, D) := i≤n V (BH (i)). We have the following result for the occupation time of the discrete-time fractional Brownian motion.

Theorem 3.5.1 (Limiting distribution of the occupation time of fractional Brow- nian motions). Let (Xn)n∈N be as in Theorem 2.5.1. Denote the occupation time of Pn 1 Sk in the interval (a, b) at time n by `n([a, b]) = i=1 (a,b)(Si). Then there exists 1−H a sequence of numbers an = O(n ) such that

Z  Z    1 `n([a − x, b − x]) g ψ(ω)dP (ω) dx → E[g((b − a)Yα)](3.13), 2 − Ω an for any  > 0, any bounded and continuous function g, any probability density 1 function ψ ∈ L (P ), where Yα is a random variable having the Mittag-Leffler distribution with index α = 1 − H.

Remark 3.5.2. (1) Taking ψ = 1 one could try to evaluate the left hand side when  → 0. This could show that the occupation times have a weak limit which is Mittag- Leffler distribution. We do not know this, but the result shows that convergence in the weak-∗ sense in L∞(dx). (2) We do not know the connection of this result to the local time of fractional Brownian motion. In [33] it is remarked that the law of the local time of a fractional Brownian motion is not a Mittag-Leffler distribution unless it is Brownian motion, although Kono’s result in [38] suggested that it may be true. Theorem 3.5.1 may give a hint to explain this phenomenon. Kasahara and Matsumoto have found that the limiting distribution of the occupation time of BH is similar but not equal to a Mittag-Leffler distribution. 60

3.5.1 Occupation times of discrete-time fractional Brown- ian motions

In this section, we interpret the stationary Gaussian random variables {Xn} above from the point view of dynamical systems.

Without loss of generality, suppose the random variables {Xn} are defined in the probability space(Ω, Σ,P ) with Ω = RN and Σ is the σ-algebra generated by the cylinder sets of RN. T is the shift operator

T :Ω → Ω, (T ω)i = ωi+1

N R where ω = (ω1, ω2...) ∈ R . Define φ :ΩtoR as φ(ω) := ω1, Ω |φ|dP < ∞ and R (n−1) Ω φ dP = 0. Let random variables Xn(ω) := φ ◦ T (ω) = ωn satisfy the jointed Gaussian with zero mean: for any family of Borel sets

C1,C2...Cr ∈ R,

P ({ω ∈ Ω: Xn1 (ω) ∈ C1,Xn2 (ω) ∈ C2, ...Xnr (ω) ∈ Cr}) Z = p(t)dt. C1×C2...×Cr

1 Here p is the normal probability density function: p(t) = C exp(− 2 (Dt, t)), where t = (t1, t2, ··· , tr) and D is the matrix inverse to the covariance matrix

B = (b(ni − nj)) and b(ni − nj) = E[Xni Xnj ].

We represent the occupation time of {Sn} by introducing the skew product as before: let (X, B, µ) = (Ω × R, Σ × σ(R),P × mR), where σ(R) is the Borel ˜ σ-algebra and mR is the Lebegue measure on R. Define T :Ω × R → Ω × R by ˜ ˜n n T (ω, r) := (T (ω), r +φ(ω)), then by induction, T (ω, r) = (T ω, r +Sn(ω)), where

Sn is the partial sum of {Xn}. T˜ Pn−1 ˜k T Define Sn (f) := k=0 f ◦ T for any f :Ω → R. Specifically, Sn (φ) = Sn. Let A = Ω × D, then the occupation time of {Sn} for set D ⊂ R has the following representation:

n n X 1 X 1 ˜i T˜ 1 λ(n, D) = {Si∈D} = {A}(T (ω, 0)) = Sn ( {A})(ω, 0). (3.14) i=1 i=1 Similarly, we can make an ergodic decomposition of the dynamical system 61

(X, B, µ, T˜).

Proposition 3.5.3 (Conservative and Ergodic Decomposition).

1. T˜ is a conservative and measure preserving transformation of (X, B, µ).

2. There exists a probability space (Y, C, λ) and a collection of measures {µy : y ∈ Y } on (X, B) such that

(a) For y ∈ Y , λ- almost surely, T˜ is a conservative ergodic measure-

preserving transformation of (X, B, µy).

(b) For A ∈ B, the map y → µy(A) is measurable and Z µ(A) = µy(A)dλ(y). Y

˜ 3. λ-almost surely for y, (X, B, µy, T ) is pointwise dual ergodic.

Proof. 1. By Corollary 8.1.5 in [3], to prove T˜ is conservative, it is sufficient to prove that R (1) φ :Ω → R is integrable, and Ω φ dP = 0. (2) T is ergodic and probability measure preserving on (Ω, Σ,P ).

By the assumption, (1) is true. For (2), by [22] (page 369), limn→∞ b(n) = 0 is a sufficient and necessary condition that T is mixing: |P (A ∩ T −nB) −

P (A)P (B)| → 0 as n → ∞, where b(n) = E[XkXk+n]. It implies that T

is ergodic. T is also a probability preserving transformation, since {Xn} are stationary. Hence T˜ is conservative. T˜ is measure preserving since T is measure preserving.

2. For the second and the third parts of Proposition 3.5.3, the proof is similar to

that of Proposition 3.4.6 except that the returning sequence is anC(y) where C(y) = (b−a) for any interval (a, b) and b > a. µy(Ω⊗(a,b))

We end up this section with the proof of Theorem 3.5.1. 62

˜ Proof. Since T is pointwise dual ergodic with respect to measure µy, an is regularly Pn g(0) varying with index α = 1 − H and has the same order as , where dn is i=0 di ST˜ the scaling coefficient in Theorem 2.5.1, then by Theorem 3.3.8, n converges C(y)an strongly in distribution, i.e.,

Z  T˜  Sn (f)(ω, x) g hy(ω, x)dµy(ω, x) → E[g(µy(f)Yα)], (3.15) X C(y)an or equivalently,

Z  T˜  Sn (f)(ω, x) g hy(ω, x)dµy(ω, x) → E[g(C(y)µy(f)Yα)], (3.16) X an

1 R for any bounded and continuous function g and for any hy ∈ L (µy) and X hydµy = T˜ Pn ˜i−1 1, where Sn (f) = i=1 f ◦ T , and Yα has the normalized Mittag-Leffler distri- bution of order α = 1 − H. 1 1 T˜ Pn 1 Let f = Ω × (a,b), then Sn (f)(x, ω) = i=1 (a,b)(x + Si(ω)), which is the b−a occupation time of Sn at time n on interval (a−x, b−x). Since C(y) = , µy(1Ω⊗1(a,b)) R C(y)µy(1Ω ⊗ 1(a,b)) = 1(a,b)dm, then the right hand side of (3.16) is simplified R to be E[g((b − a)Yα)]. Let H(ω, x) be any probability density function on (X, B, µ), for each y, define

 1 R  R H(ω,x)dµ H(ω, x), X H(ω, x)dµy 6= 0; h (ω, x) = X y (3.17) y R 0, X H(ω, x)dµy = 0. hy(ω, x) is a density function on (X, B, µy) for y ∈ U where U = {y ∈ Y : R X H(ω, x)dµy 6= 0}. By (3.16), one has

Z Pn 1 (x + S (ω)) g i=1 (a,b) i H(ω, x)dµ X an Z Z Pn 1  i=1 (a,b)(x + Si(ω)) = g H(ω, x)dµydλ(y) Y X an Z Z Pn 1  i=1 (a,b)(x + Si(ω)) = g H(ω, x)dµydλ(y) U X an 63

Z Z Pn 1  i=1 (a,b)(x + Si(ω)) + g H(ω, x)dµydλ(y) Y \U X an Z Z Z Pn 1  i=1 (a,b)(x + Si(ω)) = ( H(ω, x)dµy) g hy(x, ω)dµydλ(y) U X X an Z Z Z Pn 1  i=1 (a,b)(x + Si(ω)) = ( H(ω, x)dµy) g hy(x, ω)dµydλ(y) Y X X an Z → µy(H(ω, x))E[g((b − a)Yα)]dλ(y) by DCT Y = E[g((b − a)Yα)].

1 1 Let H(ω, x) = 2 {−,}(x)⊗ψ(ω) where  > 0 and ψ(ω) is a probability density function on (Ω, Σ,P ). Then one has that as n → ∞,

Z  Z Pn 1   1 i=1 (a−x,b−x)(Si(ω)) g ψ(ω)dP (ω) dx → E[g((b − a)Yα)]. 2 − Ω an

3.5.2 Occupation times of continuous fractional Brownian motions

In [33], Kasahara and Matsumoto studied the occupation time of the d-dimensional d fractional Brownian motion BH = (BH,1,BH,2, ··· ,BH,d), where BH,i are indepen- d dent copies of BH . If Hd < 1, then the local time of BH exists and there exists a jointly continuous version Ld(t, x) such that

Z t Z d d f(BH (u))du = f(x)L (t, x)dx, 0 Rd for any bounded and continuous function f. The Kallianpur-Robbins law states that for a bounded summable function V on Rd,

1. if 0 < Hd < 1, then

Z t ¯ 1 d V d 1−Hd V (BH (s))ds → √ d L (1, 0) t 0 2π

as t → ∞; 64

2. if Hd = 1 and d ≥ 2, then

Z t ¯ 1 d V V (BH (s))ds → √ L1 log t 0 2π

¯ R as t → ∞, where L1 ∼ exp(1) and V = V (x)dx..

Put this result with the Darling-Kac theorem together, a question of interest is whether Ld(x, t) has the Mittag-Leffler distribution. Kasahara and Matsumoto d have found that the limiting distribution of the occupation time of BH is similar to but not the Mittag-Leffler distribution, which is consistent with our conclusion for the discrete-time fractional Brownian motion.

Theorem 3.5.4 ([33]). Suppose d ≥ 2, 0 < Hd < 1, and let α = 1 − Hd.

 1 n!Γ(α)n n = √ nd , n = 1 E[Ld(1, 0) ] 2π Γ(αn+1) 1 n!Γ(α)n > √ nd , n ≥ 2. 2π Γ(αn+1) Also n n 1 n!γ(α) E[Ld(1, 0) ] ≤ √ , n ≥ 1. πnd γ(αn + 1) One Corollary is that when d ≥ 2, the distribution of Ld(1, 0) is not the Mittag- Leffler distribution with any index.

Remark 3.5.5. In the proof that the limiting distribution is not Mittag-Leffler, it 1 assumes that d ≥ 2, 0 < Hd < 1. In fact, the proof is still available when H 6= , 2 0 < Hd < 1. Chapter 4

Almost Sure Central Limit Theorems

4.1 Almost sure central limit theorems for local times of random walks

In this chapter, we shall discuss a new type of limit theorem: almost sure central limit theorem (ASCLT). It was first discovered by Brosamler (1988, [20]) and Schatte(1988, [57]) independently. In the past, it has been extensively investigated for partial sums of independent random variables and some dependent variables.

The simplest form of ASCLT is when {Xn} is a sequence of i.i.d. random variables 2 with mean zero and variance σ < ∞. Let Sn be the partial sum of {Xn}. Then

N 1 √ 1 X {S ≤σ kx} lim k = Φ(x) a.s., N→∞ log N k k=1 where Φ is the standard normal distribution function.

For independent random variables {Xn}, the general form of ASCLT is

N 1 X 1{(S −a )/b ≤x} lim k k k = Ψ(x) a.s. N→∞ log N k k=1 for some cumulative density function Ψ. It is a direct consequence of the central limit theorem S − a n n → N (0, 1) bn 66

under some moment hypotheses on the underlying variable {Xn}. So despite its pointwise property, ASCLT is weaker than the distribution convergence. In [11], Berkes and Cs´akishow that not only the central limit theorem, but every weak limit theorem for independent random variables has an analogous almost sure “logarithmic” version. The generic form of a weak limit theorem for a sequence of independent random variables is

fk(X1,X2, ··· ) → G, where f : R∞ → R is a measurable function and G is a distribution funcition. Pk For example, when fk(x1, x2, ··· ) = i=1 xi/k, it corresponds to the central limit Pk 1 Pi theorem. When fk(x1, x2, ··· ) = ak i=1 x( j=1 xj) for some ak, it becomes the weak convergence of the local time of the process {Xn}. Other examples include extrema, U-statistics and so on.

Particularly, we are interested in the local time. Let X1,X2... be i.i.d. integer- valued random variables with EX1 = 0. Let ψ(t) be the characteristic function of

X1 and assume that ψ(2πt) = 1 if and only if t is an integer and ψ satisfies

1 Z π ψ(t) C dt ∼ α as λ → 1, 2π −π 1 − λψ(t) (1 − λ) where 0 < α ≤ 1/2. Let Sn be the partial sum of {Xn}: Sn = X1 + X2 + ...Xn, it follows that the random walk {Sn, n ≥ 1} is aperiodic and X1 is in the domain of attraction of a stable law of order d = 1/(1 − α) ∈ (1, 2]. Define the local time `(n, x) as before. It was shown by Darling and Kac (1957) [24] that

  ∞ j−1 `n 1 X (−1) j lim P < x = Fα(x) = sin(παj)Γ(1 + αj)x . n→∞ Cnα πα j!j j=1

Also the local time has ASCLT.

Theorem 4.1.1 ([11]). For the random walk defined above, one has

1 X 1 ` lim { k

4.2 Almost sure central limit theorem for sta- tionary processes

When the sequence {Xn} is dependent, we would like to know whether the local time of its partial sum still has almost sure central limit theorem. In general, this is not always true. In this section, we study the case when {Xn} is stationary and its characteristic operator has a spectral gap.

Suppose {Xn} is a stationary process in probability space (Ω, B,P ) and it satisfies that there exists a random variable φ :Ω → Z with finite Lipschitz constant and a measure preserving transformation T :Ω → Ω such that Xn = φ ◦ T n−1. 1 1 Define the characteristic function operator Pt : L (P ) → L (P ) by Ptf := itφ PT (e f), where PT is the Perron-Frobenius operator. Pt is the perturbation of n PT around t = 0, so Pt and PT are identical when t = 0. By induction, Pt f = itSn P i−1 PT n (e f), where Sn = i≤n φ ◦ T =: Sn(φ). Let the space L be the space 1 L (P ) equipped with Lipschitz norm: kfk := kfk∞ +Df , where Df is the Lipschitz constant of f. With some conditions on the characteristic operator Pt, we have the almost sure central limit theorem for the local time of Sn. Here, we prove that an almost sure weak limit theorem holds for local times of stationary processes when the local limit theorem holds in a stronger form than Definition 3.4.1.

Definition 4.2.1. An integer-valued stationary process {Xn}n≥1 is said to satisfy ∞ the L conditional local limit theorem at 0 if there exists a sequence gn ∈ R of real constants such that

lim gn = g(0) > 0 n→∞ and

n 1 kBnPT ( {Sn=x}) − gnk∞ decreases exponentially fast.

This condition is essentially stronger than condition (3.7) holding in L∞(P ) and the convergence is exponentially fast. 68

Theorem 4.2.2 (Almost sure central limit theorem for the local times). Let

{Xn}n∈N be an integer-valued stationary process satisfying the local limit theorem β at 0 in Definition 4.2.1 and Bn = n L(n), where slowly varying function L(n) converges to c > 0. Moreover, assume that the following two conditions are satis- fied: for some constants K > 0 and δ > 0 and for all bounded Lipschitz continuous functions g, F ∈ Cb(R) and x ∈ Z it holds that

−1−δ Cov (g(`k),F ◦ T2k)) ≤ C (log log k) (4.1) and ∞ X α 1 1  1−α |E {Sn=x} − {Sn=0} | ≤ K(1 + |x| ), (4.2) n=1 where α = 1 − β. Then

N 1 X 1 1 ` lim { k ≤x} = M(x) a.s. (4.3) N→∞ log N k a k=1 k is equivalent to

N 1 X 1 `k lim P ( ≤ x) = M(x), (4.4) N→∞ log N k ak k=1 where M(x) is a cumulative distribution function.

Corollary 4.2.3 (Gibbs-Markov maps). The almost sure central limit theorem for the local times holds under the same setting as Corollary 3.4.9.

It is because the conditional local limit theorem in the sense of Definition 4.2.1 holds. And the assumptions on the transfer operator in section 4.4 are satisfied. 69

4.3 Proof of almost sure central limit theorem (ASCLT)

4.3.1 Proof of Theorem 4.2.2

In this section, we shall show that the local time `n of {Xn} has an almost sure weak convergence theorem under the assumptions of Theorem 4.2.2. The following proposition will be used in the proof of Theorem 4.2.2, so we state it below and the proof of it is in Section 4.3.2.

Proposition 4.3.1.

 N    1 X 1 `k Var g( ) = O (log log N)(−1−δ) log N k ak k=1 for some δ > 0, as N → ∞, where g is any bounded Lipschitz function with Lipschitz constant 1.

Once Proposition 4.3.1 is granted, the proof of Theorem 4.2.2 can be proved using standard arguments. We sketch these shortly.

Proof of Theorem 4.2.2. By the dominant convergence theorem, statement (4.3) implies (4.4) if taking expectation. To prove the other direction, it is sufficient to prove that (see e.g. Lacey and Philipp, 1990 [39])

N 1 X 1 lim ξk = 0, a.s. (4.5) N→∞ log N k k=1 for any bounded Lipschitz continuous function g with Lipschitz constant 1, where   h  i `k `k ξk := g − E g . ak ak 1 Taking N = exp exp i, for any  > , then Proposition 4.3.1 implies i 1 + δ

∞ N X 1 Xi 1 E( ξ )2 < ∞. (4.6) log2 N k k i=1 i k=1 70

By Borel-Cantelli lemma,

N 1 Xi 1 lim ξk = 0, a.s. (4.7) i→∞ log Ni k k=1

For any N, there exists k such that Nk ≤ N < Nk+1 and we have

N N Nk+1 1 X 1 1 Xk 1 X 1 | ξj| ≤ (| ξj| + |ξj|) log N j log Nk j j j=1 j=1 j=Nk+1 N 1 Xk 1 C ≤ | ξ | + (log N − log N ) log N j j log N k+1 k k j=1 k → 0 as k → ∞, a.s.

log N The last step is because ((1 + k) − k) → 0 as k → ∞ for any  < 1, k+1 = log Nk e((1+k)−k) → 1 as k → ∞. Hence (4.5) holds and the proof is done.

4.3.2 Proof of Proposition 4.3.1

Proposition 4.3.1 is a result of the following two lemmas.

Lemma 4.3.2. Suppose {Xn} has conditional local limit theorem 4.2.1, then one Pn 1 has E[`n] = O(an), where an = g(0) i=1 . Bi Proof. Since the convergence in the conditional local limit theorem is in the sense n of Definiton 4.2.1, BnP (Sn = 0) = BnE (P (Sn = 0|T )) → g(0) as n → ∞. So

n n X X 1 E(` ) = P (S = 0) ∼ a = g(0) . n i n B i=1 i=1 i

Lemma 4.3.3. When j > 2k, and k, j → ∞,

2 α E (`j − `(j, S2k) − `2k + `(2k, S2k)) = O(ajE[|S2k| 1−α ]), (4.8)

1 Pn g(0) where α ∈ (0, ] and an = . 2 i=1 Bi 71

Remark 4.3.4. For i.i.d. case, by Kesten and Spizer (1979) [34], it is known that 2 α α E(`(n, x) − `(n, y)) ≤ C|x − y| 1−α n when Xn is in the domain of attraction of a 1 stable law of order d = 1−α .

Proof.

2 E |`j − `(j, S2k) − `2k + `(2k, S2k)|  j 2  X X 1 1 1 = E {Si=x} − {Si=0} {S2k=x} x∈Z i=2k+1

X X  ≤ 2 E 1{S =x}(1{S =x} − 1{S =0})1{S =x} j1 j2 j2 2k 2k+1≤j1≤j2≤j x∈Z

X X  + 2 E 1{S =0}(1{S =x} − 1{S =0})1{S =x} . j1 j2 j2 2k 2k+1≤j1≤j2≤j x∈Z

Due to the similar form of the two terms above, let z1 ∈ {x, 0}.

E 1 (1 − 1 )1  {Sj1 =z1} {Sj2 =x} {Sj2 =0} {S2k=x}   = E 1 2k (1 j1 − 1 j1 )1{S =x} {Sj1−2k◦T =z1−x} {Sj2−j1 ◦T =x−z1} {Sj2−j1 ◦T =−z1} 2k     −j1 = E E 1 2k 1{S =x}|T F (1 j1 − 1 j1 ) {Sj1−2k◦T =z1−x} 2k {Sj2−j1 ◦T =x−z1} {Sj2−j1 ◦T =−z1}   = E P j1 (1 2k 1{S =x})(1{S =x−z } − 1{S =−z }) . T {Sj1−2k◦T =z1−x} 2k j2−j1 1 j2−j1 1

In order to bound this expression consider

Bj −2kB2kP j1 (1 2k 1{S =x}) 1 T {Sj1−2k◦T =z1−x} 2k   = B P j1−2k 1 B P 2k (1 , j1−2k T {Sj1−2k=z1−x} 2k T {S2k=x} which by the assumption of the L∞-conditional local limit theorem at 0, can be 2 written in the form gj1−2kg2k + Z, gj1−2kg2k converging to g(0) and Z being a

∞ j1−2k L random variable with kZk∞ ≤ cθ . Then, for fixed j2 − j1, and since

B2kP (S2k = x) is universally bounded, and using the assumption,

    −j1 E E 1 2k 1{S =x}|T F (1 j1 − 1 j1 ) {Sj1−2k◦T =z1−x} 2k {Sj2−j1 ◦T =x−z1} {Sj2−j1 ◦T =−z1} 72 is bounded by

 1 1  j1−2k CP (S2k = x) q(j2 − j1, x) + θ Bj1−2k Bj2−j1

∞ α for some constants C > 0, where P q(j −j , x) ≤ Kx 1−α by (4.2). Summing j2−j1=1 2 1 over x, z1, then over j2 and finally over j1 shows the lemma.

  PN 1 `k Proof of Proposition 4.3.1. Split Var k=1 g( ) up into three parts: T1,T2 k ak and T3:

 N  N X 1 `k X 1 2 Var g( ) = E[( ξk) ] k ak k k=1 k=1 N X 1 X |E[ξkξj]| X |E[ξkξj]| = E[ξ2] + 2 + 2 k2 k kj kj k=1 1≤k

= T1 + T2 + T3.

2 For T1, since ξk is bounded, there is a constant C1 such that T1 ≤ C1(log N) for all N ∈ N. For T2, there is a constant C2 such that

X 1 T ≤ kgk ≤ C (log N)2. 2 ∞ kj 2 1≤k

For T3, since 1 ≤ k ≤ 2k < j ≤ N, let

j−2k 1 X 1 f := 1 = (`(j, S ) − `(2k, S )), (2k,j) a {X2k+1+...+X2k+i=0} a 2k 2k j i=1 j

j which is measurable with respect to F2k+1 = σ(X2k+1, ...Xj). Then   `k `j E[ξkξj] = Cov g( ), g( ) ak aj     `k `j `k = Cov g( ), g( ) − g(f(2k,j)) + Cov g( ), g(f(2k,j)) . ak aj ak 73

Due to the assumption 4.1,

  `k −1−δ Cov g( ), g(f(2k,j)) ≤ C(log log k) =: Cα(k). ak

Because g is Lipschitz and bounded,

    `k `j `j Cov g( ), g( ) − g(f(2k,j)) ≤ CE − f(2k,j) . ak aj aj

  `j E − f(2k,j) = 1/ajE[|(`j − `(j, S2k)) + `(2k, S2k)|] aj

≤ 1/ajE [|(`j − `(j, S2k)) + `(2k, S2k) − `2k|] + 1/ajE[`2k].

So when 1 ≤ k ≤ 2k < j ≤ N, one has

E[ξkξj]   `(0, j) ≤ C1E − f(2k,j) + C2α(k) aj 1   ≤ C1 E[|`j − `(j, S2k) − `2k + `(2k, S2k)|] + E[|`2k|] + C2α(k). aj

When 2k < j, Lemma 4.3.2, Lemma 4.3.3 and Jensen’s inequality imply that as k, j → ∞,

1   E[ξkξj] ≤ C1 E[|`j − `(j, S2k) − `2k + `(2k, S2k)|] + E[|`2k|] + C2α(k) aj 1  1  1  α  2 (1−α) 2 ≤ C1 E[|S2k| ] aj + a2k + C2α(k) aj 1  α 1  2(1−α) 2 ∼ C1 B2k aj + a2k + C2α(k). aj

1 Since B = nβL(n) and a ∼ nα with α = 1 − β, one has n n L(n)

X X 1 T = E(ξ ξ ) 3 kj k j 1≤k≤N 2k

α  2(1−α)  X X 1 2k α L(2k) 2k α L(j) ≤ C ( ) 2 + ( ) + α(k) . kj j L(j)−1/2 j L(2k) 1≤k≤N 2k

Since we assume L(n) → c,

X X 1 k α X X 1 T ≤ C ( ) 2 + C α(k) 3 kj j kj 1≤k≤N 2k

= T31 + T32.

1 1 T ≤ C P P = O(log N). 31 1≤j≤N j1+α/2 1≤k

N N N X α(k) X 1 X log N T ≤ C ≤ C = O(log2 N(log log N)−(1+δ)). 32 k j k(log log k)1+δ k=1 j=k k=1

−1−δ 2 So T3 = O((log log N) log N), as N → ∞. Hence Proposition 4.3.1 is proved.

4.4 Transfer operators

We turn to the investigation of conditions (4.1) and (4.2) and show that they can be derived from the theory of transfer operators in dynamics. Define the 1 1 itφ characteristic function operator Pt : L (P ) → L (P ) by Ptf := PT (e f), which

n itSn is the perturbation of PT . By induction, Pt f = PT n (e f). Let the space L be 1 the subspace in L (P ) of all functions with norm: kfk := kfk∞ + Df where Df is the Lipschitz constant of f. We assume that Pt acts on L and has the following properties:

• There exists δ > 0, such that when t ∈ Cδ := [−δ, δ], Pt has a representation:

Pt = λtπt + Nt, πtNt = Ntπt = 0, and πt is a one-dimensional projection

generated by an eigenfunction Vt of Pt, i.e. PtVt = λtVt. It implies that n n n Pt = λt πt + Nt .

• There exists constants K,K1 and θ1 < 1 such that on Cδ, kπtk ≤ K1, kNtk ≤ 75

d θ1 < 1, |λt| ≤ 1 − K|t| .

• There exists θ2 < 1 such that for |t| > δ, kPtk ≤ θ2 < 1.

• φ is Lipschitz continuous.

We end up this part by proving conditions (4.1) and (4.2) under these assump- tions. This also will complete the proof of the corollary 4.2.3 since from [5], one can see that Gibbs-Markov maps satisfy all the assumptions above. An example not satisfying the above condition can be derived for functions of the fractional Brownian motion.

`k R `k T 2k Proof of (4.1). Let g( ) := g( )dP +g ˆ := Ck +g ˆ, then by P = P + N , ak Ω ak 2k

 `  cov g( k ),F ◦ T 2k) ak Z ` Z ` Z = g( k )(F ◦ T 2k)dP − g( k )dP F ◦ T 2kdP Ω ak Ω ak Ω Z   Z Z 2k `k `k = PT g( ) F dP − g( )dP F dP Ω ak Ω ak Ω Z = N 2k(ˆg)F dP Ω 2k ≤ kN (ˆg)kkF k1 2k ≤ Cθ1 kgk ≤ C(log log(2k))−1−δ.

Proof of (4.2). Let ζ = (e−itx − 1). Then

X 1 1 E(`(n, x) − `n) = E( {Sj =x} − {Sj =0}) 1≤j≤n Z  Z X itSj X j1 = ζE e dt = ζE[Pt ]dt 1≤j≤n [−π,π] 1≤j≤n [−π,π] and Z Z j1 j1 ζE[Pt ]dt = Re ζE[Pt ]dt [−π,π] [−π,π] 76

Z Z Z j 1 j1 j1 = Re ζλt E[πt ]dt + Re ζE[Nt dt + Re ζE[Pt ]dt. ¯ Cδ Cδ Cδ

n n On Cδ, one has kNt k ≤ θ1 where |θ1| < 1, whence

n Z n Z X j X  j  Re ζE[N ]dt ≤ Re ζE[N 1] dt t t j=1 Cδ j=1 Cδ n X Z 1 ≤ 2 kN jkdt ≤ 4δ . t 1 − θ j=1 Cδ 1

¯ j On Cδ, one has kPt k ≤ θ2 so that

n Z n Z X j X  j1  Re ζE[Pt ]dt ≤ Re ζE[Pt ] dt ¯ ¯ j=1 Cδ j=1 Cδ n Z X j 1 ≤ 2 kPt kdt ≤ 4π . ¯ 1 − θ j=1 Cδ 2

The main part is the first term. We show that for some constant C > 0

n Z X j α 1−α Re ζλt πt1dt ≤ C|x| dt j=1 Cδ

Let πt1 = Wt, so kWtk ≤ kπtk ≤ K1. Then

n Z X j Re ζλt Wtdt j=1 Cδ n Z   X j ≤ | Re(λt Wt) cos(xt) − 1 Wtdt| j=1 Cδ n Z X j + | Im(λt Wt) sin(xt)dt|. j=1 Cδ

j2 j2 d j Since |Reλt Wt| ≤ K1|λt | ≤ (1 − Ct ) , we have

n Z   Z X j −1 1 Re(λ Wt) cos(xt) − 1) dt ≤ K1C | cos(xt) − 1)| dt. t |t|d j=1 Cδ [−δ,δ] 77

If |δx| < 1, then (1 − cos tx) ≤ |tx|2 ≤ |tx|d, so

Z Z 1 1 d d−1 d |1 − cos tx| dt ≤ 2 d |tx| dt = 2 |x| . [−δ,δ] |t| [0,1/x] (t)

If |δx| ≥ 1, then

Z Z Z 1 1 2 1 0 d−1 d |1 − cos tx| dt ≤ 2 d |tx| dt + 2 d dt ≤ C |x| . [−δ,δ] |t| [0,1/x] t [1/|x|,∞] (t) for some constant C0. P R j Next we consider the second part Im(λt Wt) sin(xt)dt . j2 Cδ

n Z Z δ X j −1 1 Im(λ Wt) sin(xt)dt ≤ C | sin(xt)|dt t |t|d j=1 Cδ −δ Z |x|δ −1 d−1 1 ≤ 2C |x| d | sin(u)|du 0 u

If |x|δ ≤ 1, then

Z |x|δ Z 1 1 1−d d | sin(u)|du ≤ u du < ∞. 0 u 0

If |x|δ ≥ 1, then

Z |x|δ Z 1 Z δ|x| 1 1−d 1 d | sin(u)|du ≤ u du + d du < ∞. 0 u 0 1 u

So n Z X j 0 d−1 Im(λ Wt) sin(xt)dt ≤ C |x| t j Cδ for some constant C0 is proved. 78

4.5 Bounds of local times of stationary processes

In 1949, Chung and Hunt proved the following theorems for simple symmetric random walks.

Theorem 4.5.1. If φ(x) ↓ 0, and φ(x)x1/2 ↑ ∞, then

p P (`2n < n/2φ(n) i.o.) = 0 or 1, as Z ∞ φ(y) dy < ∞ or = ∞. 1 y Theorem 4.5.2. If φ(x) ↑ 0, then

p P (`2n > n/2φ(n) i.o.) = 0 or 1, as Z ∞ 2 φ(y) − φ(y) e 2 dy < ∞ or = ∞. 1 y Example 4.5.3. By choosing some specific φ(x), we can get

1. Almost surely, for any ω, ∃N0(ω), such that

`n 1 √ < ((2 + ) log log n) 2 ∀n > N (ω). n 0

2. Almost surely, for any ω, ∃n1(ω) < n2(ω) < ... < nk(ω)... ↑ ∞, such that

`ni 1/2 √ > ( log log ni) . ni

3. Almost surely, for any ω, ∃n1(ω) < n2(ω) < ... < nk(ω)... ↑ ∞, such that

` 1 √ni < . ni log ni

4. Almost surely, for any ω, ∃N0(ω), such that

` 1 √n > . n log n1+δ 79

Kesten (1965) proved an iterated logarithm law for the local time of random walks.

Theorem 4.5.4 (Kesten, 1965, [35] ). Let Xn be a sequence of integer valued i.i.d. random variables with P (Xi = k) = pk. Assume the mean of Xi is 0, 2 σ = V ar(Xn) < ∞, and g.c.d {k : pk > 0} = 1.Then √ sup `(n, x) 2 lim sup √ x = a.s. n→∞ n log log n σ

There exists a constant γ1 ∈ (0, ∞) such that √ supx `(n, x) log log n lim inf √ = γ1 a.s. n→∞ n

Similar results hold for the Brownian local times L(t, x).

Theorem 4.5.5 (Bound and Deviation). Suppose {Xn} is a stationary process satisfying the conditions in Theorem 3.4.8. Its local time at level 0 is `n. Then for 2 every β > 1, there exists a constant nβ such that for all nβ ≤ t ≤ L2(n) , where

L2(n) = log log n, one has

(−(1−α)βt) Γ(1 + α) − 1 (1−α)t e ≤ P (` ≥ ta ) ≤ e β n αα n/t and `n lim sup = Kα, n→∞ a n L2(n) L2(n) Γ(1 + α) where K = and L (n) = log log n. α αα(1 − α)1−α 2 This theorem was proved by Chung and Hunt in 1949 [21] for simple random walk, by Jain and Pruitt [32] and Marcus and Rosen [42] for more general random walks. In Theorem 4.5.5 it is extended to stationary processes with conditional local limit theorems. Chapter 5

Conclusion and Open Questions

We formulate the conditional local limit theorems for discrete-time stochastic pro- cesses, which are the preliminary conditions in the study of the limiting behaviors of the local times of certain stationary processes. One example of the processes having the conditional local limit theorem is the discrete-time fractional Brownian motion. For stationary processes with conditional local limit theorems, the limit- ing distribution of their local time is the Mittag-Leffler distribution when the state space is integer-valued; but when the state space is real-valued, the limiting distri- bution is closed related to the Mittag-Leffler distribution. The method is from the pointwise dual ergodic theorem or Aaronson’s Darling-Kac theorem. Essentially, it is via the estimate of the moments of the occupation time and Karamata’s Taube- rian theorem. In the framework of dynamical systems, we also considered the almost sure central limit theorem (ASCLT) of the local times. When the process is not i.i.d., the characteristic operator is introduced and under the assumption of the spectral gap, ASCLT exists. This work is a success of using the theory of dynamical systems to solve problems raised from the . Along this line, there are several directions that may deserve further explo- ration.

1. α-mixing condition. In Chapter 4, the almost sure central limit theorem for the local times holds under the assumptions that the characteristic operator has a spectral gap. This is a stronger condition than the α-mixing condition of the stationary process. So a question of interest is whether the local 81

times still have the almost sure central limit theorem when the assumption is relaxed to be just α-mixing.

2. ASCLT of the local times of dfBm. We have studied the limiting distribution of the local time of the discrete-time fractional Brownian motion. Does it have ASCLT?

3. Limiting distribution of local times `([nt], x). A functional weak invariance principle for the local time of a process generated by a Gibbs-Markov map is proved in the normal case ([17]). For a process generated by a stationary

process {Xn} belonging to the domain of attraction of a stable law, we would like to know whether a functional weak invariance principle of its local time exists. Bibliography

[1] J. Aaronson, M. Denker, O. Sarig, and R. Zweim¨uller.Aperiodicity of cocycles and conditional local limit theorems. Stoch. Dyn., 4(1):31–62, 2004. [2] Jon Aaronson. The asymptotic distributional behaviour of transformations preserving infinite measures. J. Analyse Math., 39:203–234, 1981. [3] Jon Aaronson. An introduction to infinite ergodic theory, volume 50 of Math- ematical Surveys and Monographs. American Mathematical Society, Provi- dence, RI, 1997. [4] Jon Aaronson and Manfred Denker. A local limit theorem for stationary processes in the domain of attraction of a normal distribution. In Asymptotic methods in probability and statistics with applications (St. Petersburg, 1998), Stat. Ind. Technol., pages 215–223. Birkh¨auserBoston, Boston, MA, 2001. [5] Jon Aaronson and Manfred Denker. Local limit theorems for partial sums of stationary sequences generated by Gibbs-Markov maps. Stoch. Dyn., 1(2):193– 237, 2001. [6] A. K. Aleshkyavichene. A local limit theorem for sums of random variables related to a homogeneous Markov chain, for the case of a stable limit distri- bution. Litovsk. Mat. Sb., 1(1–2):5–13, 1961. [7] A. K. Aleshkyavichene. Approximations of distributions of local time. Litovsk. Mat. Sb., 24(4):10–28, 1984. [8] A. K. Aleshkyavichene. Letter to the editors: “Approximation of distributions of local time”. Litovsk. Mat. Sb., 25(4):198, 1985. [9] A. K. Aleshkyavichene. Asymptotic behavior of moments of local times of a random walk. Litovsk. Mat. Sb., 26(2):197–204, 1986. [10] I. Berkes and H. Dehling. Some limit theorems in log density. Ann. Probab., 21(3):1640–1670, 1993. 83

[11] Istv´anBerkes and Endre Cs´aki.A universal result in almost sure central limit theory. Stochastic Process. Appl., 94(1):105–134, 2001.

[12] Martin Bilodeau and David Brenner. Theory of . Springer Science & Business Media, 2008.

[13] N. H. Bingham, C. M. Goldie, and J. L. Teugels. Regular variation, volume 27 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1989.

[14] A. N. Borodin. Asymptotic behavior of local times of recurrent random walks with infinite variance. Teor. Veroyatnost. i Primenen., 29(2):312–326, 1984.

[15] A. N. Borodin. Brownian local time. Uspekhi Mat. Nauk, 44(no. 2(266)):7–48, 1989.

[16] Jean Bretagnolle and Didier Dacunha-Castelle. Th´eor`emeslimites `adistance finie pour les marches al´eatoires. Ann. Inst. H. Poincar´eSect. B (N.S.), 4:25– 73, 1968.

[17] Michael Bromberg. Weak invariance principle for the local times of Gibbs- Markov processes. arXiv preprint arXiv:1406.4174, 2014.

[18] Michael Bromberg. Invariance principle for local time by quasi-compactness. arXiv preprint arXiv:1511.01746, 2015.

[19] Michael Bromberg and Zemer Kosloff. Weak invariance principle for the local times of partial sums of Markov chains. J. Theoret. Probab., 27(2):493–517, 2014.

[20] Gunnar A. Brosamler. An almost everywhere central limit theorem. Math. Proc. Cambridge Philos. Soc., 104(3):561–574, 1988. Pn [21] K. L. Chung and G. A. Hunt. On the zeros of 1 ±1. Ann. of Math. (2), 50:385–400, 1949.

[22] I. P. Cornfeld, S. V. Fomin, and Ya. G. and Sina˘ı. Ergodic theory, volume 245 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, New York, 1982. Translated from the Russian by A. B. Sosinski˘ı.

[23] S´andorCs¨org˝o. R´enyi-mixing of occupation times. In Asymptotic methods in probability and statistics (Ottawa, ON, 1997), pages 3–12. North-Holland, Amsterdam, 1998.

[24] D. A. Darling and M. Kac. On occupation times for Markoff processes. Trans. Amer. Math. Soc., 84:444–458, 1957. 84

[25] Wolfgang Doeblin. Sur deux probl`emesde M. Kolmogoroff concernant les chaˆınesd´enombrables. Bull. Soc. Math. France, 66:210–220, 1938.

[26] P. Erd˝osand S. J. Taylor. Some problems concerning the structure of random walk paths. Acta Math. Acad. Sci. Hungar., 11:137–162. (unbound insert), 1960.

[27] Donald Geman and Joseph Horowitz. Occupation densities. Ann. Probab., 8(1):1–67, 1980.

[28] Khurelbaatar Gonchigdanzan. Almost sure central limit theorems for strongly mixing and associated random variables. Int. J. Math. Math. Sci., 29(3):125– 131, 2002.

[29] Ulf Grenander and Gabor Szeg¨o. Toeplitz forms and their applications. Cali- fornia Monographs in Mathematical Sciences. University of California Press, Berkeley-Los Angeles, 1958.

[30] Y. Guivarc’h and J. Hardy. Th´eor`emeslimites pour une classe de chaˆınes de Markov et applications aux diff´eomorphismes d’Anosov. Ann. Inst. H. Poincar´eProbab. Statist., 24(1):73–98, 1988.

[31] I. A. Ibragimov and Yu. V. Linnik. Independent and stationary sequences of random variables. Wolters-Noordhoff Publishing, Groningen, 1971. With a supplementary chapter by I. A. Ibragimov and V. V. Petrov, Translation from the Russian edited by J. F. C. Kingman.

[32] Naresh C. Jain and William E. Pruitt. Asymptotic behavior of the local time of a recurrent random walk. Ann. Probab., 12(1):64–85, 1984.

[33] Yuji Kasahara and Yuki Matsumoto. On Kallianpur-Robbins law for fractional Brownian motion. J. Math. Kyoto Univ., 36(4):815–824, 1996.

[34] H. Kesten and F. Spitzer. A limit theorem related to a new class of self-similar processes. Z. Wahrsch. Verw. Gebiete, 50(1):5–25, 1979.

[35] Harry Kesten. An iterated logarithm law for local time. Duke Math. J., 32:447–456, 1965.

[36] Harry Kesten. A Tauberian theorem for random walk. Israel J. Math., 6:279– 294, 1968.

[37] A. N. Kolmogorov. A local limit theorem for classical Markov chains. Izvestiya Akad. Nauk SSSR. Ser. Mat., 13:281–300, 1949. 85

[38] Norio Kˆono. Kallianpur-Robbins law for fractional Brownian motion. In Probability theory and mathematical statistics (Tokyo, 1995), pages 229–236. World Sci. Publ., River Edge, NJ, 1996.

[39] Michael T. Lacey and Walter Philipp. A note on the almost sure central limit theorem. Statist. Probab. Lett., 9(3):201–205, 1990.

[40] Paul L´evy. Processus Stochastiques et Mouvement Brownien. Suivi d’une note de M. Lo`eve. Gauthier-Villars, Paris, 1948.

[41] Benoit B. Mandelbrot and John W. Van Ness. Fractional Brownian motions, fractional noises and applications. SIAM Rev., 10:422–437, 1968.

[42] Michael B. Marcus and Jay Rosen. Laws of the iterated logarithm for the local times of symmetric L´evyprocesses and recurrent random walks. Ann. Probab., 22(2):626–658, 1994.

[43] A. A. Markov. Extension of the limit theory of probability to sums of quanti- ties connected in a chain. Trans. Phys.-Math. Division Acad. Sci., Series VIII 22, 1908.

[44] Takehiko Morita. A generalized local limit theorem for Lasota-Yorke trans- formations. Osaka J. Math., 26(3):579–595, 1989.

[45] Takehiko Morita. Correction to: “A generalized local limit theorem for Lasota-Yorke transformations” [Osaka J. Math. 26 (1989), no. 3, 579–595; MR1021432 (91a:58176)]. Osaka J. Math., 30(3):611–612, 1993.

[46] Peter M¨ortersand Yuval Peres. Brownian motion, volume 30 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2010. With an appendix by Oded Schramm and Wendelin Werner.

[47] S. V. Nagaev. Some limit theorems for stationary Markov chains. Teor. Veroyatnost. i Primenen., 2:389–416, 1957.

[48] Magda Peligrad and Qi Man Shao. A note on the almost sure central limit the- orem for weakly dependent random variables. Statist. Probab. Lett., 22(2):131– 136, 1995.

[49] Edwin Perkins. Weak invariance principles for local time. Z. Wahrsch. Verw. Gebiete, 60(4):437–451, 1982.

[50] Walter Philipp. Limit theorems for sums of partial quotients of continued fractions. Monatshefte f¨urMathematik, 105(3):195–206, 1988. 86

[51] Hong Qian. Fractional Brownian motion and fractional Gaussian noise. In Processes with Long-Range Correlations, pages 22–33. Springer, 2003.

[52] A. R´enyi. Probability theory. North-Holland Publishing Co., Amsterdam- London; American Elsevier Publishing Co., Inc., New York, 1970. Translated by L´aszl´oVekerdi, North-Holland Series in and Me- chanics, Vol. 10.

[53] P´alR´ev´esz. Local time and invariance. In Analytical methods in probability theory (Oberwolfach, 1980), volume 861 of Lecture Notes in Math., pages 128– 145. Springer, Berlin-New York, 1981.

[54] P´alR´ev´esz. Random walk in random and non-random environments. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, third edition, 2013.

[55] Daniel Revuz and Marc Yor. Continuous martingales and Brownian motion, volume 293. Springer Science & Business Media, 2013.

[56] J. Rousseau-Egele. Un th´eor`emede la limite locale pour une classe de transfor- mations dilatantes et monotones par morceaux. Ann. Probab., 11(3):772–788, 1983.

[57] Peter Schatte. On strong versions of the central limit theorem. Math. Nachr., 137:249–256, 1988.

[58] Marc S´eva. On the local limit theorem for non-uniformly ergodic Markov chains. J. Appl. Probab., 32(1):52–62, 1995.

[59] Charles Stone. Ratio limit theorems for random walks on groups. Trans. Amer. Math. Soc., 125:86–100, 1966.

[60] Murad S. Taqqu. Weak convergence to fractional Brownian motion and to the Rosenblatt process. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 31:287– 302, 1974/75.

[61] H. F. Trotter. A property of Brownian motion paths. Illinois J. Math., 2:425– 433, 1958.

[62] Ward Whitt. Stochastic-process limits. Springer Series in Operations Research. Springer-Verlag, New York, 2002. An introduction to stochastic-process limits and their application to queues. Vita Xiaofei Zheng

Xiaofei Zheng was born in 1987 in Hebei Province, the People’s Republic of China and was raised up there. She enrolled in the No. 1 High School of Xuanhua in the city of Zhangjiakou. There she was trained in science major. In 2008, she was admitted by Department of Mathematics in Nankai University. She obtained the Bachelor’s degree in Applied Mathematics from Nankai University in 2012. Then she was accepted by the Ph.D. program of Mathematics in The Pennsylvania State University. There she was fortunate to meet Prof. Manfred Denker, under whose supervision she worked on the local times of stochastic processes. She is expected to graduate on August 2017.