Lecture 5: Gaussian Processes & Stationary Processes

Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2019 Lecture 5: Gaussian processes & Stationary processes Readings Recommended: • Pavliotis (2014), sections 1.1, 1.2 • Grimmett and Stirzaker (2001), 8.2, 8.6 • Grimmett and Stirzaker (2001) 9.1, 9.3, 9.5, 9.6 (more advanced than what we will cover in class) Optional: • Grimmett and Stirzaker (2001) 4.9 (review of multivariate Gaussian random variables) • Grimmett and Stirzaker (2001) 9.4 • Chorin and Hald (2009) Chapter 6; a gentle introduction to the spectral decomposition for continuous processes, that gives insight into the link to the spectral decomposition for discrete processes. • Yaglom (1962), Ch. 1, 2; a nice short book with many details about stationary random functions; one of the original manuscripts on the topic. • Lindgren (2013) is a in-depth but accessible book; with more details and a more modern style than Yaglom (1962). We want to be able to describe more stochastic processes, which are not necessarily Markov process. In this lecture we will look at two classes of stochastic processes that are tractable to use as models and to simulat: Gaussian processes, and stationary processes. 5.1 Setup Here are some ideas we will need for what follows. 5.1.1 Finite dimensional distributions Definition. The finite-dimensional distributions (fdds) of a stochastic process Xt is the set of measures Pt1;t2;:::;tk given by Pt1;t2;:::;tk (F1 × F2 × ··· × Fk) = P(Xt1 2 F1;Xt2 2 F2;:::Xtk 2 Fk); k where (t1;:::;tk) is any point in T , k is any integer, and Fi are measurable events in R. That is, we take a finite number of points (t1;t2;:::;tk) and consider the distribution of (Xt1 ;Xt2 ;:::Xtk ). Examples Here are some examples of processes and their fdds. + 1. A sequence of i.i.d. random variables. The parameter set is T = Z , and the fdds are Pt1;:::;tk (F1 × k ··· × Fk) = ∏i=1 P(Fi). Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2019 2. Markov chains. Let Xt be a continuous-time, homogeneous Markov chain with generator Q and initial probability distribution m0, and let 0 = t0 < t1 < ··· < tk. Then the fdd is given by P(Xt0 2 F0;Xt1 2 F1;:::;Xtk 2 Fk) = ∑ m0(i0) ∑ Pi0i1 (t1 −t0)··· ∑ Pik−1ik (tk −tk−1): i02F0 i12F1 ik2Fk 3. Poisson process. For any 0 ≤ t1 ≤ ::: ≤ tk, let h1;h2;:::;hk be independent Poisson random variables with rates lt1;l(t2 −t1);:::;l(tk −tk−1). Let Pt1;:::;tk be the joint distribution of h1;h1 + h2;:::;h1 + h2 + ··· + hk). Given a set of finite-dimensional distributions, the Kolmogorov Extension Theorem says that, if the fdds satisfy some basic consistency criteria (they are invariant under permutations, and their marginal distributions are consistent), then there exists a stochastic process with the corresponding fdds. See Koralov and Sinai (2010), p. 174, or Grimmett and Stirzaker (2001), section 8.7 p.372, or Terry’s Tao’s notes: https:// terrytao.wordpress.com/2010/10/30/245a-notes-6-outer-measures-pre-measures-and-product-measures/ #more-4371 However, knowing only the fdds of a stochastic process does not let us ask many questions that we may be interested in, such as whether the process is continuous, differentiable, bounded; what is the first passage time to a given set, what its maximum is at time t, etc. You need the values of the process at an uncountable number of points, to decide such questions. Here is an example to illustrate some of the difficulties. Example. Let U ∼ Uniform([0;1]) be a random variable. Define two processes X = (Xt )0≤t≤1 and Y = (Yt )0≤t≤1 by 1 if U = t; X = 0 for all t; Y = t t 0 otherwise Then X and Y have the same fdds, since P(U = t) = 0 for all t. But X, Y are different processes. Some differ- ences include: X is continuous, Y is not; supt X = 0, supt Y = 1; P(Xt =0 for all t) = 1, P(Yt =0 for all t) = 0. This example may seem trivial, but one can often construct less trivial examples of different processes with the same fdds. Such processes are called versions of each other. Any theory which studies properties of sample paths must take care to specify which version of a process is being studied. Typically one considers processes which are right-continuous and have limits from the left (so-called “cadl` ag”` processes, from the French “continue a` droite, limites a` gauche.”) A theorem (see Grimmett and Stirzaker (2001), Theorem 8.7.6, p.373, and also Breiman (1992), p.300) gives conditions under which a stochastic process has a cadl` ag` version. 5.1.2 Covariance functions A highly useful way to characterize properties of a stochastic process is its covariance function, which essentially characterizes the variance of the two-point fdds. T T T Recall that if we have a random vector X = (X1;:::;Xn) , its covariance matrix is S = EXX −(EX)(EX) is the matrix whose elements are the covariance of Xi, Xj, i.e. Si j = EXiEXj −(EXi)(EXj). This generalizes to random functions: 2 Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2019 Definition. The covariance function of a real-valued stochastic process (Xt )t2T is B(s;t) ≡ E[(Xs − EXs)(Xt − EXt )] = EXsXt − m(s)m(t); where m(t) = EXt is the mean of the process. 2 2 Notice that C(t;t) = EXt − (EXt ) gives the variance of the process at a single point t. T n One can easily show that a covariance matrix is positive semidefinite, i.e. x Sx ≥ 0 for all x 2 R . (Exercise: show this!) A similar statement is true for covariance functions. k Definition. A function B(s;t) is positive (semi)definite if if the matrix (B(ti;t j))i; j=1 is positive (semi)definite k for all finite slices ftigi=1. Lemma. The covariance function B(s;t) of a stochastic process is positive semidefinite. Proof. This will be an exercise on the homework. Example. Let’s calculate the covariance function of a Poisson process (Nt )t≥0 with parameter l. Assume WLOG that t ≥ s, and calculate: 2 2 C(s;t) = ENt Ns − (ENt )(ENs) = E(Nt − Ns)Ns + ENs − l ts = l 2(t − s)s + ls + l 2s2 − l 2ts = ls: We used the fact that Nt − Ns, Ns are independent. You can repeat the calculation with s > t, to find that C(s;t) = l min(s;t) : 5.2 Gaussian processes A very important class of processes are Gaussian processes. These arise in a number of applications, partly because they are tractable models that are possible to simulate and such that much is known analytically about their fdds. Also, the Central Limit Theorem suggests that they should arise from a superposition of random, uncorrelated effects. T Definition. A random vector X = (X1;:::;Xn) is a multivariate Gaussian if its probability density function has the form 1 − 1 (x−m)T S−1(x−m) f (x1;:::;xn) = e 2 ; (1) p(2p)njSj n×n where S 2 R is a positive definite symmetric matrix, equal to the covariance matrix of the random vector, n and m 2 R is the mean. We sometimes write X ∼ N(m;S). A few facts about multivariate Gaussians that may be useful are listed in the Appendix, section 5.7. Definition. X = (Xt )t2T is a Gaussian process iff all its fdds are Gaussian, i.e. if (Xt1 ;:::;Xtk ) is a multi- k variate Gaussian for any (t1;:::;tk) 2 T . 3 Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2019 This means that if we are given functions m(t) and B(s;t), and B is positive definite, then we can construct a Gaussian process whose mean is EXt = m(t) and whose covariance is B(s;t), i.e. all the finite-dimensional k distributions have covariance matrix (B(ti;t j))i; j=1. This gives us a lot of flexibility in constructing Gaussian processes. It also means that, given only the mean and the covariance of the process, we know all the finite dimensional distributions. This is a powerful statement, since means and covariances are readily measurable. It is only true for Gaussian processes. Example. A Brownian motion or Wiener process is a continuous Gaussian process W = (Wt )t≥0 with mean m(t) = 0 and covariance B(s;t) = min(s;t) for s;t ≥ 0, and such that W0 = 0. (We’ll see other definitions later in the course.) Notice that a Brownian motion W = (Wt )t≥0 has the same covariance as a Poisson process with l = 1. If we define a process Y = (Yt )t≥0 by Yt = Nt −t, where Nt is a Poisson process with rate l = 1, then Y;W both have mean 0 and covariance function min(s;t). However, these are clearly not the same process; clearly the Poisson process does not have Gaussian fdds, and it is also not continuous. Exercise 5.1. Show that the function B(s;t) = min(s;t) for s;t ≥ 0 is positive definite. Exercise 5.2. Show, from the definition above, that the Wiener process has stationary independent incre- ments, i.e. (a) the distribution of Wt −Ws depends only on t − s; (b) the variables Wt j −Ws j , 1 ≤ j ≤ n are independent whenever the intervals (s j;t j] are disjoint. Simulating Gaussian processes There is a straightforward algorithm for simulating realizations of a Gaussian process.

Lecture 5: Gaussian Processes & Stationary Processes

Stationary Processes and Their Statistical Properties

Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes

Methods of Monte Carlo Simulation II

Stochastic Process - Introduction

Deep Neural Networks As Gaussian Processes

Autocovariance Function Estimation Via Penalized Regression

Examples of Stationary Time Series

Stationary Processes

Gaussian Process Dynamical Models for Human Motion

Modelling Multi-Object Activity by Gaussian Processes 1

Financial Time Series Volatility Analysis Using Gaussian Process State-Space Models

Gaussian-Random-Process.Pdf