Markov Chains on a General State Space

Total Page:16

File Type:pdf, Size:1020Kb

Markov Chains on a General State Space Markov chains on measurable spaces Lecture notes Dimitri Petritis Master STS mention mathématiques Rennes UFR Mathématiques Preliminary draft of April 2012 ©2005–2012 Petritis, Preliminary draft, last updated April 2012 Contents 1 Introduction1 1.1 Motivating example.............................1 1.2 Observations and questions........................3 2 Kernels5 2.1 Notation...................................5 2.2 Transition kernels..............................6 2.3 Examples-exercises.............................7 2.3.1 Integral kernels...........................7 2.3.2 Convolution kernels........................9 2.3.3 Point transformation kernels................... 11 2.4 Markovian kernels............................. 11 2.5 Further exercises.............................. 12 3 Trajectory spaces 15 3.1 Motivation.................................. 15 3.2 Construction of the trajectory space................... 17 3.2.1 Notation............................... 17 3.3 The Ionescu Tulce˘atheorem....................... 18 3.4 Weak Markov property........................... 22 3.5 Strong Markov property.......................... 24 3.6 Examples-exercises............................. 26 4 Markov chains on finite sets 31 i ii 4.1 Basic construction............................. 31 4.2 Some standard results from linear algebra............... 32 4.3 Positive matrices.............................. 36 4.4 Complements on spectral properties.................. 41 4.4.1 Spectral constraints stemming from algebraic properties of the stochastic matrix........................ 41 4.4.2 Spectral constraints stemming from convexity properties of the stochastic matrix........................ 43 5 Potential theory 49 5.1 Notation and motivation.......................... 49 5.2 Martingale solution to Dirichlet’s problem............... 51 5.3 Asymptotic and invariant σ-algebras.................. 58 5.4 Triviality of the asymptotic σ-algebra.................. 59 6 Markov chains on denumerably infinite sets 65 6.1 Classification of states........................... 65 6.2 Recurrence and transience........................ 69 6.3 Invariant measures............................. 71 6.4 Convergence to equilibrium........................ 75 6.5 Ergodic theorem.............................. 77 6.6 Semi-martingale techniques for denumerable Markov chains.... 79 6.6.1 Foster’s criteria........................... 79 6.6.2 Moments of passage times.................... 84 7 Boundary theory for transient chains on denumerably infinite sets 85 7.1 Motivation.................................. 85 iii 7.2 Martin boundary.............................. 86 7.3 Convexity properties of sets of superharmonic functions....... 91 7.4 Minimal and Poisson boundaries..................... 93 7.5 Limit behaviour of the chain in terms of the boundary........ 95 7.6 Markov chains on discrete groups, amenability and triviality of the boundary................................... 100 8 Continuous time Markov chains on denumerable spaces 103 8.1 Embedded chain.............................. 103 8.2 Infinitesimal generator........................... 103 8.3 Explosion................................... 103 8.4 Semi-martingale techniques for explosion times........... 103 9 Irreducibility on general spaces 105 9.1 Á-irreducibility............................... 105 9.2 c-sets..................................... 109 9.3 Cycle decomposition............................ 114 10 Asymptotic behaviour for Á-recurrent chains 121 10.1 Tail σ-algebras................................ 121 10.2 Structure of the asymptotic σ-algebra.................. 123 11 Uniform Á-recurrence 127 11.1 Exponential convergence to equilibrium................ 127 11.2 Embedded Markov chains......................... 130 11.3 Embedding of a general Á-recurrent chain............... 131 iv 12 Invariant measures for Á-recurrent chains 133 12.1 Convergence to equilibrium........................ 133 12.2 Relationship between invariant and irreducibility measures..... 134 12.3 Ergodic properties............................. 140 13 Quantum Markov chains 141 A Complements in measure theory 143 A.1 Monotone class theorems......................... 143 A.1.1 Set systems............................. 143 A.1.2 Set functions and their extensions................ 145 A.2 Uniformity induced by Egorov’s theorem and its extension..... 146 B Some classical martingale results 147 B.1 Uniformly integrable martingales.................... 147 B.2 Martingale proof of Radon-Nikodým theorem............. 148 B.3 Reverse martingales............................ 149 C Semi-martingale techniques 151 C.1 Integrability of the passage time..................... 151 References 152 Index 156 v 1 Introduction 1.1 Motivating example The simplest non-trivial example of a Markov chain is the following model. Let X {0,1} (heads 1, tails 0) and consider two coins, one honest and one biased Æ giving tails with probability 2/3. Let ½ outcome of biased coin if Xn 0 Xn 1 Æ Å Æ outcome of honest coin if X 1. n Æ For y X and ¼ (y) P(X y), we have 2 n Æ n Æ ¼n 1(y) P(Xn 1 y honest)P(honest) P(Xn 1 y biased)P(biased), Å Æ Å Æ j Å Å Æ j yielding µ ¶ P00 P01 (¼n 1(0),¼n 1(1)) (¼n(0),¼n(1)) , Å Å Æ P10 P11 with P 2/3, P 1/3, and P P 1/2. 00 Æ 01 Æ 10 Æ 11 Æ Iterating, we get ¼ ¼ P n n Æ 0 where ¼ must be determined ad hoc (for instance by ¼ (x) ± for x X). 0 0 Æ 0x 2 In this formulation, the probabilistic problem is translated into a purely al- gebraic problem of determining the asymptotic behaviour of ¼n in terms of the n asymptotic behaviour of P , where P (Px y )x,y X is a stochastic matrix (i.e. P Æ 2 Px y 0 for all x and y and y X Px y 1). Although the solution of this prob- ¸ 2 Æ lem is an elementary exercise in linear algebra, it is instructive to give some de- µ1 a a ¶ tails. Suppose we are in the non-degenerate case where P ¡ with Æ b 1 b ¡ 0 a,b 1. Compute the spectrum of P and left and right eigenvectors: Ç Ç ut P ¸ut Æ Pv ¸v. Æ 1 2 1 0 Figure 1.1: Dotted lines represent all possible and imaginable trajectories of heads and tails trials. Full line represents a particular realisation of such a se- quence of trials. Note that only integer points have a significance, they are joined by line segments only for visualisation purposes. Since the matrix P is not normal in general, the left (resp. right) eigenvectors cor- responding to different eigenvalues are not orthogonal. However we can always normalise ut v ± . Under this normalisation, and denoting by E v ut , ¸ ¸0 Æ ¸¸0 ¸ Æ ¸ ­ ¸ we get ¸ u¸ v¸ E¸ µ ¶ µ ¶ µ ¶ b 1 1 b a ¸1 1 a b Æ a 1 Å b a µ 1 ¶ µ a ¶ µ a a¶ ¸ 1 (a b) 1 ¡ 2 Æ ¡ Å 1 b a b b b ¡ ¡ Å ¡ We observe that for ¸ and ¸0 being eigenvalues, E E ± E , i.e. E are spec- ¸ ¸0 Æ ¸¸0 ¸ ¸ tral projectors. The matrix P admits the spectral decomposition X P ¸E¸, Æ ¸ specP 2 where specP {1,1 (a b)} {¸ C : ¸I P is not invertible}. Now E being Æ ¡ Å Æ 2 ¡ ¸ orthogonal projectors, we compute immediately, for all positive integers n, n X n n n P ¸ E¸ ¸1 E¸1 ¸2 E¸2 Æ ¸ specP Æ Å 2 n and since ¸1 1 while ¸2 1, we get that limn P E¸ . Æ j j Ç !1 Æ 1 Applying this formula for the numerical values of the example, we get that limn ¼n (0.6,0.4). !1 Æ 1.2. Observations and questions 3 1.2 Observations and questions – In the previous example, a prominent role is played by the stochastic ma- trix P. In the general case, P is replaced by an operator, known as stochas- tic kernel that will be introduced and studied in chapter2. – An important question, that has been eluded in this elementary introduc- tion, is whether exists a probability space (­,F ,P) sufficiently large as to carry the whole sequence of random variables (Xn)n N. This question will 2 be positively answered by the Ionescu Tulce˘atheorem and the construc- tion of the trajectory space of the process in chapter3. – In the finite case, motivated in this chapter and fully studied in chapter4, the asymptotic behaviour of the sequence (Xn) is fully determined by the spectrum of P, a purely algebraic object. In the general case (countable or uncountable), the spectrum is an algebraic and topological object defined as specP {¸ C : ¸I P is not invertible}. Æ 2 ¡ This approach exploits the topological structure of measurable functions on X; a thorough study of the general case is performed, following the line of spectral theory of quasi-compact operators, in [14]. Although we don’t follow this approach here, we refer the reader to the previously mentioned monograph for further study. – Without using results from the theory of quasi-compact operators, we can 1 P n study the formal power series (I P)¡ 1 P (compare with the above ¡ Æ n 0 definition of the spectrum). This series canÆ be given both a probabilis- tic and an analytic significance realising a profound connection between harmonic analysis, martingale theory, and Markov chains. It is this latter approach that will be developed in chapter5. – Quantum Markov chains are objects defined on a quantum probability space. Now, quantum probability can be thought as a non-commutative extension of classical probability where real random variables are replaced by self-adjoint operators acting on a Hilbert space. Thus, once again the spectrum plays an important rôle in this context. Exercise 1.2.1 (Part of the paper of December 2006) Let (Xn) be the Markov 0 Pn 1 chain defined in the example of section 1.1, with X {0,1}. Let ´ (A) ¡ 1 (X ) Æ n Æ k 0 A k for A X. Æ ⊆ P 0 1. Compute Eexp(i x X tx ´n({x})) as a product of matrices and vectors, for 2 t R. x 2 2. What is the probabilistic significance of the previous expectation? 0 3. Establish a central limit theorem for ´n({0}). Pn 1 0 4. Let f : X R. Express S (f ) ¡ f (X ) with the help of ´ (A), with ! n Æ k 0 k n A X. Æ ⊆ 4 2 Kernels This chapter deals mainly with the general theory of kernels.
Recommended publications
  • Conditioning and Markov Properties
    Conditioning and Markov properties Anders Rønn-Nielsen Ernst Hansen Department of Mathematical Sciences University of Copenhagen Department of Mathematical Sciences University of Copenhagen Universitetsparken 5 DK-2100 Copenhagen Copyright 2014 Anders Rønn-Nielsen & Ernst Hansen ISBN 978-87-7078-980-6 Contents Preface v 1 Conditional distributions 1 1.1 Markov kernels . 1 1.2 Integration of Markov kernels . 3 1.3 Properties for the integration measure . 6 1.4 Conditional distributions . 10 1.5 Existence of conditional distributions . 16 1.6 Exercises . 23 2 Conditional distributions: Transformations and moments 27 2.1 Transformations of conditional distributions . 27 2.2 Conditional moments . 35 2.3 Exercises . 41 3 Conditional independence 51 3.1 Conditional probabilities given a σ{algebra . 52 3.2 Conditionally independent events . 53 3.3 Conditionally independent σ-algebras . 55 3.4 Shifting information around . 59 3.5 Conditionally independent random variables . 61 3.6 Exercises . 68 4 Markov chains 71 4.1 The fundamental Markov property . 71 4.2 The strong Markov property . 84 4.3 Homogeneity . 90 4.4 An integration formula for a homogeneous Markov chain . 99 4.5 The Chapmann-Kolmogorov equations . 100 iv CONTENTS 4.6 Stationary distributions . 103 4.7 Exercises . 104 5 Ergodic theory for Markov chains on general state spaces 111 5.1 Convergence of transition probabilities . 113 5.2 Transition probabilities with densities . 115 5.3 Asymptotic stability . 117 5.4 Minorisation . 122 5.5 The drift criterion . 127 5.6 Exercises . 131 6 An introduction to Bayesian networks 141 6.1 Introduction . 141 6.2 Directed graphs .
    [Show full text]
  • A Predictive Model Using the Markov Property
    A Predictive Model using the Markov Property Robert A. Murphy, Ph.D. e-mail: [email protected] Abstract: Given a data set of numerical values which are sampled from some unknown probability distribution, we will show how to check if the data set exhibits the Markov property and we will show how to use the Markov property to predict future values from the same distribution, with probability 1. Keywords and phrases: markov property. 1. The Problem 1.1. Problem Statement Given a data set consisting of numerical values which are sampled from some unknown probability distribution, we want to show how to easily check if the data set exhibits the Markov property, which is stated as a sequence of dependent observations from a distribution such that each suc- cessive observation only depends upon the most recent previous one. In doing so, we will present a method for predicting bounds on future values from the same distribution, with probability 1. 1.2. Markov Property Let I R be any subset of the real numbers and let T I consist of times at which a numerical ⊆ ⊆ distribution of data is randomly sampled. Denote the random samples by a sequence of random variables Xt t∈T taking values in R. Fix t T and define T = t T : t>t to be the subset { } 0 ∈ 0 { ∈ 0} of times in T that are greater than t . Let t T . 0 1 ∈ 0 Definition 1 The sequence Xt t∈T is said to exhibit the Markov Property, if there exists a { } measureable function Yt1 such that Xt1 = Yt1 (Xt0 ) (1) for all sequential times t0,t1 T such that t1 T0.
    [Show full text]
  • Local Conditioning in Dawson–Watanabe Superprocesses
    The Annals of Probability 2013, Vol. 41, No. 1, 385–443 DOI: 10.1214/11-AOP702 c Institute of Mathematical Statistics, 2013 LOCAL CONDITIONING IN DAWSON–WATANABE SUPERPROCESSES By Olav Kallenberg Auburn University Consider a locally finite Dawson–Watanabe superprocess ξ =(ξt) in Rd with d ≥ 2. Our main results include some recursive formulas for the moment measures of ξ, with connections to the uniform Brown- ian tree, a Brownian snake representation of Palm measures, continu- ity properties of conditional moment densities, leading by duality to strongly continuous versions of the multivariate Palm distributions, and a local approximation of ξt by a stationary clusterη ˜ with nice continuity and scaling properties. This all leads up to an asymptotic description of the conditional distribution of ξt for a fixed t> 0, given d that ξt charges the ε-neighborhoods of some points x1,...,xn ∈ R . In the limit as ε → 0, the restrictions to those sets are conditionally in- dependent and given by the pseudo-random measures ξ˜ orη ˜, whereas the contribution to the exterior is given by the Palm distribution of ξt at x1,...,xn. Our proofs are based on the Cox cluster representa- tions of the historical process and involve some delicate estimates of moment densities. 1. Introduction. This paper may be regarded as a continuation of [19], where we considered some local properties of a Dawson–Watanabe super- process (henceforth referred to as a DW-process) at a fixed time t> 0. Recall that a DW-process ξ = (ξt) is a vaguely continuous, measure-valued diffu- d ξtf µvt sion process in R with Laplace functionals Eµe− = e− for suitable functions f 0, where v = (vt) is the unique solution to the evolution equa- 1 ≥ 2 tion v˙ = 2 ∆v v with initial condition v0 = f.
    [Show full text]
  • Strong Memoryless Times and Rare Events in Markov Renewal Point
    The Annals of Probability 2004, Vol. 32, No. 3B, 2446–2462 DOI: 10.1214/009117904000000054 c Institute of Mathematical Statistics, 2004 STRONG MEMORYLESS TIMES AND RARE EVENTS IN MARKOV RENEWAL POINT PROCESSES By Torkel Erhardsson Royal Institute of Technology Let W be the number of points in (0,t] of a stationary finite- state Markov renewal point process. We derive a bound for the total variation distance between the distribution of W and a compound Poisson distribution. For any nonnegative random variable ζ, we con- struct a “strong memoryless time” ζˆ such that ζ − t is exponentially distributed conditional on {ζˆ ≤ t,ζ>t}, for each t. This is used to embed the Markov renewal point process into another such process whose state space contains a frequently observed state which repre- sents loss of memory in the original process. We then write W as the accumulated reward of an embedded renewal reward process, and use a compound Poisson approximation error bound for this quantity by Erhardsson. For a renewal process, the bound depends in a simple way on the first two moments of the interrenewal time distribution, and on two constants obtained from the Radon–Nikodym derivative of the interrenewal time distribution with respect to an exponential distribution. For a Poisson process, the bound is 0. 1. Introduction. In this paper, we are concerned with rare events in stationary finite-state Markov renewal point processes (MRPPs). An MRPP is a marked point process on R or Z (continuous or discrete time). Each point of an MRPP has an associated mark, or state.
    [Show full text]
  • Simulation of Markov Chains
    Copyright c 2007 by Karl Sigman 1 Simulating Markov chains Many stochastic processes used for the modeling of financial assets and other systems in engi- neering are Markovian, and this makes it relatively easy to simulate from them. Here we present a brief introduction to the simulation of Markov chains. Our emphasis is on discrete-state chains both in discrete and continuous time, but some examples with a general state space will be discussed too. 1.1 Definition of a Markov chain We shall assume that the state space S of our Markov chain is S = ZZ= f:::; −2; −1; 0; 1; 2;:::g, the integers, or a proper subset of the integers. Typical examples are S = IN = f0; 1; 2 :::g, the non-negative integers, or S = f0; 1; 2 : : : ; ag, or S = {−b; : : : ; 0; 1; 2 : : : ; ag for some integers a; b > 0, in which case the state space is finite. Definition 1.1 A stochastic process fXn : n ≥ 0g is called a Markov chain if for all times n ≥ 0 and all states i0; : : : ; i; j 2 S, P (Xn+1 = jjXn = i; Xn−1 = in−1;:::;X0 = i0) = P (Xn+1 = jjXn = i) (1) = Pij: Pij denotes the probability that the chain, whenever in state i, moves next (one unit of time later) into state j, and is referred to as a one-step transition probability. The square matrix P = (Pij); i; j 2 S; is called the one-step transition matrix, and since when leaving state i the chain must move to one of the states j 2 S, each row sums to one (e.g., forms a probability distribution): For each i X Pij = 1: j2S We are assuming that the transition probabilities do not depend on the time n, and so, in particular, using n = 0 in (1) yields Pij = P (X1 = jjX0 = i): (Formally we are considering only time homogenous MC's meaning that their transition prob- abilities are time-homogenous (time stationary).) The defining property (1) can be described in words as the future is independent of the past given the present state.
    [Show full text]
  • Brownian Motion and the Strong Markov Property
    BROWNIAN MOTION AND THE STRONG MARKOV PROPERTY JAMES LEINER Abstract. This paper is an introduction to Brownian motion. After a brief introduction to measure-theoretic probability, we begin by constructing Brow- d nian motion over the dyadic rationals and extending this construction to R . After establishing some relevant features, we introduce the strong Markov property and its applications. We then use these tools to demonstrate the existence of various Markov processes embedded within Brownian motion. Contents 1. Introduction 1 2. Construction 5 3. Non-Differentiability 9 4. Invariance Properties 10 5. Markov Properties 11 6. Applications 13 7. Derivative Markov Processes 15 Acknowledgments 16 References 16 1. Introduction With a varied array of uses across pure and applied mathematics, Brownian motion is one of the most widely studied stochastic processes. This paper seeks to provide a rigorous introduction to the topic, using [3] and [4] as our primary references. In order to properly ground our discussion, however, we must first become com- fortable with the basic framework of measure-theoretic probability. In this section, we provide a terse review of the subject sufficient to prepare the reader for some of the more advanced proofs within this paper. Most of the important results in this section are presented without proof and individuals wishing for a more detailed exposition of the topic can consult [1]. If so inclined, a large portion of the first half of the paper can be understood without measure theory. In this case, consult [2] for a more basic introduction to probability not requiring measure theory. Definition 1.1.
    [Show full text]
  • Interacting Particle Systems MA4H3
    Interacting particle systems MA4H3 Stefan Grosskinsky Warwick, 2009 These notes and other information about the course are available on http://www.warwick.ac.uk/˜masgav/teaching/ma4h3.html Contents Introduction 2 1 Basic theory3 1.1 Continuous time Markov chains and graphical representations..........3 1.2 Two basic IPS....................................6 1.3 Semigroups and generators.............................9 1.4 Stationary measures and reversibility........................ 13 2 The asymmetric simple exclusion process 18 2.1 Stationary measures and conserved quantities................... 18 2.2 Currents and conservation laws........................... 23 2.3 Hydrodynamics and the dynamic phase transition................. 26 2.4 Open boundaries and matrix product ansatz.................... 30 3 Zero-range processes 34 3.1 From ASEP to ZRPs................................ 34 3.2 Stationary measures................................. 36 3.3 Equivalence of ensembles and relative entropy................... 39 3.4 Phase separation and condensation......................... 43 4 The contact process 46 4.1 Mean field rate equations.............................. 46 4.2 Stochastic monotonicity and coupling....................... 48 4.3 Invariant measures and critical values....................... 51 d 4.4 Results for Λ = Z ................................. 54 1 Introduction Interacting particle systems (IPS) are models for complex phenomena involving a large number of interrelated components. Examples exist within all areas of natural and social sciences, such as traffic flow on highways, pedestrians or constituents of a cell, opinion dynamics, spread of epi- demics or fires, reaction diffusion systems, crystal surface growth, chemotaxis, financial markets... Mathematically the components are modeled as particles confined to a lattice or some discrete geometry. Their motion and interaction is governed by local rules. Often microscopic influences are not accesible in full detail and are modeled as effective noise with some postulated distribution.
    [Show full text]
  • Pre-Publication Accepted Manuscript
    P´eterE. Frenkel Convergence of graphs with intermediate density Transactions of the American Mathematical Society DOI: 10.1090/tran/7036 Accepted Manuscript This is a preliminary PDF of the author-produced manuscript that has been peer-reviewed and accepted for publication. It has not been copyedited, proof- read, or finalized by AMS Production staff. Once the accepted manuscript has been copyedited, proofread, and finalized by AMS Production staff, the article will be published in electronic form as a \Recently Published Article" before being placed in an issue. That electronically published article will become the Version of Record. This preliminary version is available to AMS members prior to publication of the Version of Record, and in limited cases it is also made accessible to everyone one year after the publication date of the Version of Record. The Version of Record is accessible to everyone five years after publication in an issue. This is a pre-publication version of this article, which may differ from the final published version. Copyright restrictions may apply. CONVERGENCE OF GRAPHS WITH INTERMEDIATE DENSITY PÉTER E. FRENKEL Abstract. We propose a notion of graph convergence that interpolates between the Benjamini–Schramm convergence of bounded degree graphs and the dense graph convergence developed by László Lovász and his coauthors. We prove that spectra of graphs, and also some important graph parameters such as numbers of colorings or matchings, behave well in convergent graph sequences. Special attention is given to graph sequences of large essential girth, for which asymptotics of coloring numbers are explicitly calculated. We also treat numbers of matchings in approximately regular graphs.
    [Show full text]
  • A Review on Statistical Inference Methods for Discrete Markov Random Fields Julien Stoehr
    A review on statistical inference methods for discrete Markov random fields Julien Stoehr To cite this version: Julien Stoehr. A review on statistical inference methods for discrete Markov random fields. 2017. hal-01462078v1 HAL Id: hal-01462078 https://hal.archives-ouvertes.fr/hal-01462078v1 Preprint submitted on 8 Feb 2017 (v1), last revised 11 Apr 2017 (v2) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. A review on statistical inference methods for discrete Markov random fields Julien Stoehr1 1School of Mathematical Sciences & Insight Centre for Data Analytics, University College Dublin, Ireland February 8, 2017 Abstract Developing satisfactory methodology for the analysis of Markov random field is a very challenging task. Indeed, due to the Markovian dependence structure, the normalizing con- stant of the fields cannot be computed using standard analytical or numerical methods. This forms a central issue for any statistical approach as the likelihood is an integral part of the procedure. Furthermore, such unobserved fields cannot be integrated out and the like- lihood evaluation becomes a doubly intractable problem. This report gives an overview of some of the methods used in the literature to analyse such observed or unobserved random fields.
    [Show full text]
  • CONDITIONAL EXPECTATION Definition 1. Let (Ω,F,P) Be A
    CONDITIONAL EXPECTATION STEVEN P.LALLEY 1. CONDITIONAL EXPECTATION: L2 THEORY ¡ Definition 1. Let (­,F ,P) be a probability space and let G be a σ algebra contained in F . For ¡ any real random variable X L2(­,F ,P), define E(X G ) to be the orthogonal projection of X 2 j onto the closed subspace L2(­,G ,P). This definition may seem a bit strange at first, as it seems not to have any connection with the naive definition of conditional probability that you may have learned in elementary prob- ability. However, there is a compelling rationale for Definition 1: the orthogonal projection E(X G ) minimizes the expected squared difference E(X Y )2 among all random variables Y j ¡ 2 L2(­,G ,P), so in a sense it is the best predictor of X based on the information in G . It may be helpful to consider the special case where the σ algebra G is generated by a single random ¡ variable Y , i.e., G σ(Y ). In this case, every G measurable random variable is a Borel function Æ ¡ of Y (exercise!), so E(X G ) is the unique Borel function h(Y ) (up to sets of probability zero) that j minimizes E(X h(Y ))2. The following exercise indicates that the special case where G σ(Y ) ¡ Æ for some real-valued random variable Y is in fact very general. Exercise 1. Show that if G is countably generated (that is, there is some countable collection of set B G such that G is the smallest σ algebra containing all of the sets B ) then there is a j 2 ¡ j G measurable real random variable Y such that G σ(Y ).
    [Show full text]
  • Lecture Notes on Markov and Feller Property
    AN INTRODUCTION TO MARKOV PROCESSES AND THEIR APPLICATIONS IN MATHEMATICAL ECONOMICS UMUT C¸ETIN 1. Markov Property During the course of your studies so far you must have heard at least once that Markov processes are models for the evolution of random phenomena whose future behaviour is independent of the past given their current state. In this section we will make precise the so called Markov property in a very general context although very soon we will restrict our selves to the class of `regular' Markov processes. As usual we start with a probability space (Ω; F;P ). Let T = [0; 1) and E be a locally compact separable metric space. We will denote the Borel σ-field of E with E . Recall that the d-dimensional Euclidean space Rd is a particular case of E that appears very often in applications. For each t 2 T, let Xt(!) = X(t; !) be a function from Ω to E such that it is F=E - −1 measurable, i.e. Xt (A) 2 F for every A 2 E . Note that when E = R this corresponds to the familiar case of Xt being a real random variable for every t 2 T. Under this setup X = (Xt)t2T is called a stochastic process. We will now define two σ-algebras that are crucial in defining the Markov property. Let 0 0 Ft = σ(Xs; s ≤ t); Ft = σ(Xu; u ≥ t): 0 0 Observe that Ft contains the history of X until time t while Ft is the future evolution of X 0 after time t.
    [Show full text]
  • Geometric Ergodicity of the Random Walk Metropolis with Position-Dependent Proposal Covariance
    mathematics Article Geometric Ergodicity of the Random Walk Metropolis with Position-Dependent Proposal Covariance Samuel Livingstone Department of Statistical Science, University College London, London WC1E 6BT, UK; [email protected] Abstract: We consider a Metropolis–Hastings method with proposal N (x, hG(x)−1), where x is the current state, and study its ergodicity properties. We show that suitable choices of G(x) can change these ergodicity properties compared to the Random Walk Metropolis case N (x, hS), either for better or worse. We find that if the proposal variance is allowed to grow unboundedly in the tails of the distribution then geometric ergodicity can be established when the target distribution for the algorithm has tails that are heavier than exponential, in contrast to the Random Walk Metropolis case, but that the growth rate must be carefully controlled to prevent the rejection rate approaching unity. We also illustrate that a judicious choice of G(x) can result in a geometrically ergodic chain when probability concentrates on an ever narrower ridge in the tails, something that is again not true for the Random Walk Metropolis. Keywords: Monte Carlo; MCMC; Markov chains; computational statistics; bayesian inference 1. Introduction Markov chain Monte Carlo (MCMC) methods are techniques for estimating expecta- Citation: Livingstone, S. Geometric tions with respect to some probability measure p(·), which need not be normalised. This is Ergodicity of the Random Walk done by sampling a Markov chain which has limiting distribution p(·), and computing Metropolis with Position-Dependent empirical averages. A popular form of MCMC is the Metropolis–Hastings algorithm [1,2], Proposal Covariance.
    [Show full text]