Probability Theory: STAT310/MATH230; September 12, 2010 Amir Dembo
Total Page:16
File Type:pdf, Size:1020Kb
Probability Theory: STAT310/MATH230; September 12, 2010 Amir Dembo E-mail address: [email protected] Department of Mathematics, Stanford University, Stanford, CA 94305. Contents Preface 5 Chapter 1. Probability, measure and integration 7 1.1. Probability spaces, measures and σ-algebras 7 1.2. Random variables and their distribution 18 1.3. Integration and the (mathematical) expectation 30 1.4. Independence and product measures 54 Chapter 2. Asymptotics: the law of large numbers 71 2.1. Weak laws of large numbers 71 2.2. The Borel-Cantelli lemmas 77 2.3. Strong law of large numbers 85 Chapter 3. Weak convergence, clt and Poisson approximation 95 3.1. The Central Limit Theorem 95 3.2. Weak convergence 103 3.3. Characteristic functions 117 3.4. Poisson approximation and the Poisson process 133 3.5. Random vectors and the multivariate clt 140 Chapter 4. Conditional expectations and probabilities 151 4.1. Conditional expectation: existence and uniqueness 151 4.2. Properties of the conditional expectation 156 4.3. The conditional expectation as an orthogonal projection 164 4.4. Regular conditional probability distributions 169 Chapter 5. Discrete time martingales and stopping times 175 5.1. Definitions and closure properties 175 5.2. Martingale representations and inequalities 184 5.3. The convergence of Martingales 191 5.4. The optional stopping theorem 203 5.5. Reversed MGs, likelihood ratios and branching processes 209 Chapter 6. Markov chains 225 6.1. Canonical construction and the strong Markov property 225 6.2. Markov chains with countable state space 233 6.3. General state space: Doeblin and Harris chains 255 Chapter 7. Continuous, Gaussian and stationary processes 269 7.1. Definition, canonical construction and law 269 7.2. Continuous and separable modifications 275 3 4 CONTENTS 7.3. Gaussian and stationary processes 284 Chapter 8. Continuous time martingales and Markov processes 291 8.1. Continuous time filtrations and stopping times 291 8.2. Continuous time martingales 296 8.3. Markov and Strong Markov processes 319 Chapter 9. The Brownian motion 343 9.1. Brownian transformations, hitting times and maxima 343 9.2. Weak convergence and invariance principles 351 9.3. Brownian path: regularity, local maxima and level sets 369 Bibliography 377 310a: Homework Sets 2007 379 Preface These are the lecture notes for a year long, PhD level course in Probability Theory that I taught at Stanford University in 2004, 2006 and 2009. The goal of this course is to prepare incoming PhD students in Stanford's mathematics and statistics departments to do research in probability theory. More broadly, the goal of the text is to help the reader master the mathematical foundations of probability theory and the techniques most commonly used in proving theorems in this area. This is then applied to the rigorous study of the most fundamental classes of stochastic processes. Towards this goal, we introduce in Chapter 1 the relevant elements from measure and integration theory, namely, the probability space and the σ-algebras of events in it, random variables viewed as measurable functions, their expectation as the corresponding Lebesgue integral, and the important concept of independence. Utilizing these elements, we study in Chapter 2 the various notions of convergence of random variables and derive the weak and strong laws of large numbers. Chapter 3 is devoted to the theory of weak convergence, the related concepts of distribution and characteristic functions and two important special cases: the Central Limit Theorem (in short clt) and the Poisson approximation. Drawing upon the framework of Chapter 1, we devote Chapter 4 to the definition, existence and properties of the conditional expectation and the associated regular conditional probability distribution. Chapter 5 deals with filtrations, the mathematical notion of information progres- sion in time, and with the corresponding stopping times. Results about the latter are obtained as a by product of the study of a collection of stochastic processes called martingales. Martingale representations are explored, as well as maximal inequalities, convergence theorems and various applications thereof. Aiming for a clearer and easier presentation, we focus here on the discrete time settings deferring the continuous time counterpart to Chapter 8. Chapter 6 provides a brief introduction to the theory of Markov chains, a vast subject at the core of probability theory, to which many text books are devoted. We illustrate some of the interesting mathematical properties of such processes by examining few special cases of interest. Chapter 7 sets the framework for studying right-continuous stochastic processes indexed by a continuous time parameter, introduces the family of Gaussian pro- cesses and rigorously constructs the Brownian motion as a Gaussian process of continuous sample path and zero-mean, stationary independent increments. 5 6 PREFACE Chapter 8 expands our earlier treatment of martingales and strong Markov pro- cesses to the continuous time setting, emphasizing the role of right-continuous fil- tration. The mathematical structure of such processes is then illustrated both in the context of Brownian motion and that of Markov jump processes. Building on this, in Chapter 9 we re-construct the Brownian motion via the in- variance principle as the limit of certain rescaled random walks. We further delve into the rich properties of its sample path and the many applications of Brownian motion to the clt and the Law of the Iterated Logarithm (in short, lil). The intended audience for this course should have prior exposure to stochastic processes, at an informal level. While students are assumed to have taken a real analysis class dealing with Riemann integration, and mastered well this material, prior knowledge of measure theory is not assumed. It is quite clear that these notes are much influenced by the text books [Bil95, Dur03, Wil91, KaS97] I have been using. I thank my students out of whose work this text materialized and my teaching as- sistants Su Chen, Kshitij Khare, Guoqiang Hu, Julia Salzman, Kevin Sun and Hua Zhou for their help in the assembly of the notes of more than eighty students into a coherent document. I am also much indebted to Kevin Ross, Andrea Montanari and Oana Mocioalca for their feedback on earlier drafts of these notes, to Kevin Ross for providing all the figures in this text, and to Andrea Montanari, David Siegmund and Tze Lai for contributing some of the exercises in these notes. Amir Dembo Stanford, California April 2010 CHAPTER 1 Probability, measure and integration This chapter is devoted to the mathematical foundations of probability theory. Section 1.1 introduces the basic measure theory framework, namely, the probability space and the σ-algebras of events in it. The next building blocks are random variables, introduced in Section 1.2 as measurable functions ! X(!) and their distribution. 7! This allows us to define in Section 1.3 the important concept of expectation as the corresponding Lebesgue integral, extending the horizon of our discussion beyond the special functions and variables with density to which elementary probability theory is limited. Section 1.4 concludes the chapter by considering independence, the most fundamental aspect that differentiates probability from (general) measure theory, and the associated product measures. 1.1. Probability spaces, measures and σ-algebras We shall define here the probability space (Ω; ; P) using the terminology of mea- sure theory. F The sample space Ω is a set of all possible outcomes ! Ω of some random exper- iment. Probabilities are assigned by A P(A) to A in2a subset of all possible sets of outcomes. The event space represen7! ts both the amounFt of information available as a result of the experimenFt conducted and the collection of all events of possible interest to us. A pleasant mathematical framework results by imposing on the structural conditions of a σ-algebra, as done in Subsection 1.1.1. The most commonF and useful choices for this σ-algebra are then explored in Subsection 1.1.2. Subsection 1.1.3 provides fundamental supplements from measure theory, namely Dynkin's and Carath´eodory's theorems and their application to the construction of Lebesgue measure. 1.1.1. The probability space (Ω; , P). We use 2Ω to denote the set of all possible subsets of Ω. The event space isFthus a subset of 2Ω, consisting of all allowed events, that is, those events to which we shall assignF probabilities. We next define the structural conditions imposed on . F Definition 1.1.1. We say that 2Ω is a σ-algebra (or a σ-field), if (a) Ω , F ⊆ (b) If 2A F then Ac as well (where Ac = Ω A). (c) If A 2 F for i = 12; 2F; 3; : : : then also A n . i 2 F i i 2 F c c Remark. Using DeMorgan's law, we knoS w that ( i Ai ) = i Ai. Thus the following is equivalent to property (c) of Definition 1.1.1: (c') If A for i = 1; 2; 3; : : : then also A .S T i 2 F i i 2 F 7 T 8 1. PROBABILITY, MEASURE AND INTEGRATION Definition 1.1.2. A pair (Ω; ) with a σ-algebra of subsets of Ω is called a measurable space. Given a measurF able spFace (Ω; ), a measure µ is any countably additive non-negative set function on this space.FThat is, µ : [0; ], having the properties: F ! 1 (a) µ(A) µ( ) = 0 for all A . ≥ ; 2 F (b) µ( n An) = n µ(An) for any countable collection of disjoint sets An . When in addition µ(Ω) = 1, we call the measure µ a probability measure2 F, and often labS el it by PP(it is also easy to see that then P(A) 1 for all A ). ≤ 2 F Remark.