5 | Probability Spaces and Random Variables

Total Page:16

File Type:pdf, Size:1020Kb

5 | Probability Spaces and Random Variables 5 j Probability spaces and random variables In this Chapter, we review some essentials of probability theory as required for the theory of the GLM. We focus on the particularities and inner logic of the probability theory model rather than its practical application and primarily aim to establish important concepts and notation that will be used in subsequent sections. In Section 5.1, we first introduce the basic notion of a probability space as a model for experiments that involve some degree of randomness. We then discuss some elementary aspects of probability in Section 5.2 which mainly serve to ground the subsequently discussed theory of random variables and random vectors. The fundamental mathematical construct to model univariate data endowed with uncertainty is the concept of a random variable. We focus on different ways of specifying probability distributions of random variables, notably probability mass and density functions for discrete and continuous random variables, respectively, in Section 5.3. The concise mathematical representation of more than one data point requires the concept of a random vector. In Section 5.4, we first discuss the extension of random variable concepts to the multivariate case of random vectors and then focus on three concepts that arise only in the multivariate scenario and are of immense importance for statistical data analysis: marginal distributions, conditional distributions, and independent random variables. 5.1 Probability spaces Probability spaces Probability spaces are very general and abstract models of random experiments. We use the following definition. Definition 5.1.1 (Probability space). A probability space is a triple (Ω; A; P), where Ω is a set of elementary outcomes !, A is a σ-algebra, i.e., A is a set with the following properties ◦ Ω 2 A, ◦A is closed under the formation of complements, i.e. if A 2 A, then also Ac = Ω for all A 2 A, 1 ◦A is closed under countable unions, i.e., if A1;A2;A3; ::: 2 A, then [i=1Ai 2 A. P is a probability measure, i.e., P is a mapping P : A! [0; 1] with the following properties: ◦ P is normalized, i.e., P (;) = 0 and P (Ω) = 1, and ◦ P is σ-additive, i.e., if A1;A2; ::: is a pairwise disjoint sequence in A (i.e., Ai 2 A for i = 1; 2; ::: 1 P1 and Ai \ Aj = ; for i 6= j), then P([i=1Ai) = i=1 P(Ai). • Example A basic example is a probability space that models the throw of a die. In this case the elementary outcomes ! 2 Ω model the six faces of the die, i.e., one may define Ω := f1; 2; 3; 4; 5; 6g. If the die is thrown, it will roll, and once it comes to rest, its upper surface will show one of the elementary outcomes. The typical σ-algebra used in the case of discrete and finite outcomes sets (such as the current Ω) is the power set P(Ω) of Ω. It is a basic exercise in probability theory to show that the power set indeed fulfils the properties of a σ-algebra as defined above. Because P(Ω) contains all subsets of Ω, it also contains the elementary outcome sets f1g; f2g; :::; f6g, which thus get allocated a probability P(f!g) 2 [0; 1];! 2 Ω by the probability measure P. Probabilities of sets containing a single elementary outcome are also often written simply as P(!) (:= P(f!g)). The typical value ascribed to P(!);! 2 Ω, if used to model a fair die, is P(!) = 1=6. The σ-algebra P(Ω) contains many more sets than the sets of elementary outcomes. The purpose of these additional elements is to model all sorts of events to which an observer of the random experiment may want to ascribe probabilities. For example, the observer may ask \What is the probability that the upper Elementary probabilities 2 surface shows a number larger than three?". This event corresponds to the set f4; 5; 6g, which, because the σ-algebra P(Ω) contains all possible subsets of Ω, is contained in P(Ω). Likewise, the observer may ask \What is the probability that the upper surface shows an even number?", which corresponds to the subset f2; 4; 6g of Ω. The probability measure P is defined in such a manner that the answers to the following questions are predetermined: \What is the probability that the upper surface shows nothing?" and \What is the probability that the upper surface shows any number in Ω?”. The element of P(Ω) that corresponds to the first question is the empty set, and by definition of P, P(;) = 0. This models the idea that one of the elementary outcomes, i.e., one surface with pips, will show up on every instance of the random experiment. If this is not the case, for example because the pips have worn off at one of the surfaces, the probability space model as sketched thus far is not a good model of the die experiment. The element of P(Ω) that corresponds to the second question is Ω itself. Here, the definition of the probability measure assigns P(Ω) = 1, i.e., the probability that something unspecific will happen, is one. Again, if the die falls off the table and cannot be recovered, the probability space model and the experiment are not in good alignment. Finally, the definition of the probability space as provided above allows one to evaluate probabilities for certain events based on the probabilities of other events by means of the σ-additivity of P. Assume for example that the probability space models the throw of a fair die, such that P(f!g) = 1=6 by definition. Based on this assumption, the σ-additivity property allows to evaluate the probabilities of many other events. Consider for example an observer who is interested in the probability of the event that the surface of the die shows a number smaller or equal to three. Because the elementary events f1g; f2g; f3g are pairwise disjoint, and because the event of interest can be written as the countable union f1; 2; 3g = f1g [ f2g [ f3g of these events, one may evaluate the probability of the event of interest by 3 P3 P([i=1fig) = i=1 P(i) = 1=6 + 1=6 + 1=6 = 1=2. The die example is concerned with the case that a probability space is used to model a random experiment with a finite number of elementary outcomes. In the modelling of scientific experiments, the elementary outcomes are often modelled by the set of real numbers or real-valued vectors. Much of the theoretical development of modern probability theory in the early twentieth century was concerned with the question of how ideas from basic probability with finite elementary outcome spaces can be generalized to the continuous outcome space case of real numbers and vectors. In fact, it is perhaps the most important contribution of the probability space model as defined above and originally developed by Kolmogorov (1956) to be applicable in both the discrete-finite and the continuous-infinite elementary outcome set scenarios. The study of probability spaces for Ω := R or Ω := Rn; n > 1 is a central topic in probability theory which we by and large omit here. We do however note that the σ-algebras employed when Ω := Rn; n ≥ 1 are the so-called Borel σ-algebras, commonly denoted by B for n = 1 and Bn for n > 1. The mathematical construction of these σ-algebras is beyond our scope, but for the theory of the GLM, it is not unhelpful to think of Borel σ-algebras as power sets of R or Rn; n > 1. This is factually wrong as it can be shown that there are in fact more subsets of R or Rn; n > 1 than there are elements in the corresponding Borel σ-algebras. Nevertheless, many events of interest, such as the probability for the elementary outcome of a random experiment with outcome space R to fall into a real interval [a; b], are in B. 5.2 Elementary probabilities We next discuss a few elementary aspects of probabilities defined on probability spaces. Throughout, let (Ω; A; P) denote a probability space, such that P : A! [0; 1] is a probability measure. Interpretation We first note that the probability P(A) of an event A is associated with at least two interpretations. From a Frequentist perspective, the probability of an event corresponds to the idealized long run frequency of observing the event A. From a Bayesian perspective, the probability of an event corresponds to the degree of belief that the event is true. Notably, both interpretations are subjective in the sense that the Frequentist perspective envisions an idealized long run frequency which can never be realized in practice, while the Bayesian belief interpretation is explicitly subjective and specific to a given observer. However, irrespective of the specific interpretation of the probability of an event, the logical rules for probabilistic inference, also known as probability calculus, are identical under both interpretations. The General Linear Model j © 2020 Dirk Ostwald CC BY-NC-SA 4.0 Elementary probabilities 3 Basic properties We next note the following basic properties of probabilities, which follow directly from probability space definition. Theorem 5.2.1 (Properties of probabilities). Let (Ω; A; P) denote a probability space. Then the following properties holds. (1) If A ⊂ B, then P(A) ≤ P(B). (2) P(Ac) = 1 − P(A). (3) If A \ B = ;, then P(A [ B) = P(A) + P(B).
Recommended publications
  • Measure-Theoretic Probability I
    Measure-Theoretic Probability I Steven P.Lalley Winter 2017 1 1 Measure Theory 1.1 Why Measure Theory? There are two different views – not necessarily exclusive – on what “probability” means: the subjectivist view and the frequentist view. To the subjectivist, probability is a system of laws that should govern a rational person’s behavior in situations where a bet must be placed (not necessarily just in a casino, but in situations where a decision must be made about how to proceed when only imperfect information about the outcome of the decision is available, for instance, should I allow Dr. Scissorhands to replace my arthritic knee by a plastic joint?). To the frequentist, the laws of probability describe the long- run relative frequencies of different events in “experiments” that can be repeated under roughly identical conditions, for instance, rolling a pair of dice. For the frequentist inter- pretation, it is imperative that probability spaces be large enough to allow a description of an experiment, like dice-rolling, that is repeated infinitely many times, and that the mathematical laws should permit easy handling of limits, so that one can make sense of things like “the probability that the long-run fraction of dice rolls where the two dice sum to 7 is 1/6”. But even for the subjectivist, the laws of probability should allow for description of situations where there might be a continuum of possible outcomes, or pos- sible actions to be taken. Once one is reconciled to the need for such flexibility, it soon becomes apparent that measure theory (the theory of countably additive, as opposed to merely finitely additive measures) is the only way to go.
    [Show full text]
  • Probability and Statistics Lecture Notes
    Probability and Statistics Lecture Notes Antonio Jiménez-Martínez Chapter 1 Probability spaces In this chapter we introduce the theoretical structures that will allow us to assign proba- bilities in a wide range of probability problems. 1.1. Examples of random phenomena Science attempts to formulate general laws on the basis of observation and experiment. The simplest and most used scheme of such laws is: if a set of conditions B is satisfied =) event A occurs. Examples of such laws are the law of gravity, the law of conservation of mass, and many other instances in chemistry, physics, biology... If event A occurs inevitably whenever the set of conditions B is satisfied, we say that A is certain or sure (under the set of conditions B). If A can never occur whenever B is satisfied, we say that A is impossible (under the set of conditions B). If A may or may not occur whenever B is satisfied, then A is said to be a random phenomenon. Random phenomena is our subject matter. Unlike certain and impossible events, the presence of randomness implies that the set of conditions B do not reflect all the necessary and sufficient conditions for the event A to occur. It might seem them impossible to make any worthwhile statements about random phenomena. However, experience has shown that many random phenomena exhibit a statistical regularity that makes them subject to study. For such random phenomena it is possible to estimate the chance of occurrence of the random event. This estimate can be obtained from laws, called probabilistic or stochastic, with the form: if a set of conditions B is satisfied event A occurs m times =) repeatedly n times out of the n repetitions.
    [Show full text]
  • The Probability Set-Up.Pdf
    CHAPTER 2 The probability set-up 2.1. Basic theory of probability We will have a sample space, denoted by S (sometimes Ω) that consists of all possible outcomes. For example, if we roll two dice, the sample space would be all possible pairs made up of the numbers one through six. An event is a subset of S. Another example is to toss a coin 2 times, and let S = fHH;HT;TH;TT g; or to let S be the possible orders in which 5 horses nish in a horse race; or S the possible prices of some stock at closing time today; or S = [0; 1); the age at which someone dies; or S the points in a circle, the possible places a dart can hit. We should also keep in mind that the same setting can be described using dierent sample set. For example, in two solutions in Example 1.30 we used two dierent sample sets. 2.1.1. Sets. We start by describing elementary operations on sets. By a set we mean a collection of distinct objects called elements of the set, and we consider a set as an object in its own right. Set operations Suppose S is a set. We say that A ⊂ S, that is, A is a subset of S if every element in A is contained in S; A [ B is the union of sets A ⊂ S and B ⊂ S and denotes the points of S that are in A or B or both; A \ B is the intersection of sets A ⊂ S and B ⊂ S and is the set of points that are in both A and B; ; denotes the empty set; Ac is the complement of A, that is, the points in S that are not in A.
    [Show full text]
  • 1 Probabilities
    1 Probabilities 1.1 Experiments with randomness We will use the term experiment in a very general way to refer to some process that produces a random outcome. Examples: (Ask class for some first) Here are some discrete examples: • roll a die • flip a coin • flip a coin until we get heads And here are some continuous examples: • height of a U of A student • random number in [0, 1] • the time it takes until a radioactive substance undergoes a decay These examples share the following common features: There is a proce- dure or natural phenomena called the experiment. It has a set of possible outcomes. There is a way to assign probabilities to sets of possible outcomes. We will call this a probability measure. 1.2 Outcomes and events Definition 1. An experiment is a well defined procedure or sequence of procedures that produces an outcome. The set of possible outcomes is called the sample space. We will typically denote an individual outcome by ω and the sample space by Ω. Definition 2. An event is a subset of the sample space. This definition will be changed when we come to the definition ofa σ-field. The next thing to define is a probability measure. Before we can do this properly we need some more structure, so for now we just make an informal definition. A probability measure is a function on the collection of events 1 that assign a number between 0 and 1 to each event and satisfies certain properties. NB: A probability measure is not a function on Ω.
    [Show full text]
  • Propensities and Probabilities
    ARTICLE IN PRESS Studies in History and Philosophy of Modern Physics 38 (2007) 593–625 www.elsevier.com/locate/shpsb Propensities and probabilities Nuel Belnap 1028-A Cathedral of Learning, University of Pittsburgh, Pittsburgh, PA 15260, USA Received 19 May 2006; accepted 6 September 2006 Abstract Popper’s introduction of ‘‘propensity’’ was intended to provide a solid conceptual foundation for objective single-case probabilities. By considering the partly opposed contributions of Humphreys and Miller and Salmon, it is argued that when properly understood, propensities can in fact be understood as objective single-case causal probabilities of transitions between concrete events. The chief claim is that propensities are well-explicated by describing how they fit into the existing formal theory of branching space-times, which is simultaneously indeterministic and causal. Several problematic examples, some commonsense and some quantum-mechanical, are used to make clear the advantages of invoking branching space-times theory in coming to understand propensities. r 2007 Elsevier Ltd. All rights reserved. Keywords: Propensities; Probabilities; Space-times; Originating causes; Indeterminism; Branching histories 1. Introduction You are flipping a fair coin fairly. You ascribe a probability to a single case by asserting The probability that heads will occur on this very next flip is about 50%. ð1Þ The rough idea of a single-case probability seems clear enough when one is told that the contrast is with either generalizations or frequencies attributed to populations asserted while you are flipping a fair coin fairly, such as In the long run; the probability of heads occurring among flips is about 50%. ð2Þ E-mail address: [email protected] 1355-2198/$ - see front matter r 2007 Elsevier Ltd.
    [Show full text]
  • Determinism, Indeterminism and the Statistical Postulate
    Tipicality, Explanation, The Statistical Postulate, and GRW Valia Allori Northern Illinois University [email protected] www.valiaallori.com Rutgers, October 24-26, 2019 1 Overview Context: Explanation of the macroscopic laws of thermodynamics in the Boltzmannian approach Among the ingredients: the statistical postulate (connected with the notion of probability) In this presentation: typicality Def: P is a typical property of X-type object/phenomena iff the vast majority of objects/phenomena of type X possesses P Part I: typicality is sufficient to explain macroscopic laws – explanatory schema based on typicality: you explain P if you explain that P is typical Part II: the statistical postulate as derivable from the dynamics Part III: if so, no preference for indeterministic theories in the quantum domain 2 Summary of Boltzmann- 1 Aim: ‘derive’ macroscopic laws of thermodynamics in terms of the microscopic Newtonian dynamics Problems: Technical ones: There are to many particles to do exact calculations Solution: statistical methods - if 푁 is big enough, one can use suitable mathematical technique to obtain information of macro systems even without having the exact solution Conceptual ones: Macro processes are irreversible, while micro processes are not Boltzmann 3 Summary of Boltzmann- 2 Postulate 1: the microscopic dynamics microstate 푋 = 푟1, … . 푟푁, 푣1, … . 푣푁 in phase space Partition of phase space into Macrostates Macrostate 푀(푋): set of macroscopically indistinguishable microstates Macroscopic view Many 푋 for a given 푀 given 푀, which 푋 is unknown Macro Properties (e.g. temperature): they slowly vary on the Macro scale There is a particular Macrostate which is incredibly bigger than the others There are many ways more to have, for instance, uniform temperature than not equilibrium (Macro)state 4 Summary of Boltzmann- 3 Entropy: (def) proportional to the size of the Macrostate in phase space (i.e.
    [Show full text]
  • Topic 1: Basic Probability Definition of Sets
    Topic 1: Basic probability ² Review of sets ² Sample space and probability measure ² Probability axioms ² Basic probability laws ² Conditional probability ² Bayes' rules ² Independence ² Counting ES150 { Harvard SEAS 1 De¯nition of Sets ² A set S is a collection of objects, which are the elements of the set. { The number of elements in a set S can be ¯nite S = fx1; x2; : : : ; xng or in¯nite but countable S = fx1; x2; : : :g or uncountably in¯nite. { S can also contain elements with a certain property S = fx j x satis¯es P g ² S is a subset of T if every element of S also belongs to T S ½ T or T S If S ½ T and T ½ S then S = T . ² The universal set ­ is the set of all objects within a context. We then consider all sets S ½ ­. ES150 { Harvard SEAS 2 Set Operations and Properties ² Set operations { Complement Ac: set of all elements not in A { Union A \ B: set of all elements in A or B or both { Intersection A [ B: set of all elements common in both A and B { Di®erence A ¡ B: set containing all elements in A but not in B. ² Properties of set operations { Commutative: A \ B = B \ A and A [ B = B [ A. (But A ¡ B 6= B ¡ A). { Associative: (A \ B) \ C = A \ (B \ C) = A \ B \ C. (also for [) { Distributive: A \ (B [ C) = (A \ B) [ (A \ C) A [ (B \ C) = (A [ B) \ (A [ C) { DeMorgan's laws: (A \ B)c = Ac [ Bc (A [ B)c = Ac \ Bc ES150 { Harvard SEAS 3 Elements of probability theory A probabilistic model includes ² The sample space ­ of an experiment { set of all possible outcomes { ¯nite or in¯nite { discrete or continuous { possibly multi-dimensional ² An event A is a set of outcomes { a subset of the sample space, A ½ ­.
    [Show full text]
  • Measure Theory and Probability
    Measure theory and probability Alexander Grigoryan University of Bielefeld Lecture Notes, October 2007 - February 2008 Contents 1 Construction of measures 3 1.1Introductionandexamples........................... 3 1.2 σ-additive measures ............................... 5 1.3 An example of using probability theory . .................. 7 1.4Extensionofmeasurefromsemi-ringtoaring................ 8 1.5 Extension of measure to a σ-algebra...................... 11 1.5.1 σ-rings and σ-algebras......................... 11 1.5.2 Outermeasure............................. 13 1.5.3 Symmetric difference.......................... 14 1.5.4 Measurable sets . ............................ 16 1.6 σ-finitemeasures................................ 20 1.7Nullsets..................................... 23 1.8 Lebesgue measure in Rn ............................ 25 1.8.1 Productmeasure............................ 25 1.8.2 Construction of measure in Rn. .................... 26 1.9 Probability spaces ................................ 28 1.10 Independence . ................................. 29 2 Integration 38 2.1 Measurable functions.............................. 38 2.2Sequencesofmeasurablefunctions....................... 42 2.3 The Lebesgue integral for finitemeasures................... 47 2.3.1 Simplefunctions............................ 47 2.3.2 Positivemeasurablefunctions..................... 49 2.3.3 Integrablefunctions........................... 52 2.4Integrationoversubsets............................ 56 2.5 The Lebesgue integral for σ-finitemeasure.................
    [Show full text]
  • Probability Theory Review 1 Basic Notions: Sample Space, Events
    Fall 2018 Probability Theory Review Aleksandar Nikolov 1 Basic Notions: Sample Space, Events 1 A probability space (Ω; P) consists of a finite or countable set Ω called the sample space, and the P probability function P :Ω ! R such that for all ! 2 Ω, P(!) ≥ 0 and !2Ω P(!) = 1. We call an element ! 2 Ω a sample point, or outcome, or simple event. You should think of a sample space as modeling some random \experiment": Ω contains all possible outcomes of the experiment, and P(!) gives the probability that we are going to get outcome !. Note that we never speak of probabilities except in relation to a sample space. At this point we give a few examples: 1. Consider a random experiment in which we toss a single fair coin. The two possible outcomes are that the coin comes up heads (H) or tails (T), and each of these outcomes is equally likely. 1 Then the probability space is (Ω; P), where Ω = fH; T g and P(H) = P(T ) = 2 . 2. Consider a random experiment in which we toss a single coin, but the coin lands heads with 2 probability 3 . Then, once again the sample space is Ω = fH; T g but the probability function 2 1 is different: P(H) = 3 , P(T ) = 3 . 3. Consider a random experiment in which we toss a fair coin three times, and each toss is independent of the others. The coin can come up heads all three times, or come up heads twice and then tails, etc.
    [Show full text]
  • 1 Probability Measure and Random Variables
    1 Probability measure and random variables 1.1 Probability spaces and measures We will use the term experiment in a very general way to refer to some process that produces a random outcome. Definition 1. The set of possible outcomes is called the sample space. We will typically denote an individual outcome by ω and the sample space by Ω. Set notation: A B, A is a subset of B, means that every element of A is also in B. The union⊂ A B of A and B is the of all elements that are in A or B, including those that∪ are in both. The intersection A B of A and B is the set of all elements that are in both of A and B. ∩ n j=1Aj is the set of elements that are in at least one of the Aj. ∪n j=1Aj is the set of elements that are in all of the Aj. ∩∞ ∞ j=1Aj, j=1Aj are ... Two∩ sets A∪ and B are disjoint if A B = . denotes the empty set, the set with no elements. ∩ ∅ ∅ Complements: The complement of an event A, denoted Ac, is the set of outcomes (in Ω) which are not in A. Note that the book writes it as Ω A. De Morgan’s laws: \ (A B)c = Ac Bc ∪ ∩ (A B)c = Ac Bc ∩ ∪ c c ( Aj) = Aj j j [ \ c c ( Aj) = Aj j j \ [ (1) Definition 2. Let Ω be a sample space. A collection of subsets of Ω is a σ-field if F 1.
    [Show full text]
  • CSE 21 Mathematics for Algorithm and System Analysis
    CSE 21 Mathematics for Algorithm and System Analysis Unit 1: Basic Count and List Section 3: Set (cont’d) Section 4: Probability and Basic Counting CSE21: Lecture 4 1 Quiz Information • The first quiz will be in the first 15 minutes of the next class (Monday) at the same classroom. • You can use textbook and notes during the quiz. • For all the questions, no final number is necessary, arithmetic formula is enough. • Write down your analysis, e.g., applicable theorem(s)/rule(s). We will give partial credit if the analysis is correct but the result is wrong. CSE21: Lecture 4 2 Correction • For set U={1, 2, 3, 4, 5}, A={1, 2, 3}, B={3, 4}, – Set Difference A − B = {1, 2}, B − A ={4} – Symmetric Difference: A ⊕ B = ( A − B)∪(B − A)= {1, 2} ∪{4} = {1, 2, 4} CSE21: Lecture 4 3 Card Hand Illustration • 5 card hand of full house: a pair and a triple • 5 card hand with two pairs CSE21: Lecture 4 4 Review: Binomial Coefficient • Binomial Coefficient: number of subsets of A of size (or cardinality) k: n n! C(n, k) = = k k (! n − k)! CSE21: Lecture 4 5 Review : Deriving Recursions • How to construct the things of a given size by using the same type of things of a smaller size? • Recursion formula of binomial coefficient – C(0,0) = 1, – C(0, k) = 0 for k ≠ 0 and – C(n,k) = C(n−1, k−1)+ C(n−1, k) for n > 0; • It shows how recursion works and tells another way calculating C(n,k) besides the formula n n! C(n, k) = = k k (! n − k)! 6 Learning Outcomes • By the end of this lesson, you should be able to – Calculate set partition number by recursion.
    [Show full text]
  • Probability Theory: STAT310/MATH230; September 12, 2010 Amir Dembo
    Probability Theory: STAT310/MATH230; September 12, 2010 Amir Dembo E-mail address: [email protected] Department of Mathematics, Stanford University, Stanford, CA 94305. Contents Preface 5 Chapter 1. Probability, measure and integration 7 1.1. Probability spaces, measures and σ-algebras 7 1.2. Random variables and their distribution 18 1.3. Integration and the (mathematical) expectation 30 1.4. Independence and product measures 54 Chapter 2. Asymptotics: the law of large numbers 71 2.1. Weak laws of large numbers 71 2.2. The Borel-Cantelli lemmas 77 2.3. Strong law of large numbers 85 Chapter 3. Weak convergence, clt and Poisson approximation 95 3.1. The Central Limit Theorem 95 3.2. Weak convergence 103 3.3. Characteristic functions 117 3.4. Poisson approximation and the Poisson process 133 3.5. Random vectors and the multivariate clt 140 Chapter 4. Conditional expectations and probabilities 151 4.1. Conditional expectation: existence and uniqueness 151 4.2. Properties of the conditional expectation 156 4.3. The conditional expectation as an orthogonal projection 164 4.4. Regular conditional probability distributions 169 Chapter 5. Discrete time martingales and stopping times 175 5.1. Definitions and closure properties 175 5.2. Martingale representations and inequalities 184 5.3. The convergence of Martingales 191 5.4. The optional stopping theorem 203 5.5. Reversed MGs, likelihood ratios and branching processes 209 Chapter 6. Markov chains 225 6.1. Canonical construction and the strong Markov property 225 6.2. Markov chains with countable state space 233 6.3. General state space: Doeblin and Harris chains 255 Chapter 7.
    [Show full text]