Conditional Measures and Conditional Expectation; Rohlin’S Disintegration Theorem
Total Page:16
File Type:pdf, Size:1020Kb
DISCRETE AND CONTINUOUS doi:10.3934/dcds.2012.32.2565 DYNAMICAL SYSTEMS Volume 32, Number 7, July 2012 pp. 2565{2582 CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN'S DISINTEGRATION THEOREM David Simmons Department of Mathematics University of North Texas P.O. Box 311430 Denton, TX 76203-1430, USA Abstract. The purpose of this paper is to give a clean formulation and proof of Rohlin's Disintegration Theorem [7]. Another (possible) proof can be found in [6]. Note also that our statement of Rohlin's Disintegration Theorem (The- orem 2.1) is more general than the statement in either [7] or [6] in that X is allowed to be any universally measurable space, and Y is allowed to be any subspace of standard Borel space. Sections 1 - 4 contain the statement and proof of Rohlin's Theorem. Sec- tions 5 - 7 give a generalization of Rohlin's Theorem to the category of σ-finite measure spaces with absolutely continuous morphisms. Section 8 gives a less general but more powerful version of Rohlin's Theorem in the category of smooth measures on C1 manifolds. Section 9 is an appendix which contains proofs of facts used throughout the paper. 1. Notation. We begin with the definition of the standard concept of a system of conditional measures, also known as a disintegration: Definition 1.1. Let (X; µ) be a probability space, Y a measurable space, and π : X ! Y a measurable function. A system of conditional measures of µ with respect to (X; π; Y ) is a collection of measures (µy)y2Y such that −1 i) For each y 2 Y , µy is a measure on π (X). For µb-almost every y 2 Y , µy is a probability measure. ii) The measures (µy)y2Y satisfy the law of total probability Z µ(B) = µπ−1(y)(B)dµb(y) (1) −1 for every event B of X. (Here and throughout this paper µb := µ ◦ π .) Note that we are implicitly assuming that the map y 7! µy(B) is µb-measurable; we must be careful to prove this claim. The proof that we will give of Rohlin's disintegration theorem is probabilistic; in particular, we will use the following notations motivated by a probabilistic point of view: 2000 Mathematics Subject Classification. Primary: 28A50, 28C15; Secondary: 28C05. Key words and phrases. Disintegration, conditional measures, linear functionals, differential forms. 2565 2566 DAVID SIMMONS Notation. Let (X; µ) be a probability space, A a µ-measurable subset of X with µ(A) > 0. We write µ(B \ A) P (X 2 B X 2 A) := µ (B) := µ A µ(A) Z Eµ( (X ) X 2 A) := (x)dµA(x) To prove the existence of systems of conditional measures, we will use a related concept which depends on topology: Definition 1.2. Let (X; µ) be a topological probability space, Y a metric space, and π : X ! Y a measurable function. (π need not be continuous.) Let y 2 Y . Then the topological conditional measure of µ with respect to (X; π; y; Y ) is the weak-* limit µy := lim µπ−1(B(y;")) (2) "!0 if it exists and is supported entirely on π−1(y). (The measures on the right hand side are defined by Notation1.) This definition has the advantage of being specific: for each y 2 Y , there is at most one measure on π−1(y) which can be called the conditional probability of µ on π−1(y). Its disadvantage is that the context of the definition is less general: X is required to be a topological space and Y is required to be a metric space. We recall the following standard definitions: Definition 1.3. Standard Borel space is the Cantor space 2N with its Borel σ- algebra; the Borel isomorphism theorem states that any uncountable Polish space with its Borel σ-algebra is Borel isomorphic to standard Borel space. Definition 1.4. A universally measurable space is a measurable space X such that there is an isomorphic embedding iX of X into standard Borel space, such that for every Borel measure µ on standard Borel space, iX (X) is in the completion of µ. Definition 1.5. A metric space X is an ultrametric space if it satisfies the ultra- metric triangle inequality d(x; z) ≤ max(d(x; y); d(y; z)) for all x; y; z 2 X. 2. Statement of Rohlin's Disintegration Theorem. We will prove two ver- sions of Rohlin's Theorem; the first, which is a strengthening of the version given in [7], is an entirely measure-theoretic formulation, whereas the second, which appears to be new, involves topology. Theorems 2.1 and 2.2 correspond to Definitions 1.1 and 1.2, respectively. Theorem 2.1 (Rohlin's Disintegration Theorem). Let X be a universally measurable space, let Y be a measurable space such that there exists a measurable injective map from Y into standard Borel space, and let µ be a Borel probability measure on X. Let π : X ! Y be measurable. Then there exists a system of conditional measures (µy)y2Y of µ with respect to (X; π; Y ). They are unique in the sense that if (νy)y2Y is any other system of conditional measures, then µy = νy for µb-almost every y 2 Y . ROHLIN'S DISINTEGRATION THEOREM 2567 Theorem 2.2. Let (X; µ) be a compact metric probability space, let Y be a locally compact separable ultrametric space or a separable Riemannian manifold. Let π : X ! Y be measurable. Then for µb-almost every y 2 Y , the topological conditional measure of µ with respect to (X; π; y; Y ) exists as in Definition 1.2. Furthermore the collection of measures (µy)y2Y is a system of conditional measures as in Definition 1.1. (If µy does not exist, set µy = 0.) The proof will be divided into 2 parts: deducing Theorem 2.1 from Theorem 2.2, and proving Theorem 2.2. 3. Proof of Rohlin's Theorem: Theorem 2.2 ! Theorem 2.1. Let X0 = 2N 0 be standard Borel space, and let iX : X ! X be the inclusion guaranteed by the 0 −1 0 0 universal measurability of X. Let µ = µ ◦ iX ; µ is a probability measure on X . Then (X0; µ0) is a compact metric probability space. 0 Let iY be a measurable injective map from Y into the Cantor space Y := 2N equipped with the Borel σ-algebra. Note that Y 0 is a locally compact separable ul- trametric space. By [[8] 3.2.3 p.92], the map π admits a Borel measurable extension π0 : X0 ! Y 0. Note that there is no reason to suppose that π0 is continuous. Thus we have satisfied the hypotheses of Theorem 2.2 for (X0; µ0; π0;Y 0). (If 0 X and Y are standard Borel, we are done with existence.) Let (µy0 )y02Y 0 be a system of conditional measures of µ0 with respect to (X0; π0;Y 0). For each y 2 Y , let µ = (µ0 i (X)) ◦ (i−1)−1 if µ0 is supported on i (X), and µ = 0 y iY (y) X X iY (y) X y otherwise. Note that this makes sense since iX (X) is universally measurable. We claim that (µy)y2Y is a system of conditional measures of µ with respect to (X; π; Y ). 0 0 First, note that since π is an extension of π, then iY ◦ π = π ◦ iX , and thus 0 −1 µb = µb ◦ iY . For all y 2 Y , µ0 is a measure on (π0)−1(i (y)). If µ0 (X0 n i (X)) > 0, iY (y) Y iY (y) X then µ = 0 is a measure on π−1(y). If µ0 (X0 n i (X)) = 0, then µ = y iY (y) X y (µ0 i (X)) ◦ (i−1)−1 is a measure supported on i−1((π0)−1(i (y)) \ i (X)), iY (y) X X X Y X −1 which by the injectivity of iY is equal to π (y). Furthermore, in this case we have µ0 = µ ◦ i−1. If additionally µ0 is a probability measure, then µ is a iY (y) y X iY (y) y probability measure. 0 0 0 0 0 0 Now for µb -almost every y 2 Y , µy0 is a probability measure, and µy0 (X n 0 0 iX (X)) = 0. (The second claim follows from (1) applied to the formula µ (X n 0 iX (X)) = µ(;) = 0.) Thus for µ-almost every y 2 Y , µ is a probability b iY (y) measure, and µ0 (X0 n i (X)) = 0. By the preceding paragraph, we see that for iY (y) X −1 every y 2 Y , µy is a measure on π (y), and for µb-almost every y 2 Y , µy is a probability measure and µ0 = µ ◦ i−1. Thus condition (i) of Definition 1.1 is iY (y) y X satisfied. To prove condition (ii), fix B ⊆ X measurable. Since iX is an embedding, there 0 0 −1 0 exists B ⊆ X Borel such that B = iX (B ). Now for µb-almost every y 2 Y , −1 0 −1 0 µiY (y) = µy ◦ iX and therefore µiY (y)(B ) = µy ◦ iX (B ) = µy(B). Thus the function y 7! µy(B) is equal µb-almost everywhere to the composition of iY with the 0 0 map y 7! µy0 (B ), and is therefore µb-measurable. Finally, note that µ0(B0) = µ(B). Applying (1), we see that µ(B) = R 0 0 0 0 R 0 0 R −1 0 R µ 0 (B )dµ (y ) = µ (B )dµ(y) = µy ◦ i (B )dµ(y) = µy(B)dµ(y).