<<

“measureTheory_v2” 2019/5/1 i i page 61 i i

Chapter 5 Construction of a General Structure

Simple methods will soon lead us to results of far reaching theoretical and practical importance. We shall encounter theoretical conclusions which not only are unexpected but actually come as a shock to intuition and common sense.

W. Feller

What I don’t like about measure theory is that you have to say “almost everywhere” almost everywhere.

K. Friedrichs

Keep in mind that there are millions of theorems but only thousands of proofs, hundreds of proof blocks, and dozens of ideas. Unfortunately, no one has figured out how to transfer the ideas directly yet, so you have to extract them from complicated arguments by yourself. F. Nazarov

Let me do it. You tell me when you want it and where you want it to land, and I’ll do it backwards and tell you when to take off.

K. Johnson

I’ve been giving this lecture to first-year classes for over twenty-five years. You’d think they would begin to understand it by now.

J. Littlewood

Panorama From developing some ideas of measure theory from an experimental point of view, we turn 180◦ to the development of a rigorous general measure theory. The ingredients are a describing the “universe” of points, a class of “measurable” along with permissible operations on these sets, and the measure itself. After postulating the basic properties of measure theory, we can then develop consequences and applications of those properties to build a rich theory. That is the focus of the rest of this book.

61

i i

i i “measureTheory_v2” 2019/5/1 i i page 62 i 62 Chapter 5. Construction of a General Measure Structure i

However, before starting that path, we have to construct interesting measures satis- fying the assumptions and we have to explain how to compute the measure of compli- cated sets. The latter point is critical as it is impractical to assign the measure of every complicated set. We have to build a systematic method for computing the measures of complicated sets based on the measures of a class of simple sets. Indeed, this is how we approached measure in Chapter 4, where we started with the lengths of intervals. So, after describing the basic properties of the class of measurable sets and the measure, we develop a systematic approach to compute the measure of complicated sets. A number of different approaches to do this have been developed over the decades. We use the approach of Carathéodory because it is simultaneously general yet still closely related to intuition. The construction of measure is carried out in two stages, involving first an “” stage and then a “premeasure” stage. This is a long process that involves overcoming a number of technical difficulties. It is hard and likely to take some time to understand. This chapter is abstract, beginning with the assumption of a generic “master” set or “universe” X. We supply a number of simple examples that are illustrative but not very practical. The choice of X is important in practice. For example, as with B and I from Chapter 4, it may be related to modeling a physical situation. Its properties have a strong impact on the properties of measure. For example, whether or not X is a and whether or not X is bounded are important. In this chapter, we give a real application of the general theory to systematic construction of measure on a general metric space. In the next chapter, we develop the main application to Euclidean space. We conclude this chapter with a general result on approximation of measures. This is a reality check in the sense that the long, complicated development of measure theory results in a concept of measure that can be computed through an approximation process.

5.1 Sigma algebras

Assume that X is a nonempty set.

We begin by describing the class of “measurable” sets. For convenience, we repeat Definition 2.1.4.

Definition 5.1.1

The family of all subsets of X is called the of X and is denoted by PX.

Recall that in Chapter 4, we found that natural questions about sequences of coin flips lead to consideration of countable unions and intersections, as well as complements of simple intervals. In abstract, defining a class of measurable sets involves specifying the permissible set operations that allow combining given measurable sets to get new measurable sets. This definition is very important in practice, since most of the time we build rich collections by starting with collection of simple sets and using the permissible operations. We distinguish the cases of finite and countable numbers of operations. Dealing with finite numbers of operations fits intuition about measuring sizes of sets, but we need to deal with countable numbers of operations to reach the desired generality. For those readers who have an aversion to “structural classification”, the name alge- bra does not imply that we dive into subjects like group theory.

i i

i i “measureTheory_v2” 2019/5/1 i i page 63 i 5.1. Sigma algebras 63 i

Definition 5.1.2: Algebra

An algebra on X is a non-empty collection of subsets M with the following properties,

1. If A ∈ M then Ac ∈ M .(Closed under complements) Sm 2. If A1,A2,...,Am ∈ M then i=1 Ai ∈ M .(Closed under finite unions)

Example 5.1.1

Let X = (0, 1] and M = {∅, finite unions of disjoint intervals that are open on the Sm left and closed on the right}. A typical A ∈ M has the form A = i=1(ai, bi], with ai > bi−1. It is easy to see that M is closed under complements and finite unions after noting that the complement of such an interval consists of two disjoint like intervals and the union of two such intervals that overlap is a like interval. The analogous set is not an algebra if X = R - try the complement condition.

Definition 5.1.3: σ-algebra

A σ- algebra (sigma algebra) on X is a non-empty collection of subsets M with the properties,

1. If A ∈ M then Ac ∈ M .(Closed under complements) ∞ ∞ S 2. If {Ai}i=1 is a collection of sets in M then Ai ∈ M .(Closed under i=1 countable unions)

Two immediate examples (proof is an exercise),

Theorem 5.1.1

PX and {∅, X} are σ- algebras.

Definition 5.1.4

PX is the maximal σ- algebra and {∅, X} is the trivial or minimal σ- algebra.

It is natural to wonder why it is necessary to consider σ- algebras other than PX since it contains all subsets of X and therefore is the “biggest” collection. It turns out n that PX contains too many subsets in the case of X = R . A simple σ-algebra that is not trivial:

Example 5.1.2

If A ⊂ X, the collection {∅, A, Ac, X} is a σ-algebra.

Conditions defining a σ- algebra can use a variety of set properties.

i i

i i “measureTheory_v2” 2019/5/1 i i page 64 i 64 Chapter 5. Construction of a General Measure Structure i

Example 5.1.3

If X is uncountable, M = {A ⊂ X : A is countable or Ac is countable} is a σ- algebra. c First note that (Ac) = A. Thus, if A ∈ M , it is either countable or Ac is c ∞ countable. So, A ∈ M . Let {Ai}i=1 ⊂ M . If all the sets are countable then the countable union is countable and so is in M . If not all the sets are countable, there  ∞ c ∞ + c S T c exists an index j ∈ Z such that Aj is countable. Thus, Ai = Ai ⊂ i=1 i=1 c Aj is countable.

An algebra does not have to be a σ- algebra.

Example 5.1.4

Consider the algebra M defined in Example 5.1.1. M is not a σ- algebra. To see this, consider the sets

 1 1 1 1 1 1  A = 0, ,A = + , + + , 1 2 2 2 22 2 22 23 1 1 1 1 1 1 1 1 1  A = + + + , + + + + , ··· . 3 2 22 23 24 2 22 23 24 25 S∞ Then, i=1 Ai is a countable union of disjoint intervals that is not a finite union of disjoint intervals.

Next, we present another example built on half-open intervals that is a σ- algebra.

Example 5.1.5

Let X = R and let M = {∅, countable unions of disjoint intervals of the form [i, i+1), and half rays (−∞, i) and [i, ∞), i ∈ Z}. ∅ and R are in M . A countable union of collections of countable unions of disjoint intervals [i, i+1) and indicated rays is a countable union of indicated intervals and rays. To see that M is closed under complements, for example, note that [i, i + 1)c = (−∞, i) ∪ [i + 1, ∞). If j < i, then [j, j + 1) ∪ [i, i + 1) = (∞, j) ∪ [j + 1, i) ∪ [i + 1, ∞). Thus, we can show that the complement of a countable union of indicated intervals and rays, which is the intersection of the complements of such sets, can be written as a countable union of indicated intervals and rays. Thus, M is a σ- algebra.

It is a good exercise to compare Examples 5.1.1 and 5.1.5. The starting point for defining a measure consists of a set and a σ- algebra.

Definition 5.1.5: Measurable space

If X has a σ- algebra M , we call (X, M ) a measurable space. The sets in M are called measurable sets.

i i

i i “measureTheory_v2” 2019/5/1 i i page 65 i 5.1. Sigma algebras 65 i

Example 5.1.6

In Chapter 4, we developed a rough picture of a collection of measurable sets in (0, 1] based on performing set operations such as union, intersection, and com- plement starting with half-open intervals (a, b]. We extend the idea to all of R and make it precise in Chapter 6 to obtain the Borel and closely related Lebesgue σ- algebras. These are certainly two of the most important examples of a σ- algebra, but establishing their existence takes some work. For now, we anticipate that countable unions and intersections of open, closed, and half-open intervals are in the σ- algebras.

Remark 5.1.1

If familar, it may help acceptance of the abstraction of defining a space through set operations by recalling the idea of a topological space, which is a nonempty set of points together with a family of subsets called the open sets that have the properties: (1) the space and the are open; (2) a finite intersection of open sets is open; (3) any union of open sets is open. The properties of a topological space are an expression of the fundamental properties of “openness”, without any additional complications added. A metric space is an example of a topological space, though a metric space has more assumptions. Note that the differences in assumptions between measurable and topological spaces lead to very different constructions.

It is not immediately apparent, but the few assumptions for a σ- algebra have a number of consequences for other set operations.

Theorem 5.1.2: Basic Properties of σ- algebras

1.A σ- algebra on X is an algebra on X. 2. If M is an algebra or σ- algebra on X, then X, ∅ ∈ M . m 3. If M is an algebra on X and {Ai}i=1 is a collection of sets in M then m T Ai ∈ M .(Closed under finite intersections) i=1 ∞ 4. If M is a σ- algebra on X and {Ai}i=1 is a collection of sets in M then ∞ T Ai ∈ M .(Closed under countable intersections) i=1 5. If M is an algebra or σ- algebra on X and if A, B ∈ M , then A\B ∈ M .

Proof. We prove these in order. Result 1 Follows from the definitions. Result 2 Since M is nonempty there is a A ⊂ X such that A ∈ M . Thus Ac ∈ M and X = A ∪ Ac ∈ M . Since X ∈ M , Xc = ∅ ∈ M . m c m Sm c T Sm c c Result 3 Since {Ai }i=1 ⊂ M and i=1 Ai ∈ M , Ai = ( i=1 Ai ) ∈ M . i=1 ∞ c ∞ S∞ c T S∞ c c Result 4 Since {Ai }i=1 ⊂ M and i=1 Ai ∈ M , Ai = ( i=1 Ai ) ∈ M . i=1

i i

i i “measureTheory_v2” 2019/5/1 i i page 66 i 66 Chapter 5. Construction of a General Measure Structure i

Result 5 From above, A\B = A ∩ Bc ∈ M .

Remark 5.1.2

Result 2, ∅, X ∈ M , is often assumed in the definition of algebras and σ- algebras. As shown, the assumption that X is nonempty implies this holds. The following theorem provides a useful way to generate a σ- algebras on a member of a σ- algebra.

Theorem 5.1.3

Let M be a σ-algebra on X and B ⊂ X. Then, the collection MB = {B ∩ A : A ∈ M } is a σ- algebra on B.

Proof. Let B1 ∈ MB, then B1 = B ∩ A1 for some A1 ∈ M . The complement of B1 c c c c c c c in B is B ∩ B1. Now, B ∩ B1 = B ∩ (B ∪ A1) and B ∩ B1 = (B ∩ B ) ∪ (B ∩ A1). c c ∞ So B ∩ B1 = B ∩ A1 ∈ MB. Let {Bi}i=1 be a sequence in MB. Each Bi = B ∩ Ai ∞ ∞ S S for some Ai ∈ M . Therefore, Bi = B ∩ Ai ∈ MB. i=1 i=1

This result is very useful. Below, we construct a σ−algebra on R and easily obtain a σ−algebra on any measurable of R. As mentioned, in practice we start with a collection of simple sets and build a σ- algebra using set operations. Obviously, it is easier to define an algebra than a σ- algebra. The next result gives conditions under which an algebra is a σ-algebra.

Theorem 5.1.4

∞ An algebra of sets M on X is a σ- algebra if and only if {Ai}i=1 ⊂ M is a disjoint ∞ S collection implies Ai ∈ M .(Closed under countable disjoint unions) i=1

Proof. ⇒ follows by definition. ∞ For ⇐, consider a collection of sets {Bi}i=1 ⊂ M that may or may not be disjoint. ∞ Following Theorem 2.4.2, we can construct a disjoint collection {Aj}j=1 such that ∞ ∞ S S Bi = Aj ∈ M . i=1 j=1

There is little chance that an arbitrary collection of sets is a σ- algebra. It is natural to wonder if it is possible to construct a σ- algebra starting with a given collection of sets. The following result gives a partial answer.

Theorem 5.1.5

1. The intersection of any collection of σ- algebras on X is a σ- algebra. 2. If A is a collection of subsets in X, there is a unique smallest σ- algebra M containing A in the sense that any σ- algebra containing A also contains

i i

i i “measureTheory_v2” 2019/5/1 i i page 67 i 5.1. Sigma algebras 67 i

M . The result is "partial" in the sense that while it says that the unique smallest σ- algebra exists, the proof does not give a practical procedure to construct it.

Proof. Result 1 Let O be a collection of σ- algebras on X. Define, \ N = M = {A ⊂ X : A ∈ M for all M ∈ O}. M ∈O Suppose A ∈ N . Then, Ac ∈ M for all M ∈ O and therefore Ac ∈ N . In a similar ∞ S way, if {Ai} ⊂ N , then Ai ∈ N . i=1 Result 2 Define M to be the intersection of all σ- algebras containing A . The inter- section is nonempty and is itself a σ- algebra by 1. By definition, M is contained in any σ- algebra that contains A .

Definition 5.1.6

Let A be a collection of subsets of X. The unique smallest σ- algebra containing A is denoted by σ (A ) and is said to be the σ- algebra generated by A .

Example 5.1.7

Let X = {1, 2, 3, 4} and set A1 = {1, 2}, A2 = {2, 3}, and finally define A = {A1,A2}. Then σ(A ) = PX. {2} = A1 ∩ A2, {1} = A1\(A1 ∩ A2), {3} = c A2\(A1 ∩ A2), {4} = (A1 ∪ A2) , and now the other subsets can be obtained using unions.

Example 5.1.8

Let X = N and let A be the collection of sets consisting of a finite number of nonnegative integers. It is an easy exercise to see that σ(A ) = PX.

Example 5.1.9

Let X = {a, b, c, d} and set A1 = {a, b}, A2 = {b}, and finally define A = {A1,A2}. It is a good exercise to show that,  σ(A ) = ∅, {a}, {b}, {a, b}, {c, d}, {a, c, d}, {b, c, d}, {a, b, c, d} 6= PX. Proving the following is a good exercise.

Theorem 5.1.6

1. If M is a σ- algebra on X, then M = σ (M ). 2. For any A ⊂ X, we have σ ({A}) = {∅, A, Ac, X}.

i i

i i “measureTheory_v2” 2019/5/1 i i page 68 i 68 Chapter 5. Construction of a General Measure Structure i

3. If A ⊂ B are collections of subsets of X, then σ (A ) ⊂ σ (B).

The following useful result gives conditions under which the σ- algebra generated by a collection of sets inherits some desirable property of those sets.

Theorem 5.1.7: Principle of Inheritance

Let A be a collection of subsets of X. Define C to be a collection of sets generated from A with a given property, i.e.,

C = {A ∈ σ (A ): A has a desired property}.

If A ⊂ C and C is a σ- algebra, then σ (A ) ⊂ C .

Definition 5.1.7

Theorem 5.1.7 is also called the Principle of Appropriate Sets and Principle of Good Sets.

Proof. σ (A ) ⊂ σ (C ) = C .

Example 5.1.10

Let M be a σ- algebra on X and B ∈ M . By Theorem 5.1.3, the collection MB = {B ∩ A : A ∈ M } is a σ- algebra on B. We show that if A is a collection of sets such that σ (A ) = M , then σ (AB) = MB, where AB = {B ∩ A : A ∈ A }. Theorem 5.1.6 implies that σ (AB) ⊂ MB. To show MB ⊂ σ (AB), consider the collection C = {A ∈ M : B ∩ A ∈ σ (AB)}. It is a good exercise to show that C is a σ- algebra. The result now follows from Theorem 5.1.7.

5.2 Measure In this section, we develop the basic properties of measure, which is a means of quanti- fying the “size” of the sets in a given σ- algebra on a set X. In other words, a measure is defined on a measurable space. Size could be the analog of length, area or volume as discussed in Chapter 4. But, it can be something else entirely, e.g. evaluating probabil- ity. In Chapter 4, the probability measure coincided with the measure of “length”, which worked out because of the length of the unit interval is 1 and we require the probability of the entire space to be 1 as well. But, that is a special case. In this section, we develop general properties and consequences. We leave a discus- sion of how measure is actually computed to later.

Let (X, M ) be a measurable space where X is nonempty.

i i

i i “measureTheory_v2” 2019/5/1 i i page 69 i 5.2. Measure 69 i

Definition 5.2.1: Measure

An additive measure on (X, M ) is a set µ : M → [0, ∞] satisfying, • µ (∅) = 0, m • If {Ai}i=1 is a collection of disjoint sets in M , then

m ! m [ X µ Ai = µ (Ai) . i=1 i=1

A (countably additive) measure on (X, M ) is a set function µ : M → [0, ∞] satisfying, • µ (∅) = 0, ∞ • If {Ai}i=1 is a collection of disjoint sets in M , then

∞ ! ∞ [ X µ Ai = µ (Ai). i=1 i=1

Below, measure refers to a countably additive measure. Finite additivity fits intuition of how a function that measures the size of sets should behave, and countable additivity is the extension to countable collection of sets. Chapter 4 provides motivation for such an extension. Example 5.2.1

 On the σ- algebra M = ∅, {a}, {b}, {a, b}, {c, d}, {a, c, d}, {b, c, d}, {a, b, c, d} on X = {a, b, c, d} defined in Example 5.1.9, define µ(∅) = 0, µ({a}) = 1, µ({b}) = 1, µ({a, b}) = 2, µ({c, d}) = 1, µ({a, c, d}) = 2, µ({b, c, d}) = 2, µ({a, b, c, d}) = 3.

It is easy to verify that µ is a measure.

The next example is sufficiently interesting to be described as a theorem. It shows how to turn an ordinary function on a countable set into a measure. As the basis of a measure space, a countable set is relatively easy to deal with since we can use the power set as a σ-algebra.

Theorem 5.2.1

Let X be countable, M = PX, and f : X → [0, ∞] any function. Define X µ (A) = f(x),A ∈ M . x∈A

Then, µ is a measure on (X, M ).

Note that we know that X is equivalent to N, i.e., X ∼ N, and we can write f(i) = ai, i = 0, 1, 2, ··· . Hence, Theorem 5.2.1 defines a measure on the space defined by considering subsets of the elements of a given sequence.

i i

i i “measureTheory_v2” 2019/5/1 i i page 70 i 70 Chapter 5. Construction of a General Measure Structure i

Proof. This is a good exercise. Countability of X and the nonnegativity of f are key.

Definition 5.2.2: Counting measure

If f(x) = 1 for all x ∈ X in Theorem 5.2.1, then µ is called the counting measure. If there is a point x0 ∈ X with f(x0) = 1 and f(x) = 0 for x 6= x0, then µ is called the point mass at x0. Recall that Example 5.1.3 is an example of how a variety of set properties can be used to define a σ- algebra. This is reflected in choices of measure.

Example 5.2.2

Let X be uncountable and let M be the σ- algebra defined by,

c {A ⊂ X : A is countable or A is countable} . It is a good exercise to show that the set function, ( 0, A is countable, µ (A) = 1,Ac is countable,

is a measure.

Example 5.2.3

Let X be an infinite set, M = PX, and define ( 0,A finite, µ (A) = ∞,A infinite.

It is a good exercise to show that µ is a finitely additive measure but it is not countably additive.

Now, we have defined the three ingredients for measure theory.

Definition 5.2.3

If (X, M ) is a measurable space on which there is a measure µ, then the triple (X, M , µ) is called a measure space.

Remark 5.2.1

All of these terms are abused regularly. We might say µ is a measure on X, where M is taken to be the “natural” domain for µ. For example, there is a natural choice in a metric space. If we know M , then we also know X since X is a maximal ele- ment of M . So X may not be mentioned explicitly. We try to be careful in notation because we want to emphasize that using measure theory requires specification of

i i

i i “measureTheory_v2” 2019/5/1 i i page 71 i 5.2. Measure 71 i

a universe, a σ- algebra, and a measure.

Now let (X, M , µ) be a measure space where X is nonempty.

There is a significant difference between the cases when the space has finite mea- sure and when it does not. In the latter situation, we distinguish two particular cases.

Definition 5.2.4

We say that µ is a finite measure if µ (X) < ∞ and that (X, M , µ) is a finite measure space. We say that µ is a probability measure if µ (X) = 1 and (X, M , µ) is a probability space. We say that µ is a σ- finite measure if M contains an increasing sequence ∞ S A1 ⊂ A2 ⊂ A3 ... such that Ai = X and µ (Ai) < ∞ for all i. We say that i=1 (X, M , µ) is a σ- finite measure space.

A finite measure is trivially σ-finite.

Example 5.2.4

Consider X = N and let µ be the measure defined in Theorem 5.2.1. 1. If f(i) = 2−(i+1), the resulting measure is a probability measure. −1 2 P∞ 2 2. If f(i) = C /(i + 1) , with C = i=1 1/(i + 1) , then the resulting measure is a probability measure. 3. If f(i) = 1/(i + 1), the resulting measure is not finite, but it is σ-finite.

∞ [ The idea behind σ- finite follows naturally from considering that R = (−i, i), i=1 which allows the extension of the on a finite interval developed in- formally in Chapter 4 to R.

We assume that any measure being considered is σ- finite (which includes finite measures).

Dealing with non-σ- finite measures require additional assumptions and work and some important results do not hold. The following Theorem gives a useful equivalence that is a good exercise.

Theorem 5.2.2

(X, M , µ) is σ- finite if and only if there is a countable disjoint collection of sets ∞ S∞ {Ai}i=1 with finite measure such that X = i=1 Ai.

i i

i i “measureTheory_v2” 2019/5/1 i i page 72 i 72 Chapter 5. Construction of a General Measure Structure i

Example 5.2.5

∞ ∞ [ [ R = (−i, i) = (i, i + 1]. i=1 i=−∞

Definition 5.2.5

For any measure space (X, M , µ), any countable disjoint collection of sets with ∞ S∞ finite measure {Ai}i=1 such that X = i=1 Ai is called a measurable decompo- sition of X. We note above that defining a measure is not the same as giving a practical recipe for computing the measure of sets. Even direct verification of the properties of a measure is difficult outside simple examples. A natural approach is to start with a “measure- like” function that satisfies the key properties on an algebra of relatively simple sets and then undertake some kind of limiting process to expand the domain and range of the “proto-measure”. We discuss this below. Next, we explore consequences of the assumptions made about measures. In par- ticular, we show that a measure, which is a function on sets, behaves continuously with respect to sequences of sets. It may be useful to review Section 2.4 before studying this theorem. Theorem 5.2.3: Properties of Measure

1. µ is finitely additive. 2. If A, B ∈ M and A ⊂ B then µ (A) ≤ µ (B). 3. If A, B ∈ M and A ⊂ B where µ (A) < ∞, then µ (B\A) = µ (B) − µ (A). ∞ 4. If {Ai}i=1 is a collection of sets in M , then

∞ ! ∞ [ X µ Ai ≤ µ (Ai). i=1 i=1

∞ 5. If {Ai}i=1 is a monotone sequence of sets in M such that A1 ⊂ A2 ⊂ A3 ... , then ∞ ! [ µ Ai = lim µ (Ai) . i→∞ i=1 ∞ 6. If {Ai}i=1 is a monotone sequence of sets in M such that A1 ⊃ A2 ⊃ + A3 ... and µ (Am) < ∞ for some m ∈ Z , then

∞ ! \ µ Ai = lim µ (Ai) . i→∞ i=1

Example 5.2.6

Using the preliminary ideas behind the Lebesgue measure µL on (0, 1], the sets

i i

i i “measureTheory_v2” 2019/5/1 i i page 73 i 5.2. Measure 73 i

Ai = (0, 1 − 1/i) for i ≥ 2 are measurable and µL (Ai) = 1 − 1/i. We have S∞ Ai ⊂ Ai+1, limi→∞ µL (Ai) = 1, and i=1 Ai = (0, 1).

Example 5.2.7

Using the preliminary ideas behind the Lebesgue measure µL on (0, 1], the sets Ai = (0, 1 + 1/i) for i ≥ 1 are measurable and µL (Ai) = 1 + 1/i. We have T∞ Ai ⊃ Ai+1, limi→∞ µL (Ai) = 1, and i=1 Ai = (0, 1].

Example 5.2.8

It is a good exercise to verify that the properties of Theorem 5.2.3 hold directly for the measure defined in Theorem 5.2.1.

These properties are essential to computing measures of complicated sets.

Definition 5.2.6

Property 2 is called monotonicity and Property 4 is called subadditivity. Prop- erty 5 is called continuity from below and Property 6 is called continuity from above.

Proof. We prove in order.

Result 1 Let Ai = ∅ for all i larger than a given finite index. Result 2 If A ⊂ B, then µ(B) = µ(A) + µ(B ∩ Ac) ≥ µ(A). We use the fact that A and B ∩ Ac are disjoint. Result 3 Since µ (A) < ∞ we can subtract the µ (A) from µ(B) = µ(A)+µ(B ∩Ac) to obtain µ (B) − µ (A) = µ (B ∩ Ac) = µ (B − A).

Result 4 Following Theorem 2.4.2, we construct a sequence of disjoint sets: B1 = A1 j−1  ∞  S S and Bj = Aj Ai for j ≥ 2. Note that µ (Bj) ≤ µ (Aj) for all j and Ai = i=1 i=1 ∞ S Bi. Hence, i=1

∞ ! ∞ ! ∞ ∞ [ [ X X µ Ai = µ Bi = µ (Bi) ≤ µ (Ai). i=1 i=1 i=1 i=1

S∞  Result 5 First we note that if µ(Ai) = ∞ some i, then µ j=1 Aj = µ(Ai) + S∞  µ j=1 Aj\Ai since ∞ = ∞. Moreover, µ(Aj) = ∞ for j ≥ i. So the result S∞ holds. So we assume that µ(Ai) is finite for all i. Set A0 = ∅. Since i=1 Ai = S∞ i=1(Ai\Ai−1), where the latter union is disjoint,

∞ ! ∞ [ X µ Ai = µ (Ai\Ai−1). i=1 i=1

i i

i i “measureTheory_v2” 2019/5/1 i i page 74 i 74 Chapter 5. Construction of a General Measure Structure i

We write the series as a limit of partial sums and use properties of measure to write,

∞ m m X X X  µ (Ai\Ai−1) = lim µ (Ai\Ai−1) = lim µ (Ai) − µ (Ai−1) . m→∞ m→∞ i=1 i=1 i=1 We use the fact that the partial sums “telescope” to conclude

∞ ! m [ X  µ Ai = lim µ (Ai) − µ (Ai−1) = lim µ(Am). m→∞ m→∞ i=1 i=1

Result 6 Let Bj = Am\Aj for j > m, and Bj = ∅ for j ≤ m. Then, Bm+1 ⊂ Bm+2 ⊂ ... , µ(Am) = µ(Bj) + µ(Aj) for j > m, and

∞  ∞  [ \ Bj = Am\  Aj . j=m+1 j=m

By Result 5,

 ∞   ∞  \ \  µ(Am) = µ  Aj + lim µ(Bi) = µ  Aj + lim µ(Am) − µ(Ai) . i→∞ i→∞ j=m j=m

Since µ(Am) < ∞, we can subtract it from both sides.

The extra assumption in proving continuity from above is necessary. It is a good exercise ∞ to produce an example of a decreasing sequence {Ai}i=1 with µ (Ai) = ∞ for all i, ∞ T but Ai = ∅. i=1

Remark 5.2.2

The proof of Result 2 shows that additive measures are monotone. The proof of Result 4 is a classic type of measure theory argument.

The continuity of a measure is an important property that leads to other useful prop- erties. For example, the following result says continuity together with finite additivity gives countable additivity.

Theorem 5.2.4

Let µ be a finitely additive measure on an algebra A . ∞ S∞ 1. If for every increasing sequence of sets {Ai}i=1 ⊂ A with A = i=1 Ai ⊂ A (so Ai ↑ A), µ(Ai) → µ(A), then µ is countably additive on A . ∞ 2. If for every decreasing sequence of sets {Ai}i=1 ⊂ A with Ai ↓ ∅, µ(Ai) → 0, then µ is countably additive on A .

Proof. ∞ Sm Result 1 As usual, we may assume the {Ai}i=1 are disjoint. If Bm = i=1 Ai, then Pm Bm ↑ A and µ(Bm) → µ(A). But, finite additivity implies µ(Bm) = i=1 µ(Ai) → P∞ i=1 µ(Ai) = µ(A).

i i

i i “measureTheory_v2” 2019/5/1 i i page 75 i 5.3. Sets of measure zero, completion of measure 75 i

∞ S∞ Result 2 Assume {Ai}i=1 ⊂ A is a disjoint collection with A = i=1 Ai ⊂ A Sm and set Bm = i=1 Ai. Since µ(A) = µ(Bm) + µ(A \ Bm) and A \ Bm ↓ ∅, µ(Bm) → µ(A). Now we apply the proof for Result 1.

The next theorem shows that the assumption of σ- finite implies that the space can- not have “too much content”.

Theorem 5.2.5

If (X, M , µ) is σ- finite, then M cannot contain an uncountable, disjoint collec- tion of sets of positive measure.

Proof. Let E = {Aα}α∈A be a disjoint collection of subsets in M such that for each ∞ α ∈ A , µ (Aα) > 0. We show that E is countable. There is a sequence of sets {Bi}i=1 ∞ S such that Bi % X and µ (Bi) < ∞ for each i. For any A ∈ E , A = (A ∩ Bi). i=1 ∞ We use {Bi}i=1 and a countable set of lower bounds to create a countable partition of E into sub-collections of sets where the measures of sets in a given sub-collection are bounded away from zero by one of the lower bounds. For j ∈ N, define  1 Ei,j = A ∈ E : µ (A ∩ Bi) > . j

For all i, j, Ei,j ⊂ E and for any A ∈ E there is a i, j ∈ N such that A ∈ Ei,j. Hence,

∞ [ E = Ei,j. i,j=1

The result follows if we show that Ei,j is finite for all indices. Consider a finite sequence m of sets {Ck}k=1 in a Ei,j. Since, these sets are disjoint,

m m ! m X [ ≤ µ (C ∩ B ) = µ (C ∩ B ) ≤ µ (B ) . j k i k i i k=1 k=1

Thus, m ≤ jµ (Bi), i.e. m is bounded by a constant that depends only on i and j, and therefore Ei,j is finite.

Remark 5.2.3

This proof is a typical σ- finite argument.

5.3 Sets of measure zero, completion of measure The discussion in Section 4.3 hints at the importance of dealing with sets of measure zero. There is a technical issue about such sets that we settle in this section.

Let (X, M , µ) be a measure space where X is nonempty.

i i

i i “measureTheory_v2” 2019/5/1 i i page 76 i 76 Chapter 5. Construction of a General Measure Structure i

Definition 5.3.1: Sets of measure zero

A set E ∈ M with µ (E) = 0 is called a set of measure zero. If a statement about points x ∈ X is true except for x in a set of measure zero, we say that the statement is true almost everywhere (a.e. ).

It is necessary to reconcile this definition with Definition 4.3.4 introduced in Chap- ter 4 for Lebesgue measure. In that chapter, we built up the idea of measure based on measuring the size of intervals. Now, we are dealing with an abstract construction of measure. Note that if we prove there is a Lebesgue measure µL and if A is a Lebesgue mea- ∞ surable set that is covered by a countable set of intervals {Ii}i=1, then Theorem 5.2.3 implies ∞ ! ∞ [ X µL(A) ≤ µL Ii ≤ µL(Ii). i=1 i=1 Now, if we can make the sum on the right smaller than any given δ > 0 by a suitable ∞ choice of {Ii}i=1, then µL(A) = 0. So, Definition 4.3.4 implies that Definition 5.3.1 holds. The reverse implication requires some proof, and we discuss that later. First, recall this claim from Chapter 4. Its proof is a good exercise.

Theorem 5.3.1

• A finite or countable union of sets of measure zero in M has measure zero. • If A is a measurable set of measure zero in M then any measurable subset of A has measure zero.

However, a subset of a measurable set of measure zero is not necessarily measur- able!

Example 5.3.1

Consider the σ- algebra {X, ∅} and the measure µ that is zero on this σ- algebra. Then µ is not defined on any proper, non-empty subset of X.

The fact that a set of measure 0 can contain a nonmeasurable set is not really sur- prising. After all, we have seen that sets of measure 0 can be very complicated, e.g. the Cantor Set (Definition 4.3.6) and the non-Normal numbers Nc. We resolve this annoy- ing point by adding all those subsets of sets in a σ- algebra that have measure 0 to the σ- algebra.

Definition 5.3.2

If M contains all subsets of sets in M with measure 0, then (X, M , µ) is com- plete.

Being complete eliminates some annoying issues and it can always be obtained by enlarging the domain of a given measure to obtain an equivalent measure in the following sense:

i i

i i “measureTheory_v2” 2019/5/1 i i page 77 i 5.3. Sets of measure zero, completion of measure 77 i

Theorem 5.3.2: Completion of a measure

Let N = {N ∈ M : µ (N) = 0}. Define,

M = {A ∪ B : A ∈ M and B ⊂ N for some N ∈ N }.

Then, M is a σ- algebra on X that contains M . Moreover, the unique measure µ on M defined by µ (A ∪ B) = µ (A) for all A ∈ M and B ⊂ N for some  N ∈ N makes X, M , µ a complete measure space.

We literally add all the subsets of measurable sets of measure zero to the σ- algebra and define their measure to be 0.

Definition 5.3.3

µ is an example of an extension of µ from M to M . This particular extension is called the completion of µ and M is the completion of M with respect to µ.

Proof. Clearly M ⊂ M . We show that M is a σ- algebra. Let C ∈ M , so C = A∪B with A ∈ M and B ⊂ N for some N ∈ N . Since B ⊂ N, N c ⊂ Bc and Bc is equal to the disjoint union Bc = N c ∪(N ∩Bc). This implies, Cc = (Ac ∩ N c) ∪ (Ac ∩ N ∩ Bc). Now, (Ac ∩ N c) ∈ M since both E,N ∈ M . Also, (Ac ∩ N ∩ Bc) ⊂ N. Hence, Cc ∈ M . ∞ Let {Ci}i=1 be a sequence of sets in M . For each i, Ci = Ai ∪ Bi with Ai ∈ M and Bi ⊂ Ni for some Ni ∈ N . Thus,

∞ ∞ ! ∞ ! [ [ [ [ Ci = Ai Bi , i=1 i=1 i=1

∞ ∞ ∞ ∞ S S S S where Ai ∈ M and Bi ⊂ Ni. From Theorem 5.3.1, Ni ∈ N and hence i=1 i=1 i=1 i=1 ∞ S Ai ∈ M . Thus M is a σ- algebra. i=1 We next verify that µ (A) is a well defined function. If C ∈ M can be written as C = A1 ∪ B1 = A2 ∪ B2 with Ai ∈ M and Bi ⊂ Ni for some Ni ∈ N , then we want to show that µ (A1) = µ (A2). To see this note that A1 ⊂ A1 ∪ B1 = A2 ∪ B2 ⊂ A2 ∪N2 ∈ M . Thus µ (A1) ≤ µ (A2)+µ (N2) = µ (A2). Similarly, µ (A2) ≤ µ (A1).  To show that X, M , µ is complete, we show that if C ∈ M satisfies µ (C) = 0, then D ∈ M for any D ⊂ C. With C = A ∪ B with A ∈ M and B ⊂ N for some N ∈ N , we have µ (A) = µ (C) = 0. This implies A ∈ N . But, D = ∅ ∪ D where ∅ ∈ M and D ⊂ A ∪ B ∈ N . Thus D ∈ M .

Example 5.3.2

Returning to Example 5.3.1, to complete X we add all subsets of X to the σ- algebra.

i i

i i “measureTheory_v2” 2019/5/1 i i page 78 i 78 Chapter 5. Construction of a General Measure Structure i

Thus, M = PX.

Remark 5.3.1

Unfortunately, unlike the other parts of measure structure, completion does not interact nicely with maps between measurable spaces. For this reason, it is some- times inconvenient to work with the completion of a given measure. An important example of this situation is the Borel measure on Rn and its completion Lebesgue measure (the rigorous formulations of the ideas behind Lebesgue measure in (0, 1] in Chapter 4). Most of the time, completeness is not an issue, but some key re- sults do require a complete measure. The reader should pay attention to whether or not measure being considered at any given point in a textbook is assumed to be complete.

5.4 Outer measures We have developed the basic properties of a measure on a σ- algebra, but we have presented only illustrative examples. In this section, we develop a systematic method for constructing a measure based on specifying the values of the measure on a class of “simple” sets. This is the same idea that drove the intuitive development of measure in Chapter 4, which was based on defining the measure of an interval I = (a, b] to be  m  m S P µ (I) = b − a and for a finite union of disjoint intervals {Ii}, µ Ii = µ (Ii). i=1 i=1 That is actually sufficient for the probability computations in Chapter 4. But, the discussion in Chapter 4 also hints that even in one dimension we are likely to want to consider sets that are more complicated than just finite collections of disjoint intervals, e.g. recall the Cantor set and the set of non-normal numbers. Working in higher dimensions introduces the possibility of further complications in boundaries and interior structure of sets. The technical difficulties involved in construction of measure, as well as computing its values, arise mainly from these complications. There are various ways to approach construction of measures that appear to be quite different, though they all end up with the same result. If the reader is looking at other books, they will encounter different approaches. The technical challenges involved with measure means that any approach has aspects that are not very intuitive. We use the “outer measure” approach developed by Carathéodory based on Lebesgue’s devel- opment.

Let X be a nonempty set. Recall that even though we are not assuming a measur- able space, PX is a ready made σ- algebra. We first briefly describe an early approach to defining measure due to Peano and Jordan that developed the ancient idea of approximating complex geometric shapes with simple shapes. This approach is intuitive and also reveals some of the technical issues that led to the more complicated development of Lebesgue.

Definition 5.4.1

An Jordan elementary set in Rn is a finite disjoint union of n-dimensional cubes. We define the measure µ(R) of n-dimensional cube R to be its Euclidean volume

i i

i i “measureTheory_v2” 2019/5/1 i i page 79 i 5.4. Outer measures 79 i

tb

1 m

Aδ 3 2 1 2 3 m 0 1

Figure 5.1. Computation of the Jordan measure of an isoceles right triangle

m and the measure of a finite disjoint collection of n-dimensional cubes {Ri}i=1 to  m  m S P n be µ Ri = µ (Ri). A set A ∈ R is Jordan measurable if for any δ > i=1 i=1 δ δ δ  0, there are elementary sets Aδ and A with Aδ ⊂ A ⊂ A and µ A \ Aδ < δ.

This uses the fact that the set difference of two Jordan elementary sets is another Jordan elementary set. The idea of this definition is that if we take a sequence of δ → 0, we obtain a δ δ sequence of elementary sets {Aδ,A } such that the Jordan measures of Aδ and A δ converge to the same number. We can think of µ (Aδ) and µ A as the “inner” and “outer” content of A. We define the common limit to be the Jordan measure of A. This is a good idea and it can be used to improve the Riemann integral in particular.

Example 5.4.1

Consider the isoceles right triangle spanning (0, 0), (1, 0), and (0, 1) shown in Figure 5.1. We create a partition of the unit square consisting of nonoverlapping squares with sides of length h = 1/m for integer m ≥ 1. The measure of Aδ 2 2 2 1 m−1 δ is (m − 1) · h + (m − 2) · h + ··· + 0 · h = 2 m . The measure of A is 2 2 2 1 m+1 δ  2 1 m · h + (m − 1) · h + ··· + 1 · h = 2 m . Thus, µ A \Aδ = mh = m → 0 (illustrated in Figure 5.1 as the diagonal of lighter shaded squares) as m → ∞. The Jordan measure of the triangle is 1/2.

But it has the problem that the collection of Jordan measurable sets is too restric- tive. Example 5.4.2

It is a good exercise to show that the set of rational numbers Q in the unit interval I is not Jordan measurable. Lebesgue’s development aimed at preserving the properties of the Jordan measure, though extended to countable disjoint collections of Jordan measurable sets. But, it al- tered the Jordan approach to set approximation. It defined an “outer measure” µ∗(A) of

i i

i i “measureTheory_v2” 2019/5/1 i i page 80 i 80 Chapter 5. Construction of a General Measure Structure i

any set A ⊂ Rn to be the infimum of the measures of countable covers of A consisting of Jordan elementary sets. Lebesgue’s definition of measurability can be expressed as saying that for any δ > 0, there is a Jordan elementary set Aδ such that µ∗(A4Aδ) < δ. A notion of “inner measure” can also be defined using the outer measure of the com- plement of A. Since no approximation of a set by interior elementary sets is involved, this approach greatly increases the size of the collection of measurable sets. Indeed, the construction of non-measurable sets becomes quite difficult and depends on some set axioms. We present Carathéodory’s generalization of Lebesgue’s approach because it pro- vides a way to construct measures on a wide variety of spaces. This has similarities to the way we originally defined a set of measure zero in terms of countable covers, Def. 4.3.4.

Definition 5.4.2: Outer measure

∗ An outer measure on X is a set function µ : PX → [0, ∞] such that: 1. µ∗ (∅) = 0; 2. For any sets A ⊂ B, µ∗ (A) ≤ µ∗ (B); (Monotonicity)  ∞  ∞ ∞ ∗ S P ∗ 3. For any collection {Ai}i=1 of sets in X, µ Ai ≤ µ (Ai). (Sub- i=1 i=1 additivity)

Note that an outer measure is defined on all of PX, which is the largest σ- algebra on X. So, provided we prove that outer measures exist, they place no restrictions on the subsets. But, this is also a disadvantage in the sense that computationally, we prefer to start with a function defined on some smaller collection of sets and build up the values on more complicated sets as opposed to having to specify values for every subset of X.

Example 5.4.3

∗ Set X = {a1, a2, a3} and define µ by

∗ ∗ µ (∅) = 0, µ ({ai}) = 1, i = 1, 2, 3, 3 µ∗({a , a }) = , µ∗({a , a }) = µ∗({a , a }) = 2, 1 2 2 1 3 2 3 5 µ∗({a , a , a }) = . 1 2 3 2 Then, we can systematically check Definition 5.4.2 to see that µ∗ is an outer mea- ∗ 3 ∗ ∗ sure, e.g. µ ({a1, a2}) = 2 ≤ µ ({a1}) + µ ({a2}) = 2, and so on. ∗ ∗ However, if we define µ ({a1, a2, a3}) = 3, then µ is not an outer measure ∗ ∗ ∗ 5 since µ ({a1, a2, a3}) = 3 > µ ({a1, a2}) + µ ({a3}) = 2

Example 5.4.4

i i

i i “measureTheory_v2” 2019/5/1 i i page 81 i 5.4. Outer measures 81 i

Set X = N and define ( e|A|, |A| < ∞, σ(A) = A ⊂ X. ∞, |A| = ∞,

σ is monotone since ex is a monotone increasing function of x ≥ 0. However, if we check subadditivity for A = {1, 2, ··· , m} using the singleton sets, we find m X that σ(A) = em > σ({i}) = me1 for any m ≥ 1, so σ is not an outer i=1 measure. The issue is that σ assigns too much “size” to the collection A compared to the sizes of the individual components. Note that once we find one violation, we do not need to check the other possibilities for partitioning A into subsets.

Example 5.4.5

Instead of the exponential function in Example 5.4.4, we consider a monomial. Set X = N and for fixed p > 0 define ( |A|p, |A| < ∞, σ(A) = A ⊂ X. ∞, |A| = ∞,

For any p, σ : PX → [0, ∞] and σ(∅) = 0. Moreover, σ is monotone for any p since xp is a monotone increasing function of x ≥ 0 for any p > 0. ∞ Thus, we have to check subadditivity. Let {Ai}i=1 be a collection of sets in ∞ S X and set A = Ai. There are two cases to treat, |A| finite and |A| infinite. In i=1 the finite case, we can assume A = {1, 2, ··· , m} without loss of generality, since |A| does not depend on the values of the members of A. If p > 1, we check subadditivity in the finite case using the singleton sets to p Pm p find that σ(A) = m > i=1 σ({i}) = m1 = m for m > 1. Thus, subadditiv- ity fails if p > 1, and σ is not an outer measure.  ∞  ∗ S We next consider p ≤ 1. If any of the |Ai| is infinite then µ Ai = i=1 ∞ ∞ P ∗ P ∗ ∞ = µ (Ai). If |A| is infinite, then µ (Ai) = ∞, hence subadditivity i=1 i=1 ∞ P ∗ also holds. Hence, we assume that m = |A| and µ (Ai) are finite. By Theo- i=1 ∞ rem 2.4.2, we may assume that {Ai}i=1 are disjoint and therefore there are a finite number of sets in the collection. Thus, we have to check subadditivity for all possible partitions of A into a finite number of disjoint subsets. To do this, we use the inequality

mp ≤ xp + (m − x)p, 0 ≤ x ≤ m, 0 < p ≤ 1. (5.1)

The inequality is obvious for p = 1, and for p < 1 we define f(x) = xp + (m − x)p − mp and find the extrema of f to prove the result. The inequality implies that subadditivity holds for any partition of A into two disjoint subsets. Using

i i

i i “measureTheory_v2” 2019/5/1 i i page 82 i 82 Chapter 5. Construction of a General Measure Structure i

induction, we can prove it holds for any finite partition, so σ is subadditive and an outer measure if p ≤ 1. The following theorem presents a systematic way to construct an outer measure.

Theorem 5.4.1

Let E be a non-empty family of subsets of X with ∅, X ∈ E such that there is a set function f : E → [0, ∞] satisfying f(∅) = 0. For any subset A ∈ PX, define

( ∞ ∞ ) ∗ ∗ X [ µ (A) = µf (A) = inf f(Ai): {Ai} ⊂ E and A ⊂ Ai . (5.2) i=1 i=1

∗ Then, µf is an outer measure.

Theorem 5.4.1 suggests there are many ways to construct an outer measure correspond- ing to many possible choices of f. But, it is particularly meaningful when the function f is related to our idea about “area”. We illustrate in Figure 5.2.

A A A

2 Figure 5.2. Illustration of computing the infimum for the outer measure for a set in R in the case that the set function f gives the area of a square. We consider the family of squares and compute the infimum by “refining” the squares that cover the given set.

Definition 5.4.3

∗ We say that the outer measure µf in Theorem 5.4.1 is induced by f. We use the subscript f only when it is important to indicate the function f.

Proof. We begin by showing the infimum exists. The collection of countable covers in (5.2) is not empty since X ∈ E and X covers A. We are computing an infimum over positive real numbers that are bounded below by 0, so the infimum exists. Next, µ∗ (∅) = 0 because ∅ ∈ E is contained in a countable collection of empty sets and f(∅) = 0. Next, we verify monotonicity. If A ⊂ B, then µ∗(A) ≤ µ∗(B) since the collection of countable covers over which the infimum is computed for A includes the collection of countable covers of B. ∞ Let {Ai}i=1 be a collection of sets in X. Given  > 0, for each i there is a collection ∞ S∞ of sets {Bi,j}j=1 ⊂ E such that Ai ⊂ j=1 Bi,j and ∞ X ∗ −i f(Bi,j) ≤ µ (Ai) +  2 . j=1

i i

i i “measureTheory_v2” 2019/5/1 i i page 83 i 5.4. Outer measures 83 i

S∞ S∞ If A = i=1 Ai, then A ⊂ i,j=1 Bi,j, and ∞ ∞ ∞ ∞ ∞ ∗ X X X X ∗ −i  X ∗ µ (A) ≤ f(Bi,j) = f(Bi,j) ≤ µ (Ai) + 2  ≤ µ (Ai) + . i,j=1 i=1 j=1 i=1 i=1 Since  is arbitrary, the result follows.

Remark 5.4.1

Analogs of the trick of specifying 2−i for the accuracy of each successive member of a countable cover of Ai is a standard argument in measure theory.

Example 5.4.6

The measure µ defined on the σ- algebra M on X = {a, b, c, d} in Example 5.2.1 is defined on a proper subset of PX. We treat this measure as a set function and compute the corresponding outer measure. Using the definition, we compute,

µ∗({c}) = µ∗({d}) = 1, µ∗({a, c}) = µ∗({a, d}) = µ∗({b, c}) = µ∗({b, d}) = 2, µ∗({a, b, c}) = µ∗({a, b, d}) = 3, µ∗({a, c, d}) = µ∗({b, c, d}) = 2,

and of course µ and µ∗ agree on M .

∗ Note that the outer measure µf induced by a set function f is not the same as f in general. After all, the outer measure is defined on all subsets of X, not just the family E on which f is defined. Example 5.4.7

In Example 5.4.5, σ is not an outer measure when p > 1. If we define an outer measure µ∗ from σ using (5.2), we find that µ∗({1, 2, ··· , m}) = m. Thus, the outer measure obtained from σ for p > 1 reduces to σ with p = 1.

On the other hand, Theorem 5.4.1 implies

Theorem 5.4.2

Any measure on X whose associated σ- algebra is PX is an outer measure on X.

Example 5.4.8

It is a good exercise to show that if X = R and f is defined as f([a, b]) = b − a for ∗ a, b ∈ R and f(∅) = 0, then the corresponding outer measure satisfies µf ([a, b]) = b − a for any a, b ∈ R.

As noted, outer measures have the advantage that they are defined on all of PX. However, a disadvantage of outer measures is that there is no assumption of countable

additivity. The next step is to identify a subset of PX on which an outer measure satisfies additional properties that make it into a measure. There are several ways to do this. We use Carathéodory’s approach.

i i

i i “measureTheory_v2” 2019/5/1 i i page 84 i 84 Chapter 5. Construction of a General Measure Structure i

Definition 5.4.4: Carathéodory’s Condition

Given an outer measure µ∗ on X, a set A is µ∗ - measurable or outer measurable if µ∗ (E) = µ∗ (E ∩ A) + µ∗ (E ∩ Ac) , (5.3)

for all E ⊂ X. We usually say outer measurable instead of µ∗ - measurable if µ∗ is clear from the context. Note that the sets E used to test (5.3) are not restricted to µ∗ - measurable sets. Below, we use E to denote the testing sets in order to make things a little more readable. There is no other significance to the choice of label.

Remark 5.4.2

The language is a bit confusing because it is possible to compute the outer measure of sets that are not outer measurable!

E U Ac E U E A E

Ac A X

Figure 5.3. Left: Illustration of Carathéodory’s condition. By varying E we can check Carathéodory’s condition in different regions. Right: Zooming in on a portion of A where the boundary is complicated.

Any set A together with its complement Ac forms a “decomposition” of X in the sense that if E ⊂ X then E is equal to the disjoint union (E ∩ A) ∪ (E ∩ Ac), see Figure 5.3. Therefore, A is µ∗ - measurable if µ∗ is additive with respect to all disjoint unions formed using A. We can interpret this assumption as follows. On one hand, µ∗(E ∩ A) is an outer measure of the part of E that lies “inside” A. On the other hand, µ∗(E) − µ∗(E ∩ Ac) can be interpreted as a way to compute the outer measure of that part of E that does not lie “outside” A, so this is an indirect way to compute the outer measure of the part of E “inside” A. So, (5.3) is the requirement that these ways to compute the outer measure of part of the interior of A should match. By varying E, we can check this condition on different parts of A.

Example 5.4.9

We check Carathéodory’s condition for the outer measure defined in Example 5.4.3. ∗ 3 ∗ We start with {1}. Using E = {1, 2}, we obtain µ ({1, 2}) = 2 6= µ ({1}) + µ∗({2}) = 2, so {1} is not outer measurable. Neither is {2} by symmetry. For

i i

i i “measureTheory_v2” 2019/5/1 i i page 85 i 5.4. Outer measures 85 i

{3}, we find that (5.3) is violated using E = {1, 2, 3}. Similarly, we find that none of the two subsets with two elements are outer measurable. Hence, {∅, X} are the outer measurable sets. We want to build a measure out of an outer measure. First a useful result:

Theorem 5.4.3

Let µ∗ be an outer measure on X. A set A is µ∗ - measurable if and only if µ∗ (E) ≥ µ∗ (E ∩ A) + µ∗ (E ∩ Ac) , (5.4)

for any subset E ⊂ X.

The point is that we only have to check (5.4) not (5.3).

Proof. Since µ∗ is sub-additive, we always have µ∗ (E) ≤ µ∗ (E ∩ A) + µ∗ (E ∩ Ac).

Example 5.4.10

We show that the set function σ defined in Example 5.4.5 is an outer measure for p ≤ 1. We now determine the outer measurable sets by checking (5.4). Choose A ⊂ X. First consider finite E with |E| = `, so (5.4) becomes `p ≥ (` − i)p + ip for 0 ≤ i ≤ `. However, (5.1) implies that this inequality does not hold for p < 1, while it does hold for p = 1. When p = 1, it is straightforward to verify that (5.4) holds for infinite E as well. So, when p < 1, the outer measurable sets are {∅, X} and when p = 1, the outer measurable sets are PX. We have the major result:

Theorem 5.4.4: Carathéodory

Let µ∗ be an outer measure on X. The collection M = {A ⊂ X : A is µ∗- ∗ ∗ measurable} of µ - measurable sets is a σ- algebra and the restriction µ |M of µ∗ to is a measure. Moreover, , , µ∗  is a complete measure space. M X M M

Hang on, this is the first major proof we present.

Proof. M is nonempty. We show that any set A of outer measure 0 is outer measurable so ∅ ∈ M since µ∗ (∅) = 0. Choose E ⊂ X. Since E ⊃ E ∩ Ac, µ∗ (E) ≥ µ∗(E ∩ Ac). Similarly, A ⊃ E ∩ A, so µ∗ (A) ≥ µ∗(E ∩ A). Addition gives µ∗ (E) + µ∗ (A) = µ∗ (E) ≥ µ∗ (E ∩ A) + µ∗ (E ∩ Ac), since µ∗ (A) = 0. M is an algebra. The definition of outer measurability is symmetric with respect to a set and its complement, so M is closed under complements. Let A1,A2 ∈ M . Then for any E1,E2 ⊂ X,

∗ ∗ ∗ c µ (E1) = µ (E1 ∩ A1) + µ (E1 ∩ A1), ∗ ∗ ∗ c µ (E2) = µ (E2 ∩ A2) + µ (E2 ∩ A2).

i i

i i “measureTheory_v2” 2019/5/1 i i page 86 i 86 Chapter 5. Construction of a General Measure Structure i

c For E ⊂ X, set E1 = E and E2 = E ∩ A1 to get

∗ ∗ ∗ c µ (E) = µ (E ∩ A1) + µ (E ∩ A1) ∗ c ∗ c ∗ c c µ (E ∩ A1) = µ (E ∩ A1 ∩ A2) + µ (E ∩ A1 ∩ A2).

or, ∗ ∗ ∗ c ∗ c µ (E) = µ (E ∩ A1) + µ (E ∩ A1 ∩ A2) + µ E ∩ (A1 ∪ A2) .

c Now, E ∩ (A1 ∪ A2) = (E ∩ A1) ∪ (E ∩ A1 ∩ A2), so

∗ ∗ ∗ c µ (E ∩ (A1 ∪ A2)) ≤ µ (E ∩ A1) + µ (E ∩ A1 ∩ A2) .

Hence, ∗ ∗ ∗ c µ (E) ≥ µ (E ∩ (A1 ∪ A2)) + µ (E ∩ (A1 ∪ A2) ),

∗ for any set E in X. Hence, A1 ∪ A2 ∈ M . Induction shows that µ is finitely additive on M . M is a σ- algebra. We show that M is closed under countable disjoint unions, so the ∞ result follows from Theorem 5.1.4. Let {Ai}i=1 be a disjoint collection of sets in M Sm S∞ and set Bm = i=1 Ai and B = i=1 Ai. For E ⊂ X,

∗ ∗ ∗ c ∗ ∗ µ (E∩Bm) = µ (E∩Bm ∩Am)+µ (E∩Bm ∩Am) = µ (E∩Am)+µ (E∩Bm−1).

By induction, m ∗ X ∗ µ (E ∩ Bm) = µ (E ∩ Ai). i=1 Thus,

m ∗ ∗ ∗ c X ∗ ∗ c µ (E) = µ (E ∩ Bm) + µ (E ∩ Bm) = µ (E ∩ Ai) + µ (E ∩ Bm). (5.5) i=1

∞ c ∞ Now {Bm}m=1 is an increasing sequence of sets, so {Bm}m=1 is a decreasing sequence of sets. Taking the limit as m → ∞ in (5.5) (noting that all the terms in the sums are nonnegative!) and using monotonicity, we obtain

∞ ∗ X ∗ ∗ c µ (E) ≥ µ (E ∩ Ai) + µ (E ∩ B ) i=1  ∞  ∗ [ ∗ c ≥ µ (E ∩ Ai) + µ (E ∩ B ) i=1 = µ∗(E ∩ B) + µ∗(E ∩ Bc) ≥ µ∗(E).

Therefore, all the inequalities are equalities, so B ∈ M . Taking E = B yields,  ∞  ∞ ∗ [ X ∗ µ Ai = µ (Ai). i=1 i=1

∞ for any disjoint collection {Ai}i=1 in M .

i i

i i “measureTheory_v2” 2019/5/1 i i page 87 i 5.4. Outer measures 87 i

M is complete. Choose A ∈ M with µ∗(A) = 0 and let B ⊂ A. Since, B ⊂ X, µ∗ (B) ≤ µ∗ (A) = µ (A) = 0. As proved above, this implies that B ∈ M and µ(B) = 0.

Before continuing with the construction of measures, we present an important ex- ample of outer measure.

Assume that X is a nonempty metric space with metric d. Recall that the class of open sets (the topology) is centrally important to metric spaces.

Definition 5.4.5: Borel σ-algebra

The σ- algebra generated by the open sets of X is called the Borel σ- algebra and is denoted BX = B. The members of BX are called Borel sets.A Borel measure on a metric space is a measure whose domain consists of the Borel sets.

The Borel sets include countable unions and intersections of open and closed sets, and countable unions and intersections of those sets, and so on.

In this section, we work in the measurable space (X, B).

Definition 5.4.6

In reference to the Borel σ- algebra of a metric space X, a countable intersec- tion of open sets is called a Gδ−set, a countable union of closed sets is called a Fσ−set, a countable intersection of Gδ−sets is called a Gδδ−set, a countable union of Gδ−sets is called a Gδσ−set, a countable intersection of Fσ−sets is called a Fσδ−set, and so on. Note that taking countable unions, intersections and complements starting with open (or closed) sets does not generate all the members of B. The point is that these set operations can be repeated in an unlimited fashion, not simply a countable number of times. The existence of a metric makes it possible to “separate” sets.

Definition 5.4.7

Let A, B ⊂ X. The distance between A and B is defined, d(A, B) = inf{d(a, b): a ∈ A, b ∈ B}.

This is well defined since d is nonnegative. In the next definition, we add a condition to the definition of an outer measure.

Definition 5.4.8: Metric outer measure

Let µ∗ be an outer measure on X. We say that µ∗ is a metric outer measure if µ∗(A ∪ B) = µ∗(A) + µ∗(B), (5.6)

i i

i i “measureTheory_v2” 2019/5/1 i i page 88 i 88 Chapter 5. Construction of a General Measure Structure i

Gc G

A 1 1

A2 1/ A 2 3 A 1 A /3 4 1/ X 4

Figure 5.4. Illustration of computing a metric outer measure from the “inside”.

for all A, B ⊂ X with d(A, B) > 0.

Note that if d(A, B) > 0, then A ∩ B = ∅. But more than that, the two sets are “well separated”. This separation should mean that we can compute the outer measure of the union by summing the outer measures of each set. Before stating the main result, we prove a theorem by Carathéodory that discusses the approximation of outer measure from “within”.

Theorem 5.4.5

Let µ∗ be a metric outer measure on X, let G ⊂ X be open, and assume A ⊂ G. c ∗ For each i ≥ 1, set Ai = {x ∈ A : d(x, G ) ≥ 1/i}. Then, limi→∞ µ (Ai) = µ∗(A).

See Figure 5.4.

∞ Proof. Since {Ai}i=1 is an increasing sequence and Ai ⊂ A for all i, we just have to ∗ ∗ show that lim µ (Ai) ≥ µ (A). Each point of A is an interior point of G, therefore S∞ since A ⊃ i=1 Ai, each point of A must belong to Ai for i sufficiently large. Thus, S∞ S∞ A ⊂ i=1 Ai or A = i=1 Ai. Set Bi = Ai+1 \ Ai for i ≥ 1. For each m,

∞ ! ∞ ! ∞ ! [ [ [ A = A2m ∪ Bi = A2m ∪ B2i ∪ B2i+1 . i=2m i=m i=m

i i

i i “measureTheory_v2” 2019/5/1 i i page 89 i 5.5.  89 i

Therefore, ∞ ∞ ∗ ∗ X ∗ X ∗ µ (A) ≤ µ (A2m) + µ (B2i) + µ (B2i+1). (5.7) i=m i=m ∗ If both the series in (5.7) converge, then letting m → ∞ and noting that lim µ (A2m) = ∗ ∗ ∗ lim µ (Am), shows µ (A) ≤ lim µ (Am). Otherwise, at least one of the series di- verges. Without loss of generality, assume the first series in (5.7) diverges. Since 1 1 d(B2i,B2i+2) ≥ 2i+1 − 2i+2 > 0,

m−1 ! m−1 ∗ ∗ [ X ∗ µ (A2m) ≥ µ B2i = µ (B2i) → ∞ as m → ∞. i=1 i=1 ∗ ∗ So, lim µ (Am) = ∞ ≥ µ (A).

The next result characterizes the µ∗-measurable sets to be precisely the sets we would hope to be outer measurable.

Theorem 5.4.6

µ∗ is a metric outer measure on X if and only if every is µ∗-measurable.

Proof. First, we assume µ∗ is a metric outer measure and show that every closed set F is µ∗-measurable. Let E ⊂ X. E\F is contained in the open set F c, so there is a sequence ∞ ∗ ∗ {Ai}i=1 of subsets of E \ F such that d(Ai,F ) ≥ 1/i and lim µ (Ai) = µ (E \ F ). Therefore,

∗ ∗ ∗ ∗ ∗ ∗ µ (E) ≥ µ ((E ∩ F ) ∪ Ai) = µ (E ∩ F ) + µ (Ai) → µ (E ∩ F ) + µ (E \ F ).

Now we assume that every Borel set is µ∗-measurable. Let A, B ⊂ X satisfy d(A, B) > 0. Choose an open set G ⊃ A such that G ∩ B = ∅. By assumption, G is µ∗-measurable, so µ∗(A ∪ B) = µ∗((A ∪ B) ∩ G) + µ∗((A ∪ B) \ G) = µ∗(A) + µ∗(B).

Theorem 5.4.4 implies

Theorem 5.4.7

∗ Let µ be a metric outer measure on X. Then, (X, BX, µ) is a measure space, where µ = µ∗| . BX

5.5  Hausdorff measure This result suggests that abstract measure theory is applicable to a wide range of in- teresting examples, e.g. any metric space. That is, provided there are any interesting metric outer measures in a general metric space! We next show there is a default metric outer measure in a general metric space that is interesting. We being by extending the notion the basic geometric concept of diameter of a sphere:

i i

i i “measureTheory_v2” 2019/5/1 i i page 90 i 90 Chapter 5. Construction of a General Measure Structure i

Definition 5.5.1

The diameter of a set A ⊂ X is d(A) = sup{d(x, y): x, y ∈ A}. A set A ⊂ X is bounded if d(A) < ∞.

Example 5.5.1

This coincides with the usual definition of diameter for a ball in Rn.

Anticipating that the end result is indeed an outer measure, we define

( ∞ ∞ ) ∗ X p [ µH,p,δ(A) = inf d(Ai) : A ⊂ Ai, d(Ai) ≤ δ all i , i=1 i=1

for any A ⊂ X, with the convention that inf ∅ = ∞. It is an exercise to show that the covering sets can be restricted to be either closed or open. ∗ As δ decreases, µH,p,δ(A) increases since the inf is computed over a smaller col- lection of covers. Thus, we define

Definition 5.5.2: Hausdorff outer measure

∗ ∗ The p dimensional Hausdorff outer measure is defined µH,p(A) = lim µH,p,δ(A), δ→0 A ⊂ X.

The role and usefulness of the parameter p is not clear at this point. We note that if A is a “n-dimensional” set in a “n-dimensional space”, then d(A) has units of a single dimension, so we raise the diameter to the power p = n to get the right dimensions for 3 π 3 volume in n-dimensions. For example, the volume of a ball in R of radius ρ is 6 ρ . In general, including p provides a way to define the measure of a surface bounding a volume, which is “lower dimensional” than the volume, and we explore that later. The definition of Hausdorff outer measure of a set as the limit of measures of covers with decreasing diameters provides a way to deal with potential geometric complexities.

Example 5.5.2

Consider X = C([0, 1]) equipped with the usual sup = max metric. Suppose A1 is the set of continuous functions with values between .1 and .2 and A2 is the set of continuous functions with values between .6 and .7. Define A = A1 ∪A2. Then d(A) = .6 > d(A1) + d(A2) = .1 + .1 + .2.

The main result is,

Theorem 5.5.1

∗ µH,p is a metric outer measure on X.

∗ ∗ Proof. Theorem 5.4.1 implies that µH,p,δ is an outer measure, thus µH,p is an outer measure.

i i

i i “measureTheory_v2” 2019/5/1 i i page 91 i 5.5.  Hausdorff measure 91 i

∞ Choose A, B ⊂ X with d(A, B) > 0. Choose a cover {Ai}i=1 of A ∪ B with d(Ai) < δ for all i. Then, Ai can have a nonempty intersection with A or B, but not both. We divide the sum , X p X p X p d(Ai) = d(Ai) + d(Ai) .

i Ai∩A6=∅ Ai∩B6=∅

This gives ∗ ∗ X p µH,p,δ(A) + µH,p,δ(B) ≤ d(Ai) i or ∗ ∗ ∗ µH,p,δ(A) + µH,p,δ(B) ≤ µH,p,δ(A ∪ B). We let δ → 0 to get the result.

∗ Theorem 5.4.7 implies that µH,p is a measure when restricted to BX.

Theorem 5.5.2

∗ ( , , µH,p) is a measure space, where µH,p = µ | . X BX H,p BX

Definition 5.5.3: Hausdorff measure

µH,p is the p dimensional Hausdorff measure.

It is an exercise to adapt earlier proofs to show,

Theorem 5.5.3

The Hausdorff measure is regular.

Determining the Hausdorff measure can be a complicated problem because sets can be very complicated.

Example 5.5.3: Sierpinski Carpet

We construct the Sierpinski Carpet using an iterative process. We begin with an equilateral triangle, see Figure 5.5. At step 1, we divide the initial triangle into four equilateral triangles and remove the middle third. At every subsequent step, we divide each equilateral in the current figure into four equilateral triangles, and remove the middle third. 3 i The area of the set at step i is 4 × the area of the initial triangle while the 3 i length of the perimeter of the set is 2 × the perimeter of the initial triangle. Thus, the perimeter of the sets tend to ∞ while their area tends to 0!

∗ The properties of a set is related to the parameter p in µH,p. In fact, there is a critical value of p for any set A ∈ BX.

i i

i i “measureTheory_v2” 2019/5/1 i i page 92 i 92 Chapter 5. Construction of a General Measure Structure i

Figure 5.5. First 6 sets in the construction of the Sierpinski Carpet.

Theorem 5.5.4

Let A ∈ BX. If µH,p(A) < ∞, then µH,q(A) = 0 for all q > p. Vice versa, if µH,p(A) > 0, then µH,q(A) = ∞ for all q < p.

Proof. For the first result, since µH,p(A) < ∞, for any δ > 0, there is a cover {Ai} ⊂ S P p BX with A ⊂ i Ai, d(Ai) < δ for all i, and i d(Ai) ≤ µH,p(A) + 1. For q > p, X q q−p X p q−p d(Ai) ≤ δ d(Ai) ≤ δ (µH,p(A) + 1). i i q−p So, µH,q(A) ≤ δ (µH,p(A) + 1). We let δ → 0. The second result follows immediately from the first result.

This theorem motivates Definition 5.5.4:

Let A ∈ BX. The Hausdorff dimension of A ∈ BX is the common value of

dimH (A) = inf{p ≥ 0 : µH,p(A) = 0} = sup{p ≥ 0 : µH,p(A) = ∞}.

Determining the Hausdorff dimension of a set is generally a complicated compu- tation that involves establishing both lower and upper bounds, see [Fal03, Fol99]. We give a few examples using heuristic arguments.

Example 5.5.4

We give a plausible computation of the Hausdorff measure of a line segment of length l in the square. We use m circles of diameter l/m as shown in Figure 5.6 (d), and compute the Hausdorff measure m1−p as shown. If p < 1, this converges to ∞ as m → ∞. If p > 1, this converges to 0 as m → ∞. Hence, we conclude the Hausdorff dimension is 1.

Example 5.5.5

Next, we provide a plausible estimate of the Hausdorff measure of a unit square 2 in R . In Figure 5.6, we show three covers√ of the square. The cover (a) yields an estimate of the Hausdorff measure as ( 2)p. Because the square is a regular figure with several symmetries, we can consider covers that have symmetries as shown in (b) and (c). Given an integer m > 0, we use circles of diameter 1/m to cover the square in the pattern shown (b). In general, there are 2m(m + 1) circles in the

i i

i i “measureTheory_v2” 2019/5/1 i i page 93 i 5.5.  Hausdorff measure 93 i

cover, so the measure of the cover is 2m(m + 1)/mp. If we choose p < 2, this converges to ∞ as m → ∞. If we choose p > 2, this converges to 0. Hence, the Hausdorff dimension is 2. Lastly, to convey the fact that covering a set is complicated business, we show a few members of a nonoverlapping cover in (c), but without the computation!

(a) (b) (c) (d)

Figure 5.6. Computation of the Hausdorff measure. (a), (b), (c) are covers of a square. (d) is a cover of a line segment in a square.

There are different approaches that exploit properties of the set such as self-similarity and symmetry. For example, we derive a scaling property of Hausdorff measure that holds in the metric space Rn.

Theorem 5.5.5

If A ⊂ Rn is a Borel set and α > 0, define αA = {αx : x ∈ A}. Then, p µH,p(αA) = α µH,p(A).

∞ Proof. Let {Ai}i=1 be a cover of A using sets of diameter no more than δ. Then, ∞ ∗ p ∗ {αAi}i=1 is a cover of αA. This gives, µH,p,δ(αA) ≤ α µH,p,δ(A) for any δ > p 0. Letting δ → 0, gives µH,p(αA) ≤ α µH,p(A). We get the reverse inequality by applying the same argument with the substitutions α → 1/α and A → αA.

Theorem 5.5.5 is useful for computing Hausdorff dimensions.

Example 5.5.6

We provide a plausible computation of the Hausdorff dimension of the Cantor set (Definition 4.3.6). C is the disjoint union C = CL ∩ CR of a “left-hand” part 1 2 CL = C ∩ [0, 3 ] and “right-hand” part CR = C ∩ [ 3 , 1]. CL and CR are similar to C, except for being scaled by a factor of 1/3 and a shift. By Theorem 5.5.5,

1p 1p µ (C) = µ (C ) + µ (C ) ≤ µ (C) + µ (C). H,p H,p L H,p R 3 H,p 3 H,p

1 p Assuming that 0 < 3 µH,p(C) < ∞ for p = dimH (C) (which should be 2 p proved), we get 1 = 2 3 , so p = log3(2) ≈ .6309.

i i

i i “measureTheory_v2” 2019/5/1 i i page 94 i 94 Chapter 5. Construction of a General Measure Structure i

Example 5.5.7: Koch snowflake

The boundary of the Koch snowflake is a well known example of a fractal [Fal03]. The Koch snowflake is constructed by an iterative process. Starting with an equi- lateral triangle, we alter each side by dividing each line segment into three seg- ments of equal length, then adding an equilateral triangle that has the middle seg- ment on the side as its base and points outward. In the second step, we add an equilateral triangle to the middle third of each line segment comprising the bound- ary from the previous step. This continues. We illustrate the first four steps in Figure 5.7. It is easy to prove that the perimeters of the figures obtained by this construc- tion converge to infinity. We can adapt the scaling argument in Example 5.5.6 to argue that the Hausdorff dimension of the boundary is log3(4) ≈ 1.2619.

Figure 5.7. First four steps in the construction of the Koch snowflake.

Working from the definition, it is a good exercise to prove some elementary proper- ties of Hausdorff dimension.

Theorem 5.5.6

The Hausdorff dimension has the properties.

1. If A ∈ BX is countable, then dimH (A) = 0. 2. If A, B ∈ BX with A ⊂ B, then dimH (A) ≤ dimH (B). ∞ S∞ 3. If {A}i=1 ⊂ BX with dimH (Ai) ≤ d for all i, then dimH ( i=1 Ai) ≤ d.

5.6 Premeasures There is a disconnect between the approach we use to develop measure informally in Chapter 4 and the construction of measure via an outer measure in this chapter. In Chapter 4, we generalize the idea of measuring the sizes of complicated sets based on the familiar idea of the length of simple half-open intervals. In retrospect (Exam- ple 5.1.1), with X = (0, 1], the set M = {∅, finite unions of disjoint intervals} is an algebra. So, we develop measure by specifying the measure of elements in an algebra. In this section, we connect the two approaches. We begin by formalizing the ap- proach used in Chapter 4.

i i

i i “measureTheory_v2” 2019/5/1 i i page 95 i 5.6. Premeasures 95 i

Assume X is a nonempty set.

Definition 5.6.1: Premeasure

Let A ⊂ PX be an algebra on X. A set function µ0 : A → [0, ∞] is a premea- sure if

1. µ0(∅) = 0. ∞ ∞ S 2. If {Ai}i=1 ⊂ A is a collection of disjoint sets with Ai ∈ A , then i=1

∞ ! ∞ [ X µ0 Ai = µ0 (Ai) i=1 i=1

Example 5.6.1

The informal development of measure in Chapter 4 uses the premeasure µL((a, b]) = b − a. We assume that condition 2 holds and add it to the second wish list for mea- sure. We also use the same name for the premeasure and the eventual measure it leads to. We make this rigorous in Chapter 6.

Unlike outer measures, premeasures are countably additive on a restricted class of ∞ S subsets. However, for premeasures, we have to assume that Ai ∈ A because we are i=1 working with an algebra not a σ- algebra. Moving to a full measure from a premeasure therefore involves establishing an additional condition on the domain of subsets.

Theorem 5.6.1

Premeasures are finitely additive and monotone.

Example 5.6.2

Let X = {1, 2, 3}, A = {∅, {1} , {2, 3} , {1, 2, 3}} and let µ0 : A → [0, ∞] with,

µ0(∅) = 0, µ0({1}) = 2, µ0({2, 3}) = 3, µ0({1, 2, 3}) = 5.

Then, µ0 is a premeasure.

Example 5.6.3

The set function µ defined in Example 5.2.1 is a premeasure.

i i

i i “measureTheory_v2” 2019/5/1 i i page 96 i 96 Chapter 5. Construction of a General Measure Structure i

Example 5.6.4

Let X = {1, 2, 3}, A = {∅, {1} , {1, 2} , {1, 2, 3}} and let ρ : A → [0, ∞] with, ρ(∅) = 0, ρ({1}) = 2, ρ({1, 2}) = 1, ρ({1, 2, 3}) = 3.

ρ is not a premeasure because A is not an algebra. As with measures, we distinguish two kinds of premeasures: Definition 5.6.2

If A ⊂ PX is an algebra, and µ0 is a premeasure, then µ0 is a finite premea- sure if µ0(X) < ∞ and a σ-finite premeasure if there is an increasing sequence ∞ ∞ S {Ai}i=1 ⊂ A with X = Ai and µ0(Ai) < ∞ for all i. i=1 We use a given premeasure to induce an outer measure and study the properties of the resulting outer measure. Theorem 5.6.2

Let A be an algebra on X, and µ0 be a premeasure on A . Then, µ0 induces an outer measure µ∗ defined,

( ∞ ∞ ) ∗ X ∞ [ µ (A) = inf µ0(Ai): {Ai}i=1 ⊂ A ,A ⊂ Ai ,A ⊂ X. (5.8) i=1 i=1 The outer measure µ∗ satisfies:

∗ 1. µ (A) = µ0(A), A ∈ A . 2. Every set in A is µ∗ - measurable.

∗ Proof. Theorem 5.4.1 implies that µ0 induces µ . We prove in order. Property 1 We show µ∗ ≤ µ and µ ≤ µ∗ . If A ∈ and A ⊂ S∞ A with A 0 0 A A i=1 i ∞ {Ai}i=1 ⊂ A , define  j−1  Bj = A ∩ Aj \ ∪i=1 Ai , j ≥ 1.

∞ {Bj}j=1 is a disjoint collection whose union is A ∈ A , and Bi ⊂ Ai for every i, so ∞ ∞ X X µ0(A) = µ0(Bj) ≤ µ0(Ai). j=1 i=1 ∗ S∞ Therefore, µ0(A) ≤ µ (A). However, A ⊂ i=1 Ai, where A1 = A and Ai = ∅ for ∗ i ≥ 2. So µ (A) ≤ µ0(A). Property 2 Let E be a test set for Carathéodory’s Condition. For  > 0, there is a ∞ ∞ S collection {Bi}i=1 ⊂ A such that E ⊂ Bi and, i=1 ∞ X ∗ µ0(Bi) ≤ µ (E) + . i=1

i i

i i “measureTheory_v2” 2019/5/1 i i page 97 i 5.6. Premeasures 97 i

c Choose A ∈ A . Then, Bi = (Bi ∩ A) ∪ (Bi ∩ A ) is a disjoint union of sets in A for each i, and c µ0(Bi) = µ0(Bi ∩ A) + µ0(Bi ∩ A ). Thus,

∞ ∞ ∗ X X c ∗ ∗ c µ (E) +  ≥ µ0(Bi ∩ A) + µ0(Bi ∩ A ) ≥ µ (E ∩ A) + µ (E ∩ A ) . i=1 i=1 Since  was arbitrary, the desired result follows.

Example 5.6.5

The outer measure generated by the premeasure µ0 defined in Example 5.6.2 has values

µ∗ (∅) = 0, µ∗ ({1}) = 2, µ∗ ({2}) = µ∗ ({3}) = 3, µ∗ ({1, 2}) = µ∗ ({1, 3}) = 5, µ∗ ({2, 3}) = 3, µ∗ ({1, 2, 3}) = 5,

All the sets in A are µ∗ - measurable. The set {2} is not measurable since µ∗ ({2, 3}) = 3 while µ∗ ({2} ∩ {2, 3}) + µ∗ ({1, 3} ∩ {2, 3}) = 6.

Example 5.6.6

Even though the set function ρ defined in Example 5.6.4 is not a premeasure, we can still use it to generate an outer measure since X ⊂ A . This has values µ∗ (∅) = 0, µ∗ ({1}) = 1, µ∗ ({2}) = 1, µ∗ ({3}) = 3, µ∗ ({1, 2}) = 1, µ∗ ({2, 3}) = µ∗ ({1, 3}) = 3, µ∗ ({1, 2, 3}) = 3,

Checking case by case shows that the only µ∗ - measurable sets are ∅ and X. For example, if we try A = {1} and choose E = {1, 2, 3}, then µ∗ (E) = 3 while,

µ∗ (A ∩ E) + µ∗ (Ac ∩ E) = µ∗ ({1}) + µ∗ ({2, 3}) = 1 + 3 = 4.

Now we are ready to construct a measure by starting with a premeasure on an alge- bra, inducing an outer measure, and then restricting the outer measure to get a measure.

Theorem 5.6.3: Hahn-Kolmogorov Extension

Let A ⊂ PX be an algebra and µ0 be a σ−finite premeasure defined on A . There exists a unique measure µ on σ (A ) whose restriction to A is µ0.

Definition 5.6.3

i i

i i “measureTheory_v2” 2019/5/1 i i page 98 i 98 Chapter 5. Construction of a General Measure Structure i

We say that the measure space (X, σ (A ) , µ) is induced by the premeasure µ0 on A . Likewise, µ is induced by µ0.

As with Theorem 5.4.4, Theorem 5.6.3 is most valuable when σ (A ) is a “rich” σ- algebra. This is the second major theorem we prove.

Proof. Existence of the extension We have done all the hard work here. We begin by extend- ∗ ing the premeasure µ0 on A to the outer measure µ on all of PX using Theorem 5.4.1 and Theorem 5.6.2. Then, we restrict µ∗ to obtain a measure µ on the µ∗-measurable sets σ- algebra M using Theorem 5.4.4. Since M contains A , it contains σ (A ). Thus, we have obtained a measure µ on σ ( ) such that, µ = µ∗ and µ = µ∗ = µ . A σ(A ) A A 0

Uniqueness Assume that ν is another measure on σ (A ) that extends µ0. We begin by showing that ν(A) ≤ µ (A) for all A ∈ σ (A ). Then, we show that µ (A) ≤ ν(A) for all A ∈ σ (A ). This shows that ν = µ. The idea of showing that ν = µ by showing two inequalities on their values is suggested by the definition of outer measure, which involves taking an infimum over countable covers. First, note that ν = µ = µ , so for A ∈ , ν(A) = µ (A) = µ (A). A 0 A A 0 ∞ ∞ S However, if {Ai}i=1 is a collection of sets in A , then we know that A = Ai is in i=1 σ(A ), but we do not know if A is in A . So, it is not immediately clear that ν(A) = µ (A). Sm ∞ To prove this is true, we use the fact that { i=1 Ai}m=1 is an increasing sequence of sets in A . Since ν and µ are measures, we use monotonicity (Theorem 5.2.3) to conclude,

m ! m ! [ [ ν(A) = lim ν Ai = lim µ Ai = µ (A) . m→∞ m→∞ i=1 i=1

∞ ∞ S Now choose A ∈ σ (A ) and {Ai}i=1 ⊂ A with A ⊂ Ai. Then, i=1

∞ ! ∞ ∞ [ X X ν(A) ≤ ν Ai ≤ ν(Ai) = µ0(Ai). i=1 i=1 i=1

This says that ν acts like an outer measure on A . In particular, this means that,

( ∞ ∞ ) X [ ν(A) ≤ inf µ0(Ai): A ⊂ Ai = µ (A) , i=1 i=1

since A ∈ σ (A ). Next, we show that if A ∈ σ (A ) and in addition µ (A) < ∞, then ν(A) = µ (A). ∞ ∞ ∞ S P For  > 0, choose {Ai}i=1 ⊂ A with A ⊂ Ai so that µ0(Ai) < µ (A) + . i=1 i=1

i i

i i “measureTheory_v2” 2019/5/1 i i page 99 i 5.7. Approximation of measures 99 i

 ∞  S Thus, µ Ai \ A <  since µ (A) < ∞. This implies i=1

∞ ! ∞ ! ∞ ! [ [ [ µ (A) ≤ µ Ai = ν Ai = ν(A) + ν Ai \ A i=1 i=1 i=1 ∞ ! [ ≤ ν(A) + µ Ai \ A ≤ ν(A) + . i=1 Since  is arbitrary, µ (A) ≤ ν(A) and therefore ν(A) = µ (A). ∞ S Finally, we use the assumption that µ0 is σ- finite, so that X = Ai with Ai ∈ A i=1 ∞ and µ0(Ai) < ∞ for all i. We can assume that {Ai}i=1 is disjoint (by the argument that should now be familiar). For any A ∈ σ (A ), A is given by the disjoint union, ∞ S A = A ∩ Ai. Hence, i=1

∞ ∞ X X µ (A) = µ (A ∩ Ai) = ν(A ∩ Ai) = ν(A). i=1 i=1

Remark 5.6.1

Note how we approach uniqueness by assuming the existence of another object, then showing it has to be the same as the original object.

5.7 Approximation of measures

The process for creating a measure on a σ- algebra of a domain X begins by specifying a premeasure on an algebra, constructing an outer measure on the power set of X, then using Carathéodory’s Theorem to obtain a measure on a σ- algebra. It is a complicated process that is required in order to define a consistent process of measuring the sizes of complex sets based on the measure of simple sets. An important issue that we have not addressed is the relation of the original premea- sure on simple sets with which we start to the measure on complex sets that we obtain. Of course, Theorem 5.6.3 states that the measure and premeasure agree on sets in the original algebra. The interesting question is the relation between the two on sets in the larger σ- algebra. The following approximation result states that the measure of a set in the σ- algebra can be approximated using premeasures of sets in the original algebra. Results like this underlie numerical computation of probabilities.

Theorem 5.7.1: Approximation of Measures

Let (X, M , µ) be a measure space induced by the σ-finite premeasure µ0 on the algebra A ⊂ M = σ (A ). Let B ∈ M be a measurable set. For any  > 0, there is a set A ∈ A such that

µ(B) − µ0(A) ≤ µ(B4A) <  (5.9)

i i

i i “measureTheory_v2” 2019/5/1 i i page 100 i 100 Chapter 5. Construction of a General Measure Structure i

if µ(B) < ∞, and µ(B4A) <  (5.10) if µ(B) = ∞.

This kind of approximation result is interesting only in the case of domains X of infinite cardinality, since the approximations become exact after some point in the finite case. This is the third major theorem we prove.

Proof. We first consider the case that µ(B) is finite and extend the result to the general case afterwards. By (5.8) and the definition of infimum, given  > 0, there is a collection ∞ S∞ {Ai}i=1 ⊂ A with B ⊂ i=1 Ai and ∞ ∞ ∞ ∞ X  X  X X µ(A ) − = µ (A ) − ≤ µ(B) ≤ µ (A ) = µ(A ). i 2 0 i 2 0 i i i=1 i=1 i=1 i=1 It follows that ∞ ! [  µ A \B ≤ . i 2 i=1 ( j )∞ [ Since Ai is an increasing sequence of sets, there is a m such that i=1 j=1

∞ m ! [ [  µ A  A ≤ . i i 2 i=1 i=1 m [ We set A = Ai to be the desired set approximation. We have, i=1

∞ ! ! ∞ ! ! [ [ [ B4A = (B\A) ∪ A(\B) = Ai\A ∩ B Ai\B ∩ A , i=1 i=1 where the union is disjoint. Thus,

∞ ! ! ∞ ! ! [ [ µ(B4A) = µ Ai\A ∩ B + µ Ai\B ∩ A i=1 i=1 ∞ ! ∞ ! [ [ ≤ µ Ai\A + µ Ai\B < . i=1 i=1 Now B = (B ∩ A) ∪ (B\A) and A = (B ∩ A) ∪ (A\B). Hence, µ(A) − µ(B) ≤ µ(A\B) − µ(B\A) and µ(B) − µ(A) ≤ µ(B\A) − µ(A\B). Thus,

|µ(B) − µ(A)| = |µ(B) − µ0(A)| ≤ µ(B4A) = µ(B\A) + µ(A\B). This completes the claim. ∞ Now assume that µ is σ-finite. Let {Ci}i=1 be a disjoint collection of sets in M S∞ with X = i=1 Ci and µ(Ci) < ∞ for all i. Given B ∈ B, we write the disjoint union m ! ∞ [  B = S B ∩C . As above, given  > 0, there is an m such that µ B C < . i=1 i i 2 i=1 Sm  We set C = i=1 Ci. Since µ(C) < ∞, there is a A ∈ A such that µ(C4A) < 2 .

i i

i i “measureTheory_v2” 2019/5/1 i i page 101 i 5.8. Zoology of measure creatures 101 i

We have µ(B4A) = µ(B\A) + µ(A\B) = µ((B\C)\A) + µ(C\A) + µ(A\B) ≤ µ(B\C) + µ(C\A) + µ(A\C) < .

Remark 5.7.1

The practical use of Theorem 5.7.1 to approximate the measure of a given set depends on the ability to produce collections of sets that in the limit yield the infimum defining the outer measure induced by the given premeasure.

5.8 Zoology of measure creatures We tabulate the properties of all the kinds of measures we have seen in the Table 5.1.

Table 5.1. Characteristics of measure, premeasure and outer-measure

∗ Properties premeasure µ0 outer-measure µ measure µ

domain algebra PX σ- algebra monotonicity XXX finite-sub-additivity XXX finite-additivity X - X countable-sub-additivity - XX countable-additivity - - X

5.9 References 5.10 Worked problems The ideas underlying what makes a set µ∗ - measurable can be the source of much confusion even after years of study. We are particularly interested in probability spaces (where the measure of the space is one) and finite measure spaces in general (e.g., when considering spatial domains commonly encountered with partial differential equations), so we end this chapter by considering two problems that provide other characterizations of µ∗ - measurable sets when the measure is finite. These characterizations are both useful conceptually and practically. A useful exercise for the reader is to consider if these problems can be generalized to cases where µ0 is σ-finite. It is useful to review Carathéodory’s Condition (Definition 5.4.4) for defining µ∗- measurable sets before attempting the first problem below. Here, we simply note that a conceptual interpretation of this Condition is that if A is a µ∗ - measurable set, then A and Ac “partition X nicely” in the sense that µ∗ behaves additively instead of sub- additively on E = (E ∩ A) ∪ (E ∩ Ac) for any E ⊂ X. This is theoretically convenient for proving µ∗ is in fact a measure on the σ- algebra of µ∗ - measurable sets since µ∗ is only assumed to be sub-additive in general. The problem below provides an equivalent condition on µ∗ - measurable sets when the outer measure µ∗ is defined by a finite premeasure µ0. Note that this condition only involves the inner and outer approximation

i i

i i “measureTheory_v2” 2019/5/1 i i page 102 i 102 Chapter 5. Construction of a General Measure Structure i

by sets in the algebra A on which the premeasure µ0 is defined. In other words, rather than use a condition on µ∗ - measurable sets that requires testing the µ∗ measure of every E ⊂ X (Carathéodory’s Condition), this condition is defined entirely in terms of how well a set can itself be approximated in µ∗ measure. It is useful to compare this to the ideas of Jordan measurable sets and the approximation of measures by symmetric differences discussed in the chapter. The differences are subtle but important.

Problem 5.10.1

∗ Let A be an algebra on X, µ0 be a finite premeasure on A , and outer measure µ ∗ induced by µ0 as in (5.8). A set A is µ - measurable if and only if there exists ∞ ∞ ∞ ∞ two countable families of sets {{Ai,j}}j=1}i=1, {{Bi,j}}j=1}i=1 ⊂ A defining inner ∞ ∞ outer ∞ ∞ A = ∪i=1 ∩j=1 Ai,j ∈ σ (A ) and A = ∩i=1 ∪j=1 Bi,j ∈ σ (A ) such that Ainner ⊂ A ⊂ Aouter and µ∗(Aouter \ Ainner) = 0.

The problem below provides perhaps the simplest characterization of sets that are ∗ ∗ µ -measurable when µ is defined by a finite premeasure µ0.

Problem 5.10.2

∗ Let A be an algebra on X, µ0 be a finite premeasure on A , and outer measure µ ∗ induced by µ0 as in (5.8). A set A is µ - measurable if and only if

∗ ∗ c µ (A) + µ (A ) = µ0(X).

Hint: The proof of the previous problem greatly simplifies the reverse direction of this proof. The forward direction of this proof is almost trivial.

i i

i i “measureTheory_v2” 2019/5/1 i i page 103 i i

Chapter 6 Measure Structure in Euclidean Space

The strength of mathematics multiplies, like the giant Antaeus, when it makes contact with reality, the ground upon which it was grown.

C. Carathéodory

The introduction of numbers as coordinates is an act of violence.

H. Weyl

n ... we shall content ourselves with constructing [measure] on a class of subsets of R that includes all the sets one is likely to meet in practice unless one is deliberately searching for pathological examples.

G. Folland

I remember one occasion when I tried to add a little seasoning to a review, but I wasn’t allowed to. The paper was by Dorothy Maharam, and it was a perfectly sound contribu- tion to abstract measure theory. The domains of the underlying measures were not sets but elements of more general Boolean algebras, and their range consisted not of pos- itive numbers but of certain abstract equivalence classes. My proposed first sentence was: “The author discusses valueless measures in pointless spaces.”

P. Halmos

Panorama In this chapter, we apply the general measure theory of Chapter 5 to construct an im- portant class of measures on Euclidean space Rn, n ≥ 1. Rn is of course the canonical example of a metric space. But, it has some special properties in comparison to a general metric space, and this impacts the construction and the properties of measures. We begin by constructing the default σ- algebra based on the open sets defined by the metric. The key result is that we can generate this σ- algebra by using set operations on a number of different kinds of elementary sets. The measures we construct generalize the idea of length of an interval to allow for variations in how length is measured. When we measure length using a ruler, we exploit

103

i i

i i “measureTheory_v2” 2019/5/1 i i page 104 i 104 Chapter 6. Measure Structure in Euclidean Space i

1 2 3 4 5 6

6 5 4 3 2 1

1 2 3 4 5 6

6 5 4 3 2 1

Figure 6.1. Rulers with homogeneous and nonhomogeneous scales.

Figure 6.2. An AcuDesign Acumath 400 slide rule.

the “directional” or “order” property imposed on the real line, e.g., measuring length of an interval by subtracting the left coordinate position from the right coordinate position of the interval endpoints. This works because position is monotone increasing as we move rightwards from a given point. A standard ruler, of course, uses a homogeneous division of markings. However, we can use any monotone arrangement of marks to measure length, see Figure 6.1. One important historical example is the slide rule, which is based on a logarithmic scaling that relates addition to multiplication, see Figure 6.2. We encounter other important examples in probability later. Because direction plays such a critical role, we first develop the theory for R1 be- fore moving to Rn. The idea is the same for Rn, but geometry complicates matters significantly. We also develop some of the basic properties of measure on Rn. The first result shows that the measure of sets can be computed by computing limits of measures of sets of particular types, e.g. closed or compact sets. This is not so useful for computation, but is useful for proving additional properties. The second result says that the particular measure that corresponds to the familiar length, area, and volume is invariant under some special maps such as translation and these invariance properties distinguish that measure. The default measure theory on Rn is closely related to a fundamental property about the approximation of open sets by simple sets. We open the chapter by giving an ap- proximation result based on cubes and we close the chapter by giving the analogous result using open balls.

6.1 Approximation of open sets

In Chapter 4, we based the intuitive development of measure in R by using the length of an interval and showed that this leads to a useful concept that can measure the sizes

i i

i i “measureTheory_v2” 2019/5/1 i i page 105 i 6.1. Approximation of open sets 105 i

of complicated sets. But, we did not address the general usefulness of this approach. In other words, can we extend these ideas to handle other kinds of sets in R such as open sets. We begin by discussing a fundamental result about approximation of open sets in Rn by countable numbers of set operations applied to a collection of “simple” sets, where simple refers to both geometry and the simplicity of defining the “length”, “area”, or “volume”, i.e. measure. Such results underlie measure theory in Rn and have practical importance in numerical computation. There are various choices for the collection of simple sets. In this section, we present a classic result about “rectangular” sets. Later, we present an analogous result using “balls”.

Definition 6.1.1

> > n Let a = (a1, a2, . . . , an) and b = (b1, b2, . . . , bn) be points in R for n ≥ 1 with the usual metric, such that ai ≤ bi for 1 ≤ i ≤ n. A (generalized) closed rectangle is a set of the form,

n {x ∈ R : ai ≤ xi ≤ bi, 1 ≤ i ≤ n} = [a1, b1] × · · · × [an, bn].

We denote the collection of all generalized close rectangles by Rc. A (generalized) open rectangle is a set of the form,

n {x ∈ R : ai < xi < bi, 1 ≤ i ≤ n} = (a1, b1) × · · · × (an, bn).

We denote the collection of all generalized open rectangles by Ro. A (generalized) right half-closed rectangle is a set of the form,

n {x ∈ R : ai < xi ≤ bi, 1 ≤ i ≤ n} = (a1, b1] × · · · × (an, bn].

We denote the collection of all generalized right half-closed rectangles by Rrc. A (generalized) left half-closed rectangle is a set of the form,

n {x ∈ R : ai ≤ xi < bi, 1 ≤ i ≤ n} = [a1, b1) × · · · × [an, bn).

We denote the collection of all generalized left half-closed rectangles by Rlc. A (generalized) open or closed or half-open cube is an open or closed or half-open rectangle with bi − ai = bj − aj for 1 ≤ i, j ≤ n. A face of the generalized rectangle [a1, b1] × · · · × [an, bn] or any of the open or half-closed variations is a set

n {x ∈ R : xj = aj or bj for some 1 ≤ j ≤ n and ai ≤ xi ≤ bi, 1 ≤ i ≤ n, i 6= j} .

The approximation result is,

Theorem 6.1.1: Approximation of Open Sets by Cubes

Every open set G in Rn is the union of a countable disjoint collection of half-open cubes.

See Figure 6.3 and consider the earlier Figure 4.1.

Proof.

i i

i i “measureTheory_v2” 2019/5/1 i i page 106 i 106 Chapter 6. Measure Structure in Euclidean Space i

Figure 6.3. Illustration of the construction in the proof of Theorem 6.1.1. We show the approximation by non-overlapping dyadic cubes constructed from Cj for j = 1, 2, 3, 4, 5.

−j Let Cj be the countable family of half-open “dyadic” cubes of the form (i12 , (i1 + −j −j −j 1)2 ] × · · · × (in2 , (in + 1)2 ], i1, ··· , in ∈ Z, that is whose vertices lie on the −j ∞ rectangular lattice of points with spacing 2 and let C = ∪j=1Cj. C is a countable collection. n Each point x ∈ R lies in one and only one member of Cj for each j. If x ∈ G, then x is contained in some open ball of positive radius contained in G. It follows (exercise) that x is contained in a half-open cube contained in the ball, and thus G, that belongs to Ck for some k sufficiently large. This shows that G is contained in the union of all half-open cubes in C that are contained in G. Note that if A ∈ Cj and B ∈ Ck with j < k then either B ⊂ A or A ∩ B = ∅. Let C˜ denote the collection of half-open cubes in C that lie in G. We construct a disjoint collection iteratively. Let Cˆ1 be the cubes in C˜ ∩ C1. Let Cˆ2 be the cubes in C˜ ∩ C2 that do not lie in any cube in Cˆ1. Continuing, let Cˆi be the cubes in C˜ ∩ Ci that do not lie in ˆ ˆ S∞ ˆ any cube in Ci−1. We illustrate in Fig. 6.3. Finally, C = i=1 Ci is the desired disjoint countable collection of cubes.

6.2 Generating the σ- algebra We start the construction of a measure space by constructing the σ- algebra. The stan- dard σ- algebra on Rn is constructed using the metric space topology. We recall some definitions from the (optional) Sec. 5.5 applied to Rn. Definition 6.2.1: Borel σ-algebra

The σ- algebra generated by the open sets of Rn with the standard metric is called n n the Borel σ- algebra and is denoted BR = B. The members of BR are called Borel sets.A Borel measure on a metric space is a measure whose domain consists of the Borel sets.

The Borel sets include countable unions and intersections of open and closed (through complements) sets, and countable unions and intersections of those sets, and so on. Recall that a countable union of open sets is open, but a countable intersection of open

i i

i i “measureTheory_v2” 2019/5/1 i i page 107 i 6.2. Generating the σ- algebra 107 i

sets may not be open. Likewise, countable intersections of closed sets are closed, but countable unions of closed sets may not be closed. This motivates:

Definition 6.2.2

A countable intersection of open sets is called a Gδ−set, a countable union of closed sets is called a Fσ−set, a countable intersection of Gδ−sets is called a Gδδ−set, a countable union of Gδ−sets is called a Gδσ−set, a countable intersec- tion of Fσ−sets is called a Fσδ−set, and so on.

As noted earlier, taking countable unions, intersections and complements starting with

n open sets generates a rich class of sets but does not generate all the members of BR . The σ- algebra we use on Rn is the closure (completion) of the Borel σ−algebra. However, a significant issue in defining a measure space by beginning with the open sets of a metric space is that the open sets are a large collection on which to define an outer measure. It is preferable to start with a collection of sets that have much simpler geometry. For Rn, it turns out that we can use a simpler collection of sets. A hint of this fact is provided by Theorem 6.1.1. In addition to the sets in Def. 5.3.2c, we define

Definition 6.2.3

We denote the collections of open, closed and compact sets in Rn by G , F , and K respectively.

n The following result shows that we can generate BR using a variety of sets.

Theorem 6.2.1: Generation of Borel σ- algebras in Rn

We have,

rc c n n n 1. BR = σ (G ). 4. BR = σ (R ). 7. BR = σ (R ). o n n 2. BR = σ (F ). 5. BR = σ (R ). lc n n 3. BR = σ (K ). 6. BR = σ R .

Proof. Result 1 This is just the definition. Result 2 For any open set G ∈ G , Gc ∈ F . Thus, G ∈ σ (F ), which implies that G ⊂ σ (F ) and thus σ (G ) ⊂ σ (F ). A similar argument with roles reversed leads to σ (F ) ⊂ σ (G ), which shows 2. Result 3 Any compact set K is also closed so σ (K ) ⊂ σ (F ). We write any closed n set as a countable union of compact sets in R . For F ∈ F , define Fi = F ∩ Bi(0) + for i ∈ Z , where Bi(0) is the ball of radius i centered at the origin 0. Each Fi is an intersection of closed sets and is bounded and hence is compact in Rn. Moreover, ∞ S F = Fi. Thus, F ∈ σ (K ) and so σ (F ) ⊂ σ (K ). This shows 3. i=1 Result 4 Theorem 6.1.1 implies that any open set G can be written as the countable union of right half-closed rectangles. This implies that G ⊂ σ (Rrc), so σ (G ) ⊂

i i

i i “measureTheory_v2” 2019/5/1 i i page 108 i 108 Chapter 6. Measure Structure in Euclidean Space i

σ (Rrc). On the other hand, \ (a1, b1] × ... (an, bn] = (a1, b1 + 1/j) × ... (an, bn + 1/j). (6.1) + j∈Z

Thus, Rrc ⊂ σ (Ro) ⊂ σ (G ). This implies σ (Rrc) ⊂ σ (G ), showing 4. Result 5 Since Ro ⊂ G , σ (Ro) ⊂ σ (G ). On the other hand, (6.1) shows that σ (G ) = σ (Rrc) ⊂ σ (Ro). This shows 5. lc rc lc rc Result 6 Rn is obtained by taking complements of sets in R , so σ R = σ (R ). Result 7 A closed rectangle is a member of F , so σ (Rc) ⊂ σ (F ). On the other hand, [ (a1, b1] × ... (an, bn] = [a1 + 1/j, b1] × ... [an + 1/j, bn]. + j∈Z Hence, σ (F ) = σ (G ) = σ (Rrc) ⊂ σ (Rc). This shows 7.

For clarity, we restate these ideas in the one dimensional case. We also add a few more types of generating sets.

Definition 6.2.4

In one dimension, Rc is the set of closed intervals {[a, b], a, b ∈ R}. Ro is the set of open intervals {(a, b), a, b ∈ R}. Rrc is the set of right half-closed intervals {(a, b], a, b ∈ R}. Rlc is the set of left half-closed intervals {[a, b), a, b ∈ R}. lo R∞ is the set of left half-open rays {(a, ∞), a ∈ R}. lc R∞ is the set of left half-closed rays {[a, ∞), a ∈ R}. ro R∞ is the set of right half-open rays {(−∞, b), b ∈ R}. rc R∞ is the set of right half-closed rays {(−∞, b], b ∈ R}.

Theorem 6.2.2: Generation of Borel σ- algebra in R

In one dimension, we have

rc c ro 1. BR = σ (R ). 4. BR = σ (R ). 7. BR = σ (R∞ ) o lo  rc 2. BR = σ (R ). 5. BR = σ R∞ 8. BR = σ (R∞) 3. = σ lc. lc  BR R 6. BR = σ R∞

Proof. Results 1.-4. are proved above. It is a good exercise to prove Results 5.-8. It is possible to use an analog of (6.1) applied to each type of ray.

We can choose any of the generating sets in Theorems 6.2.1 and 6.2.2 to build the complete measure structure. However, it is useful in the context of probability measures to generate the σ- algebra using the generalized right half-closed rectangles or right half-closed rays.

i i

i i “measureTheory_v2” 2019/5/1 i i page 109 i i 6.3. Borel, Lebesgue-Stieljes measures on R 109

6.3 Borel, Lebesgue-Stieljes measures on R We begin with the construction of measure on R. It is inconvenient if not impossible to define a measure on every element of a complex σ-algebra. So, we describe a process by which we can define a measure-like object on a much simpler class of sets, and then systematically extend the definition to obtain a measure on a σ-algebra. On the theme of building on a simple collection of sets,

Definition 6.3.1: Semi-algebra

A semi-algebra or elementary family on a nonempty set X is a collection E of subsets of X such that,

1. ∅ ∈ E . 2. If E,F ∈ E then E ∩ F ∈ E . 3. If E ∈ E , then Ec is a finite disjoint union of members of E .

Example 6.3.1

 If X = {a, b, c}, then ∅, {a}, {b}, {c} is a semi-algebra.

If we define a method to “measure” subsets of a semi-algebra, that method automat- ically extends to cover intersections and complements of subsets of the semi-algebra. For the latter, it is key that the condition on complements involves finite disjoint unions. It turns out that we can generate an algebra from a semi-algebra in such a way that if we define a method to measure the subsets of the semi-algebra, it automatically extends to subsets of the generated algebra.

Theorem 6.3.1

If E is a semi-algebra, then the collection A of finite disjoint unions of members of E is an algebra.

Proof. First note that E ⊂ A since any E ∈ E can be written as the disjoint union of E ∪ ∅. So, A is not empty. Let A ∈ A so A is given by a disjoint union A = E1 ∪ E2 ∪ · · · ∪ Em. We first c c c c c c show that A ∈ A . Note that A = E1 ∩ E2 ∩ · · · ∩ Em. However, each El can be Ec = El ∪ · · · ∪ El ∅ written as a disjoint union l 1 ml . By adding enough copies of to each collection, we can assume that ml = m = max ml for all l. Therefore,

m  m  m m ! c \ [ l [ \ l A =  Ej = Ej , l=1 j=1 j=1 l=1

which is a disjoint union of sets in E . We next show that A is closed under finite unions of sets. This follows by induction after we show that it is closed under the union of two sets. Let A, B ∈ A , so that there m l S S are disjoint unions A = Ci and B = Dj of sets in E . We show that A ∪ B = i=1 j=1 S  i,j Ci ∪ Dj ∈ A using induction. We know that there is a disjoint union such that

i i

i i “measureTheory_v2” 2019/5/1 i i page 110 i 110 Chapter 6. Measure Structure in Euclidean Space i

k k c S c S c D1 = Ei, so C1 ∩ D1 = C1 ∩ Ei ∈ A . But, C1 ∪ D1 = (C1 ∩ D1) ∪ D1. Since i=1 i=1 c C1 ∩ D1 is the disjoint union of sets in E , C1 ∪ D1 is a disjoint union of sets in E and S  hence C1 ∪ D1 ∈ A . Induction on m and l shows that i,j Ci ∪ Dj ∈ A . Finally, another use of induction shows that A is closed under finite unions.

Example 6.3.2

In Example 6.3.1, the algebra generated by the semi-algebra is PX.

For R, we use the following sets to create a semi-algebra. Definition 6.3.2: h-interval

We let Rhi denote the collection of h-intervals, consisting of sets in R of the form (a, b], (−∞, b], (a, ∞), and ∅ for a, b ∈ R.

Theorem 6.3.2

Rhi is a semi-algebra.

Proof. Exercise.

Following Theorem 6.3.1,

Definition 6.3.3

We let A hi denote the algebra obtained by taking finite disjoint unions of h- intervals in Rhi. It follows,

Theorem 6.3.3

hi The σ- algebra generated by A is BR.

Proof. Theorem 6.2.2.

Next, we identify the premeasure. Recall that in Chapter 4, we build an intuitive concept of measure based on computing the length of an interval (a, b] as b − a. We want to generalize this idea, so we examine this concept of length. There are two key observations. First, the definition of b − a, as say opposed to a − b, is based on the ordering property of the real numbers, which is realized as ordering the standard number line as increasing from left to right. A key feature of this definition of length is that it is monotone increasing, so if b is moved to the right, the length of the interval (a, b] increases. The second observation is that this definition is homogeneous in the location of the interval (a, b]. All that matters is the length b − a. The consequence for the application to probability is that it leads to the uniform probability measure, which was heavily featured in Chapter 4.

i i

i i “measureTheory_v2” 2019/5/1 i i page 111 i i 6.3. Borel, Lebesgue-Stieljes measures on R 111

In seeking to generalize these ideas, it appears to be important to preserve the prop- erty of monotonicity but not necessarily to preserve the property of homogeneity. The latter observation is based on experience with probability, which employs non-uniform probability distributions. Thus, we choose monotone increasing, right-continuous func-

tions to be candidates for premeasures on BR. Definition 6.3.4: Monotone function

A function F : R → R (or F : Rb → Rb) is (monotone) increasing if F (x) ≤ F (y) whenever x < y for all x, y ∈ R (or x, y ∈ Rb). F : R → R (or F : Rb → Rb) is (monotone) decreasing if F (x) ≥ F (y) whenever x < y for all x, y ∈ R (or x, y ∈ Rb). F : R → R (or F : Rb → Rb) is monotone if it is either increasing or decreasing.

We only use monotone increasing functions to build measures, so monotone almost always means monotone increasing in this book.

Example 6.3.3

x, tanh(x), x3, and x1/3 are monotone increasing functions. So is ( x, 0 ≤ x < 1/2, F (x) = 2x, 1/2 ≤ x ≤ 1.

Monotone functions have several important properties. For example,

Theorem 6.3.4

If F : R → R is a monotone function, then F has right and left-hand limits at each point x ∈ R,   inf F (x),F is increasing, F (a+) = lim F (x) = x>a x→a+ sup F (x),F is decreasing, x>a  sup F (x),F is increasing, F (a−) = lim F (x) = x

Moreover, the limits ( sup F (x),F is increasing, F (∞) = x∈R infx∈R F (x),F is decreasing, and ( inf F (x),F is increasing, F (−∞) = x∈R sup F (x),F is decreasing, x∈R exist. (They may be ±∞.)

Theorem 6.3.4 implies that a monotone function F : R → R extends naturally to a monotone function from Rb into Rb.

i i

i i “measureTheory_v2” 2019/5/1 i i page 112 i 112 Chapter 6. Measure Structure in Euclidean Space i

a a1 a2 3

Figure 6.4. Illustration of right continuity. The function is right continuous at a1 but is not right continuous at a2 or a3.

Definition 6.3.5

The extension of a monotone function F : R → R is denoted Fb : Rb → Rb. In addition to monotonicity, we need a form of continuity.

Definition 6.3.6: Right continuous function

A monotone function F : R → R is right continuous if F (a) = F (a+) for all a ∈ R. See Fig. 6.4.

The choice of right continuity instead of left continuity is simply convention. The important point is to have a uniform rule for assigning values at points of discontinuity so that we can turn a pointwise function into a set function in a systematic way. It is possible to use left continuity to do this. We now define the key tool for measurement.

Definition 6.3.7: Distribution function

A distribution function is a monotone increasing, right continuous function F : R → R. Following Definition 6.3.5, a distribution function F has a natural exten- sion to a monotone function Fb : Rb → Rb.

Remark 6.3.1

It is important to keep in mind that a distribution function is finite-valued for finite input. A general extended real-valued monotone function on the extended reals can be infinite for finite input.

Now, we choose the premeasure. Assume F : R → R is a distribution function. m For a disjoint collection of h-intervals {(aj, bj]}j=1, with aj, bj ∈ R for all j, define

 m  m [ X µ0  (aj, bj] = (F (bj) − F (aj)), (6.2) j=1 j=1

and µ0 (∅) = 0. For (a, ∞), define µ0 ((a, ∞)) = µ0 ((a, ∞]) = Fb(∞) − Fb(a).

i i

i i “measureTheory_v2” 2019/5/1 i i page 113 i i 6.3. Borel, Lebesgue-Stieljes measures on R 113

This definition is built upon defining the length of an h-interval (a, b] to be the usual length of the image F ((a, b]), see Figure 6.5. Note that the values for infinite endpoints are implied by the monotonicity of F , e.g. F ((−∞, ∞)) = limx↑∞ F (x) − limx↓−∞ F (x).

F ) F F ) ) F F((a,b ] ) F((a,b ] F((a,b ] F((a,b ] (a,b] (a,b] (a,b] (a,b]

Figure 6.5. Illustration of using various monotone increasing functions to define length of the h-interval (a, b]. The length of intervals in the first three examples depends only on the difference b − a.

Theorem 6.3.5

hi µ0 is a premeasure on the algebra A .

This is the fourth major proof we present.

Proof. Step 1 Since some elements in A hi can be represented as a disjoint union of h- intervals in more than one way, we begin by showing that µ0 is well defined. First, m S suppose I = (a, b], with a, b ∈ R, is written as the disjoint union I = (ai, bi]. i=1 Pm Re-labeling so that a = a1 < b1 = a2 < b2 = a3 ··· < b, µ0 (I) = i=1(F (bi) − F (ai)) = F (b) − F (a) because of cancellation. The case I = (a, ∞) is treated simi- larly. m l S S Now assume that I is represented by two disjoint unions, I = Ii and I = Jj. i=1 j=1 l S We observe that each Ii can be written as a disjoint union Ii = (Ii ∩ Jj), and each j=1 Jj can be decomposed similarly. This implies

m m l l X X X X µ0 (Ii) = µ0 (Ii ∩ Jj) = µ0 (Jj) . i=1 i=1 j=1 j=1

So, µ0 is well defined.

Step 2 µ0(∅) = 0 is obvious.

Step 3 We observe that µ0 is finitely additive by construction. ∞ Step 4 We finally show countable additivity. Let {Ii}i=1 be a disjoint sequence of S∞ h-intervals such that I = i=1 Ii ∈ A , so I is a finite disjoint union of h-intervals. However, finite additivity implies that we can assume that I is an h-interval, i.e. I = (a, b], without loss of generality (exercise).

i i

i i “measureTheory_v2” 2019/5/1 i i page 114 i 114 Chapter 6. Measure Structure in Euclidean Space i

P∞ We show the desired equality µ0(I) = i=1 µ0(Ii) by showing two inequalities. One direction is simpler. Finite additivity implies that for any m ≥ 1,

m m m m [  [  [  X µ0(I) = µ0 Ii + µ0 I \ Ii ≥ µ0 Ii = µ0 (Ii) . i=1 i=1 i=1 i=1 P∞ Letting m → ∞, µ0(I) ≥ i=1 µ0(Ii). To show the other direction, we use a fundamental property of the real numbers. We have no problem computing the premeasure on a finite set of h-intervals. The trouble occurs with a countable set of h-intervals. So, we are motivated to reduce the countable ∞ collection {Ii}i=1 to a finite number. Recall the important fact that if a compact set in R is covered by a countable collection of open sets, then it is covered by a finite number of the open sets. This is known as a compactness argument. But, we cannot apply that result directly because we are dealing with h-intervals. So, we first “shrink” (a, b] a little bit on the left to get a closed (and compact) interval just a little bit smaller than (a, b]. Then we increase each h-interval Ii a little bit on the right to get an open interval that is nearly the same size. Thus, we obtain a countable open cover of a compact interval and can use a compactness argument to extract a finite subcover. This all works because the right continuity of F means we can make little changes without introducing too much error in the measure of the various sets. Now for the details. We first assume that −∞ < a < b < ∞ and choose  > 0. We exploit the right continuity of F by extending the covering sets on the right to provide overlap but not enough to change the length of the covering sets very much. Specifically, since F is right continuous, there is a δ > 0 such that F (a+δ)−F (a) < . Likewise, if Ii = (ai, bi], for i = 1, ··· , ∞, then for each i, there is a δi such that −i ∞ F (bi + δi) − F (bi) < 2 . The open intervals {(ai, bi + δi)}i=1 cover the compact set [a + δ, b], and so there is a finite subcover. We throw out the “redundant” (a, bi + δi) in such a finite subcover that are contained in some larger interval to obtain a subcover with the properties:

m 1. {(aij , bij + δij )}j=1 covers [a + δ, b], where {i1, ··· , im} is a strictly increasing sequence of positive integers;

2. ai1 < a + δ;

3. ai1 < ai2 < ··· < aim ;

4. aij+1 < bij + δij < bij+1 + δij+1 for j = 1, ··· , m − 1;

5. b ≤ bim + δim . Then,

µ0 (I) ≤ F (b) − F (a + δ) + 

≤ F (bim + δim ) − F (ai1 ) +  m−1 X  = F (bim + δim ) − F (aim ) + F (aij+1 ) − F (aij ) +  j=1 m−1 X  ≤ F (bim + δim ) − F (aim ) + F (bij + δij ) − F (aij ) +  j=1 m ∞ X  X ≤ µ0 Iij + 2 ≤ µ0 (Ii) + 2. j=1 i=1

i i

i i “measureTheory_v2” 2019/5/1 i i page 115 i i 6.3. Borel, Lebesgue-Stieljes measures on R 115

Since  is arbitrary, we have proved the result for finite a and b. If a = −∞, we make a similar argument to show that ∞ X F (b) − F (−M) ≤ µ0 (Ii) + 2, i=1 for any M < ∞ and if b = ∞, ∞ X F (M) − F (a) ≤ µ0 (Ii) + 2, i=1 for any M < ∞. The result follows by letting M → ∞ in both cases. The case with a = −∞ and b = ∞ is handled similarly (exercise).

Now we apply the framework for constructing measures developed in Chapter 5. We have done all the hard work, so the results are quick to prove. Theorem 6.3.6: Borel Measures and Distribution Functions

If F : R → R is a distribution function, there is a unique Borel measure µF on BR such that

µF ((a, b]) = Fb(b) − Fb(a) for all a, b ∈ Rb, a < b,

where Fb is the extension of F . If G is another such function, then µF = µG if and only if F − G is a constant. Conversely, if µ is a Borel measure on R that is finite on all bounded Borel sets and we define,  µ ((0, x]) , if x > 0,  F (x) = 0, if x = 0 −µ ((x, 0]) , if x < 0,

then F is a distribution function and µ = µF .

This theorem establishes a 1 − 1 correspondence between the Borel measures on R and the set of distribution functions.

hi Proof. Theorem 6.3.5 implies that F induces a premeasure µ0 on the algebra A . hi Thus, it can be extended to a measure µF on the σ- algebra generated by A , which

is BR, by Theorem 5.6.3. The measure is σ- finite since, ∞ [ R = (i, i + 1]. i=−∞

Two functions F and G induce the same premeasure µ0 if and only if F − G is constant. Since µ0 is σ- finite, Theorem 5.6.3 implies they generate the same measure

on BR. For the converse, µ is monotone in the sense of measures (Theorem 5.2.3) and this implies that F is monotone increasing in the sense of functions. The continuity of µ from above (Theorem 5.2.3 again) implies the right continuity of F . Clearly, µ = µF on A , hence µ = µF by Theorem 5.6.3.

We present a number of examples.

i i

i i “measureTheory_v2” 2019/5/1 i i page 116 i 116 Chapter 6. Measure Structure in Euclidean Space i

Example 6.3.4

If F (x) = x, then µF ((a, b]) = b − a, and we obtain the Lebesgue measure.

Example 6.3.5

If F (x) = tanh(x), we obtain a measure in which length depends on location. In Figure 6.6, we plot F and F (a + δ) − F (a).

1 .1 F

-4 4

-1 -4 4

Figure 6.6. We plot F for Example 6.3.5 on the left and F (a + ) − F (a) for small  and a varying through the domain on the right. The length of (a, a + ] depends on location.

Example 6.3.6

Distribution functions are not restricted to smooth functions like tanh. An exam- ple, plotted in Figure 6.7.  0, a ≤ b < 0,   0, x < 0, b2, a < 0 ≤ b < 1,   2 2 2 F (x) = x , 0 ≤ x < 1, =⇒ µF ((a, b]) = b − a , 0 ≤ a ≤ b < 1,   1, 1 ≤ x. 1 − a2, 0 ≤ a < 1 ≤ b,  0, 1 ≤ a ≤ b.

1 F(x)

0 1

Figure 6.7. A plot of F for Example 6.3.6.

i i

i i “measureTheory_v2” 2019/5/1 i i page 117 i i 6.3. Borel, Lebesgue-Stieljes measures on R 117

Example 6.3.7

For µ((a, b]) = b1/3 − a1/3,  x1/3 − 0, x > 0,  F (x) = 0, x = 0, =⇒ F (x) = x1/3. −(0 − (x)1/3),

Example 6.3.8

For the measure,  0, a ≤ b < 0,  b, a < 0 ≤ b < 1,  µ((a, b]) = b − a, 0 ≤ a ≤ b < 1,  1 − a, 0 ≤ a < 1 ≤ b,  0, 1 ≤ a ≤ b.

we obtain,  0, x < 0,  F (x) = x, 0 ≤ x < 1, 1, 1 ≤ x. This is a measure which is the Lebesgue measure on (0, 1] but zero on intervals outside of (0, 1].

Remark 6.3.2

We could begin with any of the generating sets for BR in Theorem 6.2.2 by choos- ing appropriate premeasures and modifying the proofs accordingly.

The following special case is important in probability:

Theorem 6.3.7

If µ is a finite measure on BR and we define

F (x) = µ ((−∞, x]) , x ∈ R, (6.3)

then F is a distribution function and µF = µ.

Proof. Exercise.

Definition 6.3.8

The distribution function determined by Theorem 6.3.7 is called the distribution

i i

i i “measureTheory_v2” 2019/5/1 i i page 118 i 118 Chapter 6. Measure Structure in Euclidean Space i

function of µ.

Recall that the final step is completion of the measure.

Theorem 6.3.8

If F : R → R is a distribution function, there is a unique complete measure µF defined on the closure BR of BR.

Proof. Exercise.

Definition 6.3.9: Lebesgue-Stieltjes measure

If F is a distribution function, the complete measure on BR is called the Lebesgue- Stieltjes (L-S) measure associated with F and is denoted by µF .

Remark 6.3.3

We drop the overbar notation unless we need to stress the difference between the Borel measure and its completion. Note that this can be misleading!

We can use the Lebesgue-Stieltjes measure on R to define a measure structure on any measurable subset B ⊂ R with positive L-S measure.

Definition 6.3.10

Let F : R → R be a distribution function and A ∈ BR. Set,

BA = {B ∩ A : B ∈ BR} , µA(B) = µF,A(B) = µF (B),B ∈ BA.

µF,A is called the restriction of µF to A. We usually omit the A in the notation if A is clear from the context.

Theorem 6.3.9

For A ∈ BR, (A, BA, µF,A) is a (complete) measure space.

Proof. Exercise!

It is convenient to gather the measures of other kinds of intervals.

Theorem 6.3.10

Let F be a distribution function and µ the corresponding Lebesgue-Stieljes mea-

i i

i i “measureTheory_v2” 2019/5/1 i i page 119 i i 6.3. Borel, Lebesgue-Stieljes measures on R 119

sure. For a < b, a, b ∈ R, µ(a, b] = F (b) − F (a), µ[a, b) = F (b−) − F (a−), (6.4) µ[a, b] = F (b) − F (a−), µ(a, b) = F (b−) − F (a), (6.5) µ({a}) = F (a) − F (a−). (6.6)

Similar results can be written out for half-rays.

Proof. For example,

 1   1  µ[a, b] = lim µ a − , b = lim F (b) − F a −  = F (b) − F (a−). i→∞ i i→∞ i

The other cases are an exercise.

We conclude this section by discussing an important special case:

Definition 6.3.11: Lebesgue measure

The L-S measure associated with the function F (x) = x is called the Lebesgue measure and is denoted by µL. We denote the associated complete measure space

by L = BR, which is the class of Lebesgue measurable sets.

Recall that we need to reconcile the definition of sets of measure zero in Chapter 4 and in Chapter 5.

Theorem 6.3.11

A set A ∈ L has Lebesgue measure zero if and only if for every  > 0, there is a ∞ ∞ S countable cover of open intervals {Ai}i=1, A ⊂ Ai, such that i=1

∞ X µ (Ai) < . i=1

Proof. Exercise!

The Lebesgue measure has several interesting properties. For example,

Theorem 6.3.12

Let A ∈ L and fix s ∈ R. The sets A + s = {x + s : x ∈ A} , sA = {sx : x ∈ A}

are in L and µL(A + s) = µL(A),

i i

i i “measureTheory_v2” 2019/5/1 i i page 120 i 120 Chapter 6. Measure Structure in Euclidean Space i

µL(sA) = |s| µL(A).

Proof. We leave this as an exercise for now. We prove the analogous result in higher dimensions below.

Definition 6.3.12

A + s is called the translation of A and sA is called the dilation of A.

We stress that these properties are unique to Lebesgue measure in some sense (which we make precise below) and many useful measures do not have these properties.

n 6.4 Borel, Lebesgue-Stieljes measures in R We next discuss the construction of the default measures on Rn. The approach we use is the direct extension of the construction used for R in Sec. 6.3. There are a couple of issues that arise from this approach. We discuss the construction of measures in general metric spaces in Sec. 5.5. Of n n course, R is a standard example of a metric space and we build BR starting with the open sets defined by the Rn metric. But, Rn has more properties and features than a generic metric space and some of these properties are used directly in the construction

n of the Borel and Lebesgue-Stieljes measures on BR . Chief among these properties is the possibility of defining an order on points in Rn, leading to the notion of distribution functions. But, a property of Rn that we do not use systematically in the first approach to higher dimensions is the product structure, i.e. Rn = R × · · · × R. It is natural to ask if we can build up the measure on Rn by taking a “product” of the measure on R n times. This is not just an academic question, as it is central to the notion of decomposing an integral over Rn as multiple integrals over subspaces of Rn and probability issues such as independence and computing marginals in probability. But, a direct construction of measure in Rn is important from a computational point of view and using product structure involves additional complications, hence we leave it for Chapter 10. rc n Starting with the σ- algebra, recall that Theorem 6.2.1 implies BR = σ (R ). We extend the definition of h-intervals:

Definition 6.4.1: h-rectangle

We let Rhi denote the space of (generalized) right half-closed rectangles or h-rectangles, consisting of sets in Rn of the form (a, b], (−∞, b], (a, ∞), and ∅ for a, b ∈ Rn.

Theorem 6.4.1

Rhi is a semi-algebra.

To prove this theorem, we use a result that is important in itself. Notice that we sneak in a little of the product structure.

i i

i i “measureTheory_v2” 2019/5/1 i i page 121 i n i 6.4. Borel, Lebesgue-Stieljes measures in R 121

Definition 6.4.2: Generalized (product) rectangle

Let E1 and E2 be collections of generalized rectangles as defined in Definition 6.1.1 n1 n2 in R and R respectively, where n1 and n2 are positive integers. The (Carte- sian) product family of sets in Rn1+n2 is defined,

E1 × E2 = {I1 × I2 : I1 ∈ E1,I2 ∈ E2}.

A set I1 × I2 ∈ E1 × E2 is called a generalized (product) rectangle.

This definition is consistent with Definitions 6.1.1 and 6.4.1, e.g.

((a1, ··· , an), (b1, ··· , bn)] = ((a1, ··· , an−1), (b1, ··· , bn−1)] × (an, bn].

Formally, this uses the identification,

((x1, ··· , xn1 ), (y1, ··· , yn2 )) ↔ (x1, ··· , xn1 , y1, ··· , yn2 )

for (x1, ··· , xn1 ) ∈ I1 and (y1, ··· , yn2 ) ∈ I2, which is valid in a generalized rectan- gle.

Theorem 6.4.2

n1 n2 Let E1, E2 be semi-algebras in R and R respectively, where n1 and n2 are n1+n2 positive integers. Then, E1 × E2 is a semi-algebra in R .

Proof.

n1 n2 Step 1 ∅ = (∅, ∅) ∈ E1 × E2, R × R ∈ E1 × E2. Step 2 For I1, J1 ∈ E1 and I2, J2 ∈ E2, (I1 ×I2)∩(J1 ×J2) = (I1 ∩J1)×(I2 ∩J2) ∈ E1 × E2, since I1 ∩ J1 ∈ E1 and I2 ∩ J2 ∈ E2. Step 3 Let I1 ∈ E1 and I2 ∈ E2. Then,

c (I1 × I2) = {(x, y): x∈ / I1, y∈ / I2 or x∈ / I1, y ∈ I2 or x ∈ I1, y∈ / I2} c c c c (6.7) = (I1 × I2) ∪ (I1 × I2) ∪ (I1 × I2),

c c where the sets in the union are disjoint. By assumption, I1 and I2 are finite disjoint unions of sets in E1 and E2 respectively. Hence, each of the sets in the union in (6.7) are given by finite disjoint unions of sets in E1 × E2.

hi n Proof. (Theorem 6.4.1) Let Rn denote the set of h-rectangles in R . The proof is by induction on the dimension. Theorem 6.3.2 implies the result for dimension n = 1. hi Assume we have proved the result for dimension n − 1. From the definition, Rn = hi hi Rn−1 × R1 . The result follows from Theorem 6.4.2.

Following Theorem 6.3.1,

Definition 6.4.3

We let A hi denote the algebra obtained by taking finite disjoint unions of h-

i i

i i “measureTheory_v2” 2019/5/1 i i page 122 i 122 Chapter 6. Measure Structure in Euclidean Space i

intervals in Rhi. Theorem 6.3.1 implies,

Theorem 6.4.3

hi n The σ- algebra generated by A is BR .

Next, we tackle the measure. Recall that for R, we built measures by choosing an increasing, right continuous function F , which yields a corresponding finite measure µF on the Borel σ- algebra, with µF ((a, b]) = F (b) − F (a), a ≤ b. Defining and using an “order” on Rn is not so straightforward.

Definition 6.4.4

n Let a, b be points in R . We say that a ≤ b if ai ≤ bi for all 1 ≤ i ≤ n. We use −∞ for (−∞, ··· , −∞) and ∞ for (∞, ··· , ∞).

Thinking of two dimensions, a ≤ b if a is the lower left corner and b is the upper right corner of a rectangle with sides parallel to the coordinate axes. This generalizes to n dimensions. The consequence of working in dimensions larger than 1 is that the notion of “mono- tone increasing” is considerably more complicated. Basically, we have to control how a function changes as we increment in each of the coordinate directions.

Definition 6.4.5

Let G : Rn → R and a ≤ b, a, b ∈ Rn. The difference of G in the ith coordinate at from ai to bi, ai, bi ∈ R, is,

∆biai G = G(x1, x2, . . . , xi−1, bi, . . . , xn) − G(x1, x2, . . . , xi−1, ai, . . . , xn).

The difference of G from a to b, a, b ∈ Rn, is defined

n G((a, b]) = ∆b1a1 ... ∆bnan G = G0 − G1 + G2 − · · · + (−1) Gn,

n where Gi is the sum of all i terms of the form G(c1, . . . , cn) with ck = ak for exactly i integers in {i, 2, . . . , n} and ck = bk for the remaining n − i integers.

th Asserting that ∆biai G ≥ 0 means that G is increasing in the i coordinate while fixing the other coordinates. The complications arise from having to deal with changes in all the coordinates simultaneously.

Example 6.4.1

In R2,

G((a, b]) = ∆b1a1 ∆b2a2 G(x1, x2) = ∆b1a1 (G(x1, b2) − G(x1, a2))

= G(b1, b2) − G(b1, a2) − G(a1, b2) + G(a1, a2)

i i

i i “measureTheory_v2” 2019/5/1 i i page 123 i n i 6.4. Borel, Lebesgue-Stieljes measures in R 123

Example 6.4.2

In R3,

G((a, b]) =∆b1a1 ∆b2a2 ∆b3a3 G(x1, x2, x3)  = G(b1, b2, b3) − G(a1, b2, b3) + G(b1, a2, b3) + G(b1, b2, a3)  + G(a1, a2, b3) + G(a1, b2, a3) + G(b1, a2, a3) − G(a1, a2, a3).

This is complicated, but the following example suggests that it is the right ap- proach.

Example 6.4.3

2 If G(x1, x2) = x1 × x2 on R , then for a ≤ b,

G((a, b]) = ∆b1a1 ∆b2a2 G(x1, x2) = b1b2−b1a2−a1b2+a1a2 = (b2−a2)(b1−a1),

which suggests a relation to defining the area of a rectangle to be the product of the length of its sides.

Now we define,

Definition 6.4.6: Monotone increasing function

A function F : Rn → R is (monotone) increasing if F ((a, b]) ≥ 0 for all a ≤ b.

With this definition, we have

Theorem 6.4.4

If F is an increasing function on Rn and (c, d] ⊂ (a, b] are h-rectangles in Rn, then F ((c, d]) ≤ F ((a, b]).

This justifies the label of increasing.

Proof. We give the proof in R2. The extension to Rn is tedious but straightforward. Assume (c, d] ⊂ (a, b], or a ≤ c ≤ d ≤ b. I = (a, b] is the disjoint union of 9 h-rectangles I1, ··· ,I9, which are obtained by taking pairs of constraints,

a1 < x1 ≤ c1, c1 < x1 ≤ d1, d1 < x1 ≤ b1,

a2 < x2 ≤ c2, c2 < x2 ≤ d2, d2 < x2 ≤ b2.

See Figure 6.8. In n dimensions, we decompose into 3n h-rectangles. P9 By writing out all the terms and using cancellation, we find that F (I) = i=1 F (Ii). This implies F ((c, d]) ≤ F ((a, b]) because F is increasing and I5 = (c, d].

Next, we turn to right continuity. First,

i i

i i “measureTheory_v2” 2019/5/1 i i page 124 i 124 Chapter 6. Measure Structure in Euclidean Space i

b 2 I I I 7 8 9 d 2 I I I 4 5 6 c 2 I I I 1 2 3 a 2a c d b 1 1 1 1

Figure 6.8. The decomposition of an h-rectangle into 9 smaller h-rectangles.

Definition 6.4.7

(i) ∞ n A sequence of points {x }i=1 in R converges to a point x from the right (i) (i) if for each coordinate j, xj ≥ xj and limi→∞ xj = xj. We denote this by (i) + (i) ∞ x → x . We also say that {x }i=1 decreases to x and converges to x from above, and write x(i) & x and x(i) ↓ x.

Definition 6.4.8: Right continuous function

A function F : Rn → R is right continuous at x if for a sequence of points with x(i) → x+, F (x(i)) → F (x). It is right continuous if it is right continuous at every point.

Definition 6.4.9: Distribution function

A distribution function F : Rn → Rn is a monotone increasing, right continuous function. A simple but very important example is provided by the following result that gener- alizes Example 6.4.3.

Theorem 6.4.5

Let F1,F2,...,Fn be n distribution functions on R, and define,

F (x1, x2, . . . , xn) = F1(x1) × F2(x2) × · · · × Fn(xn).

F is a distribution function on Rn with, n Y F ((a, b]) = (Fi(bi) − Fi(ai)). i=1

Proof. First, F is increasing since

∆b1a1 ··· ∆bnan F1(x1) ··· Fn(xn) = ∆b1a1 F1(x1) ··· ∆bnan Fn(xn) ≥ 0. Right continuity also follows by the right continuity in each coordinate.

i i

i i “measureTheory_v2” 2019/5/1 i i page 125 i n i 6.4. Borel, Lebesgue-Stieljes measures in R 125

In contrast,

Example 6.4.4

Let F1,F2,...,Fn be n distribution functions on R, and define,

F (x1, x2, . . . , xn) = F1(x1) + F2(x2) + ··· + Fn(xn).

F is a distribution function on Rn. However, F ((a, b]) = 0 for all h-rectangles.

Remark 6.4.1

Constructing monotone increasing functions on Rn, and so distribution functions, by working from the definition is difficult. It turns out that integration provides a systematic way to construct measures in Rn and we give more examples later.

Next, we choose the premeasure. As motivation, we consider

Example 6.4.5

Qn If we choose Fi(xi) = xi in Theorem 6.4.5, then F ((a, b]) = i=1(bi − ai), which is the n-dimensional volume of the h-interval (a, b].

In general,

Definition 6.4.10

(i) (i) m If F is a distribution function, then for disjoint h-rectangles {(a , b ]}i=1, define

m ! m [ (i) (i) X (i) (i) µ0 (a , b ] = F ((a , b ] i=1 i=1

and µ0 (∅) = 0.

As in one dimension, the monotonicity of F gives the values of µ0 on infinite domains, e.g. if a, b ∈ Rcn\Rn, then F ((a, b)) = lim F ((c, d]). c↓a, d↑b

Theorem 6.4.6

hi µ0 is a premeasure on A .

Proof. The major steps of the proof are similar to the corresponding steps in one di- mension. hi Step 1 We first establish that µ0 is well-defined, which is needed because sets in A can be written as disjoint unions of h-rectangles in more than one way. Suppose I = m S (a, b] is written as the disjoint union I = (ai, bi]. Re-labeling so that a = a1 < i=1 Pm b1 = a2 < b2 = a3 ··· < b, µ0 (I) = i=1(F (bi) − F (ai)) = F (b) − F (a) because of cancellation. (It is tedious to carry out the cancellation in higher dimensions).

i i

i i “measureTheory_v2” 2019/5/1 i i page 126 i 126 Chapter 6. Measure Structure in Euclidean Space i

m l S S Now assume that I is represented by two disjoint unions, I = Ii and I = Jj. i=1 j=1 l S We observe that each Ii can be written as a disjoint union Ii = (Ii ∩ Jj), and each j=1 Jj can be decomposed similarly. This implies

m m l l X X X X µ0 (Ii) = µ0 (Ii ∩ Jj) = µ0 (Jj) . i=1 i=1 j=1 j=1

So, µ0 is well defined. Step 2 µ0(∅) = 0 is obvious. Step 3 A cancellation argument similar to the argument in Step 1 shows that µ0 is finitely additive. ∞ Step 4 We conclude by showing that µ0 is countable additive. Let {Ii}i=1 be a se- S∞ hi quence of h-rectangles such that I = i=1 Ii ∈ A . I is a disjoint union of h- rectangles. Finite additivity implies if we show the result for I = (a, b] then the general case follows. Also, finite additivity implies that for any finite m,

m ! m ! m ! m [ [ [ X µ0(I) = µ0 Ii + µ0 I\ Ii ≥ µ0 Ii = µ0(Ii). i=1 i=1 i=1 i=1 This holds for any m, so ∞ X µ0(I) ≥ µ0(Ii). i=1 So, we just have to establish subadditivity on A hi. We establish the other direction using a compactness argument. We first assume −∞ < a < b < ∞ and choose  > 0. Since F is right continuous, there is a δ > 0 (i) (i) such that F (a + δ) − F (a) < . Likewise, if Ii = (a , b ] for i = 1, ··· , ∞, there is a δ(i) such that F (b + δ(i)) − F (b) < 2−i for all i. The open rectangles (i) (i) (i) ∞ { a , b + δ }i=1 cover the compact set [a + δ, b], so there is a finite subcover. We discard any rectangles in the subcover that are contained in a larger rectangle in the subcover. We obtain a subcover with the properties, m  (ij ) (ij ) (ij ) 1. (a , b + δ ) j=1 covers [a + δ, b], where {i1, ··· , im} is a strictly in- creasing sequence of positive integers; 2. a(i1) < a(i2) < ··· < a(im); 3. a(ij+1) < b(ij ) + δ(ij ) < b(ij+1) + δ(ij+1) for j = 1, ··· , m − 1. Then,

(i1) (im) (im) µ0(I) ≤ µ0((a + δ, b]) +  ≤ µ0((a , b + δ ]) m m ∞ X (ij ) (ij ) (ij ) X X ≤ µ0((a , b + δ ]) +  ≤ µ0(Iij ) + 2 ≤ µ0(Iij ) + 2. j=1 i=1 i=1 Since  is arbitrary this proves subadditivity. We treat the various cases in which one or more of the coefficients of the points in (a, b] are infinite as in one dimension. For example, if a = −∞, then we show that P∞ F (b) − F (−M) ≤ i=1 µ0(Iij ) +  for any M < ∞ and pass to a limit.

i i

i i “measureTheory_v2” 2019/5/1 i i page 127 i n i 6.4. Borel, Lebesgue-Stieljes measures in R 127

Theorem 6.4.7

n Let F be a distribution function on R . There is a Borel measure µF on BRn such that, µF ((a, b]) = F ((a, b]) for all a ≤ b. Moreover, µF has a unique completion n µF on BR .

hi Proof. Theorem 6.2 implies that F induces a premeasure µ0 on A . Thus, it can hi be extended to a measure µF on the σ- algebra generated by A , which is BR, by Theorem 5.6.3. The measure is σ- finite since,

∞ n [ R = ((−i, ··· , −i), (i, ··· , i)]. i=−∞

The measure can be completed as usual.

Definition 6.4.11

Let F be a distribution function in Rn. The completion of the unique Borel mea- sure µF induced by F is called the Lebesgue-Stieltjes (L-S) measure (induced by F ) on Rn. As above, we only use the overbar notation to refer to the completion when necessary.

As before, there is a converse:

Theorem 6.4.8

n n Let µ be a finite measure on BR and set F (x) = µ ((−∞, x]), x ∈ R . Then, F is a distribution function and µF = µ.

Proof. Since µ is monotone in the sense of measures, F is monotone as well. The continuity of µ from above in the sense of measures implies that F is right continuous. Theorem 6.4.7 implies that F induces a Borel measure µF . We have to show that µF = µ. To keep the notation simple, we prove the theorem for n = 3. The general case is a tedious exercise.

∆b3a3 F = F (x1, x2, b3) − F (x1, x2, a3)

= µ ({x : x1 ≤ x1, x2 ≤ x2, x3 ≤ b3})

− µ ({x : x1 ≤ x1, x2 ≤ x2, x3 ≤ a3})

= µ ({x : x1 ≤ x1, x2 ≤ x2, a3 < x3 ≤ b3})

Thus,

∆b2a2 ∆b3a3 F = ∆b2a2 (F (x1, x2, b3) − F (x1, x2, a3))

= F (x1, b2, b3) − F (x1, a2, b3) − F (x1, b2, a3) + F (x1, a2, a3)

= µ ({x : x1 ≤ x1, a2 < x2 ≤ b2, a3 < x3 ≤ b3}) .

i i

i i “measureTheory_v2” 2019/5/1 i i page 128 i 128 Chapter 6. Measure Structure in Euclidean Space i

One more application of the difference gives

µF ((a, b]) = ∆b1a1 ∆b2a2 ∆b3a3 F  = µ {x : a1 < x1 ≤ b1, a2 < x2 ≤ b2, a3 < x3 ≤ b3} = µ((a, b]) = µ((−∞, b]) − µ((−∞, a]).

Definition 6.4.12

n If µ is a finite measure on BR , then F in Theorem 6.4.8 is the distribution function of µ.

An important special case of the Lebesgue-Stieljes measure is,

Definition 6.4.13

The complete measure induced by F (x1, ··· , xn) = x1×· · ·×xn with F ((a, b]) = (b1 − a1) × · · · × (bn − an) (see Example 6.4.5) is called the Lebesgue measure n n on R and is denoted by µ = µx = µL . We denote the closure of BR by L , which is the set of Lebesgue measurable sets.

As in one dimension, we can use the Lebesgue-Stieltjes measure on Rn to define a measure structure on any measurable subset B ⊂ Rn with positive L-S measure. Definition 6.4.14

n n Let F : R → R be a distribution function and A ∈ BR . Set,

n BA = {B ∩ A : B ∈ BR } , µA(B) = µF,A(B) = µF (B),B ∈ BA.

µF,A is called the restriction of µF to A. We usually drop the A if it is clear from the context.

Theorem 6.4.9

n For A ∈ BR , (A, BA, µF,A) is a (complete) measure space.

Proof. Exercise.

6.5 Regularity of Lebesgue-Stieljes measure In this section, we present a fundamental result about equivalent ways to compute the L-S measure. Results like this are generally known as “regularity” results. In some sense, they relate the Lebesgue theory of measure to the Jordan approach to measure. The following results are stated for Rn. Obvious versions hold for the induced measure structure on any set A ∈ M .

i i

i i “measureTheory_v2” 2019/5/1 i i page 129 i 6.5. Regularity of Lebesgue-Stieljes measure 129 i

Theorem 6.5.1: Regularity of the Lebesgue-Stieljes Measure

n Assume F : R → R is a distribution function, µ = µF is the associated Lebesgue-Stieljes measure, and M = L . For any A ∈ M , Computation through open rectangles

( ∞ ∞ ) X   [ µ (A) = inf µ (a(i), b(i)) : A ⊂ (a(i), b(i)) , (6.8) i=1 i=1 Computation through open sets

µ (A) = inf {µ (G): A ⊂ G, G is open} , (6.9)

Computation through closed sets

µ (A) = sup {µ (F ): A ⊃ F, F is closed} , (6.10)

Computation through compact sets

µ (A) = sup {µ (K): A ⊃ K, K is compact} . (6.11)

Definition 6.5.1

In general, a measure is outer regular if (6.9) holds and inner regular if (6.11) holds for all measurable sets A. A measure that is inner and outer regular is called regular.

Theorem 6.5.1 says that the Lebesgue-Stieljes measure is regular.

Proof. (i) (i) ∞ Result 1 Given  > 0, there is a sequence of h-rectangles {(a , b ]}i=1 with ∞ ∞ X  X µ((a(i), b(i)]) − ≤ µ(A) ≤ µ((a(i), b(i)]). 2 i=1 i=1 By the right continuity of F , for each i, there is a δ(i) such that 1 F (b(i) + δ(i)) − F (b(i)) < 2−i. 2 (i) (i) (i) ∞ The set of open intervals {(a , b + δ )}i=1 have the properties ∞ ∞ [ [ (a(i), b(i)] ⊂ (a(i), b(i) + δ(i)) i=1 i=1 ∞ ∞ ∞ X X X  µ((a(i), b(i)]) ≤ µ((a(i), b(i) + δ(i))) ≤ µ((a(i), b(i)]) + . 2 i=1 i=1 i=1 Thus, ∞ ∞ X X µ((a(i), b(i))) −  ≤ µ(A) ≤ µ((a(i), b(i))) i=1 i=1 This shows (6.8).

i i

i i “measureTheory_v2” 2019/5/1 i i page 130 i 130 Chapter 6. Measure Structure in Euclidean Space i

S∞ (i) (i) (i) Result 2 Given  > 0, let G = i=1(a , b + δ ) as defined in the proof of Result 1. Then, µ(G\A) < , proving (6.9). Result 3 From (6.9), there is an open set G ⊃ Ac with µ(G\Ac) < . F = Gc is closed and F ⊂ A. Also, µ(A\F ) = µ(A ∩ G) = µ(G\Ac) < . Result 4 If A is bounded, then the closed set F ⊂ A provided by (6.10) is compact, and the result is proved. If A is not bounded, define Ij = ((−j, ··· , −j), (j, ··· , j)] for j = 1, ··· , ∞. Ij ∩ A is a bounded measurable set and there is a compact set Kj ⊂ Ij ∩ A with −j ˆ Sm ˆ ˆ µ(Kj) ≥ µ(Ij ∩ A) − 2 . Define Km = j=1 Kj. Km is a compact set, Km ⊂ A, ˆ Sm  Sm and µ(Km) ≥ µ j=1 Ij ∩ A − . Since limm→∞ µ j=1 Ij = µ(A), this proves the result.

Considering both Problem 5.10.1 and Theorem 6.5.1, it seems reasonable to con- clude that if A ∈ M and µ(A) < ∞, then there exists an “inner” approximation of A defined by a closed set F ⊂ A and an “outer” approximation of A defined by an open set G ⊃ A such that µ(G\F ) = 0. Such a conclusion is flawed for several reasons. For in- stance, the construction of an inner approximation of A guaranteed by Problem 5.10.1 possibly requires countable unions of closed sets resulting in a set that may not be closed. Similarly, the outer approximation of A guaranteed by Problem 5.10.1 possibly requires taking countable intersections of open sets resulting in a set that may not be open. The following result states there exists closed and open sets that form inner and outer approximations, respectively, of A that are themselves “close” in measure. The proof requires a straightforward modification of the details of the previous proof for results 2 and 3, and we leave the details as an exercise. Theorem 6.5.2

n Assume F : R → R is a distribution function, µ = µF is the associated n Lebesgue-Stieljes measure, and M = BR . For any A ∈ M with µ(A) < ∞ and any  > 0 there exists closed F ⊂ A and open G ⊃ A such that µ(G\F ) < .

These results are sometimes described as approximation results, but they do not provide any direct practical numerical method for approximating measures of sets. We discuss results related to practical approximation in Section 6.7.

6.6  Properties of Lebesgue measure

Theorem 6.3.12 gives a couple of interesting properties of Lebesgue measure in R. We extend those properties to Rn. Definition 6.6.1

A n × n matrix U is orthogonal if UU > = U >U = I, where I is the n × n identity matrix.

Recall that Uv is a rotation of the vector v. Definition 6.6.2

Let A ∈ L be a Lebesgue measurable set.

i i

i i “measureTheory_v2” 2019/5/1 i i page 131 i 6.6.  Properties of Lebesgue measure 131 i

The translation of A is A + h = {x = y + h : y ∈ A}, where h ∈ Rn is a point. The rotation of A is UA = {Ux : x ∈ A}, where U is a n × n orthogonal matrix. The dilation of A is τA = {τx : x ∈ A}, where τ ∈ R a number. We illustrate in Figure 6.9.

h

Figure 6.9. A translation, rotation, and dilation of a set.

Theorem 6.6.1: Invariances of the Lebesgue Measure

n Let A ∈ BR or L . n n 1. For any h ∈ R , A + h ∈ BR respectively L and µL(A + h) = µL(A). n 2. For any n×n orthogonal matrix U, UA ∈ BR respectively L and µL(UA) = µL(A). n n 3. For any τ ∈ R, τA ∈ BR respectively L and µL(τA) = |τ| µL(A).

Proof. We first show the result holds for h-rectangles. Only the finite case requires proof. For a translation by h, (a, b] → (a+h, b+h], and (b1 +h1 −(a1 +h1), ··· , bn + hn − (an + hn)) = (b1 − a1, ··· , bn − an). Dilation is similarly easy, since (τb1 − n τa1, ··· , τbn − τan) = τ (b1 − a1, ··· , bn − an). For rotation, we observe that (a, b] is a parallelepiped spanned by the n vectors, (1) > v = (b1 − a1, a2, a3, ··· , an) (2) > v = (a1, b2 − a2, a3, ··· , an) . . (n) > v = (a1, a2, ··· , an−1, bn − an) Standard linear algebra says that if A is an n × n matrix, then the volume of the par- allelepiped spanned by vectors {Av(1), ··· , Av(n)} is | det(A)|−1× the volume of the parallelepiped spanned by vectors {v(1), ··· , v(n)}. Applying this to an orthogonal matrix U, we obtain 2. since | det(U)| = 1.

n It follows that the result holds for BR by the uniqueness of the extension of pre- measures in Theorem 5.6.3. If µL(A) = 0, then direct computation shows that µL(A + h) = µL(UA) = µL(τA) = 0. In other words, the set of Lebesgue measure zero are invariant under n translations, rotations, and dilations. Since L consists of unions of sets in BR and subsets of sets of Lebesgue measure zero, the result holds on L .

i i

i i “measureTheory_v2” 2019/5/1 i i page 132 i 132 Chapter 6. Measure Structure in Euclidean Space i

Actually, we can show that 2. and 3. follow from 1., so in some sense, translation invariance is the fundamental invariance property characteristic of Lebesgue measure. In fact, Lebesgue measure is the only measure with this property in a certain sense.

Theorem 6.6.2: Uniqueness of Lebesgue measure

If µ is a translation invariant regular Borel measure that is finite on compact sets,

n then there is a constant c such that µ(A) = cµL(A) for all A ∈ BR .

nk Proof. Let I = (0, 1] and set c = µ(I), so µ(I) = cµL(I). I is the union of 2 disjoint generalized cubes of equal side length 2−k for k ∈ N. By translation invariance, each of these cubes has the same measure. If R denotes any of the cubes,

nk nk 2 µ(R) = µ(I) = cµL(I) = 2 cµL(R).

Thus, µ(R) = cµL(R) for any generalized cube. It follows the result holds for h- rectangles, and thus for all Borel measurable sets.

The proof of Theorem 6.6.1 (2) can be extended to handle general linear transfor- mations.

Theorem 6.6.3

n n n n If T : R → R is a nonsingular linear transformation, then T : BR → BR and,

n µL(TA) = | det T | · µL(A),A ∈ BR . (6.12)

Proof. −1 n When B is an open set, then T B is open. Hence, A = {TA : A ∈ BR } contains all open sets. It is straightforward to show that A is a σ-algebra, hence A = n BR . n We set µ1 = µL(TA) and µ2 = | det T |µL(A) for A ∈ BR . If we show that µ1 = n µ2 on all h-rectangles, then they are equal on all sets in BR by the usual argument. By Theorem 6.6.1, it suffices to consider h-rectangles of the form A = {x : 0 < xi ≤ ai, 1 ≤ i ≤ n} for a = (a1, ··· , an) with ai > 0 for all i. By standard linear algebra results, T can be decomposed into elementary operators T = S1S2 ··· Sk, where each Si is either a permutation operator, dilation of one vari- Qk able, or translation in one variable, and det T = i=1 det Si. Permutation operators have determinant equal to ±1. The effect of the other operators are treated in Theo- rem 6.6.1. Taking the product yields (6.12).

6.7  Approximation of Lebesgue-Stieljes measure In this section, we present several fundamental approximation results that are connected to practical computation. The first is an application of Theorem 5.7.1.

i i

i i “measureTheory_v2” 2019/5/1 i i page 133 i 6.7.  Approximation of Lebesgue-Stieljes measure 133 i

Theorem 6.7.1: Approximating Lesbesgue-Stieljes Measure Using Open Cubes

n n n If (R , BR , µ) is a Lebesgue-Stieljes measure space and A ∈ BR with µ(A) < (i) m ∞, then for any  > 0, there is a finite collection of disjoint open cubes {I }i=1 such that m m !

[ (i) [ (i) µ(A) − µ I ≤ µ A 4 I < . (6.13) i=1 i=1

Proof. Theorem 5.7.1 implies that (6.13) holds for a finite disjoint collection of h- (i) m (i) rectangles {I }i=1. Each h-rectangles I can be written as a finite disjoint union of open generalized cubes except for a finite set of cube faces, which is a set of measure 0.

For the next result, we first state

Theorem 6.7.2

n n Let (R , BR , µ) be a Lebesgue-Stieljes measure space. The measure of any face of a generalized rectangle is 0.

Proof. Exercise.

The next result is a modification of the fundamental approximation result Theo- rem 6.1.1.

Theorem 6.7.3: Approximation of Open Sets by Open Cubes

Let G be an open set in Rn with finite measure. There exists a countable set of ∞ S∞  open disjoint generalized cubes {Ii}i=1 such that µL G\ i=1 Ii = 0.

Proof. We modify the proof of Theorem 6.1.1. −j Let Cj be the countable family of half-open “dyadic” cubes of the form (i2 , (i + 1)2−j] × · · · × (i2−j, (i + 1)2−j], i ∈ Z, that is whose vertices lie on the rectangular −j ∞ lattice of points with spacing 2 and let C = ∪j=1Cj. C is a countable collection. Let G = G\ S I. The interior G◦ of G is an open set that is equal to G 1 I∈C1 1 1 1 ◦ except for a union of faces of cubes in C1. We approximate G1 using half-open cubes in ∪∞ C . Abusing notation, we define G = G◦\ S I, and then construct the j=2 j 2 1 I∈C2 ◦ interior G2. Proceeding with this construction, we cover G with a countable union of disjoint open cubes except possibly for a countable collection of faces of cubes.

We conclude with an analog of Theorem 6.1.1 that uses open balls instead of cubes.

Definition 6.7.1

i i

i i “measureTheory_v2” 2019/5/1 i i page 134 i 134 Chapter 6. Measure Structure in Euclidean Space i

The open ball in Rn with center a and radius r is the set

n !1/2 n X 2 Br(a) = {x ∈ R : kx − ak < r}, kxk = xi . i=1

Theorem 6.7.4: Approximation of Open Sets by Open Balls

Let G be an open set in Rn. There is a countable collection of disjoint open balls ∞ S∞  {Bi}i=1 with Bi ⊂ G for all i such that µL G\ i=1 Bi = 0.

Proof. We first consider the case that G is contained in some cube of finite measure. Let R = (−1, 1)n = (−1, 1)×· · ·×(−1, 1) be the open cube with side length 2 and let B = B1(0) be the open ball inscribed in R. Set a = µL(B)/µL(R), so 0 < a < 1, and set b = 1 − a. Finally, choose c > 1 so that cb < 1. By Theorem 6.7.3, G can be written as a union of disjoint generalized open cubes ∞ (i) {Ii}i=1 and a set of measure 0. The cubes have the form Ii = diR + h , with di > 0 (i) n (i) and h ∈ R . In each cube Ii, we inscribe the open ball Bi, where Bi = diB + h . The invariance properties of Lebesgue measure implies µL(Bi)/µL(Ii) = a, so

µL(Ii\Bi) = µL(Ii) − µL(Bi) = bµL(Ii).

Thus, ∞ ∞ ∞ [  X X µL G\ Bi = µL(Ii\Bi) = b µL(Ii) = bµL(G). i=1 i=1 i=1 We are making progress towards approximating G since b < 1. We take a finite number m1 of the balls so that m1 [  µL G\ Bi ≤ cbµL(G). i=1

Sm1 Next, define G1 = G\ i=1 Bi where Bi is the closure of Bi. G1 is an open set. (2) m2 We repeat the argument to obtain a collection of disjoint open balls {Bi }i=1 with (2) Bi ⊂ G1 and

m2 [ (2) 2 µL G1\ Bi ≤ cbµL(G1) ≤ (cb) µL(G). i=1

(j) mj Continuing, we obtain collections of disjoint open balls {Bi }i=1, j = 0, 1, 2, ··· (j−1) Smj−1 (setting G0 = G), and sets Gj = Gj−1\ i=1 Bi such that

mj+1 [ (j+1) j+1 µL Gj\ Bi ≤ (cb) µL(G). i=1

j S∞ Smj+1 (j+1) Since (cb) → 0 as j → ∞, the set µL Gj\ j=1 i=1 Bi = 0.

i i

i i “measureTheory_v2” 2019/5/1 i i page 135 i 6.8. References 135 i

6.8 References 6.9 Worked problems In the problems below, we explore some subtle and interesting results concerning rela- tionships of intervals to Lebesgue measurable sets. The reader is encouraged to gener- alize the problem statements and solutions to Rn. Problem 6.9.1 states that if we take any Lebesgue measurable set, define closed intervals around each point in that set, and then take the union over all of these intervals, then we have constructed another Lebesgue measurable set. This is quite surprising since the union may be uncountable (and certainly must be uncountable whenever the original set has positive measure).

Problem 6.9.1

Suppose A ∈ L and [ B = [x − 1, x + 1]. x∈A Prove B ∈ L . Hint: This result would actually be trivial to prove if we used open intervals (why?). With this in mind, consider how to rewrite B as unions of different types of sets that involves the use Theorem 6.3.12.

Problem 6.9.2 involves relationships of sets of zero µL–measure and the µL–measure of intersections of sets with intervals. The forward direction of this problem gives a nec- essary and sufficient condition for a Lebesgue measurable set to have zero µL–measure. The contrapositive of the forward direction is quite interesting and essentially states that if µL(A) > 0, then at least one measurable subset of A is “approximated well” (in µL– measure) by an interval.

Problem 6.9.2

Let  ∈ [0, 1) and A ∈ L . Prove that µL(A ∩ I) ≤ µL(I) for all intervals I if and only if µL(A) = 0. Hint: To prove the forward direction, first assume µL(A) < ∞. Then, for the case µL(A) = ∞, apply continuity from below to A ∩ (−i, i).

It is possible to describe the construction of A ∈ BR with positive (and finite) µL– measure such that 0 < µL(A∩I) < µL(I) for all intervals I, which shows the necessity of  < 1 in Problem 6.9.2. However, this is very difficult. We only outline the ideas, give some hints, and encourage the interested of reader to fill in the missing details of this construction provided after the solution to Problem 6.9.2 in Appendix A. The next result is known as Steinhauss’ Theorem. In this result, we define a set using the differences of points taken from a Lebesgue measurable set with positive measure and show that this new set must contain an open interval centered at the origin. The reader should note that we are not claiming this new set is also Lebesgue measurable. While a relatively short proof is possible using topological compactness, we encourage the reader to consider how Problem 6.9.2 may be of use.

i i

i i “measureTheory_v2” 2019/5/1 i i page 136 i 136 Chapter 6. Measure Structure in Euclidean Space i

Problem 6.9.3

Suppose A ∈ L and µL(A) > 0, and define B := {x − y : x, y ∈ A}. Prove there exists δ > 0 such that (−δ, δ) ⊂ B.

i i

i i