<<

Random variables Integration Expectation

18.175: Lecture 3 Integration

Scott Sheffield

MIT

18.175 Lecture 3 Random variables Integration Expectation Outline

Random variables

Integration

Expectation

18.175 Lecture 3 Random variables Integration Expectation Outline

Random variables

Integration

Expectation

18.175 Lecture 3 I σ-algebra is collection of closed under complementation and countable unions. Call (Ω, F) a space. I Measure is µ : F → R satisfying µ(A) ≥ µ(∅) = 0 P for all A ∈ F and countable additivity: µ(∪i Ai ) = i µ(Ai ) for disjoint Ai . I Measure µ is probability measure if µ(Ω) = 1.

I The Borel σ-algebra B on a topological space is the smallest σ-algebra containing all open sets.

Random variables Integration Expectation Recall definitions

I is triple (Ω, F, P) where Ω is sample space, F is of events (the σ-algebra) and P : F → [0, 1] is the probability function.

18.175 Lecture 3 I Measure is function µ : F → R satisfying µ(A) ≥ µ(∅) = 0 P for all A ∈ F and countable additivity: µ(∪i Ai ) = i µ(Ai ) for disjoint Ai . I Measure µ is probability measure if µ(Ω) = 1.

I The Borel σ-algebra B on a topological space is the smallest σ-algebra containing all open sets.

Random variables Integration Expectation Recall definitions

I Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F → [0, 1] is the probability function.

I σ-algebra is collection of subsets closed under complementation and countable unions. Call (Ω, F) a measure space.

18.175 Lecture 3 I Measure µ is probability measure if µ(Ω) = 1.

I The Borel σ-algebra B on a topological space is the smallest σ-algebra containing all open sets.

Random variables Integration Expectation Recall definitions

I Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F → [0, 1] is the probability function.

I σ-algebra is collection of subsets closed under complementation and countable unions. Call (Ω, F) a measure space. I Measure is function µ : F → R satisfying µ(A) ≥ µ(∅) = 0 P for all A ∈ F and countable additivity: µ(∪i Ai ) = i µ(Ai ) for disjoint Ai .

18.175 Lecture 3 I The Borel σ-algebra B on a topological space is the smallest σ-algebra containing all open sets.

Random variables Integration Expectation Recall definitions

I Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F → [0, 1] is the probability function.

I σ-algebra is collection of subsets closed under complementation and countable unions. Call (Ω, F) a measure space. I Measure is function µ : F → R satisfying µ(A) ≥ µ(∅) = 0 P for all A ∈ F and countable additivity: µ(∪i Ai ) = i µ(Ai ) for disjoint Ai . I Measure µ is probability measure if µ(Ω) = 1.

18.175 Lecture 3 Random variables Integration Expectation Recall definitions

I Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F → [0, 1] is the probability function.

I σ-algebra is collection of subsets closed under complementation and countable unions. Call (Ω, F) a measure space. I Measure is function µ : F → R satisfying µ(A) ≥ µ(∅) = 0 P for all A ∈ F and countable additivity: µ(∪i Ai ) = i µ(Ai ) for disjoint Ai . I Measure µ is probability measure if µ(Ω) = 1.

I The Borel σ-algebra B on a topological space is the smallest σ-algebra containing all open sets.

18.175 Lecture 3 I Question: to prove X is measurable, is it enough to show that the pre-image of every open set is in F? −1 I Theorem: If X (A) ∈ F for all A ∈ A and A generates S, then X is a measurable from (Ω, F) to (S, S). I Can talk about σ-algebra generated by (s): smallest σ-algebra that makes a random variable (or a collection of random variables) measurable. I Example of random variable: indicator function of a set. Or sum of finitely many indicator functions of sets.

I Let F (x) = FX (x) = P(X ≤ x) be distribution function for 0 X . Write f = fX = FX for density function of X .

Random variables Integration Expectation Defining random variables

I Random variable is a measurable function from (Ω, F) to (R, B). That is, a function X :Ω → R such that the preimage of every set in B is in F. Say X is F-measurable.

18.175 Lecture 3 −1 I Theorem: If X (A) ∈ F for all A ∈ A and A generates S, then X is a measurable map from (Ω, F) to (S, S). I Can talk about σ-algebra generated by random variable(s): smallest σ-algebra that makes a random variable (or a collection of random variables) measurable. I Example of random variable: indicator function of a set. Or sum of finitely many indicator functions of sets.

I Let F (x) = FX (x) = P(X ≤ x) be distribution function for 0 X . Write f = fX = FX for density function of X .

Random variables Integration Expectation Defining random variables

I Random variable is a measurable function from (Ω, F) to (R, B). That is, a function X :Ω → R such that the preimage of every set in B is in F. Say X is F-measurable. I Question: to prove X is measurable, is it enough to show that the pre-image of every open set is in F?

18.175 Lecture 3 I Can talk about σ-algebra generated by random variable(s): smallest σ-algebra that makes a random variable (or a collection of random variables) measurable. I Example of random variable: indicator function of a set. Or sum of finitely many indicator functions of sets.

I Let F (x) = FX (x) = P(X ≤ x) be distribution function for 0 X . Write f = fX = FX for density function of X .

Random variables Integration Expectation Defining random variables

I Random variable is a measurable function from (Ω, F) to (R, B). That is, a function X :Ω → R such that the preimage of every set in B is in F. Say X is F-measurable. I Question: to prove X is measurable, is it enough to show that the pre-image of every open set is in F? −1 I Theorem: If X (A) ∈ F for all A ∈ A and A generates S, then X is a measurable map from (Ω, F) to (S, S).

18.175 Lecture 3 I Example of random variable: indicator function of a set. Or sum of finitely many indicator functions of sets.

I Let F (x) = FX (x) = P(X ≤ x) be distribution function for 0 X . Write f = fX = FX for density function of X .

Random variables Integration Expectation Defining random variables

I Random variable is a measurable function from (Ω, F) to (R, B). That is, a function X :Ω → R such that the preimage of every set in B is in F. Say X is F-measurable. I Question: to prove X is measurable, is it enough to show that the pre-image of every open set is in F? −1 I Theorem: If X (A) ∈ F for all A ∈ A and A generates S, then X is a measurable map from (Ω, F) to (S, S). I Can talk about σ-algebra generated by random variable(s): smallest σ-algebra that makes a random variable (or a collection of random variables) measurable.

18.175 Lecture 3 I Let F (x) = FX (x) = P(X ≤ x) be distribution function for 0 X . Write f = fX = FX for density function of X .

Random variables Integration Expectation Defining random variables

I Random variable is a measurable function from (Ω, F) to (R, B). That is, a function X :Ω → R such that the preimage of every set in B is in F. Say X is F-measurable. I Question: to prove X is measurable, is it enough to show that the pre-image of every open set is in F? −1 I Theorem: If X (A) ∈ F for all A ∈ A and A generates S, then X is a measurable map from (Ω, F) to (S, S). I Can talk about σ-algebra generated by random variable(s): smallest σ-algebra that makes a random variable (or a collection of random variables) measurable. I Example of random variable: indicator function of a set. Or sum of finitely many indicator functions of sets.

18.175 Lecture 3 Random variables Integration Expectation Defining random variables

I Random variable is a measurable function from (Ω, F) to (R, B). That is, a function X :Ω → R such that the preimage of every set in B is in F. Say X is F-measurable. I Question: to prove X is measurable, is it enough to show that the pre-image of every open set is in F? −1 I Theorem: If X (A) ∈ F for all A ∈ A and A generates S, then X is a measurable map from (Ω, F) to (S, S). I Can talk about σ-algebra generated by random variable(s): smallest σ-algebra that makes a random variable (or a collection of random variables) measurable. I Example of random variable: indicator function of a set. Or sum of finitely many indicator functions of sets.

I Let F (x) = FX (x) = P(X ≤ x) be distribution function for 0 X . Write f = fX = FX for density function of X . 18.175 Lecture 3 I Non-decreasing, right-continuous, with limx→∞ F (x) = 1 and limx→−∞ F (x) = 0. I Other examples of distribution functions: uniform on [0, 1], exponential with rate λ, standard normal, Cantor set measure.

I Can also define distribution functions for random variables that are a.s. integers (like Poisson or geometric or binomial random variables, say). How about for a ratio of two independent Poisson random variables? (This is a random rational with a dense on [0, ∞).)

I Higher dimensional density functions analogously defined.

Random variables Integration Expectation Examples of possible random variable laws

I What functions can be distributions of random variables?

18.175 Lecture 3 I Other examples of distribution functions: uniform on [0, 1], exponential with rate λ, standard normal, Cantor set measure.

I Can also define distribution functions for random variables that are a.s. integers (like Poisson or geometric or binomial random variables, say). How about for a ratio of two independent Poisson random variables? (This is a random rational with a dense support on [0, ∞).)

I Higher dimensional density functions analogously defined.

Random variables Integration Expectation Examples of possible random variable laws

I What functions can be distributions of random variables?

I Non-decreasing, right-continuous, with limx→∞ F (x) = 1 and limx→−∞ F (x) = 0.

18.175 Lecture 3 I Can also define distribution functions for random variables that are a.s. integers (like Poisson or geometric or binomial random variables, say). How about for a ratio of two independent Poisson random variables? (This is a random rational with a dense support on [0, ∞).)

I Higher dimensional density functions analogously defined.

Random variables Integration Expectation Examples of possible random variable laws

I What functions can be distributions of random variables?

I Non-decreasing, right-continuous, with limx→∞ F (x) = 1 and limx→−∞ F (x) = 0. I Other examples of distribution functions: uniform on [0, 1], exponential with rate λ, standard normal, Cantor set measure.

18.175 Lecture 3 I Higher dimensional density functions analogously defined.

Random variables Integration Expectation Examples of possible random variable laws

I What functions can be distributions of random variables?

I Non-decreasing, right-continuous, with limx→∞ F (x) = 1 and limx→−∞ F (x) = 0. I Other examples of distribution functions: uniform on [0, 1], exponential with rate λ, standard normal, Cantor set measure.

I Can also define distribution functions for random variables that are a.s. integers (like Poisson or geometric or binomial random variables, say). How about for a ratio of two independent Poisson random variables? (This is a random rational with a dense support on [0, ∞).)

18.175 Lecture 3 Random variables Integration Expectation Examples of possible random variable laws

I What functions can be distributions of random variables?

I Non-decreasing, right-continuous, with limx→∞ F (x) = 1 and limx→−∞ F (x) = 0. I Other examples of distribution functions: uniform on [0, 1], exponential with rate λ, standard normal, Cantor set measure.

I Can also define distribution functions for random variables that are a.s. integers (like Poisson or geometric or binomial random variables, say). How about for a ratio of two independent Poisson random variables? (This is a random rational with a dense support on [0, ∞).)

I Higher dimensional density functions analogously defined.

18.175 Lecture 3 I If X1,..., Xn are random variables in R, defined on the same n measure space, then (X1,..., Xn) is a random variable in R . I Sums and products of finitely many random variables are random variables. If Xi is countable sequence of random variables, then infn Xn is a random variable. Same for lim inf, sup, lim sup.

I Given infinite sequence of random variables, consider the event that they converge to a limit. Is this a measurable event?

I Yes. If it has measure one, we say sequence converges almost surely.

Random variables Integration Expectation Other properties

I Compositions of measurable maps between measure spaces are measurable.

18.175 Lecture 3 I Sums and products of finitely many random variables are random variables. If Xi is countable sequence of random variables, then infn Xn is a random variable. Same for lim inf, sup, lim sup.

I Given infinite sequence of random variables, consider the event that they converge to a limit. Is this a measurable event?

I Yes. If it has measure one, we say sequence converges almost surely.

Random variables Integration Expectation Other properties

I Compositions of measurable maps between measure spaces are measurable.

I If X1,..., Xn are random variables in R, defined on the same n measure space, then (X1,..., Xn) is a random variable in R .

18.175 Lecture 3 I Given infinite sequence of random variables, consider the event that they converge to a limit. Is this a measurable event?

I Yes. If it has measure one, we say sequence converges almost surely.

Random variables Integration Expectation Other properties

I Compositions of measurable maps between measure spaces are measurable.

I If X1,..., Xn are random variables in R, defined on the same n measure space, then (X1,..., Xn) is a random variable in R . I Sums and products of finitely many random variables are random variables. If Xi is countable sequence of random variables, then infn Xn is a random variable. Same for lim inf, sup, lim sup.

18.175 Lecture 3 I Yes. If it has measure one, we say sequence converges almost surely.

Random variables Integration Expectation Other properties

I Compositions of measurable maps between measure spaces are measurable.

I If X1,..., Xn are random variables in R, defined on the same n measure space, then (X1,..., Xn) is a random variable in R . I Sums and products of finitely many random variables are random variables. If Xi is countable sequence of random variables, then infn Xn is a random variable. Same for lim inf, sup, lim sup.

I Given infinite sequence of random variables, consider the event that they converge to a limit. Is this a measurable event?

18.175 Lecture 3 Random variables Integration Expectation Other properties

I Compositions of measurable maps between measure spaces are measurable.

I If X1,..., Xn are random variables in R, defined on the same n measure space, then (X1,..., Xn) is a random variable in R . I Sums and products of finitely many random variables are random variables. If Xi is countable sequence of random variables, then infn Xn is a random variable. Same for lim inf, sup, lim sup.

I Given infinite sequence of random variables, consider the event that they converge to a limit. Is this a measurable event?

I Yes. If it has measure one, we say sequence converges almost surely.

18.175 Lecture 3 Random variables Integration Expectation Outline

Random variables

Integration

Expectation

18.175 Lecture 3 Random variables Integration Expectation Outline

Random variables

Integration

Expectation

18.175 Lecture 3 I f takes only finitely many values. I f is bounded (hint: reduce to previous case by rounding down or up to nearest multiple of  for  → 0). I f is non-negative (hint: reduce to previous case by taking f ∧ N for N → ∞). I f is any measurable function (hint: treat positive/negative parts separately, difference makes sense if both integrals finite).

I In more words: if (Ω, F) is a measure space with a measure µ with µ(Ω) < ∞ and f :Ω → R is F-measurable, then we can define R fdµ (for non-negative f , also if both f ∨ 0 and −f ∧ 0 and have finite integrals...) I Idea: define integral, verify linearity and positivity (a.e. non-negative functions have non-negative integrals) in 4 cases:

Random variables Integration Expectation

I Lebesgue: If you can measure, you can integrate.

18.175 Lecture 3 I f takes only finitely many values. I f is bounded (hint: reduce to previous case by rounding down or up to nearest multiple of  for  → 0). I f is non-negative (hint: reduce to previous case by taking f ∧ N for N → ∞). I f is any measurable function (hint: treat positive/negative parts separately, difference makes sense if both integrals finite).

I Idea: define integral, verify linearity and positivity (a.e. non-negative functions have non-negative integrals) in 4 cases:

Random variables Integration Expectation Lebesgue integration

I Lebesgue: If you can measure, you can integrate.

I In more words: if (Ω, F) is a measure space with a measure µ with µ(Ω) < ∞ and f :Ω → R is F-measurable, then we can define R fdµ (for non-negative f , also if both f ∨ 0 and −f ∧ 0 and have finite integrals...)

18.175 Lecture 3 I f takes only finitely many values. I f is bounded (hint: reduce to previous case by rounding down or up to nearest multiple of  for  → 0). I f is non-negative (hint: reduce to previous case by taking f ∧ N for N → ∞). I f is any measurable function (hint: treat positive/negative parts separately, difference makes sense if both integrals finite).

Random variables Integration Expectation Lebesgue integration

I Lebesgue: If you can measure, you can integrate.

I In more words: if (Ω, F) is a measure space with a measure µ with µ(Ω) < ∞ and f :Ω → R is F-measurable, then we can define R fdµ (for non-negative f , also if both f ∨ 0 and −f ∧ 0 and have finite integrals...) I Idea: define integral, verify linearity and positivity (a.e. non-negative functions have non-negative integrals) in 4 cases:

18.175 Lecture 3 I f is bounded (hint: reduce to previous case by rounding down or up to nearest multiple of  for  → 0). I f is non-negative (hint: reduce to previous case by taking f ∧ N for N → ∞). I f is any measurable function (hint: treat positive/negative parts separately, difference makes sense if both integrals finite).

Random variables Integration Expectation Lebesgue integration

I Lebesgue: If you can measure, you can integrate.

I In more words: if (Ω, F) is a measure space with a measure µ with µ(Ω) < ∞ and f :Ω → R is F-measurable, then we can define R fdµ (for non-negative f , also if both f ∨ 0 and −f ∧ 0 and have finite integrals...) I Idea: define integral, verify linearity and positivity (a.e. non-negative functions have non-negative integrals) in 4 cases:

I f takes only finitely many values.

18.175 Lecture 3 I f is non-negative (hint: reduce to previous case by taking f ∧ N for N → ∞). I f is any measurable function (hint: treat positive/negative parts separately, difference makes sense if both integrals finite).

Random variables Integration Expectation Lebesgue integration

I Lebesgue: If you can measure, you can integrate.

I In more words: if (Ω, F) is a measure space with a measure µ with µ(Ω) < ∞ and f :Ω → R is F-measurable, then we can define R fdµ (for non-negative f , also if both f ∨ 0 and −f ∧ 0 and have finite integrals...) I Idea: define integral, verify linearity and positivity (a.e. non-negative functions have non-negative integrals) in 4 cases:

I f takes only finitely many values. I f is bounded (hint: reduce to previous case by rounding down or up to nearest multiple of  for  → 0).

18.175 Lecture 3 I f is any measurable function (hint: treat positive/negative parts separately, difference makes sense if both integrals finite).

Random variables Integration Expectation Lebesgue integration

I Lebesgue: If you can measure, you can integrate.

I In more words: if (Ω, F) is a measure space with a measure µ with µ(Ω) < ∞ and f :Ω → R is F-measurable, then we can define R fdµ (for non-negative f , also if both f ∨ 0 and −f ∧ 0 and have finite integrals...) I Idea: define integral, verify linearity and positivity (a.e. non-negative functions have non-negative integrals) in 4 cases:

I f takes only finitely many values. I f is bounded (hint: reduce to previous case by rounding down or up to nearest multiple of  for  → 0). I f is non-negative (hint: reduce to previous case by taking f ∧ N for N → ∞).

18.175 Lecture 3 Random variables Integration Expectation Lebesgue integration

I Lebesgue: If you can measure, you can integrate.

I In more words: if (Ω, F) is a measure space with a measure µ with µ(Ω) < ∞ and f :Ω → R is F-measurable, then we can define R fdµ (for non-negative f , also if both f ∨ 0 and −f ∧ 0 and have finite integrals...) I Idea: define integral, verify linearity and positivity (a.e. non-negative functions have non-negative integrals) in 4 cases:

I f takes only finitely many values. I f is bounded (hint: reduce to previous case by rounding down or up to nearest multiple of  for  → 0). I f is non-negative (hint: reduce to previous case by taking f ∧ N for N → ∞). I f is any measurable function (hint: treat positive/negative parts separately, difference makes sense if both integrals finite).

18.175 Lecture 3 R I If f ≥ 0 a.s. then fdµ ≥ 0. R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ. R R I If g ≤ f a.s. then gdµ ≤ fdµ. R R I If g = f a.e. then gdµ = fdµ. R R I | fdµ| ≤ |f |dµ.

I Theorem: if f and g are integrable then:

d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞?

18.175 Lecture 3 R I If f ≥ 0 a.s. then fdµ ≥ 0. R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ. R R I If g ≤ f a.s. then gdµ ≤ fdµ. R R I If g = f a.e. then gdµ = fdµ. R R I | fdµ| ≤ |f |dµ. d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞? I Theorem: if f and g are integrable then:

18.175 Lecture 3 R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ. R R I If g ≤ f a.s. then gdµ ≤ fdµ. R R I If g = f a.e. then gdµ = fdµ. R R I | fdµ| ≤ |f |dµ. d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞? I Theorem: if f and g are integrable then: R I If f ≥ 0 a.s. then fdµ ≥ 0.

18.175 Lecture 3 R R I If g ≤ f a.s. then gdµ ≤ fdµ. R R I If g = f a.e. then gdµ = fdµ. R R I | fdµ| ≤ |f |dµ. d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞? I Theorem: if f and g are integrable then: R I If f ≥ 0 a.s. then fdµ ≥ 0. R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ.

18.175 Lecture 3 R R I If g = f a.e. then gdµ = fdµ. R R I | fdµ| ≤ |f |dµ. d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞? I Theorem: if f and g are integrable then: R I If f ≥ 0 a.s. then fdµ ≥ 0. R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ. R R I If g ≤ f a.s. then gdµ ≤ fdµ.

18.175 Lecture 3 R R I | fdµ| ≤ |f |dµ. d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞? I Theorem: if f and g are integrable then: R I If f ≥ 0 a.s. then fdµ ≥ 0. R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ. R R I If g ≤ f a.s. then gdµ ≤ fdµ. R R I If g = f a.e. then gdµ = fdµ.

18.175 Lecture 3 d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞? I Theorem: if f and g are integrable then: R I If f ≥ 0 a.s. then fdµ ≥ 0. R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ. R R I If g ≤ f a.s. then gdµ ≤ fdµ. R R I If g = f a.e. then gdµ = fdµ. R R I | fdµ| ≤ |f |dµ.

18.175 Lecture 3 Random variables Integration Expectation Lebesgue integration

I Can we extend previous discussion to case µ(Ω) = ∞? I Theorem: if f and g are integrable then: R I If f ≥ 0 a.s. then fdµ ≥ 0. R R R I For a, b ∈ R, have (af + bg)dµ = a fdµ + b gdµ. R R I If g ≤ f a.s. then gdµ ≤ fdµ. R R I If g = f a.e. then gdµ = fdµ. R R I | fdµ| ≤ |f |dµ. d d R R I When (Ω, F, µ) = (R , R , λ), write E f (x)dx = 1E fdλ.

18.175 Lecture 3 Random variables Integration Expectation Outline

Random variables

Integration

Expectation

18.175 Lecture 3 Random variables Integration Expectation Outline

Random variables

Integration

Expectation

18.175 Lecture 3 I Since expectation is an integral, we can interpret our basic properties of integrals (as well as results to come: Jensen’s inequality, H¨older’sinequality, Fatou’s lemma, monotone convergence, dominated convergence, etc.) as properties of expectation. k I EX is called kth moment of X . Also, if m = EX then E(X − m)2 is called the of X .

Random variables Integration Expectation Expectation

I Given probability space (Ω, F, P) and random variable X , we write EX = R XdP. Always defined if X ≥ 0, or if integrals of max{X , 0} and min{X , 0} are separately finite.

18.175 Lecture 3 k I EX is called kth moment of X . Also, if m = EX then E(X − m)2 is called the variance of X .

Random variables Integration Expectation Expectation

I Given probability space (Ω, F, P) and random variable X , we write EX = R XdP. Always defined if X ≥ 0, or if integrals of max{X , 0} and min{X , 0} are separately finite.

I Since expectation is an integral, we can interpret our basic properties of integrals (as well as results to come: Jensen’s inequality, H¨older’sinequality, Fatou’s lemma, monotone convergence, dominated convergence, etc.) as properties of expectation.

18.175 Lecture 3 Random variables Integration Expectation Expectation

I Given probability space (Ω, F, P) and random variable X , we write EX = R XdP. Always defined if X ≥ 0, or if integrals of max{X , 0} and min{X , 0} are separately finite.

I Since expectation is an integral, we can interpret our basic properties of integrals (as well as results to come: Jensen’s inequality, H¨older’sinequality, Fatou’s lemma, monotone convergence, dominated convergence, etc.) as properties of expectation. k I EX is called kth moment of X . Also, if m = EX then E(X − m)2 is called the variance of X .

18.175 Lecture 3 I Main idea of proof: Approximate φ below by linear function L that agrees with φ at EX . I Applications: Utility, hedge fund payout functions. R p 1/p I H¨older’sinequality: Write kf kp = ( |f | dµ) for R 1 ≤ p < ∞. If 1/p + 1/q = 1, then |fg|dµ ≤ kf kpkgkq. I Main idea of proof: Rescale so that kf kpkgkq = 1. Use some basic calculus to check that for any positive x and y we have xy ≤ xp/p + y q/p. Write x = |f |, y = |g| and integrate R 1 1 to get |fg|dµ ≤ p + q = 1 = kf kpkgkq. I Cauchy-Schwarz inequality: Special case p = q = 2. Gives R |fg|dµ ≤ kf k2kgk2. Says that dot product of two vectors is at most product of vector lengths.

Random variables Integration Expectation Properties of expectation/integration

I Jensen’s inequality: If µ is probability measure and R R φ : R → R is convex then φ( fdµ) ≤ φ(f )dµ. If X is random variable then Eφ(X ) ≥ φ(EX ).

18.175 Lecture 3 I Applications: Utility, hedge fund payout functions. R p 1/p I H¨older’sinequality: Write kf kp = ( |f | dµ) for R 1 ≤ p < ∞. If 1/p + 1/q = 1, then |fg|dµ ≤ kf kpkgkq. I Main idea of proof: Rescale so that kf kpkgkq = 1. Use some basic calculus to check that for any positive x and y we have xy ≤ xp/p + y q/p. Write x = |f |, y = |g| and integrate R 1 1 to get |fg|dµ ≤ p + q = 1 = kf kpkgkq. I Cauchy-Schwarz inequality: Special case p = q = 2. Gives R |fg|dµ ≤ kf k2kgk2. Says that dot product of two vectors is at most product of vector lengths.

Random variables Integration Expectation Properties of expectation/integration

I Jensen’s inequality: If µ is probability measure and R R φ : R → R is convex then φ( fdµ) ≤ φ(f )dµ. If X is random variable then Eφ(X ) ≥ φ(EX ). I Main idea of proof: Approximate φ below by linear function L that agrees with φ at EX .

18.175 Lecture 3 R p 1/p I H¨older’sinequality: Write kf kp = ( |f | dµ) for R 1 ≤ p < ∞. If 1/p + 1/q = 1, then |fg|dµ ≤ kf kpkgkq. I Main idea of proof: Rescale so that kf kpkgkq = 1. Use some basic calculus to check that for any positive x and y we have xy ≤ xp/p + y q/p. Write x = |f |, y = |g| and integrate R 1 1 to get |fg|dµ ≤ p + q = 1 = kf kpkgkq. I Cauchy-Schwarz inequality: Special case p = q = 2. Gives R |fg|dµ ≤ kf k2kgk2. Says that dot product of two vectors is at most product of vector lengths.

Random variables Integration Expectation Properties of expectation/integration

I Jensen’s inequality: If µ is probability measure and R R φ : R → R is convex then φ( fdµ) ≤ φ(f )dµ. If X is random variable then Eφ(X ) ≥ φ(EX ). I Main idea of proof: Approximate φ below by linear function L that agrees with φ at EX . I Applications: Utility, hedge fund payout functions.

18.175 Lecture 3 I Main idea of proof: Rescale so that kf kpkgkq = 1. Use some basic calculus to check that for any positive x and y we have xy ≤ xp/p + y q/p. Write x = |f |, y = |g| and integrate R 1 1 to get |fg|dµ ≤ p + q = 1 = kf kpkgkq. I Cauchy-Schwarz inequality: Special case p = q = 2. Gives R |fg|dµ ≤ kf k2kgk2. Says that dot product of two vectors is at most product of vector lengths.

Random variables Integration Expectation Properties of expectation/integration

I Jensen’s inequality: If µ is probability measure and R R φ : R → R is convex then φ( fdµ) ≤ φ(f )dµ. If X is random variable then Eφ(X ) ≥ φ(EX ). I Main idea of proof: Approximate φ below by linear function L that agrees with φ at EX . I Applications: Utility, hedge fund payout functions. R p 1/p I H¨older’sinequality: Write kf kp = ( |f | dµ) for R 1 ≤ p < ∞. If 1/p + 1/q = 1, then |fg|dµ ≤ kf kpkgkq.

18.175 Lecture 3 I Cauchy-Schwarz inequality: Special case p = q = 2. Gives R |fg|dµ ≤ kf k2kgk2. Says that dot product of two vectors is at most product of vector lengths.

Random variables Integration Expectation Properties of expectation/integration

I Jensen’s inequality: If µ is probability measure and R R φ : R → R is convex then φ( fdµ) ≤ φ(f )dµ. If X is random variable then Eφ(X ) ≥ φ(EX ). I Main idea of proof: Approximate φ below by linear function L that agrees with φ at EX . I Applications: Utility, hedge fund payout functions. R p 1/p I H¨older’sinequality: Write kf kp = ( |f | dµ) for R 1 ≤ p < ∞. If 1/p + 1/q = 1, then |fg|dµ ≤ kf kpkgkq. I Main idea of proof: Rescale so that kf kpkgkq = 1. Use some basic calculus to check that for any positive x and y we have xy ≤ xp/p + y q/p. Write x = |f |, y = |g| and integrate R 1 1 to get |fg|dµ ≤ p + q = 1 = kf kpkgkq.

18.175 Lecture 3 Random variables Integration Expectation Properties of expectation/integration

I Jensen’s inequality: If µ is probability measure and R R φ : R → R is convex then φ( fdµ) ≤ φ(f )dµ. If X is random variable then Eφ(X ) ≥ φ(EX ). I Main idea of proof: Approximate φ below by linear function L that agrees with φ at EX . I Applications: Utility, hedge fund payout functions. R p 1/p I H¨older’sinequality: Write kf kp = ( |f | dµ) for R 1 ≤ p < ∞. If 1/p + 1/q = 1, then |fg|dµ ≤ kf kpkgkq. I Main idea of proof: Rescale so that kf kpkgkq = 1. Use some basic calculus to check that for any positive x and y we have xy ≤ xp/p + y q/p. Write x = |f |, y = |g| and integrate R 1 1 to get |fg|dµ ≤ p + q = 1 = kf kpkgkq. I Cauchy-Schwarz inequality: Special case p = q = 2. Gives R |fg|dµ ≤ kf k2kgk2. Says that dot product of two vectors is at most product of vector lengths. 18.175 Lecture 3 I Main idea of proof: for any , δ can take n large enough so R |fn − f |dµ < Mδ + .

Random variables Integration Expectation Bounded convergence theorem

I Bounded convergence theorem: Consider probability measure µ and suppose |fn| ≤ M a.s. for all n and some fixed M > 0, and that fn → f in probability (i.e., limn→∞ µ{x : |fn(x) − f (x)| > } = 0 for all  > 0). Then Z Z fdµ = lim fndµ. n→∞

(Build counterexample for infinite measure space using wide and short rectangles?...)

18.175 Lecture 3 Random variables Integration Expectation Bounded convergence theorem

I Bounded convergence theorem: Consider probability measure µ and suppose |fn| ≤ M a.s. for all n and some fixed M > 0, and that fn → f in probability (i.e., limn→∞ µ{x : |fn(x) − f (x)| > } = 0 for all  > 0). Then Z Z fdµ = lim fndµ. n→∞

(Build counterexample for infinite measure space using wide and short rectangles?...)

I Main idea of proof: for any , δ can take n large enough so R |fn − f |dµ < Mδ + .

18.175 Lecture 3 I Main idea of proof: first reduce to case that the fn are increasing by writing gn(x) = infm≥n fm(x) and observing that gn(x) ↑ g(x) = lim infn→∞ fn(x). Then truncate, used bounded convergence, take limits.

Random variables Integration Expectation Fatou’s lemma

I Fatou’s lemma: If fn ≥ 0 then Z Z lim inf fndµ ≥ lim inf fn)dµ. n→∞ n→∞

(Counterexample for opposite-direction inequality using thin and tall rectangles?)

18.175 Lecture 3 Random variables Integration Expectation Fatou’s lemma

I Fatou’s lemma: If fn ≥ 0 then Z Z lim inf fndµ ≥ lim inf fn)dµ. n→∞ n→∞

(Counterexample for opposite-direction inequality using thin and tall rectangles?)

I Main idea of proof: first reduce to case that the fn are increasing by writing gn(x) = infm≥n fm(x) and observing that gn(x) ↑ g(x) = lim infn→∞ fn(x). Then truncate, used bounded convergence, take limits.

18.175 Lecture 3 I Main idea of proof: one direction obvious, Fatou gives other.

I Dominated convergence: If fn → f a.e. and |fn| ≤ g for all R R n and g is integrable, then fndµ → fdµ.

I Main idea of proof: Fatou for functions g + fn ≥ 0 gives one side. Fatou for g − fn ≥ 0 gives other.

Random variables Integration Expectation More integral properties

I Monotone convergence: If fn ≥ 0 and fn ↑ f then Z Z fndµ ↑ fdµ.

18.175 Lecture 3 I Dominated convergence: If fn → f a.e. and |fn| ≤ g for all R R n and g is integrable, then fndµ → fdµ.

I Main idea of proof: Fatou for functions g + fn ≥ 0 gives one side. Fatou for g − fn ≥ 0 gives other.

Random variables Integration Expectation More integral properties

I Monotone convergence: If fn ≥ 0 and fn ↑ f then Z Z fndµ ↑ fdµ.

I Main idea of proof: one direction obvious, Fatou gives other.

18.175 Lecture 3 I Main idea of proof: Fatou for functions g + fn ≥ 0 gives one side. Fatou for g − fn ≥ 0 gives other.

Random variables Integration Expectation More integral properties

I Monotone convergence: If fn ≥ 0 and fn ↑ f then Z Z fndµ ↑ fdµ.

I Main idea of proof: one direction obvious, Fatou gives other.

I Dominated convergence: If fn → f a.e. and |fn| ≤ g for all R R n and g is integrable, then fndµ → fdµ.

18.175 Lecture 3 Random variables Integration Expectation More integral properties

I Monotone convergence: If fn ≥ 0 and fn ↑ f then Z Z fndµ ↑ fdµ.

I Main idea of proof: one direction obvious, Fatou gives other.

I Dominated convergence: If fn → f a.e. and |fn| ≤ g for all R R n and g is integrable, then fndµ → fdµ.

I Main idea of proof: Fatou for functions g + fn ≥ 0 gives one side. Fatou for g − fn ≥ 0 gives other.

18.175 Lecture 3 I Prove by checking for indicators, simple functions, non-negative functions, integrable functions.

I Examples: normal, exponential, Bernoulli, Poisson, geometric...

Random variables Integration Expectation Computing expectations

I Change of variables. Measure space (Ω, F, P). Let X be random variable in (S, S) with distribution µ. Then if f (S, S) → (R, R) is measurable we have R Ef (X ) = S f (y)µ(dy).

18.175 Lecture 3 I Examples: normal, exponential, Bernoulli, Poisson, geometric...

Random variables Integration Expectation Computing expectations

I Change of variables. Measure space (Ω, F, P). Let X be random variable in (S, S) with distribution µ. Then if f (S, S) → (R, R) is measurable we have R Ef (X ) = S f (y)µ(dy). I Prove by checking for indicators, simple functions, non-negative functions, integrable functions.

18.175 Lecture 3 Random variables Integration Expectation Computing expectations

I Change of variables. Measure space (Ω, F, P). Let X be random variable in (S, S) with distribution µ. Then if f (S, S) → (R, R) is measurable we have R Ef (X ) = S f (y)µ(dy). I Prove by checking for indicators, simple functions, non-negative functions, integrable functions.

I Examples: normal, exponential, Bernoulli, Poisson, geometric...

18.175 Lecture 3