18.175: Lecture 16 Conditional expectation, random walks, martingales

Scott Sheffield

MIT

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 I We require that Y is F measurable and that for all A in F, R R we have A XdP = A YdP. I Any Y satisfying these properties is called a version of E(X |F).

I Is it possible that there exists more than one version of E(X |F) (which would mean that in some sense the conditional expectation is not canonically defined)?

I Is there some sense in which E(X |F) always exists and is always uniquely defined (maybe up to set of measure zero)?

Conditional expectation

I Say we’re given a (Ω, F0, P) and a σ-field F ⊂ F0 and a X measurable w.r.t. F0, with E|X | < ∞. The conditional expectation of X given F is a new random variable, which we can denote by Y = E(X |F).

18.175 Lecture 16 I Any Y satisfying these properties is called a version of E(X |F).

I Is it possible that there exists more than one version of E(X |F) (which would mean that in some sense the conditional expectation is not canonically defined)?

I Is there some sense in which E(X |F) always exists and is always uniquely defined (maybe up to set of measure zero)?

Conditional expectation

I Say we’re given a (Ω, F0, P) and a σ-field F ⊂ F0 and a random variable X measurable w.r.t. F0, with E|X | < ∞. The conditional expectation of X given F is a new random variable, which we can denote by Y = E(X |F).

I We require that Y is F measurable and that for all A in F, R R we have A XdP = A YdP.

18.175 Lecture 16 I Is it possible that there exists more than one version of E(X |F) (which would mean that in some sense the conditional expectation is not canonically defined)?

I Is there some sense in which E(X |F) always exists and is always uniquely defined (maybe up to set of measure zero)?

Conditional expectation

I Say we’re given a probability space (Ω, F0, P) and a σ-field F ⊂ F0 and a random variable X measurable w.r.t. F0, with E|X | < ∞. The conditional expectation of X given F is a new random variable, which we can denote by Y = E(X |F).

I We require that Y is F measurable and that for all A in F, R R we have A XdP = A YdP. I Any Y satisfying these properties is called a version of E(X |F).

18.175 Lecture 16 I Is there some sense in which E(X |F) always exists and is always uniquely defined (maybe up to set of measure zero)?

Conditional expectation

I Say we’re given a probability space (Ω, F0, P) and a σ-field F ⊂ F0 and a random variable X measurable w.r.t. F0, with E|X | < ∞. The conditional expectation of X given F is a new random variable, which we can denote by Y = E(X |F).

I We require that Y is F measurable and that for all A in F, R R we have A XdP = A YdP. I Any Y satisfying these properties is called a version of E(X |F).

I Is it possible that there exists more than one version of E(X |F) (which would mean that in some sense the conditional expectation is not canonically defined)?

18.175 Lecture 16 Conditional expectation

I Say we’re given a probability space (Ω, F0, P) and a σ-field F ⊂ F0 and a random variable X measurable w.r.t. F0, with E|X | < ∞. The conditional expectation of X given F is a new random variable, which we can denote by Y = E(X |F).

I We require that Y is F measurable and that for all A in F, R R we have A XdP = A YdP. I Any Y satisfying these properties is called a version of E(X |F).

I Is it possible that there exists more than one version of E(X |F) (which would mean that in some sense the conditional expectation is not canonically defined)?

I Is there some sense in which E(X |F) always exists and is always uniquely defined (maybe up to set of measure zero)?

18.175 Lecture 16 I Proof: let A = {Y > 0} ∈ F and observe: R YdP = R XdP ≤ R |X |dP. By similar argument, RA A R A Ac −YdP ≤ Ac |X |dP. 0 I Uniqueness of Y : Suppose Y is F-measurable and satisfies R 0 R R A Y dP = A XdP = A YdP for all A ∈ F. Then consider the set Y − Y 0 ≥ }. Integrating over that gives zero. Must hold for any . Conclude that Y = Y 0 almost everywhere.

Conditional expectation

I Claim: Assuming Y = E(X |F) as above, and E|X | < ∞, we have E|Y | ≤ E|X |. In particular, Y is integrable.

18.175 Lecture 16 0 I Uniqueness of Y : Suppose Y is F-measurable and satisfies R 0 R R A Y dP = A XdP = A YdP for all A ∈ F. Then consider the set Y − Y 0 ≥ }. Integrating over that gives zero. Must hold for any . Conclude that Y = Y 0 almost everywhere.

Conditional expectation

I Claim: Assuming Y = E(X |F) as above, and E|X | < ∞, we have E|Y | ≤ E|X |. In particular, Y is integrable.

I Proof: let A = {Y > 0} ∈ F and observe: R YdP = R XdP ≤ R |X |dP. By similar argument, RA A R A Ac −YdP ≤ Ac |X |dP.

18.175 Lecture 16 Conditional expectation

I Claim: Assuming Y = E(X |F) as above, and E|X | < ∞, we have E|Y | ≤ E|X |. In particular, Y is integrable.

I Proof: let A = {Y > 0} ∈ F and observe: R YdP = R XdP ≤ R |X |dP. By similar argument, RA A R A Ac −YdP ≤ Ac |X |dP. 0 I Uniqueness of Y : Suppose Y is F-measurable and satisfies R 0 R R A Y dP = A XdP = A YdP for all A ∈ F. Then consider the set Y − Y 0 ≥ }. Integrating over that gives zero. Must hold for any . Conclude that Y = Y 0 almost everywhere.

18.175 Lecture 16 I Recall Radon-Nikodym theorem: If µ and ν are σ-finite measures on (Ω, F) and ν is absolutely continuous w.r.t. µ, then there exists a measurable f :Ω → [0, ∞) such that R ν(A) = A fdµ. I Observe: this theorem implies existence of conditional expectation.

Radon-Nikodym theorem

I Let µ and ν be σ-finite measures on (Ω, F). Say ν << µ (or ν is absolutely continuous w.r.t. µ if µ(A) = 0 implies ν(A) = 0.

18.175 Lecture 16 I Observe: this theorem implies existence of conditional expectation.

Radon-Nikodym theorem

I Let µ and ν be σ-finite measures on (Ω, F). Say ν << µ (or ν is absolutely continuous w.r.t. µ if µ(A) = 0 implies ν(A) = 0.

I Recall Radon-Nikodym theorem: If µ and ν are σ-finite measures on (Ω, F) and ν is absolutely continuous w.r.t. µ, then there exists a measurable f :Ω → [0, ∞) such that R ν(A) = A fdµ.

18.175 Lecture 16 Radon-Nikodym theorem

I Let µ and ν be σ-finite measures on (Ω, F). Say ν << µ (or ν is absolutely continuous w.r.t. µ if µ(A) = 0 implies ν(A) = 0.

I Recall Radon-Nikodym theorem: If µ and ν are σ-finite measures on (Ω, F) and ν is absolutely continuous w.r.t. µ, then there exists a measurable f :Ω → [0, ∞) such that R ν(A) = A fdµ. I Observe: this theorem implies existence of conditional expectation.

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 I Martingale convergence: A non-negative martingale has a limit.

Two big results

I Optional stopping theorem: Can’t make money in expectation by timing sale of asset whose price is non-negative martingale.

18.175 Lecture 16 Two big results

I Optional stopping theorem: Can’t make money in expectation by timing sale of asset whose price is non-negative martingale.

I Martingale convergence: A non-negative martingale almost surely has a limit.

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 I Finite permutation of N is one-to-one map from N to itself that fixes all but finitely many points.

I Event A ∈ F is permutable if it is invariant under any finite permutation of the ωi . I Let E be the σ-field of permutable events.

I This is related to the tail σ-algebra we introduced earlier in the course. Bigger or smaller?

Exchangeable events

I Start with measure space (S, S, µ). Let Ω = {(ω1, ω2,...): ωi ∈ S}, let F be product σ-algebra and P the product .

18.175 Lecture 16 I Event A ∈ F is permutable if it is invariant under any finite permutation of the ωi . I Let E be the σ-field of permutable events.

I This is related to the tail σ-algebra we introduced earlier in the course. Bigger or smaller?

Exchangeable events

I Start with measure space (S, S, µ). Let Ω = {(ω1, ω2,...): ωi ∈ S}, let F be product σ-algebra and P the product probability measure. I Finite permutation of N is one-to-one map from N to itself that fixes all but finitely many points.

18.175 Lecture 16 I Let E be the σ-field of permutable events.

I This is related to the tail σ-algebra we introduced earlier in the course. Bigger or smaller?

Exchangeable events

I Start with measure space (S, S, µ). Let Ω = {(ω1, ω2,...): ωi ∈ S}, let F be product σ-algebra and P the product probability measure. I Finite permutation of N is one-to-one map from N to itself that fixes all but finitely many points.

I Event A ∈ F is permutable if it is invariant under any finite permutation of the ωi .

18.175 Lecture 16 I This is related to the tail σ-algebra we introduced earlier in the course. Bigger or smaller?

Exchangeable events

I Start with measure space (S, S, µ). Let Ω = {(ω1, ω2,...): ωi ∈ S}, let F be product σ-algebra and P the product probability measure. I Finite permutation of N is one-to-one map from N to itself that fixes all but finitely many points.

I Event A ∈ F is permutable if it is invariant under any finite permutation of the ωi . I Let E be the σ-field of permutable events.

18.175 Lecture 16 Exchangeable events

I Start with measure space (S, S, µ). Let Ω = {(ω1, ω2,...): ωi ∈ S}, let F be product σ-algebra and P the product probability measure. I Finite permutation of N is one-to-one map from N to itself that fixes all but finitely many points.

I Event A ∈ F is permutable if it is invariant under any finite permutation of the ωi . I Let E be the σ-field of permutable events.

I This is related to the tail σ-algebra we introduced earlier in the course. Bigger or smaller?

18.175 Lecture 16 I Idea of proof: Try to show A is independent of itself, i.e., that P(A) = P(A ∩ A) = P(A)P(A). Start with measure theoretic fact that we can approximate A by a set An in σ-algebra generated by X1,... Xn, so that symmetric difference of A and An has very small probability. Note that 0 An is independent of event An that An holds when X1,..., Xn

and Xn1 ,..., X2n are swapped. Symmetric difference between 0 A and An is also small, so A is independent of itself up to this small error. Then make error arbitrarily small.

Hewitt-Savage 0-1 law

I If X1, X2,... are i.i.d. and A ∈ A then P(A) ∈ {0, 1}.

18.175 Lecture 16 Hewitt-Savage 0-1 law

I If X1, X2,... are i.i.d. and A ∈ A then P(A) ∈ {0, 1}. I Idea of proof: Try to show A is independent of itself, i.e., that P(A) = P(A ∩ A) = P(A)P(A). Start with measure theoretic fact that we can approximate A by a set An in σ-algebra generated by X1,... Xn, so that symmetric difference of A and An has very small probability. Note that 0 An is independent of event An that An holds when X1,..., Xn

and Xn1 ,..., X2n are swapped. Symmetric difference between 0 A and An is also small, so A is independent of itself up to this small error. Then make error arbitrarily small.

18.175 Lecture 16 I Sn = 0 for all n I Sn → ∞ I Sn → −∞ I −∞ = lim inf Sn < lim sup Sn = ∞

I Theorem: if Sn is a random walk on R then one of the following occurs with probability one:

I Idea of proof: Hewitt-Savage implies the lim sup Sn and lim inf Sn are almost sure constants in [−∞, ∞]. Note that if X1 is not a.s. constant, then both values would depend on X1 if they were not in ±∞

Application of Hewitt-Savage:

n Pn I If Xi are i.i.d. in R then Sn = i=1 Xi is a random walk on n R .

18.175 Lecture 16 I Sn = 0 for all n I Sn → ∞ I Sn → −∞ I −∞ = lim inf Sn < lim sup Sn = ∞

I Idea of proof: Hewitt-Savage implies the lim sup Sn and lim inf Sn are almost sure constants in [−∞, ∞]. Note that if X1 is not a.s. constant, then both values would depend on X1 if they were not in ±∞

Application of Hewitt-Savage:

n Pn I If Xi are i.i.d. in R then Sn = i=1 Xi is a random walk on n R . I Theorem: if Sn is a random walk on R then one of the following occurs with probability one:

18.175 Lecture 16 I Sn → ∞ I Sn → −∞ I −∞ = lim inf Sn < lim sup Sn = ∞

I Idea of proof: Hewitt-Savage implies the lim sup Sn and lim inf Sn are almost sure constants in [−∞, ∞]. Note that if X1 is not a.s. constant, then both values would depend on X1 if they were not in ±∞

Application of Hewitt-Savage:

n Pn I If Xi are i.i.d. in R then Sn = i=1 Xi is a random walk on n R . I Theorem: if Sn is a random walk on R then one of the following occurs with probability one:

I Sn = 0 for all n

18.175 Lecture 16 I Sn → −∞ I −∞ = lim inf Sn < lim sup Sn = ∞

I Idea of proof: Hewitt-Savage implies the lim sup Sn and lim inf Sn are almost sure constants in [−∞, ∞]. Note that if X1 is not a.s. constant, then both values would depend on X1 if they were not in ±∞

Application of Hewitt-Savage:

n Pn I If Xi are i.i.d. in R then Sn = i=1 Xi is a random walk on n R . I Theorem: if Sn is a random walk on R then one of the following occurs with probability one:

I Sn = 0 for all n I Sn → ∞

18.175 Lecture 16 I −∞ = lim inf Sn < lim sup Sn = ∞

I Idea of proof: Hewitt-Savage implies the lim sup Sn and lim inf Sn are almost sure constants in [−∞, ∞]. Note that if X1 is not a.s. constant, then both values would depend on X1 if they were not in ±∞

Application of Hewitt-Savage:

n Pn I If Xi are i.i.d. in R then Sn = i=1 Xi is a random walk on n R . I Theorem: if Sn is a random walk on R then one of the following occurs with probability one:

I Sn = 0 for all n I Sn → ∞ I Sn → −∞

18.175 Lecture 16 I Idea of proof: Hewitt-Savage implies the lim sup Sn and lim inf Sn are almost sure constants in [−∞, ∞]. Note that if X1 is not a.s. constant, then both values would depend on X1 if they were not in ±∞

Application of Hewitt-Savage:

n Pn I If Xi are i.i.d. in R then Sn = i=1 Xi is a random walk on n R . I Theorem: if Sn is a random walk on R then one of the following occurs with probability one:

I Sn = 0 for all n I Sn → ∞ I Sn → −∞ I −∞ = lim inf Sn < lim sup Sn = ∞

18.175 Lecture 16 Application of Hewitt-Savage:

n Pn I If Xi are i.i.d. in R then Sn = i=1 Xi is a random walk on n R . I Theorem: if Sn is a random walk on R then one of the following occurs with probability one:

I Sn = 0 for all n I Sn → ∞ I Sn → −∞ I −∞ = lim inf Sn < lim sup Sn = ∞

I Idea of proof: Hewitt-Savage implies the lim sup Sn and lim inf Sn are almost sure constants in [−∞, ∞]. Note that if X1 is not a.s. constant, then both values would depend on X1 if they were not in ±∞

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 I In finance applications, T might be the time one sells a stock. Then this states that the decision to sell at time n depends only on prices up to time n, not on (as yet unknown) future prices.

Stopping time definition

I Say that T is a if the event that T = n is in Fn for i ≤ n.

18.175 Lecture 16 Stopping time definition

I Say that T is a stopping time if the event that T = n is in Fn for i ≤ n. I In finance applications, T might be the time one sells a stock. Then this states that the decision to sell at time n depends only on prices up to time n, not on (as yet unknown) future prices.

18.175 Lecture 16 I Which of the following is a stopping time?

1. The smallest T for which |XT | = 50 2. The smallest T for which XT ∈ {−10, 100} 3. The smallest T for which XT = 0. 4. The T at which the Xn sequence achieves the value 17 for the 9th time. 5. The value of T ∈ {0, 1, 2,..., 100} for which XT is largest. 6. The largest T ∈ {0, 1, 2,..., 100} for which XT = 0.

I Answer: first four, not last two.

Stopping time examples

I Let A1,... be i.i.d. random variables equal to −1 with probability .5 and 1 with probability .5 and let X0 = 0 and Pn Xn = i=1 Ai for n ≥ 0.

18.175 Lecture 16 I Answer: first four, not last two.

Stopping time examples

I Let A1,... be i.i.d. random variables equal to −1 with probability .5 and 1 with probability .5 and let X0 = 0 and Pn Xn = i=1 Ai for n ≥ 0. I Which of the following is a stopping time?

1. The smallest T for which |XT | = 50 2. The smallest T for which XT ∈ {−10, 100} 3. The smallest T for which XT = 0. 4. The T at which the Xn sequence achieves the value 17 for the 9th time. 5. The value of T ∈ {0, 1, 2,..., 100} for which XT is largest. 6. The largest T ∈ {0, 1, 2,..., 100} for which XT = 0.

18.175 Lecture 16 I Answer: first four, not last two.

Stopping time examples

I Let A1,... be i.i.d. random variables equal to −1 with probability .5 and 1 with probability .5 and let X0 = 0 and Pn Xn = i=1 Ai for n ≥ 0. I Which of the following is a stopping time?

1. The smallest T for which |XT | = 50 2. The smallest T for which XT ∈ {−10, 100} 3. The smallest T for which XT = 0. 4. The T at which the Xn sequence achieves the value 17 for the 9th time. 5. The value of T ∈ {0, 1, 2,..., 100} for which XT is largest. 6. The largest T ∈ {0, 1, 2,..., 100} for which XT = 0.

18.175 Lecture 16 Stopping time examples

I Let A1,... be i.i.d. random variables equal to −1 with probability .5 and 1 with probability .5 and let X0 = 0 and Pn Xn = i=1 Ai for n ≥ 0. I Which of the following is a stopping time?

1. The smallest T for which |XT | = 50 2. The smallest T for which XT ∈ {−10, 100} 3. The smallest T for which XT = 0. 4. The T at which the Xn sequence achieves the value 17 for the 9th time. 5. The value of T ∈ {0, 1, 2,..., 100} for which XT is largest. 6. The largest T ∈ {0, 1, 2,..., 100} for which XT = 0.

I Answer: first four, not last two.

18.175 Lecture 16 I Conditioned on stopping time N < ∞, conditional law of {XN+n, n ≥ 1} is independent of Fn and has same law as original sequence.

I Wald’s equation: Let Xi be i.i.d. with E|Xi | < ∞. If N is a stopping time with EN < ∞ then ESN = EX1EN.

I Wald’s second equation: Let Xi be i.i.d. with E|Xi | = 0 and 2 2 EXi = σ < ∞. If N is a stopping time with EN < ∞ then 2 ESN = σ EN.

Stopping time theorems

I Theorem: Let X1, X2,... be i.i.d. and N a stopping time with N < ∞.

18.175 Lecture 16 I Wald’s equation: Let Xi be i.i.d. with E|Xi | < ∞. If N is a stopping time with EN < ∞ then ESN = EX1EN.

I Wald’s second equation: Let Xi be i.i.d. with E|Xi | = 0 and 2 2 EXi = σ < ∞. If N is a stopping time with EN < ∞ then 2 ESN = σ EN.

Stopping time theorems

I Theorem: Let X1, X2,... be i.i.d. and N a stopping time with N < ∞.

I Conditioned on stopping time N < ∞, conditional law of {XN+n, n ≥ 1} is independent of Fn and has same law as original sequence.

18.175 Lecture 16 I Wald’s second equation: Let Xi be i.i.d. with E|Xi | = 0 and 2 2 EXi = σ < ∞. If N is a stopping time with EN < ∞ then 2 ESN = σ EN.

Stopping time theorems

I Theorem: Let X1, X2,... be i.i.d. and N a stopping time with N < ∞.

I Conditioned on stopping time N < ∞, conditional law of {XN+n, n ≥ 1} is independent of Fn and has same law as original sequence.

I Wald’s equation: Let Xi be i.i.d. with E|Xi | < ∞. If N is a stopping time with EN < ∞ then ESN = EX1EN.

18.175 Lecture 16 Stopping time theorems

I Theorem: Let X1, X2,... be i.i.d. and N a stopping time with N < ∞.

I Conditioned on stopping time N < ∞, conditional law of {XN+n, n ≥ 1} is independent of Fn and has same law as original sequence.

I Wald’s equation: Let Xi be i.i.d. with E|Xi | < ∞. If N is a stopping time with EN < ∞ then ESN = EX1EN.

I Wald’s second equation: Let Xi be i.i.d. with E|Xi | = 0 and 2 2 EXi = σ < ∞. If N is a stopping time with EN < ∞ then 2 ESN = σ EN.

18.175 Lecture 16 I Wald’s second equation: Let Xi be i.i.d. with E|Xi | = 0 and 2 2 EXi = σ < ∞. If N is a stopping time with EN < ∞ then 2 ESN = σ EN.

Wald

I Wald’s equation: Let Xi be i.i.d. with E|Xi | < ∞. If N is a stopping time with EN < ∞ then ESN = EX1EN.

18.175 Lecture 16 Wald

I Wald’s equation: Let Xi be i.i.d. with E|Xi | < ∞. If N is a stopping time with EN < ∞ then ESN = EX1EN.

I Wald’s second equation: Let Xi be i.i.d. with E|Xi | = 0 and 2 2 EXi = σ < ∞. If N is a stopping time with EN < ∞ then 2 ESN = σ EN.

18.175 Lecture 16 I What is EN?

Wald applications to SRW

I S0 = a ∈ Z and at each time step Sj independently changes by ±1 according to a toss. Fix A ∈ Z and let N = inf{k : Sk ∈ {0, A}. What is ESN ?

18.175 Lecture 16 Wald applications to SRW

I S0 = a ∈ Z and at each time step Sj independently changes by ±1 according to a fair coin toss. Fix A ∈ Z and let N = inf{k : Sk ∈ {0, A}. What is ESN ? I What is EN?

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 Outline

Conditional expectation

Martingales

Random walks

Stopping times

Arcsin law, other SRW stories

18.175 Lecture 16 I Try counting walks that do cross by giving bijection to walks from (0, −x) to (n, y).

Reflection principle

I How many walks from (0, x) to (n, y) that don’t cross the horizontal axis?

18.175 Lecture 16 Reflection principle

I How many walks from (0, x) to (n, y) that don’t cross the horizontal axis?

I Try counting walks that do cross by giving bijection to walks from (0, −x) to (n, y).

18.175 Lecture 16 I Answer: (α − β)/(α + β). Can be proved using reflection principle.

Ballot Theorem

I Suppose that in election candidate A gets α votes and B gets β < α votes. What’s probability that A is a head throughout the counting?

18.175 Lecture 16 Ballot Theorem

I Suppose that in election candidate A gets α votes and B gets β < α votes. What’s probability that A is a head throughout the counting?

I Answer: (α − β)/(α + β). Can be proved using reflection principle.

18.175 Lecture 16 I Theorem for amount of positive positive time.

Arcsin theorem

I Theorem for last hitting time.

18.175 Lecture 16 Arcsin theorem

I Theorem for last hitting time.

I Theorem for amount of positive positive time.

18.175 Lecture 16