<<

Math Colloquium Five Lectures on Theory Part 2: The

Robert Niedzialomski, [email protected]

March 3rd, 2021

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 1Numbers / 18 Probability - Intuition

Probability Theory = Mathematical framework for modeling/studying non-deterministic behavior where a source of randomness is introduced (this means that more than one is possible) The space of all possible outcomes is called the sample space. A set of outcomes is called an event and the source of randomness is called a random variable

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 2Numbers / 18 Discrete Probability

A discrete probability space consists of a finite (or countable) set Ω of outcomes ω together with a set of non-negative real numbers pω assigned to each ω; pω is called the probability of the outcome ω. We require P ω∈Ω pω = 1. An event is a set of outcomes, i.e., a subset A ⊂ Ω. The probability of an event A is X P(A) = pω. ω∈A A random variable is a function X mapping the set Ω to the set of real numbers. We write X :Ω → R. We note that the following Kolmogorov axioms of probability hold true: P(∅) = 0 ∞ P∞ if A1, A2,... are disjoint events, then P(∪n=1An) = n=1 P(An).

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 3Numbers / 18 An Example of Rolling a Die Twice

Example (Rolling a Die Twice) Suppose we roll a fair die twice and we want to model the probability of the sum of the numbers we roll. The sample space to Ω = {(i, j): i, j = 1, 2, 3, 4, 5, 6} with probability of each outcome pij = 1/36. Let the random variable X represent the number after the first roll and let Y be the random variable that represents the number after the second roll. Hence X (i, j) = i and Y (i, j) = j. Our goal is to study the random variable X + Y . We compute

P(X 0 + Y 0 = 2) = P({(1, 1)}) = 1/36 P(X 0 + Y 0 = 3) = P({(1, 2), (2, 1)} = 2 · (1/36) = 1/18.

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 4Numbers / 18 Example (Rolling a Die Twice Continued) We continue our compuation of the for the random variable X 0 + Y 0 and obtain k 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 5 4 3 2 1 pk 36 36 36 36 36 36 36 36 36 36 36 We have obtained a new probability space

ΩX = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}

with probability pk of each outcome given in the table above. This probability is called the distribution of the random variable X + Y . The , denoted by E[X + Y ] of this random variable is the weighted average (mean) of the locations k with weights pk , i.e.,

12 X E[X + Y ] = pk · k = 7. k=2

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 5Numbers / 18 Distribution and Expectation

Let (Ω, (pω)) be a discrete probability space and let X be a random variable on Ω. The probability distribution on X is the discrete probability

ΩX = the set of values of X = {X (ω): ω ∈ Ω}

with probability of an outcome k given by

pk = P(X = k) = P({ω ∈ Ω: X (ω) = k}).

The expectation E[X ] of X , called also mean, is given by X E[X ] = pk · k.

k∈ΩX Remark: The formula for expectation makes sense for a probability defined on a real line without reference to a random variable.

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 6Numbers / 18 Theorem The expectation of a random variable X can be computed according to the formula X E[X ] = pωX (ω). ω∈Ω

Proof. We first notice that X pk = P({ω ∈ Ω: X (ω) = k}) = pω. {ω : X (ω)=k}

Therefore X E[X ] = pk · k

k∈ΩX X X X = pωX (ω) = pωX (ω).

k∈ΩX {ω : X (ω)=k} ω∈Ω

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 7Numbers / 18 The theorem gives us the following properties of expectation: For two random variables X and Y we have

E[X + Y ] = E[X ] + E[Y ].

For a random variable X and a real number C we have

E[cX ] = cE[X ]

We say that a random variable X has zero mean if E[X ] = 0.

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 8Numbers / 18 Bernoulli Distribution

Suppose we flip a biased coin with

probability of heads = p and probability of tails = q = 1 − p

The probability space is Ω = {H, T } with pH = p and pT = q.

Let X be the random variable that assigns the value 0 to tails and value 1 to heads. This means that X (T ) = 0 and X (H) = 1.

The probability distribution is ΩX = {0, 1} with p1 = p and p0 = q. This distribution is called the Bernoulli distribution. Its expectation is

E[X ] = 0 · p0 + 1 · p1 = p.

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 9Numbers / 18 Binomial Distribution Suppose we flip a biased coin n times. What is the probability of getting Heads k times?

The sample space is Ω = {(x1, x2,..., xn): xj = 0, 1}, where 0 represents tails and 1 represents heads. The probability of an outcome (x1, x2,..., xn) is p(number of 1’s) · q(number of 0’s).

Let Sn be the random variable that represents numbers of of Heads in n flips of the coin. We need to find P(Sn = k). We see that

k n−k P(Sn = k) = number of outcomes with k ones · p q

Since the number of outcomes with k ones equals the number of k element subsets of an n element set with is nchoosek, we have n P(S = k) = pk qn−k . n k

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 10 Numbers / 18 The distribution of the random variable Sn is

probability space: {0, 1,..., n} n probability distribution: p = P(S = k) = pk qn−k . k n k

To find the expectation we need to compute

n X n E[S ] = k pk qn−k . n k k=0 This requires the use of the Binomial Theorem which says the following. For any real numbers a and b and any positive integer n we have

n X n (a + b)n = ak bn−k . k k=1

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 11 Numbers / 18 Let X1, X2,..., Xn be the random variables representing the 1st, 2nd,. . . , n-th flip of the coin. If we wanted to be precise, we would write

Xj (x1,..., xn) = xj

Each random variable Xj , where j = 1, 2,..., n has Bernoulli distribution. Hence E[X1] = E[X2] = ... = E[Xn] = p. Moreover, we see that

Sn = X1 + X2 + ... + Xn. Therefore E[Sn] = E[X1] + E[X2] + ... + E[Xn] = np. What happens if we keep flipping the coin, record the number of Heads, and take the average by divide by the number of flips? In order words, we want to study X + X + ... + X S lim 1 2 n = lim n n→∞ n n→∞ n

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 12 Numbers / 18 Law of Large Numbers

Law of Averages: Suppose we repeat an experiment independently n times. Then # of successes in n trials → P(success) n

Law of Large Numbers: Let the random variable Xi model the i-th trial of the experiment. This means that P(Xi = 1) = P(success) = p and P(X = 0) = P(failure) = q = 1 − p. Then the random variables X1, X2,... are independent and identically distributed (i.i.d.) with Bernoulli distribution, and X + X + ... + X 1 2 n → E[X ] = P(success) = p n 1

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 13 Numbers / 18 Theorem (Bernoulli, 1692)

It is the case that Sn/n converges to p as n → ∞ in the sense that for any  > 0  S  P p −  ≤ n ≤ p +  → 1 when n → ∞. n

Proof: Let  > 0. Then   n   Sn X X n P ≥ p +  = P(S = k) = pk qn−k n n k k≥n(p+) k=dn(p+)e

Let λ > 0. Then 0 < λ[k − n(p + )] = −λn + λqk − λp(n − k) and   n   Sn X n P ≥ p +  ≤ eλ[k−n(p+)] pk qn−k n k k=dn(p+)e m X n ≤ e−λn (peλq)k (qe−λp)n−k = e−λn(peqλ + qe−λp)n. k k=0

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 14 Numbers / 18 We will now use the inequality saying that

2 ex ≤ x + ex where x is any real number.

Then S  P n ≥ p +  ≤ e−λn(peqλ + qe−λp)n n 2 2 2 2 ≤ e−λn(pλq + peλ q − qλp + qeλ p )n 2 2 ≤ e−λn(peλ + qeλ )n 2 = eλ n−λn.

The minimum of the function λ 7→ λ2n − λn = nλ(λ − ) occurs when λ = /2. We get that   Sn − 1 n2 P ≥ p +  ≤ e 4 . n

This finishes the proof of the theorem.

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 15 Numbers / 18 Weak Law of Large Numbers

Theorem (Weak Law of Large Numbers)

Let X1, X2,... be a sequence of independent identically distributed (i.i.d.) 2 random variables with E|X1| = µ < ∞ and E[|X1| ] < ∞. Then the sequence S X + X + ... + X n = 1 2 n n n converges to µ in the following two ways: in probability: this means that for any  > 0   Sn P − µ ≤  → 1 when n → ∞. n

in the L2 norm: this means that  S  E | n − µ|2 → 0 when n → ∞. n

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 16 Numbers / 18 Strong Law of Large Numbers

Theorem (Strong Law of Large Numbers)

Let X1, X2,... be a sequence of independent identically distributed (i.i.d.) random variables with finite mean E[X1] = µ < ∞. Then the sequence S X + X + ... + X n = 1 2 n n n converges to µ almost surly.

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 17 Numbers / 18 References

Walsh, John B. Knowing the odds. Graduate Studies in Mathematics, 139. American Mathematical Society, Providence, RI, 2012 Khoshnevisan Davar; Rassoul Agha Firac Introduction to Probability, lecture notes available online at: www.math.utah.edu/ davar/math5010/summer2012/Lectures.pdf Grimmett, Geoffrey R.; Stirzaker, David R. Probability and random processes. Third edition. Oxford University Press, New York, 2001

Robert Niedzialomski, [email protected] Math Colloquium Five Lectures on Probability Theory PartMarch 2: The 3rd, Law 2021 of Large 18 Numbers / 18