Chapter 3. Discrete random variables and probability distributions.
::::Defn: A :::::::random::::::::variable (r.v.) is a function that takes a sample point in S and maps it to it a real number. That is, a r.v. Y maps the sample space (its domain) to the real number line (its range), Y : S → R.
::::::::Example: In an experiment, we roll two dice. Let the r.v. Y = the sum of two dice. The r.v. Y as a function: (1, 1) → 2 (1, 2) → 3 . . (6, 6) → 12 Discrete random variables and their distributions
::::Defn: A :::::::discrete::::::::random :::::::variable is a r.v. that takes on a ::::finite or :::::::::countably ::::::infinite number of different values.
::::Defn: The probability mass function (pmf) of a discrete r.v Y is a function that maps each possible value of Y to its probability. (It is also called the probability distribution of a discrete r.v. Y ). pY : Range(Y ) → [0, 1]. pY (v) = P(Y = v), where ‘Y=v’ denotes the event {ω : Y (ω) = v}.
::::::::Notation: We use capital letters (like Y ) to denote a random variable, and a lowercase letters (like v) to denote a particular value the r.v. may take. Cumulative distribution function
::::Defn: The cumulative distribution function (cdf) is defined as F(v) = P(Y ≤ v). pY : Range(Y ) → [0, 1].
Example: We toss two fair coins. Let Y be the number of heads. What is the pmf of Y ? What is the cdf? Quiz
X a random variable.
values of X: 1 3 5 7 cdf F(a): .5 .75 .9 1
What is P(X ≤ 3)?
(A) .15 (B) .25 (C) .5 (D) .75 Quiz
X a random variable.
values of X: 1 3 5 7 cdf F(a): .5 .75 .9 1
What is P(X = 3)?
(A) .15 (B) .25 (C) .5 (D) .75 Quiz
X a random variable.
values of X: 1 3 5 7 cdf F(a): .5 .75 .9 1
What is P(X < 3)?
(A) .15 (B) .25 (C) .5 (D) .75
Example
In an experiment, we roll two fair:: dice. Let the r.v. Y = the sum of dice. What is the pmf of Y ? Properties of probability distributions
::::::::Theorem For any discrete probability distribution, the following is true: 1.0 ≤ p(y) ≤ 1. P 2. y p(y) = 1, where the summation is over all values y with non-zero probability. Expected value of a r.v.
::::Defn: Let Y be a discrete r.v. with probability distribution p(y). Then the ::::::::expected::::::value of Y , denoted E(Y ), is: X E(Y ) = yp(y). y
Note: E(Y ) exists if this sum is absolutely convergent (if P::::: y |y|p(y) < ∞).
E(Y ) is often called the :::::mean or :::::::average of Y and denoted µ.
::::::::Example: Roll a die many times. What is the average value? Quiz
We roll two dice. You win $1000 if the sum is 2 and lose $100 otherwise. How much do you expect to win on average per trial?
(A) -$13.11 (B) $69.44 (C) -$69.44 (D) $13.11 (E) None of the above Expected value of a function of a r.v.
:::::::::Theorem: If Y is a discrete r.v. with probability function p(y) and g(Y ) is a real-valued function of Y , then X E[g(Y )] = g(y)p(y). y
Proof: Suppose g(Y ) take values g1,..., gm. Then by definition
m X E[g(Y )] = gi P(g(Y ) = gi ) i=1 m X X = gi p(y)
i=1 y:g(y)=gi m X X X = gi p(y) = g(y)p(y)
i=1 y:g(y)=gi y
Example: Toss two fair coins. If Y is the number of heads, what is E(Y 2)? Properties of expected value
::::::::Theorem Let a and b be constant and Y is a r.v. Then E[aY + b] = aE[Y ] + b.
::::::::Theorem Let Y1, Y2,... Yk be r.v.’s, then
E[Y1 + Y2 + ... + Yk ] = EY1 + EY2 + ... + EYk . Proof:
::::::::Corollary
E[g1(Y ) + g2(Y ) + ... + gk (Y )] = Eg1(Y ) + Eg2(Y ) + ... + Egk (Y ). Example
Suppose that n people are sitting around a table, and that everyone at the table got up, ran around the room, and sat back down randomly (i.e., all seating arrangements are equally likely).
What is the expected value of the number of people sitting in their original seat? Petersburg Paradox
Consider a series of flips of a fair coin. A player will receive $2 if the first head occurs on flip 1, $4 if it occurs on flip 2, $8 if on flip 3, and so on. (1) What is the player’s expected earning? (2) How much one should pay for the right to play this game?
H TH TTH TTTH TTTTH ... Variance and standard deviation
::::Defn: The variance is the expected squared difference between a random variable and its mean,
2 Var(Y ) = E[(Y − µ) ], where µ = E(Y ). p ::::Defn: The standard deviation of Y is σ = Var(Y )
:::::::Formula:::for:::::::::variance:
2 2 Var(Y ) = E(Y ) − µ Proof:
2 2 Var(Y ) = E(Y − 2µY + µ ) 2 2 = E(Y ) − 2µE(Y ) + µ 2 2 = E(Y ) − µ .
Quiz
These graphs show the pmf for 3 random variables. Order them by size of standard deviation from biggest to smallest.
(A) ABC (B) ACB (C) BAC (D) BCA (E) CAB Example
Suppose Y has pmf:
y 1 2 3 4 5
1 2 4 2 1 p(y) 10 10 10 10 10 Find the expectation, variance and the standard deviation of Y . Properties of the variance and standard deviation
If a and b are constants, then
Var(aX + b) = a2Var(X),
σ(aX + b) = |a|σ(X). Quiz
True or false: If Var(X) = 0 then X is constant.
(A) True (B) False What about the variance of the sum? Independent r.v.
::::Defn: Two discrete r.v.’s X and Y are called independent if the events {X = x} and {Y = y} are independent for all possible values of x and y.
This is equivalent to the following
::::::::Theorem Two discrete r.v.’s X and Y are independent if and only if E[f (X)g(Y )] = E[f (X)]E[g(Y )] for all possible functions f and g.
Proof:
::::::::Theorem If r.v.’s X1, X2, ..., Xn are independent, then
Var[X1 + ... + Xn] = VarX1 + ... + VarXn
Proof: Discrete r.v.
• Binomial • Bernoulli • Geometric • Poisson • Negative binomial • Hypergeometric Discrete r.v.
• Binomial • Bernoulli • Geometric • Poisson • Negative binomial • Hypergeometric Discrete r.v.
• Binomial • Bernoulli • Geometric • Poisson • Negative binomial • Hypergeometric Discrete r.v.
• Binomial • Bernoulli • Geometric • Poisson • Negative binomial • Hypergeometric Discrete r.v.
• Binomial • Bernoulli • Geometric • Poisson • Negative binomial • Hypergeometric Discrete r.v.
• Binomial • Bernoulli • Geometric • Poisson • Negative binomial • Hypergeometric Binomial Experiment
::::::::Example: Suppose 2% of all items produced from an assembly line are defective. We randomly sample 50 items and count how many are defective (and how many are not).
::::Defn: A binomial experiment is an experiment with the following characteristics: 1. There are a fixed number, denoted n, of identical trials. 2. Each trial results in one of two outcomes (which we denote “Success” or “Failure”). 3. The probability of Success is constant across trials and denoted p. Hence P[Failure]= 1 − p. 4. The trials are independent. Binomial r.v. and binomial distribution
::::::::Example: Suppose 40% of students at a college are male. We select 10 students at random and count how many are male. Is this a binomial experiment?
::::Defn: The total number of successes in a binomial experiment is called a binomial r.v.
::::Defn: The probability mass function of a binomial random variables is called the binomial distribution. Formula for the binomial distribution
If we have n trials, a typical sample point looks like: SFFFSSFSFFF. . . Of the n slots, suppose we have k successes.
1. How many ways to select k slots where to put S letter?
2. For any sample point containing k S’s and (n − k) F’s, what is its probability?