Lisa Yan April 20, 2020

07: Variance, Bernoulli, Binomial Lisa Yan April 20, 2020

1 Quick slide reference

3 Variance 07a_variance_i

10 Properties of variance 07b_variance_ii

17 Bernoulli RV 07c_bernoulli

22 Binomial RV 07d_binomial

34 Exercises LIVE

Lisa Yan, CS109, 2020 2 07a_variance_i

Variance

3 Average annual weather Stanford, CA Washington, DC � high = 68°F � high = 67°F � low = 52°F � low = 51°F

Is � � enough?

Lisa Yan, CS109, 2020 4 Average annual weather Stanford, CA Washington, DC � high = 68°F � high = 67°F � low = 52°F � low = 51°F Stanford high temps Washington high temps 0.4 0.4 68°F 67°F 0.3 0.3 � �

= 0.2 = 0.2 � �

� 0.1 � 0.1

0 0 35 50 65 80 90 35 50 65 80 90 Normalized histograms are approximations of PMFs.

Lisa Yan, CS109, 2020 5 Variance = “spread” Consider the following three distributions (PMFs):

• Expectation: � � = 3 for all distributions • But the “spread” in the distributions is different! • Variance, Var � : a formal quantification of “spread”

Lisa Yan, CS109, 2020 6 Variance

The variance of a random variable � with mean � � = � is

Var � = � � − � !

• Also written as: � � − � � • Note: Var(X) ≥ 0 • Other names: 2nd central moment, or square of the standard deviation

Var � Units of � def standard deviation SD � = Var � Units of �

Lisa Yan, CS109, 2020 7 Var � = � � − � � ! Variance Variance of Stanford weather of � Stanford, CA � high = 68°F � � − � � low = 52°F 57°F 124 (°F)2 Stanford high temps 71°F 9 (°F)2 0.4 2 � � = � = 68 75°F 49 (°F) 0.3 69°F 1 (°F)2 �

= 0.2 … … �

� 0.1 Variance � � − � = 39 (°F)2 0 35 50 65 80 90 Standard deviation = 6.2°F

Lisa Yan, CS109, 2020 8 Var � = � � − � � ! Variance Comparing variance of � Stanford, CA Washington, DC � high = 68°F � high = 67°F

Stanford high temps Washington high temps 0.4 0.4 68°F 67°F 0.3 0.3 � �

= 0.2 = 0.2 � �

� 0.1 � 0.1

0 0 35 50 65 80 90 35 50 65 80 90 Var � = 39 °F Var � = 248 °F

Lisa Yan, CS109, 2020 9 07b_variance_ii

Properties of Variance

10 Properties of variance

Definition Var � = � � − � � Units of � def standard deviation SD � = Var � Units of �

Property 1 Var � = � � − � � Property 2 Var �� + � = �Var �

• Property 1 is often easier to compute than the definition • Unlike expectation, variance is not linear

Lisa Yan, CS109, 2020 11 Properties of variance

Definition Var � = � � − � � Units of � def standard deviation SD � = Var � Units of �

Property 1 Var � = � � − � � Property 2 Var �� + � = �Var �

• Property 1 is often easier to compute than the definition • Unlike expectation, variance is not linear

Lisa Yan, CS109, 2020 12 Var � = � � − � � ! Variance Computing variance, a proof = � �! − � � ! of �

Var � = � � − � � = � � − � Let � � = �

= � − � � � = � − 2�� + � � � = �� − 2� �� + � � � Everyone, please = � � − 2�� + � welcome the ⋅ 1 second = � � − 2� + � moment! = � � − � = � � − � � Lisa Yan, CS109, 2020 13 Var � = � � − � � ! Variance Variance of a 6-sided die = � �! − � � ! of � Let Y = outcome of a single die roll. Recall � � = 7/2 . Calculate the variance of Y.

1. Approach #1: Definition 2. Approach #2: A property

nd moment 1 7 1 7 2 1 Var � = 1 − + 2 − � � = 1 + 2 + 3 + 4 + 5 + 6 6 2 6 2 6 1 7 1 7 = 91/6 + 3 − + 4 − 6 2 6 2 1 7 1 7 + 5 − + 6 − 6 2 6 2 Var � = 91/6 − 7/2 = 35/12 = 35/12

Lisa Yan, CS109, 2020 14 Properties of variance

Definition Var � = � � − � � Units of � def standard deviation SD � = Var � Units of �

Property 1 Var � = � � − � � Property 2 Var �� + � = �Var �

• Property 1 is often easier to compute than the definition • Unlike expectation, variance is not linear

Lisa Yan, CS109, 2020 15 Property 2: A proof Property 2 Var �� + � = �Var �

Proof: Var �� + � = � �� + � − � �� + � Property 1 = � �� + 2�� + � − �� + � Factoring/ = �� + 2�� + � − � �[�] + 2��[�] + � Linearity of Expectation = �� − � �[�] = � � � − �[�] = �Var � Property 1

Lisa Yan, CS109, 2020 16 07c_bernoulli

Bernoulli RV

17 Bernoulli Random Variable

Consider an experiment with two outcomes: “success” and “failure.” def A Bernoulli random variable � maps “success” to 1 and “failure” to 0. Other names: indicator random variable, boolean random variable

PMF � � = 1 = � 1 = � �~Ber(�) � � = 0 = � 0 = 1 − � Expectation � � = � Support: {0,1} Variance Var � = �(1 − �) Examples: • Coin flip • Random binary digit Remember this nice property of • Whether a disk drive crashed expectation. It will come back!

Lisa Yan, CS109, 2020 18 Jacob Bernoulli

Jacob Bernoulli (1654-1705), also known as “James”, was a Swiss mathematician

One of many mathematicians in Bernoulli family The Bernoulli Random Variable is named for him My academic great14 grandfather

Lisa Yan, CS109, 2020 19 �~Ber(�) �" 1 = � Defining Bernoulli RVs � � = � �" 0 = 1 − �

Run a program Serve an ad. Roll two dice. • Crashes w.p. � • User clicks w.p. 0.2 • Success: roll two 6’s • Works w.p. 1 − � • Ignores otherwise • Failure: anything else Let �: 1 if crash Let �: 1 if clicked Let � : 1 if success

�~Ber(�) �~Ber(___) �~Ber(___) � � = 1 = � � � = 1 = ___ � � = 0 = 1 − � � � = 0 = ___ � � = ___ � Lisa Yan, CS109, 2020 20 �~Ber(�) �" 1 = � Defining Bernoulli RVs � � = � �" 0 = 1 − �

�~Ber(�) �~Ber(___) �~Ber(___) � � = 1 = � � � = 1 = ___ � � = 0 = 1 − � � � = 0 = ___ � � = ___

Lisa Yan, CS109, 2020 21 07d_binomial

Binomial RV

22 Binomial Random Variable

Consider an experiment: � independent trials of Ber(�) random variables. def A Binomial random variable � is the number of successes in � trials.

PMF � = 0, 1, … , �: � � � = � = � � = �! 1 − � "#! �~Bin(�, �) � Expectation � � = �� Support: {0,1, … , �} Variance Var � = ��(1 − �) Examples: • # heads in n coin flips • # of 1’s in randomly generated length n bit string • # of disk drives crashed in 1000 computer cluster (assuming disks crash independently)

Lisa Yan, CS109, 2020 23 Lisa Yan, CS109, 2020 24 Reiterating notation

1. The random variable � ~ Bin(�, �)

2. is distributed 3. Binomial 4. with parameters as a

The parameters of a Binomial random variable: • �: number of independent trials • �: probability of success on each trial

Lisa Yan, CS109, 2020 25 Reiterating notation � ~ Bin(�, �)

If � is a binomial with parameters � and �, the PMF of � is � � � = � = � 1 − � �

Probability that � Probability Mass Function for a Binomial takes on the value � Lisa Yan, CS109, 2020 26 � �~Bin(�, �) � � = �# 1 − � $%# Three coin flips � Three fair (“heads” with � = 0.5) coins are flipped. • � is number of heads • �~Bin 3, 0.5 Compute the following event probabilities: � � = 0

� � = 1

� � = 2

� � = 3 � � = 7 � P(event) Lisa Yan, CS109, 2020 27 � �~Bin(�, �) � � = �# 1 − � $%# Three coin flips � Three fair (“heads” with � = 0.5) coins are flipped. • � is number of heads • �~Bin 3, 0.5 Compute the following event probabilities: 3 � � = 0 = � 0 = � 1 − � = 0 3 � � = 1 = � 1 = � 1 − � = 1 3 � � = 2 = � 2 = � 1 − � = Extra math note: 2 By Binomial Theorem, 3 � � = 3 = � 3 = � 1 − � = we can prove 3 ∑ � � = � = 1 � � = 7 = � 7 = 0

P(event) PMF Lisa Yan, CS109, 2020 28 Binomial Random Variable

Consider an experiment: � independent trials of Ber(�) random variables. def A Binomial random variable � is the number of successes in � trials.

PMF � = 0, 1, … , �: � � � = � = � � = �! 1 − � "#! �~Bin(�, �) � Expectation � � = �� Range: {0,1, … , �} Variance Var � = ��(1 − �) Examples: • # heads in n coin flips • # of 1’s in randomly generated length n bit string • # of disk drives crashed in 1000 computer cluster (assuming disks crash independently)

Lisa Yan, CS109, 2020 29 Binomial RV is sum of Bernoulli RVs

Bernoulli • �~Ber(�)

Binomial • �~Bin �, � + • The sum of � independent + Bernoulli RVs + � = � , � ~Ber(�) Ber � = Bin(1, �)

Lisa Yan, CS109, 2020 30 Binomial Random Variable

Consider an experiment: � independent trials of Ber(�) random variables. def A Binomial random variable � is the number of successes in � trials.

PMF � = 0, 1, … , �: � � � = � = � � = �! 1 − � "#! �~Bin(�, �) � Expectation � � = �� Range: {0,1, … , �} Variance Var � = ��(1 − �) Examples: Proof: • # heads in n coin flips • # of 1’s in randomly generated length n bit string • # of disk drives crashed in 1000 computer cluster (assuming disks crash independently)

Lisa Yan, CS109, 2020 31 Binomial Random Variable

Consider an experiment: � independent trials of Ber(�) random variables. def A Binomial random variable � is the number of successes in � trials.

PMF � = 0, 1, … , �: � � � = � = � � = �! 1 − � "#! �~Bin(�, �) � Expectation � � = �� Range: {0,1, … , �} Variance Var � = ��(1 − �) We’ll prove Examples: this later in • # heads in n coin flips the course • # of 1’s in randomly generated length n bit string • # of disk drives crashed in 1000 computer cluster (assuming disks crash independently)

Lisa Yan, CS109, 2020 32 No, give me the variance proof right now

proofwiki.org

Lisa Yan, CS109, 2020 33 (live) 07: Variance, Bernoulli, and Binomial Lisa Yan April 20, 2020

34 Reminders: Lecture with • Turn on your camera if you are able, mute your mic in the big room • Virtual backgrounds are encouraged (classroom-appropriate)

Breakout Rooms for meeting your classmates ◦ Just like sitting next to someone new ◦ This experience is optional: You should be comfortable leaving the room at any time.

We will use Ed instead of Zoom chat

Today’s discussion thread: https://us.edstem.org/courses/109/discussion/39075

Lisa Yan, CS109, 2020 35 Our first common RVs Review

1. The random Example: Heads in one coin flip, variable � ~ Ber(�) P(heads) = 0.8 = p

2. is distributed 3. Bernoulli 4. with parameter as a

Example: # heads in 40 coin flips, � ~ Bin(�, �) P(heads) = 0.8 = p

otherwise Identify PMF, or identify as a function of an existing random variable

Lisa Yan, CS109, 2020 36 Check out the questions on the next slide (Slide 38). Post any clarifications here! Breakout https://us.edstem.org/courses/109/discussion/39075

Rooms Breakout rooms: 5 min. Introduce yourself!

� 37 Statistics: Expectation and variance 1. a. Let � = the outcome of a 4-sided die roll. What is � � ? b. Let � = the sum of three rolls of a 4-sided die. What is � � ? 2. a. Let � = # of tails on 10 flips of a biased coin (w.p. 0.4 of heads). What is � � ? b. What is Var(�)?

3. Compare the variances of �~Ber 0.1 and �~Ber(0.5). � Lisa Yan, CS109, 2020 38 Statistics: Expectation and variance 1. a. Let � = the outcome of a 4-sided die roll. What is � � ? b. Let � = the sum of three rolls of a 4-sided die. What is � � ? 2. a. Let � = # of tails on 10 flips of a biased coin (w.p. 0.3 of heads). What is � � ? b. What is Var(�)?

3. Compare the variances of �~Ber 0.1 and �~Ber(0.5).

If you can identify common RVs, just look up statistics instead of re-deriving from definitions.

Lisa Yan, CS109, 2020 39 Slide 41 has a matching question to go over by yourself. We’ll go over it together Think afterwards.

Post any clarifications here! https://us.edstem.org/courses/109/discussion/39075

Think by yourself: 2 min

(by� yourself) 40 � � = �� ~Bin(�, �) � � = �# 1 − � $%# Visualizing Binomial PMFs �

0.3 A. B. 0.3 � � 0.2 0.2 = = � � 0.1 0.1 � � 0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 � � C. 0.3 D. 0.3 � Match the distribution 0.2 � 0.2 = to the graph: = � 0.1 � 0.1 � 1. Bin 10,0.5 � 0 0 2. Bin 10,0.3 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 3. Bin 10,0.7 � � (by yourself) 4. Bin 5,0.5 � Lisa Yan, CS109, 2020 41 � � = �� ~Bin(�, �) � � = �# 1 − � $%# Visualizing Binomial PMFs �

0.3 A. B. 0.3 � � 0.2 0.2 = = � � 0.1 0.1 � � 0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 � � C. 0.3 D. 0.3 � Match the distribution 0.2 � 0.2 = to the graph: = � 0.1 � 0.1 � 1. Bin 10,0.5 � 0 0 2. Bin 10,0.3 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 3. Bin 10,0.7 � � 4. Bin 5,0.5 Lisa Yan, CS109, 2020 42 Binomial RV is sum of Bernoulli RVs Review

Bernoulli • �~Ber(�)

Binomial • �~Bin �, � + • The sum of � independent + Bernoulli RVs + � = � , � ~Ber(�)

Lisa Yan, CS109, 2020 43 Galton Board

http://web.stanford.edu/class/cs109/ demos/galton.html

0 1 2 3 4 5

Lisa Yan, CS109, 2020 44 Slide 46 has a question to go over by Think yourself. Post any clarifications here! https://us.edstem.org/courses/109/discussion/39075

Think by yourself: 2 min

(by� yourself) 45 � �~Bin(�, �) � � = �# 1 − � $%# Galton Board �

When a marble hits a pin, it has an equal chance of going left or right. Let � = the bucket index a ball drops into. What is the distribution of �?

� = 5 (Interpret: If � is a common random variable, report it, otherwise report PMF)

(by yourself) 0 1 2 3 4 5 � Lisa Yan, CS109, 2020 46 � �~Bin(�, �) � � = �# 1 − � $%# Galton Board �

When a marble hits a pin, it has an equal chance of going left or right. Let � = the bucket index a ball drops into. What is the distribution of �?

� = 5 • Each pin is an independent trial • One decision made for level � = 1, 2, . . , 5 • Consider a Bernoulli RV with success � if ball went right on level � • Bucket index � = # times ball went right

�~Bin(� = 5, � = 0.5) 0 1 2 3 4 5

Lisa Yan, CS109, 2020 47 � �~Bin(�, �) � � = �# 1 − � $%# Galton Board �

When a marble hits a pin, it has an equal chance of going left or right. Let � = the bucket index a ball drops into. � is distributed as a Binomial RV, �~Bin(� = 5, � = 0.5) � = 5 Calculate the probability of a ball landing in bucket �. 5 � � = 0 = 0.5 ≈ 0.03 0 5 � � = 1 = 0.5 ≈ 0.16 1 5 � � = 2 = 0.5 ≈ 0.31 2 0 1 2 3 4 5

Lisa Yan, CS109, 2020 48 � �~Bin(�, �) � � = �# 1 − � $%# Galton Board �

Let � = the bucket index a ball drops into. � is distributed as a Binomial RV, �~Bin(� = 5, � = 0.5)

� = 5 Calculate the probability of a ball landing in bucket �.

PMF of Binomial RV!

0 1 2 3 4 5

Lisa Yan, CS109, 2020 49 Interlude for jokes/announcements

50 Announcements

JOKE HERE

Lisa Yan, CS109, 2020 51 Interesting probability news TikTok Recommendation Algorithm Optimizations

https://www.lesswrong.com /posts/sXqAFdf3a5sA3HM CH/tiktok-recommendation- algorithm-optimizations

Lisa Yan, CS109, 2020 52 NBA Finals (RIP) and genetics

Lisa Yan, CS109, 2020 53 Think, Check out the questions on the next slide (Slide 55). Post any clarifications here! then https://us.edstem.org/courses/109/discussion/39075 Breakout Rooms By yourself: 2 min Breakout rooms: 5 min.

� 54 NBA Finals and genetics 1. The Golden State Warriors are going to play the Toronto Raptors in a 7-game series during the 2019 NBA finals. • The Warriors have a probability of 58% of winning each game, independently. • A team wins the series if they win at least 4 games (we play all 7 games). What is P(Warriors winning)? 2. Each person has 2 genes per trait (e.g., eye color). • Child receives 1 gene (equally likely) from each parent • Brown is “dominant”, blue is ”recessive”: • Child has brown eyes if either (or both) genes are brown • Blue eyes only if both genes are blue. • Parents each have 1 brown and 1 blue gene. A family has 4 children. What is P(3 children with brown eyes)? � Lisa Yan, CS109, 2020 55 � �~Bin(�, �) � � = �# 1 − � $%# NBA Finals � The Golden State Warriors are going to play the Toronto Raptors in a 7-game series during the 2019 NBA finals. • The Warriors have a probability of 58% of winning each game, independently. • A team wins the series if they win at least 4 games (we play all 7 games). What is P(Warriors winning)? 1. Define events/ Desired probability? (select all that apply) RVs & state goal A. � � > 4 �: # games Warriors win B. � � ≥ 4 �~Bin(7, 0.58) C. � � > 3 D. 1 − � � ≤ 3 Want: E. 1 − � � < 3

Lisa Yan, CS109, 2020 56 � �~Bin(�, �) � � = �# 1 − � $%# NBA Finals � The Golden State Warriors are going to play the Toronto Raptors in a 7-game series during the 2019 NBA finals. • The Warriors have a probability of 58% of winning each game, independently. • A team wins the series if they win at least 4 games (we play all 7 games). What is P(Warriors winning)? 1. Define events/ Desired probability? (select all that apply) RVs & state goal A. � � > 4 �: # games Warriors win B. � � ≥ 4 �~Bin(7, 0.58) C. � � > 3 D. 1 − � � ≤ 3 Want: E. 1 − � � < 3

Lisa Yan, CS109, 2020 57 � �~Bin(�, �) � � = �# 1 − � $%# NBA Finals � The Golden State Warriors are going to play the Toronto Raptors in a 7- game series during the 2019 NBA finals. • The Warriors have a probability of 58% of winning each game, independently. • A team wins the series if they win at least 4 games (we play all 7 games). What is P(Warriors winning)? 1. Define events/ 2. Solve RVs & state goal 7 � � ≥ 4 = � � = � = 0.58 0.42 �: # games Warriors win � �~Bin(7, 0.58)

Want: � � ≥ 4 Cool Algebra/Probability Fact: this is identical to the probability of winning if we define winning = first to win 4 games Lisa Yan, CS109, 2020 58 � �~Bin(�, �) � � = �# 1 − � $%# Genetic inheritance � Each person has 2 genes per trait (e.g., eye color). • Child receives 1 gene (equally likely) from each parent • Brown is “dominant”, blue is ”recessive”: • Child has brown eyes if either (or both) genes are brown • Blue eyes only if both genes are blue. • Parents each have 1 brown and 1 blue gene. A family has 4 children. What is P(3 children with brown eyes)? Subset A. Product of 4 independent events of ideas: B. Probability tree C. Bernoulli, success � = 3 children with brown eyes D. Binomial, � = 3 trials, success � = brown-eyed child E. Binomial, � = 4 trials, success � = brown-eyed child Lisa Yan, CS109, 2020 59 � �~Bin(�, �) � � = �# 1 − � $%# Genetic inheritance � Each person has 2 genes per trait (e.g., eye color). • Child receives 1 gene (equally likely) from each parent • Brown is “dominant”, blue is ”recessive”: • Child has brown eyes if either (or both) genes are brown • Blue eyes only if both genes are blue. • Parents each have 1 brown and 1 blue gene. A family has 4 children. What is P(3 children with brown eyes)? 1. Define events/ 2. Identify known 3. Solve RVs & state goal probabilities �: # brown-eyed children, �~Bin(4, �) �: � brown−eyed child Want: � � = 3 Lisa Yan, CS109, 2020 60 See you next time

Lisa Yan, CS109, 2020 61