Probability Cheatsheet

Probability Cheatsheet v1.1.1 Simpson's Paradox Expected Value, Linearity, and Symmetry P (A j B; C) < P (A j Bc;C) and P (A j B; Cc) < P (A j Bc;Cc) Expected Value (aka mean, expectation, or average) can be thought Compiled by William Chen (http://wzchen.com) with contributions yet still, P (A j B) > P (A j Bc) of as the \weighted average" of the possible outcomes of our random from Sebastian Chiu, Yuan Jiang, Yuqi Hou, and Jessy Hwang. variable. Mathematically, if x1; x2; x3;::: are all of the possible values Material based off of Joe Blitzstein's (@stat110) lectures Bayes' Rule and Law of Total Probability that X can take, the expected value of X can be calculated as follows: P (http://stat110.net) and Blitzstein/Hwang's Intro to Probability E(X) = xiP (X = xi) textbook (http://bit.ly/introprobability). Licensed under CC i Law of Total Probability with partitioning set B1; B2; B3; :::Bn and BY-NC-SA 4.0. Please share comments, suggestions, and errors at with extra conditioning (just add C!) Note that for any X and Y , a and b scaling coefficients and c is our http://github.com/wzchen/probability_cheatsheet. constant, the following property of Linearity of Expectation holds: P (A) = P (AjB1)P (B1) + P (AjB2)P (B2) + :::P (AjBn)P (Bn) Last Updated March 20, 2015 P (A) = P (A \ B1) + P (A \ B2) + :::P (A \ Bn) E(aX + bY + c) = aE(X) + bE(Y ) + c P (AjC) = P (AjB1; C)P (B1jC) + :::P (AjBn; C)P (BnjC) If two Random Variables have the same distribution, even when they are dependent by the property of Symmetry their expected values P (AjC) = P (A \ B jC) + P (A \ B jC) + :::P (A \ B jC) Counting 1 2 n are equal. c Multiplication Rule - Let's say we have a compound experiment Law of Total Probability with B and B (special case of a partitioning Conditional Expected Value is calculated like expectation, only (an experiment with multiple components). If the 1st component has set), and with extra conditioning (just add C!) conditioned on any event A. P n1 possible outcomes, the 2nd component has n2 possible outcomes, P (A) = P (AjB)P (B) + P (AjBc)P (Bc) E(XjA) = xP (X = xjA) and the rth component has nr possible outcomes, then overall there x P (A) = P (A \ B) + P (A \ Bc) are n1n2 : : : nr possibilities for the whole experiment. Indicator Random Variables c c Sampling Table - The sampling tables describes the different ways P (AjC) = P (AjB; C)P (BjC) + P (AjB ; C)P (B jC) Indicator Random Variables is random variable that takes on to take a sample of size k out of a population of size n. The column P (AjC) = P (A \ BjC) + P (A \ BcjC) either 1 or 0. The indicator is always an indicator of some event. If the names denote whether order matters or not. event occurs, the indicator is 1, otherwise it is 0. They are useful for Bayes' Rule, and with extra conditioning (just add C!) many problems that involve counting and expected value. Matters Not Matter Distribution IA ∼ Bern(p) where p = P (A) k n + k − 1 P (A \ B) P (BjA)P (A) With Replacement n P (AjB) = = Fundamental Bridge The expectation of an indicator for A is the k P (B) P (B) n! n probability of the event. E(I ) = P (A). Notation: Without Replacement A (n − k)! k P (A \ BjC) P (BjA; C)P (AjC) P (AjB; C) = = ( P (BjC) P (BjC) 1 A occurs Na¨ıve Definition of Probability - If the likelihood of each IA = 0 A does not occur outcome is equal, the probability of any event happening is: Odds Form of Bayes' Rule, and with extra conditioning (just add C!) number of favorable outcomes P (AjB) P (BjA) P (A) Variance P (Event) = = Var(X) = E(X2) − [E(X)]2 number of outcomes P (AcjB) P (BjAc) P (Ac) P (AjB; C) P (BjA; C) P (AjC) Expectation and Independence Probability and Thinking Conditionally = If X and Y are independent, then P (AcjB; C) P (BjAc; C) P (AcjC) E(XY ) = E(X)E(Y ) Independence Random Variables and their Distributions Independent Events - A and B are independent if knowing one Continuous RVs, LotUS, and UoU gives you no information about the other. A and B are independent if PMF, CDF, and Independence and only if one of the following equivalent statements hold: Probability Mass Function (PMF) (Discrete Only) gives the Continuous Random Variables P (A \ B) = P (A)P (B) probability that a random variable takes on the value X. What's the prob that a CRV is in an interval? Use the CDF (or the PDF, see below). To find the probability that a CRV takes on a P (AjB) = P (A) PX (x) = P (X = x) value in the interval [a; b], subtract the respective CDFs. Cumulative Distribution Function (CDF) gives the probability Conditional Independence - A and B are conditionally P (a ≤ X ≤ b) = P (X ≤ b) − P (X ≤ a) = F (b) − F (a) that a random variable takes on the value x or less independent given C if: P (A \ BjC) = P (AjC)P (BjC). Conditional Note that for an r.v. with a normal distribution, independence does not imply independence, and independence does FX (x0) = P (X ≤ x0) not imply conditional independence. P (a ≤ X ≤ b) = P (X ≤ b) − P (X ≤ a) Independence - Intuitively, two random variables are independent if b − µ a − µ = Φ − Φ Unions, Intersections, and Complements knowing one gives you no information about the other. X and Y are σ2 σ2 De Morgan's Laws - Gives a useful relation that can make independent if for ALL values of x and y: calculating probabilities of unions easier by relating them to What is the Cumulative Density Function (CDF)? It is the P (X = x; Y = y) = P (X = x)P (Y = y) intersections, and vice versa. De Morgan's Law says that the following function of x. complement is distributive as long as you flip the sign in the middle. Expected Value and Indicators F (x) = P (X ≤ x) (A [ B)c ≡ Ac \ Bc What is the Probability Density Function (PDF)? The PDF, f(x), is the derivative of the CDF. (A \ B)c ≡ Ac [ Bc Distributions 0 Probability Mass Function (PMF) (Discrete Only) is a function F (x) = f(x) Joint, Marginal, and Conditional Probabilities that takes in the value x, and gives the probability that a random Or alternatively, variable takes on the value x. The PMF is a positive-valued function, Z x Joint Probability - P (A \ B) or P (A; B) - Probability of A and B. P F (x) = f(t)dt and x P (X = x) = 1 −∞ Marginal (Unconditional) Probability - P (A) - Probability of A Note that by the fundamental theorem of calculus, PX (x) = P (X = x) Conditional Probability - P (AjB) - Probability of A given B Z b occurred. Cumulative Distribution Function (CDF) is a function that F (b) − F (a) = f(x)dx takes in the value x, and gives the probability that a random variable a Conditional Probability is Probability - P (AjB) is a probability takes on the value at most x. Thus to find the probability that a CRV takes on a value in an as well, restricting the sample space to B instead of Ω. Any theorem interval, you can integrate the PDF, thus finding the area under the that holds for probability also holds for conditional probability. F (x) = P (X ≤ x) density curve. How do I find the expected value of a CRV? Where in discrete Why is it called the Moment Generating Function? Because Independence of Random Variables th cases you sum over the probabilities, in continuous cases you integrate the k derivative of the moment generating function evaluated 0 is Review: A and B are independent if and only if either th over the densities. the k moment of X! P (A \ B) = P (A)P (B) or P (AjB) = P (A). Z 1 E(X) = xf(x)dx 0 k (k) Similar conditions apply to determine whether random variables are µk = E(X ) = MX (0) −∞ independent - two random variables are independent if their joint This is true by Taylor Expansion of etX distribution function is simply the product of their marginal distributions, or that the a conditional distribution of is the same as Law of the Unconscious Statistician (LotUS) 1 k k 1 0 k tX X E(X )t X µkt its marginal distribution. MX (t) = E(e ) = = Expected Value of Function of RV Normally, you would find the k! k! In words, random variables X and Y are independent for all x; y, if k=0 k=0 expected value of X this way: and only if one of the following hold: Or by differentiation under the integral sign and then plugging in t = 0 • Joint PMF/PDF/CDFs are the product of the Marginal PMF E(X) = ΣxxP (X = x) k k ! Conditional distribution of X given Y is the same as the (k) d tX d tX k tX • M (t) = E(e ) = E e = E(X e ) marginal distribution of X Z 1 X dtk dtk E(X) = xf(x)dx Multivariate LotUS −∞ M (k)(0) = E(Xke0X ) = E(Xk) = µ0 X k Review: E(g(X)) = P g(x)P (X = x), or LotUS states that you can find the expected value of a function of a x E(g(X)) = R 1 g(x)f (x)dx random variable g(X) this way: MGF of linear combinations If we have Y = aX + c, then −∞ X t(aX+c) ct (at)X ct For discrete random variables: MY (t) = E(e ) = e E(e ) = e MX (at) X X E(g(X)) = Σxg(x)P (X = x) E(g(X; Y )) = g(x; y)P (X = x; Y = y) Uniqueness of the MGF.

Load more