[.3Cm] Part II: Probability Distribution=1[Frame]

Basic Statistics for SGPE Students Part II: Probability distribution1 Nicolai Vitt [email protected] University of Edinburgh September 2019 1Thanks to Achim Ahrens, Anna Babloyan and Erkal Ersoy for creating these slides and allowing me to use them. Outline 1. Probability theory I Conditional probabilities and independence I Bayes’ theorem 2. Probability distributions I Discrete and continuous probability functions I Probability density function & cumulative distribution function I Binomial, Poisson and Normal distribution I E[X] and V[X] 3. Descriptive statistics I Sample statistics (mean, variance, percentiles) I Graphs (box plot, histogram) I Data transformations (log transformation, unit of measure) I Correlation vs. Causation 4. Statistical inference I Population vs. sample I Law of large numbers I Central limit theorem I Confidence intervals I Hypothesis testing and p-values 1 / 61 Random variables Most of the outcomes or events we have considered so far have been non-numerical, e.g. either head or tail. If the outcome of an experiment is numerical, we call the variable that is determined by the experiment a random variable. Random variables may be either discrete (e.g. the number of days the sun shines) or continuous (e.g. your salary after graduating from the MSc). In contrast to a continuous random variable, we can list the distinct potential outcomes of a discrete random variable. Notation Random variables are usually denoted by capital letters, e.g. X. The corresponding realisations are denote by small letters, e.g. x. 2 / 61 Should you make the bet? Example III.1 I propose the following game. We toss a fair coin 10 times. If head appears 4 times or less, I pay you £2. If head appears more than 4 times, you pay me £1. Should you make the bet? Let’s try to formalise the problem. Let the random variables X1, X2,..., X10 be defined such that 1 if head appears on the ith toss X = for i = 1,..., 10. i 0 if tail appears on the ith toss 3 / 61 Should you make the bet? Furthermore, let the random variable Y denote the number of heads. Clearly, Y = X1 + X2 + ... + X10. If the realisation of Y is greater than 4, I win. Let P(Y = y) denote the probability that Y takes the value y. Accordingly, P(Y ≤ 4) is the probability that we obtain 4 or less heads and P(Y > 4) is the probability that we obtain more than 4 heads. When would you make the bet? Your expected value is E[V ] = P(Y ≤ 4) · £2 + P(Y > 4) · (£−1) where V is the money you get. If E[V ] > 0 (and you are risk neutral), you’ll choose to play. 4 / 61 Should you make the bet? Expected value The expected value of a discrete random variable X is denoted by E[X] and given by k X E[X] = x1P(X=x1) + x2P(X=x2) + ··· + xkP(X=xk) = xiP(X=xi) i=1 where k is the number of distinct outcomes. 5 / 61 Should you make the bet? To solve the problem, we need to find P(Y ≤ 4) and P(Y > 4). From the additive law (Rule 4), we know that P(Y ≤4) = P(Y =0 ∪ Y =1 ∪ Y =2 ∪ Y =3 ∪ Y =4) = P(Y =0) + P(Y =1) + P(Y =2) + P(Y =3) + P(Y =4) P(Y >4) = P(Y =5) + P(Y =6) + P(Y =7) + P(Y =8) + P(Y =9) + P(Y =10) Hence, we need to find P(Y = yi) for i = 0,..., 10. 6 / 61 Discrete probability distribution It is common to denote the probability distribution of a discrete random variable Y by f (y). Discrete probability distribution The probability distribution or probability mass function of a discrete random variable X associates with each of the distinct potential outcomes xi (i = 1,..., k) a probability P(X = xi). That is, f (xi) = P(X = xi). Pk The sum of the probabilities add up to 1, i.e. i f (xi) = 1. 7 / 61 Discrete probability distribution Two examples: Example III.2 (Discrete Uniform Distribution) Let X be the result from rolling a fair dice. The probability distribution is simply 1/6 for x = {1, 2,..., 6} f (x) = P(X = x) = . 0 otherwise This probability distribution is an example for a discrete uniform distributions. Bernoulli distribution It is said that a random variable X has a Bernoulli distribution with parameter P(X = 1) = p (i.e. probability of success) if X can take only the values 1 (success) and 0 (failure). The probability distribution is given by ( p if x = 1 f (x) = 1 − p if x = 0 0 otherwise 8 / 61 Binomial coefficient & binomial distribution Let’s start with f (0) = P(Y =0) which is the probability of obtaining no heads. Using the multiplicative law, 1 10 P(Y =0) = P(X1=0)P(X2=0) ... P(X10=0) = ( /2) = 0.00097656 Now, f (1) = P(Y =1). Since we are interested in the number of heads, we have to take into account that there is more than one combination that results in 1 head. P(Y =1) = P(X1=1)P(X2=0) ... P(X10=0) + P(X1=0)P(X2=1) ... P(X10=0) . + P(X1=0)P(X2=0) ... P(X10=1) 10 = 10 · (1/2) = 0.00976563 9 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HHTTTTTTTT 2 HTHTTTTTTT 3 HTTHTTTTTT 4 HTTTHTTTTT 5 HTTTTHTTTT 6 HTTTTTHTTT combination 7 HTTTTTTHTT 8 HTTTTTTTHT 9 HTTTTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HHTTTTTTTT 2 THHTTTTTTT 3 THTHTTTTTT 4 THTTHTTTTT 5 THTTTHTTTT 6 THTTTTHTTT combination 7 THTTTTTHTT 8 THTTTTTTHT 9 THTTTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTHTTTTTTT 2 THHTTTTTTT 3 TTHHTTTTTT 4 TTHTHTTTTT 5 TTHTTHTTTT 6 TTHTTTHTTT combination 7 TTHTTTTHTT 8 TTHTTTTTHT 9 TTHTTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTHTTTTTT 2 THTHTTTTTT 3 TTHHTTTTTT 4 TTTHHTTTTT 5 TTTHTHTTTT 6 TTTHTTHTTT combination 7 TTTHTTTHTT 8 TTTHTTTTHT 9 TTTHTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTHTTTTT 2 THTTHTTTTT 3 TTHTHTTTTT 4 TTTHHTTTTT 5 TTTTHHTTTT 6 TTTTHTHTTT combination 7 TTTTHTTHTT 8 TTTTHTTTHT 9 TTTTHTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTHTTTT 2 THTTTHTTTT 3 TTHTTHTTTT 4 TTTHTHTTTT 5 TTTTHHTTTT 6 TTTTTHHTTT combination 7 TTTTTHTHTT 8 TTTTTHTTHT 9 TTTTTHTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTTHTTT 2 THTTTTHTTT 3 TTHTTTHTTT 4 TTTHTTHTTT 5 TTTTHTHTTT 6 TTTTTHHTTT combination 7 TTTTTTHHTT 8 TTTTTTHTHT 9 TTTTTTHTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTTTHTT 2 THTTTTTHTT 3 TTHTTTTHTT 4 TTTHTTTHTT 5 TTTTHTTHTT 6 TTTTTHTHTT combination 7 TTTTTTHHTT 8 TTTTTTTHHT 9 TTTTTTTHTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTTTTHT 2 THTTTTTTHT 3 TTHTTTTTHT 4 TTTHTTTTHT 5 TTTTHTTTHT 6 TTTTTHTTHT combination 7 TTTTTTHTHT 8 TTTTTTTHHT 9 TTTTTTTTHH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total.

Load more