[.3Cm] Part II: Probability Distribution=1[Frame]

Total Page:16

File Type:pdf, Size:1020Kb

[.3Cm] Part II: Probability Distribution=1[Frame] Basic Statistics for SGPE Students Part II: Probability distribution1 Nicolai Vitt [email protected] University of Edinburgh September 2019 1Thanks to Achim Ahrens, Anna Babloyan and Erkal Ersoy for creating these slides and allowing me to use them. Outline 1. Probability theory I Conditional probabilities and independence I Bayes’ theorem 2. Probability distributions I Discrete and continuous probability functions I Probability density function & cumulative distribution function I Binomial, Poisson and Normal distribution I E[X] and V[X] 3. Descriptive statistics I Sample statistics (mean, variance, percentiles) I Graphs (box plot, histogram) I Data transformations (log transformation, unit of measure) I Correlation vs. Causation 4. Statistical inference I Population vs. sample I Law of large numbers I Central limit theorem I Confidence intervals I Hypothesis testing and p-values 1 / 61 Random variables Most of the outcomes or events we have considered so far have been non-numerical, e.g. either head or tail. If the outcome of an experiment is numerical, we call the variable that is determined by the experiment a random variable. Random variables may be either discrete (e.g. the number of days the sun shines) or continuous (e.g. your salary after graduating from the MSc). In contrast to a continuous random variable, we can list the distinct potential outcomes of a discrete random variable. Notation Random variables are usually denoted by capital letters, e.g. X. The corresponding realisations are denote by small letters, e.g. x. 2 / 61 Should you make the bet? Example III.1 I propose the following game. We toss a fair coin 10 times. If head appears 4 times or less, I pay you £2. If head appears more than 4 times, you pay me £1. Should you make the bet? Let’s try to formalise the problem. Let the random variables X1, X2,..., X10 be defined such that 1 if head appears on the ith toss X = for i = 1,..., 10. i 0 if tail appears on the ith toss 3 / 61 Should you make the bet? Furthermore, let the random variable Y denote the number of heads. Clearly, Y = X1 + X2 + ... + X10. If the realisation of Y is greater than 4, I win. Let P(Y = y) denote the probability that Y takes the value y. Accordingly, P(Y ≤ 4) is the probability that we obtain 4 or less heads and P(Y > 4) is the probability that we obtain more than 4 heads. When would you make the bet? Your expected value is E[V ] = P(Y ≤ 4) · £2 + P(Y > 4) · (£−1) where V is the money you get. If E[V ] > 0 (and you are risk neutral), you’ll choose to play. 4 / 61 Should you make the bet? Expected value The expected value of a discrete random variable X is denoted by E[X] and given by k X E[X] = x1P(X=x1) + x2P(X=x2) + ··· + xkP(X=xk) = xiP(X=xi) i=1 where k is the number of distinct outcomes. 5 / 61 Should you make the bet? To solve the problem, we need to find P(Y ≤ 4) and P(Y > 4). From the additive law (Rule 4), we know that P(Y ≤4) = P(Y =0 ∪ Y =1 ∪ Y =2 ∪ Y =3 ∪ Y =4) = P(Y =0) + P(Y =1) + P(Y =2) + P(Y =3) + P(Y =4) P(Y >4) = P(Y =5) + P(Y =6) + P(Y =7) + P(Y =8) + P(Y =9) + P(Y =10) Hence, we need to find P(Y = yi) for i = 0,..., 10. 6 / 61 Discrete probability distribution It is common to denote the probability distribution of a discrete random variable Y by f (y). Discrete probability distribution The probability distribution or probability mass function of a discrete random variable X associates with each of the distinct potential outcomes xi (i = 1,..., k) a probability P(X = xi). That is, f (xi) = P(X = xi). Pk The sum of the probabilities add up to 1, i.e. i f (xi) = 1. 7 / 61 Discrete probability distribution Two examples: Example III.2 (Discrete Uniform Distribution) Let X be the result from rolling a fair dice. The probability distribution is simply 1/6 for x = {1, 2,..., 6} f (x) = P(X = x) = . 0 otherwise This probability distribution is an example for a discrete uniform distributions. Bernoulli distribution It is said that a random variable X has a Bernoulli distribution with parameter P(X = 1) = p (i.e. probability of success) if X can take only the values 1 (success) and 0 (failure). The probability distribution is given by ( p if x = 1 f (x) = 1 − p if x = 0 0 otherwise 8 / 61 Binomial coefficient & binomial distribution Let’s start with f (0) = P(Y =0) which is the probability of obtaining no heads. Using the multiplicative law, 1 10 P(Y =0) = P(X1=0)P(X2=0) ... P(X10=0) = ( /2) = 0.00097656 Now, f (1) = P(Y =1). Since we are interested in the number of heads, we have to take into account that there is more than one combination that results in 1 head. P(Y =1) = P(X1=1)P(X2=0) ... P(X10=0) + P(X1=0)P(X2=1) ... P(X10=0) . + P(X1=0)P(X2=0) ... P(X10=1) 10 = 10 · (1/2) = 0.00976563 9 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HHTTTTTTTT 2 HTHTTTTTTT 3 HTTHTTTTTT 4 HTTTHTTTTT 5 HTTTTHTTTT 6 HTTTTTHTTT combination 7 HTTTTTTHTT 8 HTTTTTTTHT 9 HTTTTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HHTTTTTTTT 2 THHTTTTTTT 3 THTHTTTTTT 4 THTTHTTTTT 5 THTTTHTTTT 6 THTTTTHTTT combination 7 THTTTTTHTT 8 THTTTTTTHT 9 THTTTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTHTTTTTTT 2 THHTTTTTTT 3 TTHHTTTTTT 4 TTHTHTTTTT 5 TTHTTHTTTT 6 TTHTTTHTTT combination 7 TTHTTTTHTT 8 TTHTTTTTHT 9 TTHTTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTHTTTTTT 2 THTHTTTTTT 3 TTHHTTTTTT 4 TTTHHTTTTT 5 TTTHTHTTTT 6 TTTHTTHTTT combination 7 TTTHTTTHTT 8 TTTHTTTTHT 9 TTTHTTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTHTTTTT 2 THTTHTTTTT 3 TTHTHTTTTT 4 TTTHHTTTTT 5 TTTTHHTTTT 6 TTTTHTHTTT combination 7 TTTTHTTHTT 8 TTTTHTTTHT 9 TTTTHTTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTHTTTT 2 THTTTHTTTT 3 TTHTTHTTTT 4 TTTHTHTTTT 5 TTTTHHTTTT 6 TTTTTHHTTT combination 7 TTTTTHTHTT 8 TTTTTHTTHT 9 TTTTTHTTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTTHTTT 2 THTTTTHTTT 3 TTHTTTHTTT 4 TTTHTTHTTT 5 TTTTHTHTTT 6 TTTTTHHTTT combination 7 TTTTTTHHTT 8 TTTTTTHTHT 9 TTTTTTHTTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTTTHTT 2 THTTTTTHTT 3 TTHTTTTHTT 4 TTTHTTTHTT 5 TTTTHTTHTT 6 TTTTTHTHTT combination 7 TTTTTTHHTT 8 TTTTTTTHHT 9 TTTTTTTHTH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total. And so on... toss 1 2 3 4 5 6 7 8 9 10 1 HTTTTTTTHT 2 THTTTTTTHT 3 TTHTTTTTHT 4 TTTHTTTTHT 5 TTTTHTTTHT 6 TTTTTHTTHT combination 7 TTTTTTHTHT 8 TTTTTTTHHT 9 TTTTTTTTHH 10 / 61 Binomial coefficient & binomial distribution Now, f (2) = P(Y =2). How many combinations are there that yield 2 heads out of 10 tosses? Given that the first toss produces a head, there are 9 combinations that yield two heads in total.
Recommended publications
  • The Beta-Binomial Distribution Introduction Bayesian Derivation
    In Danish: 2005-09-19 / SLB Translated: 2008-05-28 / SLB Bayesian Statistics, Simulation and Software The beta-binomial distribution I have translated this document, written for another course in Danish, almost as is. I have kept the references to Lee, the textbook used for that course. Introduction In Lee: Bayesian Statistics, the beta-binomial distribution is very shortly mentioned as the predictive distribution for the binomial distribution, given the conjugate prior distribution, the beta distribution. (In Lee, see pp. 78, 214, 156.) Here we shall treat it slightly more in depth, partly because it emerges in the WinBUGS example in Lee x 9.7, and partly because it possibly can be useful for your project work. Bayesian Derivation We make n independent Bernoulli trials (0-1 trials) with probability parameter π. It is well known that the number of successes x has the binomial distribution. Considering the probability parameter π unknown (but of course the sample-size parameter n is known), we have xjπ ∼ bin(n; π); where in generic1 notation n p(xjπ) = πx(1 − π)1−x; x = 0; 1; : : : ; n: x We assume as prior distribution for π a beta distribution, i.e. π ∼ beta(α; β); 1By this we mean that p is not a fixed function, but denotes the density function (in the discrete case also called the probability function) for the random variable the value of which is argument for p. 1 with density function 1 p(π) = πα−1(1 − π)β−1; 0 < π < 1: B(α; β) I remind you that the beta function can be expressed by the gamma function: Γ(α)Γ(β) B(α; β) = : (1) Γ(α + β) In Lee, x 3.1 is shown that the posterior distribution is a beta distribution as well, πjx ∼ beta(α + x; β + n − x): (Because of this result we say that the beta distribution is conjugate distribution to the binomial distribution.) We shall now derive the predictive distribution, that is finding p(x).
    [Show full text]
  • 9. Binomial Distribution
    J. K. SHAH CLASSES Binomial Distribution 9. Binomial Distribution THEORETICAL DISTRIBUTION (Exists only in theory) 1. Theoretical Distribution is a distribution where the variables are distributed according to some definite mathematical laws. 2. In other words, Theoretical Distributions are mathematical models; where the frequencies / probabilities are calculated by mathematical computation. 3. Theoretical Distribution are also called as Expected Variance Distribution or Frequency Distribution 4. THEORETICAL DISTRIBUTION DISCRETE CONTINUOUS BINOMIAL POISSON Distribution Distribution NORMAL OR Student’s ‘t’ Chi-square Snedecor’s GAUSSIAN Distribution Distribution F-Distribution Distribution A. Binomial Distribution (Bernoulli Distribution) 1. This distribution is a discrete probability Distribution where the variable ‘x’ can assume only discrete values i.e. x = 0, 1, 2, 3,....... n 2. This distribution is derived from a special type of random experiment known as Bernoulli Experiment or Bernoulli Trials , which has the following characteristics (i) Each trial must be associated with two mutually exclusive & exhaustive outcomes – SUCCESS and FAILURE . Usually the probability of success is denoted by ‘p’ and that of the failure by ‘q’ where q = 1-p and therefore p + q = 1. (ii) The trials must be independent under identical conditions. (iii) The number of trial must be finite (countably finite). (iv) Probability of success and failure remains unchanged throughout the process. : 442 : J. K. SHAH CLASSES Binomial Distribution Note 1 : A ‘trial’ is an attempt to produce outcomes which is neither sure nor impossible in nature. Note 2 : The conditions mentioned may also be treated as the conditions for Binomial Distributions. 3. Characteristics or Properties of Binomial Distribution (i) It is a bi parametric distribution i.e.
    [Show full text]
  • Chapter 6 Continuous Random Variables and Probability
    EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2019 Chapter 6 Continuous Random Variables and Probability Distributions Chap 6-1 Probability Distributions Probability Distributions Ch. 5 Discrete Continuous Ch. 6 Probability Probability Distributions Distributions Binomial Uniform Hypergeometric Normal Poisson Exponential Chap 6-2/62 Continuous Probability Distributions § A continuous random variable is a variable that can assume any value in an interval § thickness of an item § time required to complete a task § temperature of a solution § height in inches § These can potentially take on any value, depending only on the ability to measure accurately. Chap 6-3/62 Cumulative Distribution Function § The cumulative distribution function, F(x), for a continuous random variable X expresses the probability that X does not exceed the value of x F(x) = P(X £ x) § Let a and b be two possible values of X, with a < b. The probability that X lies between a and b is P(a < X < b) = F(b) -F(a) Chap 6-4/62 Probability Density Function The probability density function, f(x), of random variable X has the following properties: 1. f(x) > 0 for all values of x 2. The area under the probability density function f(x) over all values of the random variable X is equal to 1.0 3. The probability that X lies between two values is the area under the density function graph between the two values 4. The cumulative density function F(x0) is the area under the probability density function f(x) from the minimum x value up to x0 x0 f(x ) = f(x)dx 0 ò xm where
    [Show full text]
  • A Matrix-Valued Bernoulli Distribution Gianfranco Lovison1 Dipartimento Di Scienze Statistiche E Matematiche “S
    Journal of Multivariate Analysis 97 (2006) 1573–1585 www.elsevier.com/locate/jmva A matrix-valued Bernoulli distribution Gianfranco Lovison1 Dipartimento di Scienze Statistiche e Matematiche “S. Vianelli”, Università di Palermo Viale delle Scienze 90128 Palermo, Italy Received 17 September 2004 Available online 10 August 2005 Abstract Matrix-valued distributions are used in continuous multivariate analysis to model sample data matrices of continuous measurements; their use seems to be neglected for binary, or more generally categorical, data. In this paper we propose a matrix-valued Bernoulli distribution, based on the log- linear representation introduced by Cox [The analysis of multivariate binary data, Appl. Statist. 21 (1972) 113–120] for the Multivariate Bernoulli distribution with correlated components. © 2005 Elsevier Inc. All rights reserved. AMS 1991 subject classification: 62E05; 62Hxx Keywords: Correlated multivariate binary responses; Multivariate Bernoulli distribution; Matrix-valued distributions 1. Introduction Matrix-valued distributions are used in continuous multivariate analysis (see, for exam- ple, [10]) to model sample data matrices of continuous measurements, allowing for both variable-dependence and unit-dependence. Their potentials seem to have been neglected for binary, and more generally categorical, data. This is somewhat surprising, since the natural, elementary representation of datasets with categorical variables is precisely in the form of sample binary data matrices, through the 0-1 coding of categories. E-mail address: [email protected] URL: http://dssm.unipa.it/lovison. 1 Work supported by MIUR 60% 2000, 2001 and MIUR P.R.I.N. 2002 grants. 0047-259X/$ - see front matter © 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.jmva.2005.06.008 1574 G.
    [Show full text]
  • 3.2.3 Binomial Distribution
    3.2.3 Binomial Distribution The binomial distribution is based on the idea of a Bernoulli trial. A Bernoulli trail is an experiment with two, and only two, possible outcomes. A random variable X has a Bernoulli(p) distribution if 8 > <1 with probability p X = > :0 with probability 1 − p, where 0 ≤ p ≤ 1. The value X = 1 is often termed a “success” and X = 0 is termed a “failure”. The mean and variance of a Bernoulli(p) random variable are easily seen to be EX = (1)(p) + (0)(1 − p) = p and VarX = (1 − p)2p + (0 − p)2(1 − p) = p(1 − p). In a sequence of n identical, independent Bernoulli trials, each with success probability p, define the random variables X1,...,Xn by 8 > <1 with probability p X = i > :0 with probability 1 − p. The random variable Xn Y = Xi i=1 has the binomial distribution and it the number of sucesses among n independent trials. The probability mass function of Y is µ ¶ ¡ ¢ n ¡ ¢ P Y = y = py 1 − p n−y. y For this distribution, t n EX = np, Var(X) = np(1 − p),MX (t) = [pe + (1 − p)] . 1 Theorem 3.2.2 (Binomial theorem) For any real numbers x and y and integer n ≥ 0, µ ¶ Xn n (x + y)n = xiyn−i. i i=0 If we take x = p and y = 1 − p, we get µ ¶ Xn n 1 = (p + (1 − p))n = pi(1 − p)n−i. i i=0 Example 3.2.2 (Dice probabilities) Suppose we are interested in finding the probability of obtaining at least one 6 in four rolls of a fair die.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Solutions to the Exercises
    Solutions to the Exercises Chapter 1 Solution 1.1 (a) Your computer may be programmed to allocate borderline cases to the next group down, or the next group up; and it may or may not manage to follow this rule consistently, depending on its handling of the numbers involved. Following a rule which says 'move borderline cases to the next group up', these are the five classifications. (i) 1.0-1.2 1.2-1.4 1.4-1.6 1.6-1.8 1.8-2.0 2.0-2.2 2.2-2.4 6 6 4 8 4 3 4 2.4-2.6 2.6-2.8 2.8-3.0 3.0-3.2 3.2-3.4 3.4-3.6 3.6-3.8 6 3 2 2 0 1 1 (ii) 1.0-1.3 1.3-1.6 1.6-1.9 1.9-2.2 2.2-2.5 10 6 10 5 6 2.5-2.8 2.8-3.1 3.1-3.4 3.4-3.7 7 3 1 2 (iii) 0.8-1.1 1.1-1.4 1.4-1.7 1.7-2.0 2.0-2.3 2 10 6 10 7 2.3-2.6 2.6-2.9 2.9-3.2 3.2-3.5 3.5-3.8 6 4 3 1 1 (iv) 0.85-1.15 1.15-1.45 1.45-1.75 1.75-2.05 2.05-2.35 4 9 8 9 5 2.35-2.65 2.65-2.95 2.95-3.25 3.25-3.55 3.55-3.85 7 3 3 1 1 (V) 0.9-1.2 1.2-1.5 1.5-1.8 1.8-2.1 2.1-2.4 6 7 11 7 4 2.4-2.7 2.7-3.0 3.0-3.3 3.3-3.6 3.6-3.9 7 4 2 1 1 (b) Computer graphics: the diagrams are shown in Figures 1.9 to 1.11.
    [Show full text]
  • Discrete Probability Distributions Uniform Distribution Bernoulli
    Discrete Probability Distributions Uniform Distribution Experiment obeys: all outcomes equally probable Random variable: outcome Probability distribution: if k is the number of possible outcomes, 1 if x is a possible outcome p(x)= k ( 0 otherwise Example: tossing a fair die (k = 6) Bernoulli Distribution Experiment obeys: (1) a single trial with two possible outcomes (success and failure) (2) P trial is successful = p Random variable: number of successful trials (zero or one) Probability distribution: p(x)= px(1 − p)n−x Mean and variance: µ = p, σ2 = p(1 − p) Example: tossing a fair coin once Binomial Distribution Experiment obeys: (1) n repeated trials (2) each trial has two possible outcomes (success and failure) (3) P ith trial is successful = p for all i (4) the trials are independent Random variable: number of successful trials n x n−x Probability distribution: b(x; n,p)= x p (1 − p) Mean and variance: µ = np, σ2 = np(1 − p) Example: tossing a fair coin n times Approximations: (1) b(x; n,p) ≈ p(x; λ = pn) if p ≪ 1, x ≪ n (Poisson approximation) (2) b(x; n,p) ≈ n(x; µ = pn,σ = np(1 − p) ) if np ≫ 1, n(1 − p) ≫ 1 (Normal approximation) p Geometric Distribution Experiment obeys: (1) indeterminate number of repeated trials (2) each trial has two possible outcomes (success and failure) (3) P ith trial is successful = p for all i (4) the trials are independent Random variable: trial number of first successful trial Probability distribution: p(x)= p(1 − p)x−1 1 2 1−p Mean and variance: µ = p , σ = p2 Example: repeated attempts to start
    [Show full text]
  • STATS8: Introduction to Biostatistics 24Pt Random Variables And
    STATS8: Introduction to Biostatistics Random Variables and Probability Distributions Babak Shahbaba Department of Statistics, UCI Random variables • In this lecture, we will discuss random variables and their probability distributions. • Formally, a random variable X assigns a numerical value to each possible outcome (and event) of a random phenomenon. • For instance, we can define X based on possible genotypes of a bi-allelic gene A as follows: 8 < 0 for genotype AA; X = 1 for genotype Aa; : 2 for genotype aa: • Alternatively, we can define a random, Y , variable this way: 0 for genotypes AA and aa; Y = 1 for genotype Aa: Random variables • After we define a random variable, we can find the probabilities for its possible values based on the probabilities for its underlying random phenomenon. • This way, instead of talking about the probabilities for different outcomes and events, we can talk about the probability of different values for a random variable. • For example, suppose P(AA) = 0:49, P(Aa) = 0:42, and P(aa) = 0:09. • Then, we can say that P(X = 0) = 0:49, i.e., X is equal to 0 with probability of 0.49. • Note that the total probability for the random variable is still 1. Random variables • The probability distribution of a random variable specifies its possible values (i.e., its range) and their corresponding probabilities. • For the random variable X defined based on genotypes, the probability distribution can be simply specified as follows: 8 < 0:49 for x = 0; P(X = x) = 0:42 for x = 1; : 0:09 for x = 2: Here, x denotes a specific value (i.e., 0, 1, or 2) of the random variable.
    [Show full text]
  • Chapter 5 Sections
    Chapter 5 Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions 1 / 43 Chapter 5 5.1 Introduction Families of distributions How: Parameter and Parameter space pf /pdf and cdf - new notation: f (xj parameters ) Mean, variance and the m.g.f. (t) Features, connections to other distributions, approximation Reasoning behind a distribution Why: Natural justification for certain experiments A model for the uncertainty in an experiment All models are wrong, but some are useful – George Box 2 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Bernoulli distributions Def: Bernoulli distributions – Bernoulli(p) A r.v. X has the Bernoulli distribution with parameter p if P(X = 1) = p and P(X = 0) = 1 − p. The pf of X is px (1 − p)1−x for x = 0; 1 f (xjp) = 0 otherwise Parameter space: p 2 [0; 1] In an experiment with only two possible outcomes, “success” and “failure”, let X = number successes. Then X ∼ Bernoulli(p) where p is the probability of success. E(X) = p, Var(X) = p(1 − p) and (t) = E(etX ) = pet + (1 − p) 8 < 0 for x < 0 The cdf is F(xjp) = 1 − p for 0 ≤ x < 1 : 1 for x ≥ 1 3 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Binomial distributions Def: Binomial distributions – Binomial(n; p) A r.v.
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • A Note on Skew-Normal Distribution Approximation to the Negative Binomal Distribution
    WSEAS TRANSACTIONS on MATHEMATICS Jyh-Jiuan Lin, Ching-Hui Chang, Rosemary Jou A NOTE ON SKEW-NORMAL DISTRIBUTION APPROXIMATION TO THE NEGATIVE BINOMAL DISTRIBUTION 1 2* 3 JYH-JIUAN LIN , CHING-HUI CHANG AND ROSEMARY JOU 1Department of Statistics Tamkang University 151 Ying-Chuan Road, Tamsui, Taipei County 251, Taiwan [email protected] 2Department of Applied Statistics and Information Science Ming Chuan University 5 De-Ming Road, Gui-Shan District, Taoyuan 333, Taiwan [email protected] 3 Graduate Institute of Management Sciences, Tamkang University 151 Ying-Chuan Road, Tamsui, Taipei County 251, Taiwan [email protected] Abstract: This article revisits the problem of approximating the negative binomial distribution. Its goal is to show that the skew-normal distribution, another alternative methodology, provides a much better approximation to negative binomial probabilities than the usual normal approximation. Key-Words: Central Limit Theorem, cumulative distribution function (cdf), skew-normal distribution, skew parameter. * Corresponding author ISSN: 1109-2769 32 Issue 1, Volume 9, January 2010 WSEAS TRANSACTIONS on MATHEMATICS Jyh-Jiuan Lin, Ching-Hui Chang, Rosemary Jou 1. INTRODUCTION where φ and Φ denote the standard normal pdf and cdf respectively. The This article revisits the problem of the parameter space is: μ ∈( −∞ , ∞), σ > 0 and negative binomial distribution approximation. Though the most commonly used solution of λ ∈(−∞ , ∞). Besides, the distribution approximating the negative binomial SN(0, 1, λ) with pdf distribution is by normal distribution, Peizer f( x | 0, 1, λ) = 2φ (x )Φ (λ x ) is called the and Pratt (1968) has investigated this problem standard skew-normal distribution.
    [Show full text]