Multivariate Normal Distribution Form Normal Density Function (Multivariate)

Total Page:16

File Type:pdf, Size:1020Kb

Multivariate Normal Distribution Form Normal Density Function (Multivariate) Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 1 Copyright Copyright c 2017 by Nathaniel E. Helwig Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 2 Outline of Notes 1) Univariate Normal: 3) Multivariate Normal: Distribution form Distribution form Standard normal Probability calculations Probability calculations Affine transformations Affine transformations Conditional distributions Parameter estimation Parameter estimation 2) Bivariate Normal: 4) Sampling Distributions: Distribution form Univariate case Probability calculations Multivariate case Affine transformations Conditional distributions Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 3 Univariate Normal Univariate Normal Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 4 Univariate Normal Distribution Form Normal Density Function (Univariate) Given a variable x 2 R, the normal probability density function (pdf) is 2 1 − (x−µ) f (x) = p e 2σ2 σ 2π (1) 1 (x − µ)2 = p exp − σ 2π 2σ2 where µ 2 R is the mean σ > 0 is the standard deviation( σ2 is the variance) e ≈ 2:71828 is base of the natural logarithm Write X ∼ N(µ, σ2) to denote that X follows a normal distribution. Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 5 Univariate Normal Standard Normal Standard Normal Distribution If X ∼ N(0; 1), then X follows a standard normal distribution: 1 2 f (x) = p e−x =2 (2) 2π 0.4 ) x ( 0.2 f 0.0 −4 −2 0 2 4 x Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 6 Univariate Normal Probability Calculations Probabilities and Distribution Functions Probabilities relate to the area under the pdf: Z b P(a ≤ X ≤ b) = f (x)dx a (3) = F(b) − F(a) where Z x F(x) = f (u)du (4) −∞ is the cumulative distribution function (cdf). Note: F(x) = P(X ≤ x) =) 0 ≤ F(x) ≤ 1 Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 7 Univariate Normal Probability Calculations Normal Probabilities Helpful figure of normal probabilities: 34.1% 34.1% 2.1% 2.1% 0.1% 13.6% 13.6% 0.1% 0.0 0.1 0.2 0.3 0.4 −3σ −2σ −1σ1σ µ 2σ 3σ From http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 8 Univariate Normal Probability Calculations Normal Distribution Functions (Univariate) Helpful figures of normal pdfs and cdfs: 1.0 1.0 μ= 0, σ 2= 0.2, μ= 0, σ 2= 0.2, μ= 0, σ 2= 1.0, μ= 0, σ 2= 1.0, 0.8 μ= 0, σ 2= 5.0, 0.8 μ= 0, σ 2= 5.0, μ= −2, σ 2= 0.5, μ= −2, σ 2= 0.5, ) 0.6 ) 0.6 x x ( ( 2 2 μ,σ μ,σ 0.4 -3 -2 -1 0.4 -3 -2 -1 φ Φ 0.2 0.2 0.0 0.0 −5 −3 −2−4 −1 01 2 3 4 5 −5 −3 −2−4 −1 01 2 3 4 5 x x http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg http://en.wikipedia.org/wiki/File:Normal_Distribution_CDF.svg Note that the cdf has an elongated “S” shape, referred to as an ogive. Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 9 Univariate Normal Affine Transformations Affine Transformations of Normal (Univariate) Suppose that X ∼ N(µ, σ2) and a; b 2 R with a 6= 0. If we define Y = aX + b, then Y ∼ N(aµ + b; a2σ2). Suppose that X ∼ N(1; 2). Determine the distributions of. Y = X + 3 Y = 2X + 3 Y = 3X + 2 Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 10 Univariate Normal Affine Transformations Affine Transformations of Normal (Univariate) Suppose that X ∼ N(µ, σ2) and a; b 2 R with a 6= 0. If we define Y = aX + b, then Y ∼ N(aµ + b; a2σ2). Suppose that X ∼ N(1; 2). Determine the distributions of. Y = X + 3 =) Y ∼ N(1(1) + 3; 12(2)) ≡ N(4; 2) Y = 2X + 3 =) Y ∼ N(2(1) + 3; 22(2)) ≡ N(5; 8) Y = 3X + 2 =) Y ∼ N(3(1) + 2; 32(2)) ≡ N(5; 18) Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 10 Univariate Normal Parameter Estimation Likelihood Function Suppose that x = (x1;:::; xn) is an iid sample of data from a normal 2 iid 2 distribution with mean µ and variance σ , i.e., xi ∼ N(µ, σ ). The likelihood function for the parameters (given the data) has the form n n 2 2 Y Y 1 (xi − µ) L(µ, σ jx) = f (xi ) = p exp − 2 2σ2 i=1 i=1 2πσ and the log-likelihood function is given by n n X n n 1 X LL(µ, σ2jx) = log(f (x )) = − log(2π)− log(σ2)− (x −µ)2 i 2 2 2σ2 i i=1 i=1 Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 11 Univariate Normal Parameter Estimation Maximum Likelihood Estimate of the Mean The MLE of the mean is the value of µ that minimizes n n X 2 X 2 2 (xi − µ) = xi − 2nx¯µ + nµ i=1 i=1 Pn where x¯ = (1=n) i=1 xi is the sample mean. Taking the derivative with respect to µ we find that @ Pn (x − µ)2 i=1 i = −2nx¯ + 2nµ ! x¯ =µ ^ @µ i.e., the sample mean x¯ is the MLE of the population mean µ. Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 12 Univariate Normal Parameter Estimation Maximum Likelihood Estimate of the Variance The MLE of the variance is the value of σ2 that minimizes n Pn 2 2 n 1 X n x nx¯ log(σ2) + (x − µ^)2 = log(σ2) + i=1 i − 2 2σ2 i 2 2σ2 2σ2 i=1 Pn where x¯ = (1=n) i=1 xi is the sample mean. Taking the derivative with respect to σ2 we find that n 2 1 Pn 2 n @ log(σ ) + 2 = (xi − µ^) n 1 X 2 2σ i 1 = − (x − µ^)2 @σ2 2σ2 2σ4 i i=1 2 Pn 2 which implies that the sample variance σ^ = (1=n) i=1(xi − x¯) is the MLE of the population variance σ2. Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 13 Bivariate Normal Bivariate Normal Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 14 Bivariate Normal Distribution Form Normal Density Function (Bivariate) Given two variables x; y 2 R, the bivariate normal pdf is n h 2 2 io exp − 1 (x−µx ) + (y−µy ) − 2ρ(x−µx )(y−µy ) 2(1−ρ2) σ2 σ2 σx σy f (x; y) = x y (5) p 2 2πσx σy 1 − ρ where µx 2 R and µy 2 R are the marginal means + + σx 2 R and σy 2 R are the marginal standard deviations 0 ≤ jρj < 1 is the correlation coefficient 2 2 X and Y are marginally normal: X ∼ N(µx ; σx ) and Y ∼ N(µy ; σy ) Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 15 Bivariate Normal Distribution Form p 2 2 Example: µx = µy = 0, σx = 1, σy = 2, ρ = 0:6= 2 0.12 4 0.12 0.10 2 0.10 0.08 ) 0.08 y , 0 ) y x 0.06 y ( , 0.06 f x ( f 0.04 0.04 4 −2 0.02 2 0 0.02 −4 −4 −2 −2 y 0 2 −4 −4 −2 0 2 4 0.00 4 x x http://en.wikipedia.org/wiki/File:MultivariateNormal.png Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 16 Bivariate Normal Distribution Form Example: Different Means µx = 0, µy = 0 µx = 1, µy = 2 µx = −1, µy = −1 0.12 0.12 0.12 4 4 4 0.10 0.10 0.10 2 2 2 0.08 0.08 0.08 ) ) ) y y y , , , 0 0 0 y y y x x x 0.06 0.06 0.06 ( ( ( f f f 0.04 0.04 0.04 −2 −2 −2 0.02 0.02 0.02 −4 −4 −4 −4 −2 0 2 4 0.00 −4 −2 0 2 4 0.00 −4 −2 0 2 4 0.00 x x x p 2 2 Note: for all three plots σx = 1, σy = 2, and ρ = 0:6= 2. Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 17 Example: Different Correlations Nathaniel E. Helwig (U of Minnesota) Note: for all three plots Bivariate Normal Introduction to Normal Distribution µ x = µ Distribution Form ρ y = −0.6/ 2 ρ = 0 ρ = 1.2/ 2 = 0.12 0.20 4 4 4 0.10 0, 0.10 2 2 2 0.08 0.15 σ 0.08 ) ) ) x 2 y y y , , , 0.06 0 0 0 y y y x x x = 0.06 0.10 ( ( ( f f f 0.04 0.04 −2 −2 −2 1, and 0.05 0.02 0.02 −4 −4 −4 Updated 17-Jan-2017 : Slide 18 −4 −2 0 2 4 0.00 −4 −2 0 2 4 0.00 −4 −2 0 2 4 0.00 x x x σ y 2 = 2.
Recommended publications
  • Lecture 22: Bivariate Normal Distribution Distribution
    6.5 Conditional Distributions General Bivariate Normal Let Z1; Z2 ∼ N (0; 1), which we will use to build a general bivariate normal Lecture 22: Bivariate Normal Distribution distribution. 1 1 2 2 f (z1; z2) = exp − (z1 + z2 ) Statistics 104 2π 2 We want to transform these unit normal distributions to have the follow Colin Rundel arbitrary parameters: µX ; µY ; σX ; σY ; ρ April 11, 2012 X = σX Z1 + µX p 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 1 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Marginals General Bivariate Normal - Cov/Corr First, lets examine the marginal distributions of X and Y , Second, we can find Cov(X ; Y ) and ρ(X ; Y ) Cov(X ; Y ) = E [(X − E(X ))(Y − E(Y ))] X = σX Z1 + µX h p i = E (σ Z + µ − µ )(σ [ρZ + 1 − ρ2Z ] + µ − µ ) = σX N (0; 1) + µX X 1 X X Y 1 2 Y Y 2 h p 2 i = N (µX ; σX ) = E (σX Z1)(σY [ρZ1 + 1 − ρ Z2]) h 2 p 2 i = σX σY E ρZ1 + 1 − ρ Z1Z2 p 2 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY = σX σY ρE[Z1 ] p 2 = σX σY ρ = σY [ρN (0; 1) + 1 − ρ N (0; 1)] + µY = σ [N (0; ρ2) + N (0; 1 − ρ2)] + µ Y Y Cov(X ; Y ) ρ(X ; Y ) = = ρ = σY N (0; 1) + µY σX σY 2 = N (µY ; σY ) Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 2 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 3 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - RNG Multivariate Change of Variables Consequently, if we want to generate a Bivariate Normal random variable Let X1;:::; Xn have a continuous joint distribution with pdf f defined of S.
    [Show full text]
  • Efficient Estimation of Parameters of the Negative Binomial Distribution
    E±cient Estimation of Parameters of the Negative Binomial Distribution V. SAVANI AND A. A. ZHIGLJAVSKY Department of Mathematics, Cardi® University, Cardi®, CF24 4AG, U.K. e-mail: SavaniV@cardi®.ac.uk, ZhigljavskyAA@cardi®.ac.uk (Corresponding author) Abstract In this paper we investigate a class of moment based estimators, called power method estimators, which can be almost as e±cient as maximum likelihood estima- tors and achieve a lower asymptotic variance than the standard zero term method and method of moments estimators. We investigate di®erent methods of implementing the power method in practice and examine the robustness and e±ciency of the power method estimators. Key Words: Negative binomial distribution; estimating parameters; maximum likelihood method; e±ciency of estimators; method of moments. 1 1. The Negative Binomial Distribution 1.1. Introduction The negative binomial distribution (NBD) has appeal in the modelling of many practical applications. A large amount of literature exists, for example, on using the NBD to model: animal populations (see e.g. Anscombe (1949), Kendall (1948a)); accident proneness (see e.g. Greenwood and Yule (1920), Arbous and Kerrich (1951)) and consumer buying behaviour (see e.g. Ehrenberg (1988)). The appeal of the NBD lies in the fact that it is a simple two parameter distribution that arises in various di®erent ways (see e.g. Anscombe (1950), Johnson, Kotz, and Kemp (1992), Chapter 5) often allowing the parameters to have a natural interpretation (see Section 1.2). Furthermore, the NBD can be implemented as a distribution within stationary processes (see e.g. Anscombe (1950), Kendall (1948b)) thereby increasing the modelling potential of the distribution.
    [Show full text]
  • Applied Biostatistics Mean and Standard Deviation the Mean the Median Is Not the Only Measure of Central Value for a Distribution
    Health Sciences M.Sc. Programme Applied Biostatistics Mean and Standard Deviation The mean The median is not the only measure of central value for a distribution. Another is the arithmetic mean or average, usually referred to simply as the mean. This is found by taking the sum of the observations and dividing by their number. The mean is often denoted by a little bar over the symbol for the variable, e.g. x . The sample mean has much nicer mathematical properties than the median and is thus more useful for the comparison methods described later. The median is a very useful descriptive statistic, but not much used for other purposes. Median, mean and skewness The sum of the 57 FEV1s is 231.51 and hence the mean is 231.51/57 = 4.06. This is very close to the median, 4.1, so the median is within 1% of the mean. This is not so for the triglyceride data. The median triglyceride is 0.46 but the mean is 0.51, which is higher. The median is 10% away from the mean. If the distribution is symmetrical the sample mean and median will be about the same, but in a skew distribution they will not. If the distribution is skew to the right, as for serum triglyceride, the mean will be greater, if it is skew to the left the median will be greater. This is because the values in the tails affect the mean but not the median. Figure 1 shows the positions of the mean and median on the histogram of triglyceride.
    [Show full text]
  • 1. How Different Is the T Distribution from the Normal?
    Statistics 101–106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M §7.1 and §7.2, ignoring starred parts. Reread M&M §3.2. The eects of estimated variances on normal approximations. t-distributions. Comparison of two means: pooling of estimates of variances, or paired observations. In Lecture 6, when discussing comparison of two Binomial proportions, I was content to estimate unknown variances when calculating statistics that were to be treated as approximately normally distributed. You might have worried about the effect of variability of the estimate. W. S. Gosset (“Student”) considered a similar problem in a very famous 1908 paper, where the role of Student’s t-distribution was first recognized. Gosset discovered that the effect of estimated variances could be described exactly in a simplified problem where n independent observations X1,...,Xn are taken from (, ) = ( + ...+ )/ a normal√ distribution, N . The sample mean, X X1 Xn n has a N(, / n) distribution. The random variable X Z = √ / n 2 2 Phas a standard normal distribution. If we estimate by the sample variance, s = ( )2/( ) i Xi X n 1 , then the resulting statistic, X T = √ s/ n no longer has a normal distribution. It has a t-distribution on n 1 degrees of freedom. Remark. I have written T , instead of the t used by M&M page 505. I find it causes confusion that t refers to both the name of the statistic and the name of its distribution. As you will soon see, the estimation of the variance has the effect of spreading out the distribution a little beyond what it would be if were used.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • On the Meaning and Use of Kurtosis
    Psychological Methods Copyright 1997 by the American Psychological Association, Inc. 1997, Vol. 2, No. 3,292-307 1082-989X/97/$3.00 On the Meaning and Use of Kurtosis Lawrence T. DeCarlo Fordham University For symmetric unimodal distributions, positive kurtosis indicates heavy tails and peakedness relative to the normal distribution, whereas negative kurtosis indicates light tails and flatness. Many textbooks, however, describe or illustrate kurtosis incompletely or incorrectly. In this article, kurtosis is illustrated with well-known distributions, and aspects of its interpretation and misinterpretation are discussed. The role of kurtosis in testing univariate and multivariate normality; as a measure of departures from normality; in issues of robustness, outliers, and bimodality; in generalized tests and estimators, as well as limitations of and alternatives to the kurtosis measure [32, are discussed. It is typically noted in introductory statistics standard deviation. The normal distribution has a kur- courses that distributions can be characterized in tosis of 3, and 132 - 3 is often used so that the refer- terms of central tendency, variability, and shape. With ence normal distribution has a kurtosis of zero (132 - respect to shape, virtually every textbook defines and 3 is sometimes denoted as Y2)- A sample counterpart illustrates skewness. On the other hand, another as- to 132 can be obtained by replacing the population pect of shape, which is kurtosis, is either not discussed moments with the sample moments, which gives or, worse yet, is often described or illustrated incor- rectly. Kurtosis is also frequently not reported in re- ~(X i -- S)4/n search articles, in spite of the fact that virtually every b2 (•(X i - ~')2/n)2' statistical package provides a measure of kurtosis.
    [Show full text]
  • Approaching Mean-Variance Efficiency for Large Portfolios
    Approaching Mean-Variance Efficiency for Large Portfolios ∗ Mengmeng Ao† Yingying Li‡ Xinghua Zheng§ First Draft: October 6, 2014 This Draft: November 11, 2017 Abstract This paper studies the large dimensional Markowitz optimization problem. Given any risk constraint level, we introduce a new approach for estimating the optimal portfolio, which is developed through a novel unconstrained regression representation of the mean-variance optimization problem, combined with high-dimensional sparse regression methods. Our estimated portfolio, under a mild sparsity assumption, asymptotically achieves mean-variance efficiency and meanwhile effectively controls the risk. To the best of our knowledge, this is the first time that these two goals can be simultaneously achieved for large portfolios. The superior properties of our approach are demonstrated via comprehensive simulation and empirical studies. Keywords: Markowitz optimization; Large portfolio selection; Unconstrained regression, LASSO; Sharpe ratio ∗Research partially supported by the RGC grants GRF16305315, GRF 16502014 and GRF 16518716 of the HKSAR, and The Fundamental Research Funds for the Central Universities 20720171073. †Wang Yanan Institute for Studies in Economics & Department of Finance, School of Economics, Xiamen University, China. [email protected] ‡Hong Kong University of Science and Technology, HKSAR. [email protected] §Hong Kong University of Science and Technology, HKSAR. [email protected] 1 1 INTRODUCTION 1.1 Markowitz Optimization Enigma The groundbreaking mean-variance portfolio theory proposed by Markowitz (1952) contin- ues to play significant roles in research and practice. The optimal mean-variance portfolio has a simple explicit expression1 that only depends on two population characteristics, the mean and the covariance matrix of asset returns. Under the ideal situation when the underlying mean and covariance matrix are known, mean-variance investors can easily compute the optimal portfolio weights based on their preferred level of risk.
    [Show full text]
  • Characteristics and Statistics of Digital Remote Sensing Imagery (1)
    Characteristics and statistics of digital remote sensing imagery (1) Digital Images: 1 Digital Image • With raster data structure, each image is treated as an array of values of the pixels. • Image data is organized as rows and columns (or lines and pixels) start from the upper left corner of the image. • Each pixel (picture element) is treated as a separate unite. Statistics of Digital Images Help: • Look at the frequency of occurrence of individual brightness values in the image displayed • View individual pixel brightness values at specific locations or within a geographic area; • Compute univariate descriptive statistics to determine if there are unusual anomalies in the image data; and • Compute multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). 2 Statistics of Digital Images It is necessary to calculate fundamental univariate and multivariate statistics of the multispectral remote sensor data. This involves identification and calculation of – maximum and minimum value –the range, mean, standard deviation – between-band variance-covariance matrix – correlation matrix, and – frequencies of brightness values The results of the above can be used to produce histograms. Such statistics provide information necessary for processing and analyzing remote sensing data. A “population” is an infinite or finite set of elements. A “sample” is a subset of the elements taken from a population used to make inferences about certain characteristics of the population. (e.g., training signatures) 3 Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around the central value, and the frequency of occurrence declines away from this central point.
    [Show full text]
  • Chapter 5 Sections
    Chapter 5 Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions 1 / 43 Chapter 5 5.1 Introduction Families of distributions How: Parameter and Parameter space pf /pdf and cdf - new notation: f (xj parameters ) Mean, variance and the m.g.f. (t) Features, connections to other distributions, approximation Reasoning behind a distribution Why: Natural justification for certain experiments A model for the uncertainty in an experiment All models are wrong, but some are useful – George Box 2 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Bernoulli distributions Def: Bernoulli distributions – Bernoulli(p) A r.v. X has the Bernoulli distribution with parameter p if P(X = 1) = p and P(X = 0) = 1 − p. The pf of X is px (1 − p)1−x for x = 0; 1 f (xjp) = 0 otherwise Parameter space: p 2 [0; 1] In an experiment with only two possible outcomes, “success” and “failure”, let X = number successes. Then X ∼ Bernoulli(p) where p is the probability of success. E(X) = p, Var(X) = p(1 − p) and (t) = E(etX ) = pet + (1 − p) 8 < 0 for x < 0 The cdf is F(xjp) = 1 − p for 0 ≤ x < 1 : 1 for x ≥ 1 3 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Binomial distributions Def: Binomial distributions – Binomial(n; p) A r.v.
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • Univariate and Multivariate Skewness and Kurtosis 1
    Running head: UNIVARIATE AND MULTIVARIATE SKEWNESS AND KURTOSIS 1 Univariate and Multivariate Skewness and Kurtosis for Measuring Nonnormality: Prevalence, Influence and Estimation Meghan K. Cain, Zhiyong Zhang, and Ke-Hai Yuan University of Notre Dame Author Note This research is supported by a grant from the U.S. Department of Education (R305D140037). However, the contents of the paper do not necessarily represent the policy of the Department of Education, and you should not assume endorsement by the Federal Government. Correspondence concerning this article can be addressed to Meghan Cain ([email protected]), Ke-Hai Yuan ([email protected]), or Zhiyong Zhang ([email protected]), Department of Psychology, University of Notre Dame, 118 Haggar Hall, Notre Dame, IN 46556. UNIVARIATE AND MULTIVARIATE SKEWNESS AND KURTOSIS 2 Abstract Nonnormality of univariate data has been extensively examined previously (Blanca et al., 2013; Micceri, 1989). However, less is known of the potential nonnormality of multivariate data although multivariate analysis is commonly used in psychological and educational research. Using univariate and multivariate skewness and kurtosis as measures of nonnormality, this study examined 1,567 univariate distriubtions and 254 multivariate distributions collected from authors of articles published in Psychological Science and the American Education Research Journal. We found that 74% of univariate distributions and 68% multivariate distributions deviated from normal distributions. In a simulation study using typical values of skewness and kurtosis that we collected, we found that the resulting type I error rates were 17% in a t-test and 30% in a factor analysis under some conditions. Hence, we argue that it is time to routinely report skewness and kurtosis along with other summary statistics such as means and variances.
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]