Multivariate Normal Distribution Edps/Soc 584, Psych 594

Multivariate Normal Distribution Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology ILLINOIS university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring 2017 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Outline ◮ Motivation ◮ The multivariate normal distribution ◮ The Bivariate Normal Distribution ◮ More properties of multivariate normal ◮ Estimation of µ and Σ ◮ Central Limit Theorem Reading: Johnson & Wichern pages 149–176 C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 2.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Motivation ◮ To be able to make inferences about populations, we need a model for the distribution of random variables We’ll use −→ the multivariate normal distribution, because... ◮ It’s often a good population model. It’s a reasonably good approximation of many phenomenon. A lot of variables are approximately normal (due to the central limit theorem for sums and averages). ◮ The sampling distribution of (test) statistics are often approximately multivariate or univariate normal due to the central limit theorem. ◮ Due to it’s central importance, we need to thoroughly understand and know it’s properties. C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 3.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Introduction to the Multivariate Normal ◮ The probability density function of the Univariate normal distribution (p = 1 variables): 1 1 x µ 2 f (x)= exp − for < x < √ 2 −2 σ −∞ ∞ 2πσ ◮ The parameters that completely characterize the distribution: ◮ µ = E(X ) = mean ◮ σ2 = var(X ) = variance C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 4.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Introduction to the Multivariate Normal (continued) Area corresponds to probability: 68% area between µ σ and 95% between µ 1.96σ: ± ± C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 5.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Generalization to Multivariate Normal 2 x µ − − = (x µ)(σ2) 1(x µ) σ − − A squared statistical distance between x & µ in standard deviation units. Generalization to p > 1 variables: ◮ We have xp×1 and parameters µp×1 and Σp×p. ◮ The exponent term for multivariate normal is ′ − (x µ) Σ 1(x µ) − − where < xi < for i = 1,..., p. −∞ ∞ ◮ This is a scalar and reduces to what’s at the top for p = 1. − ◮ It is a squared statistical distance of x to µ (if Σ 1 exists). It takes into consideration both variability and covariability. ◮ Integrating 1 ... exp (x µ)′Σ−1(x µ) = (2π)p/2 Σ 1/2 −2 − − | | x1 xp C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 6.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Proper Distribution Since the sum of probabilities over all possible values must add up to 1, we need to divide by (2π)p/2 Σ 1/2 to get a “proper” density | | function. Multivariate Normal density function: 1 1 ′ − f (x)= exp (x µ) Σ 1(x µ) (2π)p/2 Σ 1/2 −2 − − | | where < xi < for i = 1,..., p. −∞ ∞ To denote this, we use p(µ, Σ) N For p = 1, this reduces to the univariate normal p.d.f. C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 7.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Bivariate Normal: p =2 x E(x ) µ x = 1 E(x)= 1 = 1 = µ x E(x ) µ 2 2 2 σ σ Σ = 11 12 σ σ 12 22 and −1 1 σ22 σ12 Σ = 2 − σ σ σ σ12 σ11 11 22 − 12 − If we replace σ12 by ρ12√σ11σ22, then we get −1 1 σ22 ρ12√σ11σ22 Σ = 2 − σ σ (1 ρ ) ρ12√σ11σ22 σ11 11 22 − 12 − Using this, let’s look at the statistical distance of x from µ. C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 8.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Bivariate Normal & Statistical Distance The quantity in the exponent of the bivariate normal is ′ − (x µ) Σ 1(x µ) − − 1 = ((x µ ), (x µ )) 1 − 1 2 − 2 σ σ (1 ρ2 ) 11 22 − 12 σ ρ √σ σ ρ √σ σ x µ 22 − 12 11 22 − 12 11 22 1 − 1 × σ11 x2 µ2 − 2 2 1 x1 µ1 x2 µ2 x1 µ1 x2 µ2 = 2 − + − 2ρ12 − − 1 ρ √σ11 √σ22 − √σ11 √σ22 − 12 1 2 2 = 2 z1 + z2 2ρ12z1z2 1 ρ12 − − C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 9.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Bivariate Normal & Independence 2 2 1 1 x1 µ1 x2 µ2 f (x) = exp − 2 − + − 2π√σ11σ22 2(1 ρ ) √σ11 √σ22 − 12 x1 µ1 x2 µ2 2ρ12 − − − √σ √σ 11 22 If σ12 = 0 or equivalently ρ12 =0, then X1 and X2 are uncorrelated. For bivariate normal, σ12 = 0 implies that X1 and X2 are statistically independent, because the density factors 1 1 x µ 2 x µ 2 f (x) = exp − 1 − 1 + 2 − 2 2π√σ11σ22 2 √σ11 √σ22 1 1 x µ 2 1 1 x µ 2 = exp − 1 − 1 exp − 2 − 2 √2πσ 2 √σ11 √2πσ 2 √σ22 11 22 = f (x ) f (x ) 1 1 × 2 2 C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 10.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Picture: µk = 0, σkk = 1, r =0.0 0.2 0.15 0.1 0.05 0 5 4 0 2 0 −2 −5 −4 y−axis x−axis C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 11.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Overhead µk = 0, σkk = 1, r =0.0 4 3 2 1 0 y−axis −1 −2 −3 −4 −4 −3 −2 −1 0 1 2 3 4 x−axis C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 12.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Picture: µk = 0, σkk = 1, r =0.75 0.25 0.2 0.15 0.1 0.05 0 5 4 0 2 0 −2 −5 −4 y−axis x−axis C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 13.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Overhead: µk = 0, σkk = 1, r =0.75 4 3 2 1 0 y−axis −1 −2 −3 −4 −4 −3 −2 −1 0 1 2 3 4 x−axis C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 14.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Summary: Comparing r =0.0 vs r =0.75 For the figures shown, µ1 = µ2 = 0 and σ11 = σ22 = 1: ◮ With r = 0.0, ◮ Σ = diag(σ11, σ22), a diagonal matrix. ◮ Density is “random” in the x-y plane. ◮ When take a slice parallel to x-y, you get a circle. ◮ When r = .75, ◮ Σ is not a diagonal . ◮ Density is not random in x-y plane. ◮ There is a linear tilt (ie., density is concentrated on a line). ◮ When you take a slice you get an ellipse that’s tilted. ◮ Tilt depends on relative values of σ11 and σ22 (and scale used in plotting). ◮ When Σ = σ2I (i.e., diagonal with equal variances), it’s “spherical normal”. C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 15.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Real Time Software Demo ◮ binormal.m (Peter Dunn) ◮ Graph Bivariate .R (http://www.stat.ucl.ac.be/ISpersonnel/lecoutre/stats/fichiers/˜gallery C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 16.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Slices of Multivariate Normal Density ◮ For bi-variate normal, you get an ellipse whose equation is ′ − (x µ) Σ 1(x µ)= c2 − − which gives all (x1, x2) pairs with constant probability. ◮ The ellipses are call contours and all are centered around µ. ◮ Definition: A constant probability contour equals ′ − = all x such that (x µ) Σ 1(x µ)= c2 { − − } = surface of ellipsoid centered at µ { } C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 17.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Probability Contours: Axes of ellipsoid Important Points: ◮ (x µ)′Σ−1(x µ) χ2 (if Σ > 0) − − ∼ p | | ◮ The solid ellipsoid of values x that satisfy ′ − (x µ) Σ 1(x µ) c2 = χ2 − − ≤ p(α) has probability (1 α) where χ2 is the (1 α)th100% point of − p(α) − the chi-square distribution with p degrees of freedom. C.J. Anderson (Illinois) Multivariate Normal Distribution Spring 2015 18.1/ 56 Motivation Intro. to Multivariate Normal Bivariate Normal More Properties Estimation CLT Others Example: Axses of Ellipses & Prob. Contours Back to the example where x N with ∼ 2 5 9 16 µ = and Σ = ρ = .667 10 16 64 → and we want the “95% probability contour”. The upper 5% point of the chi-square distribution with 2 degrees 2 √ of freedom is χ2(.05) = 5.9915, so c = 5.9915 = 2.4478 th Axes: µ c√λi ei where (λi , ei ) is the i (i = 1, 2) ± eigenvalue/eigenvector pair of Σ. ′ λ1 = 68.316 e1 = (.2604, .9655) ′ λ = 4.684 e = (.9655, .2604) 2 2 − C.J.

Multivariate Normal Distribution Edps/Soc 584, Psych 594

Conditional Central Limit Theorems for Gaussian Projections

Lecture 22: Bivariate Normal Distribution Distribution

Efficient Estimation of Parameters of the Negative Binomial Distribution

Applied Biostatistics Mean and Standard Deviation the Mean the Median Is Not the Only Measure of Central Value for a Distribution

1. How Different Is the T Distribution from the Normal?

LECTURES 2 - 3 : Stochastic Processes, Autocorrelation Function

5. the Student T Distribution

1 One Parameter Exponential Families

Random Variables and Probability Distributions 1.1

Spatial Autocorrelation: Covariance and Semivariance Semivariance

Chapter 8 Fundamental Sampling Distributions And

Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis