Lecture 1. Random Vectors and Multivariate Normal Distribution

Total Page:16

File Type:pdf, Size:1020Kb

Lecture 1. Random Vectors and Multivariate Normal Distribution Lecture 1. Random vectors and multivariate normal distribution 1.1 Moments of random vector A random vector X of size p is a column vector consisting of p random variables X1;:::;Xp 0 and is X = (X1;:::;Xp) . The mean or expectation of X is defined by the vector of expectations, 0 1 E(X1) B . C µ ≡ E(X) = @ . A ; E(Xp) which exists if EjXij < 1 for all i = 1; : : : ; p. Lemma 1. Let X be a random vector of size p and Y be a random vector of size q. For any non-random matrices A(m×p), B(m×q), C(1×n), and D(m×n), E(AX + BY ) = AE(X) + BE(Y ); E(AXC + D) = AE(X)C + D: 2 For a random vector X of size p satisfying E(Xi ) < 1 for all i = 1; : : : ; p, the variance{ covariance matrix (or just covariance matrix) of X is Σ ≡ Cov(X) = E[(X − EX)(X − EX)0]: The covariance matrix of X is a p × p square, symmetric matrix. In particular, Σij = Cov(Xi;Xj) = Cov(Xj;Xi) = Σji. Some properties: 1. Cov(X) = E(XX0) − E(X)E(X)0: 2. If c = c(p×1) is a constant, Cov(X + c) = Cov(X). 0 3. If A(m×p) is a constant, Cov(AX) = ACov(X)A . Lemma 2. The p × p matrix Σ is a covariance matrix if and only if it is non-negative definite. 1.2 Multivariate normal distribution - nonsingular case Recall that the univariate normal distribution with mean µ and variance σ2 has density 2 − 1 1 −2 f(x) = (2πσ ) 2 exp[− (x − µ)σ (x − µ)]: 2 Similarly, the multivariate normal distribution for the special case of nonsingular covariance matrix Σ is defined as follows. p p Definition 1. Let µ 2 R and Σ(p×p) > 0. A random vector X 2 R has p-variate normal distribution with mean µ and covariance matrix Σ if it has probability density function − 1 1 0 −1 f(x) = j2πΣj 2 exp − (x − µ) Σ (x − µ) ; (1) 2 p for x 2 R . We use the notation X ∼ Np(µ; Σ). Theorem 3. If X ∼ Np(µ; Σ) for Σ > 0, then − 1 1. Y = Σ 2 (X − µ) ∼ Np(0; Ip), L 1 2. X = Σ 2 Y + µ where Y ∼ Np(0; Ip), 3. E(X) = µ and Cov(X) = Σ, 4. for any fixed v 2 Rp , v0X is univariate normal. 5. U = (X − µ)0Σ−1(X − µ) ∼ χ2(p). Example 1 (Bivariate normal). 1.2.1 Geometry of multivariate normal The multivariate normal distribution has location parameter µ and the shape parameter Σ > 0. In particular, let's look into the contour of equal density p Ec = fx 2 R : f(x) = c0g p 0 −1 2 = fx 2 R :(x − µ) Σ (x − µ) = c g: 0 Moreover, consider the spectral decomposition of Σ = UΛU where U = [u1;:::; up] and Λ = diag(λ1; : : : ; λp) with λ1 ≥ λ2 ≥ ::: ≥ λp > 0. The Ec, for any pc > 0, is an ellipsoid centered around µ with principal axes ui of length proportional to λi. If Σ = Ip, the ellipsoid is the surface of a sphere of radius c centered at µ. As an example, consider a bivariate normal distribution N2(0; Σ) with 2 1 cos(π=4) − sin(π=4) 3 0 cos(π=4) − sin(π=4)0 Σ = = : 1 2 sin(π=4) cos(π=4) 0 1 sin(π=4) cos(π=4) The location of the distribution is the origin (µ = 0), and the shape (Σ) of the distribution is determined by the ellipse given by the two principal axes (one at 45 degree line, the other at -45 degree line). Figure 1 shows the density function and the corresponding Ec for c = 0:5; 1; 1:5; 2;:::. 2 Figure 1: Bivariate normal density and its contours. Notice that an ellipses in the plane can represent a bivariate normal distribution. In higher dimensions d > 2, ellipsoids play the similar role. 1.3 General multivariate normal distribution The characteristic function of a random vector X is defined as it0X p 'X (t) = E(e ); for t 2 R : Note that the characteristic function is C-valued, and always exists. We collect some important facts. L 1. 'X (t) = 'Y (t) if and only if X = Y . 2. If X and Y are independent, then 'X+Y (t) = 'X (t)'Y (t). 3. Xn ) X if and only if 'Xn (t) ! 'X (t) for all t. An important corollary follows from the uniqueness of the characteristic function. Corollary 4 (Cramer{Wold device). If X is a p × 1 random vector then its distribution is uniquely determined by the distributions of linear functions of t0X, for every t 2 Rp. Corollary 4 paves the way to the definition of (general) multivariate normal distribution. Definition 2. A random vector X 2 Rp has a multivariate normal distribution if t0X is an univariate normal for all t 2 Rp. The definition says that X is MVN if every projection of X onto a 1-dimensional subspace is normal, with a convention that a degenerate distribution δc has a normal distribution with variance 0, i.e., c ∼ N(c; 0). The definition does not require that Cov(X) is nonsingular. 3 Theorem 5. The characteristic function of a multivariate normal distribution with mean µ and covariance matrix Σ ≥ 0 is, for t 2 Rp, 1 '(t) = exp[it0µ − t0Σt]: 2 If Σ > 0, then the pdf exists and is the same as (1). In the following, the notation X ∼ N(µ; Σ) is valid for a non-negative definite Σ. How- ever, whenever Σ−1 appears in the statement, Σ is assumed to be positive definite. Proposition 6. If X ∼ Np(µ; Σ) and Y = AX + b for A(q×p) and b(q×1), then Y ∼ 0 Nq(Aµ + b; AΣA ). Next two results are concerning independence and conditional distributions of normal random vectors. Let X1 and X2 be the partition of X whose dimensions are r and s, r + s = p, and suppose µ and Σ are partitioned accordingly. That is, X1 µ1 Σ11 Σ12 X = ∼ Np ; : X2 µ2 Σ21 Σ22 Proposition 7. The normal random vectors X1 and X2 are independent if and only if Cov(X1; X2) = Σ12 = 0. Proposition 8. The conditional distribution of X1 given X2 = x2 is −1 −1 Nr(µ1 + Σ12Σ22 (x2 − µ2); Σ11 − Σ12Σ22 Σ21) ∗ −1 ∗ Proof. Consider new random vectors X1 = X1 − Σ12Σ22 X2 and X2 = X2, ∗ −1 ∗ X1 Ir −Σ12Σ22 X = ∗ = AX; A = : X2 0(s×r) Is By Proposition 6, X∗ is multivariate normal. An inspection of the covariance matrix of X∗ ∗ ∗ leads that X1 and X2 are independent. The result follows by writing ∗ −1 X1 = X1 + Σ12Σ22 X2; ∗ and that the distribution (law) of X1 given X2 = x2 is L(X1 j X2 = x2) = L(X1 + −1 ∗ −1 Σ12Σ22 X2 j X2 = x2) = L(X1 + Σ12Σ22 x2 j X2 = x2), which is a MVN of dimension r. 4 1.4 Multivariate Central Limit Theorem p If X1; X2;::: 2 R are i.i.d. with E(Xi) = µ and Cov(X) = Σ, then n − 1 X n 2 (Xj − µ) ) Np(0; Σ) as n ! 1; j=1 or equivalently, 1 ¯ n 2 (Xn − µ) ) Np(0; Σ) as n ! 1; ¯ 1 Pn where Xn = 2 j=1 Xj. ¯ The delta-method can be used for asymptotic normality of h(Xn) for some function h : Rp ! R. In particular, denote rh(x) for the gradient of h at x. Using the first two terms of Taylor series, ¯ 0 ¯ ¯ 2 h(Xn) = h(µ) + (rh(µ)) (Xn − µ) + Op(kXn − µk2); Then Slutsky's theorem gives the result, p p p ¯ 0 ¯ ¯ 0 ¯ n(h(Xn) − h(µ)) = (rh(µ)) n(Xn − µ) + Op( n(Xn − µ) (Xn − µ)) 0 ) (rh(µ)) Np(0; Σ) as n ! 1; 0 = Np(0; (rh(µ)) Σ(rh(µ))) 1.5 Quadratic forms in normal random vectors Let X ∼ Np(µ; Σ). A quadratic form in X is a random variable of the form p p 0 X X Y = X AX = XiaijXj; i=1 j=1 where A is a p × p symmetric matrix and Xi is the ith element of X. We are interested in the distribution of quadratic forms and the conditions under which two quadratic forms are independent. Example 2. A special case: If X ∼ Np(0; Ip) and A = Ip, p 0 0 X 2 2 Y = X AX = X X = Xi ∼ χ (p): i=1 Fact 1. Recall the following: 1.A p × p matrix A is idempotent if A2 = A. 0 2. If A is symmetric, then A = Γ ΛΓ, where Λ = diag(λi) and Γ is orthogonal. 3. If A is symmetric idempotent, 5 (a) its eigenvalues are either 0 or 1, (b) rank(A) = #fnon zero eigenvaluesg = trace(A). 2 Theorem 9. Let X ∼ Np(0; σ I) and A be a p × p symmetric matrix. Then X0AX Y = ∼ χ2(m) σ2 if and only if A is idempotent of rank m < p. Corollary 10. Let X ∼ Np(0; Σ) and A be a p × p symmetric matrix. Then Y = X0AX ∼ χ2(m) if and only if either i) AΣ is idempotent of rank m or ii) ΣA is idempotent of rank m. 0 −1 2 Example 3. If X ∼ Np(µ; Σ) then (X − µ) Σ (X − µ) ∼ χ (p). Theorem 11. Let X ∼ Np(0; I) and A be a p × p symmetric matrix, and B be a k × p matrix.
Recommended publications
  • 1. How Different Is the T Distribution from the Normal?
    Statistics 101–106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M §7.1 and §7.2, ignoring starred parts. Reread M&M §3.2. The eects of estimated variances on normal approximations. t-distributions. Comparison of two means: pooling of estimates of variances, or paired observations. In Lecture 6, when discussing comparison of two Binomial proportions, I was content to estimate unknown variances when calculating statistics that were to be treated as approximately normally distributed. You might have worried about the effect of variability of the estimate. W. S. Gosset (“Student”) considered a similar problem in a very famous 1908 paper, where the role of Student’s t-distribution was first recognized. Gosset discovered that the effect of estimated variances could be described exactly in a simplified problem where n independent observations X1,...,Xn are taken from (, ) = ( + ...+ )/ a normal√ distribution, N . The sample mean, X X1 Xn n has a N(, / n) distribution. The random variable X Z = √ / n 2 2 Phas a standard normal distribution. If we estimate by the sample variance, s = ( )2/( ) i Xi X n 1 , then the resulting statistic, X T = √ s/ n no longer has a normal distribution. It has a t-distribution on n 1 degrees of freedom. Remark. I have written T , instead of the t used by M&M page 505. I find it causes confusion that t refers to both the name of the statistic and the name of its distribution. As you will soon see, the estimation of the variance has the effect of spreading out the distribution a little beyond what it would be if were used.
    [Show full text]
  • 5.1 Convergence in Distribution
    556: MATHEMATICAL STATISTICS I CHAPTER 5: STOCHASTIC CONVERGENCE The following definitions are stated in terms of scalar random variables, but extend naturally to vector random variables defined on the same probability space with measure P . Forp example, some results 2 are stated in terms of the Euclidean distance in one dimension jXn − Xj = (Xn − X) , or for se- > quences of k-dimensional random variables Xn = (Xn1;:::;Xnk) , 0 1 1=2 Xk @ 2A kXn − Xk = (Xnj − Xj) : j=1 5.1 Convergence in Distribution Consider a sequence of random variables X1;X2;::: and a corresponding sequence of cdfs, FX1 ;FX2 ;::: ≤ so that for n = 1; 2; :: FXn (x) =P[Xn x] : Suppose that there exists a cdf, FX , such that for all x at which FX is continuous, lim FX (x) = FX (x): n−!1 n Then X1;:::;Xn converges in distribution to random variable X with cdf FX , denoted d Xn −! X and FX is the limiting distribution. Convergence of a sequence of mgfs or cfs also indicates conver- gence in distribution, that is, if for all t at which MX (t) is defined, if as n −! 1, we have −! () −!d MXi (t) MX (t) Xn X: Definition : DEGENERATE DISTRIBUTIONS The sequence of random variables X1;:::;Xn converges in distribution to constant c if the limiting d distribution of X1;:::;Xn is degenerate at c, that is, Xn −! X and P [X = c] = 1, so that { 0 x < c F (x) = X 1 x ≥ c Interpretation: A special case of convergence in distribution occurs when the limiting distribution is discrete, with the probability mass function only being non-zero at a single value, that is, if the limiting random variable is X, then P [X = c] = 1 and zero otherwise.
    [Show full text]
  • 5. the Student T Distribution
    Virtual Laboratories > 4. Special Distributions > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 5. The Student t Distribution In this section we will study a distribution that has special importance in statistics. In particular, this distribution will arise in the study of a standardized version of the sample mean when the underlying distribution is normal. The Probability Density Function Suppose that Z has the standard normal distribution, V has the chi-squared distribution with n degrees of freedom, and that Z and V are independent. Let Z T= √V/n In the following exercise, you will show that T has probability density function given by −(n +1) /2 Γ((n + 1) / 2) t2 f(t)= 1 + , t∈ℝ ( n ) √n π Γ(n / 2) 1. Show that T has the given probability density function by using the following steps. n a. Show first that the conditional distribution of T given V=v is normal with mean 0 a nd variance v . b. Use (a) to find the joint probability density function of (T,V). c. Integrate the joint probability density function in (b) with respect to v to find the probability density function of T. The distribution of T is known as the Student t distribution with n degree of freedom. The distribution is well defined for any n > 0, but in practice, only positive integer values of n are of interest. This distribution was first studied by William Gosset, who published under the pseudonym Student. In addition to supplying the proof, Exercise 1 provides a good way of thinking of the t distribution: the t distribution arises when the variance of a mean 0 normal distribution is randomized in a certain way.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Random Variables and Probability Distributions 1.1
    RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 1. DISCRETE RANDOM VARIABLES 1.1. Definition of a Discrete Random Variable. A random variable X is said to be discrete if it can assume only a finite or countable infinite number of distinct values. A discrete random variable can be defined on both a countable or uncountable sample space. 1.2. Probability for a discrete random variable. The probability that X takes on the value x, P(X=x), is defined as the sum of the probabilities of all sample points in Ω that are assigned the value x. We may denote P(X=x) by p(x) or pX (x). The expression pX (x) is a function that assigns probabilities to each possible value x; thus it is often called the probability function for the random variable X. 1.3. Probability distribution for a discrete random variable. The probability distribution for a discrete random variable X can be represented by a formula, a table, or a graph, which provides pX (x) = P(X=x) for all x. The probability distribution for a discrete random variable assigns nonzero probabilities to only a countable number of distinct x values. Any value x not explicitly assigned a positive probability is understood to be such that P(X=x) = 0. The function pX (x)= P(X=x) for each x within the range of X is called the probability distribution of X. It is often called the probability mass function for the discrete random variable X. 1.4. Properties of the probability distribution for a discrete random variable.
    [Show full text]
  • Chapter 5 Sections
    Chapter 5 Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions 1 / 43 Chapter 5 5.1 Introduction Families of distributions How: Parameter and Parameter space pf /pdf and cdf - new notation: f (xj parameters ) Mean, variance and the m.g.f. (t) Features, connections to other distributions, approximation Reasoning behind a distribution Why: Natural justification for certain experiments A model for the uncertainty in an experiment All models are wrong, but some are useful – George Box 2 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Bernoulli distributions Def: Bernoulli distributions – Bernoulli(p) A r.v. X has the Bernoulli distribution with parameter p if P(X = 1) = p and P(X = 0) = 1 − p. The pf of X is px (1 − p)1−x for x = 0; 1 f (xjp) = 0 otherwise Parameter space: p 2 [0; 1] In an experiment with only two possible outcomes, “success” and “failure”, let X = number successes. Then X ∼ Bernoulli(p) where p is the probability of success. E(X) = p, Var(X) = p(1 − p) and (t) = E(etX ) = pet + (1 − p) 8 < 0 for x < 0 The cdf is F(xjp) = 1 − p for 0 ≤ x < 1 : 1 for x ≥ 1 3 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Binomial distributions Def: Binomial distributions – Binomial(n; p) A r.v.
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • Parameter Specification of the Beta Distribution and Its Dirichlet
    %HWD'LVWULEXWLRQVDQG,WV$SSOLFDWLRQV 3DUDPHWHU6SHFLILFDWLRQRIWKH%HWD 'LVWULEXWLRQDQGLWV'LULFKOHW([WHQVLRQV 8WLOL]LQJ4XDQWLOHV -5HQpYDQ'RUSDQG7KRPDV$0D]]XFKL (Submitted January 2003, Revised March 2003) I. INTRODUCTION.................................................................................................... 1 II. SPECIFICATION OF PRIOR BETA PARAMETERS..............................................5 A. Basic Properties of the Beta Distribution...............................................................6 B. Solving for the Beta Prior Parameters...................................................................8 C. Design of a Numerical Procedure........................................................................12 III. SPECIFICATION OF PRIOR DIRICHLET PARAMETERS................................. 17 A. Basic Properties of the Dirichlet Distribution...................................................... 18 B. Solving for the Dirichlet prior parameters...........................................................20 IV. SPECIFICATION OF ORDERED DIRICHLET PARAMETERS...........................22 A. Properties of the Ordered Dirichlet Distribution................................................. 23 B. Solving for the Ordered Dirichlet Prior Parameters............................................ 25 C. Transforming the Ordered Dirichlet Distribution and Numerical Stability ......... 27 V. CONCLUSIONS........................................................................................................ 31 APPENDIX...................................................................................................................
    [Show full text]
  • The Normal Moment Generating Function
    MSc. Econ: MATHEMATICAL STATISTICS, 1996 The Moment Generating Function of the Normal Distribution Recall that the probability density function of a normally distributed random variable x with a mean of E(x)=and a variance of V (x)=2 is 2 1 1 (x)2/2 (1) N(x; , )=p e 2 . (22) Our object is to nd the moment generating function which corresponds to this distribution. To begin, let us consider the case where = 0 and 2 =1. Then we have a standard normal, denoted by N(z;0,1), and the corresponding moment generating function is dened by Z zt zt 1 1 z2 Mz(t)=E(e )= e √ e 2 dz (2) 2 1 t2 = e 2 . To demonstate this result, the exponential terms may be gathered and rear- ranged to give exp zt exp 1 z2 = exp 1 z2 + zt (3) 2 2 1 2 1 2 = exp 2 (z t) exp 2 t . Then Z 1t2 1 1(zt)2 Mz(t)=e2 √ e 2 dz (4) 2 1 t2 = e 2 , where the nal equality follows from the fact that the expression under the integral is the N(z; = t, 2 = 1) probability density function which integrates to unity. Now consider the moment generating function of the Normal N(x; , 2) distribution when and 2 have arbitrary values. This is given by Z xt xt 1 1 (x)2/2 (5) Mx(t)=E(e )= e p e 2 dx (22) Dene x (6) z = , which implies x = z + . 1 MSc. Econ: MATHEMATICAL STATISTICS: BRIEF NOTES, 1996 Then, using the change-of-variable technique, we get Z 1 1 2 dx t zt p 2 z Mx(t)=e e e dz 2 dz Z (2 ) (7) t zt 1 1 z2 = e e √ e 2 dz 2 t 1 2t2 = e e 2 , Here, to establish the rst equality, we have used dx/dz = .
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Package 'Distributional'
    Package ‘distributional’ February 2, 2021 Title Vectorised Probability Distributions Version 0.2.2 Description Vectorised distribution objects with tools for manipulating, visualising, and using probability distributions. Designed to allow model prediction outputs to return distributions rather than their parameters, allowing users to directly interact with predictive distributions in a data-oriented workflow. In addition to providing generic replacements for p/d/q/r functions, other useful statistics can be computed including means, variances, intervals, and highest density regions. License GPL-3 Imports vctrs (>= 0.3.0), rlang (>= 0.4.5), generics, ellipsis, stats, numDeriv, ggplot2, scales, farver, digest, utils, lifecycle Suggests testthat (>= 2.1.0), covr, mvtnorm, actuar, ggdist RdMacros lifecycle URL https://pkg.mitchelloharawild.com/distributional/, https: //github.com/mitchelloharawild/distributional BugReports https://github.com/mitchelloharawild/distributional/issues Encoding UTF-8 Language en-GB LazyData true Roxygen list(markdown = TRUE, roclets=c('rd', 'collate', 'namespace')) RoxygenNote 7.1.1 1 2 R topics documented: R topics documented: autoplot.distribution . .3 cdf..............................................4 density.distribution . .4 dist_bernoulli . .5 dist_beta . .6 dist_binomial . .7 dist_burr . .8 dist_cauchy . .9 dist_chisq . 10 dist_degenerate . 11 dist_exponential . 12 dist_f . 13 dist_gamma . 14 dist_geometric . 16 dist_gumbel . 17 dist_hypergeometric . 18 dist_inflated . 20 dist_inverse_exponential . 20 dist_inverse_gamma
    [Show full text]
  • Iam 530 Elements of Probability and Statistics
    IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 4-SOME DISCERETE AND CONTINUOUS DISTRIBUTION FUNCTIONS SOME DISCRETE PROBABILITY DISTRIBUTIONS Degenerate, Uniform, Bernoulli, Binomial, Poisson, Negative Binomial, Geometric, Hypergeometric DEGENERATE DISTRIBUTION • An rv X is degenerate at point k if 1, Xk P X x 0, ow. The cdf: 0, Xk F x P X x 1, Xk UNIFORM DISTRIBUTION • A finite number of equally spaced values are equally likely to be observed. 1 P(X x) ; x 1,2,..., N; N 1,2,... N • Example: throw a fair die. P(X=1)=…=P(X=6)=1/6 N 1 (N 1)(N 1) E(X) ; Var(X) 2 12 BERNOULLI DISTRIBUTION • An experiment consists of one trial. It can result in one of 2 outcomes: Success or Failure (or a characteristic being Present or Absent). • Probability of Success is p (0<p<1) 1 with probability p Xp;0 1 0 with probability 1 p P(X x) px (1 p)1 x for x 0,1; and 0 p 1 1 E(X ) xp(y) 0(1 p) 1p p y 0 E X 2 02 (1 p) 12 p p V (X ) E X 2 E(X ) 2 p p2 p(1 p) p(1 p) Binomial Experiment • Experiment consists of a series of n identical trials • Each trial can end in one of 2 outcomes: Success or Failure • Trials are independent (outcome of one has no bearing on outcomes of others) • Probability of Success, p, is constant for all trials • Random Variable X, is the number of Successes in the n trials is said to follow Binomial Distribution with parameters n and p • X can take on the values x=0,1,…,n • Notation: X~Bin(n,p) Consider outcomes of an experiment with 3 Trials : SSS y 3 P(SSS) P(Y 3) p(3) p3 SSF, SFS, FSS y 2 P(SSF SFS FSS) P(Y 2) p(2) 3p2 (1 p) SFF, FSF, FFS y 1 P(SFF FSF FFS ) P(Y 1) p(1) 3p(1 p)2 FFF y 0 P(FFF ) P(Y 0) p(0) (1 p)3 In General: n n! 1) # of ways of arranging x S s (and (n x) F s ) in a sequence of n positions x x!(n x)! 2) Probability of each arrangement of x S s (and (n x) F s ) p x (1 p)n x n 3)P(X x) p(x) p x (1 p)n x x 0,1,..., n x • Example: • There are black and white balls in a box.
    [Show full text]