Proof of Conjectures on the Standard Deviation, Skewness and Kurtosis of the Shifted Gompertz Distribution

Total Page:16

File Type:pdf, Size:1020Kb

Proof of Conjectures on the Standard Deviation, Skewness and Kurtosis of the Shifted Gompertz Distribution PROOF OF CONJECTURES ON THE STANDARD DEVIATION, SKEWNESS AND KURTOSIS OF THE SHIFTED GOMPERTZ DISTRIBUTION Authors: Fernando Jimenez´ Torres { Dpto. de M´etodos Estad´ısticos,Universidad de Zaragoza, Mar´ıade Luna 3, Campus R´ıoEbro, 50018, Zaragoza, Spain fjimenez@unizar.es Abstract: • Three conjectures on the standard deviation, skewness and kurtosis of the shifted Gompertz distribution, as the shape parameter increases to +1, are proved. In this regard, the exponential integral function and polygamma functions are used in the proofs. In addition, an explicit expression for the ith moment of this probabilistic model is obtained. These results allow to place the shifted Gompertz distribution in the Skewness{Kurtosis diagram, providing a valuable help in the decision to choose the shifted Gompertz distribution among the models to fit data. Their usefulness is illustrated by fitting a real malaria data set using the maximum likelihood method for estimating the parameters of the shifted Gompertz distribution and some classical models. Goodness-of-fit measures are used to compare their performance. Key-Words: • Shifted Gompertz distribution; Moment generating function; Skewness{Kurtosis dia- gram; Polygamma functions. AMS Subject Classification: • 60E05, 62E15, 26B20. 2 F. Jim´enezTorres Proof of conjectures of the shifted Gompertz distribution 3 1. INTRODUCTION One of the most important references in models of adopting timing of inno- vations is the model of Bass [1]. From this model, Bemmaor [3] formulated that the individual-level model of adopting timing of a new product in a market is ran- domly distributed according to the shifted Gompertz distribution. More recently, Lover et al. [6] show that modeling studies of period of time to first relapse in human infections with malaria in the New World tropical region, can support the shifted Gompertz distribution. Some statistical properties of the shifted Gompertz distribution were ob- tained in Bemmaor [3]. Jim´enezTorres and Jodr´a[8] gave explicit expressions for the first and second moment, a closed form expression for the quantile function was derived, and the limit distributions of extreme order statistics were conside- red. In Jim´enezTorres [7] the method of least squares, method of maximum likelihood and method of moments to estimate the parameters of the shifted Gompertz distribution were used. In this paper we want to expand and complete the knowledge and statistical properties of the shifted Gompertz distribution, solving the three conjectures presented in Jim´enezTorres and Jodr´a[8] and ob- taining a general expression for the moments. Although the Gompertz distribution Z has been given in different forms in the literature, the cumulative distribution function (cdf) FZ (z) = P (Z ≤ −αe−βz z) = e , −∞ < z < +1, found in Bemmaor [3], satisfiesp thatp its standard deviation, skewness and excess kurtosis are equals to π=( 6β), 12 6ζ(3)/π3 and 2:4, respectively, where ζ(·) denotes the Riemann zeta function. The skewness of 3 3 a random variable X is defined by γ1 = E[(X − µ) ]/σ and is a measure of the asymmetry of the probability distribution. The excess kurtosis of X is given by 4 4 γ2 = E[(X − µ) ]/σ − 3 and it describes the shape of the tails of the probability distribution. Let X be a random variable having the shifted Gompertz distribution with parameters α and β, where α > 0 is a shape parameter and β > 0 is a scale parameter. The probability density function of X is ( ) −(βx+αe−βx) −βx (1.1) fX (x) = βe (1 + α 1 − e ) x > 0: This model can be characterized as the maximum of two independent ran- dom variables with Gompertz distribution (parameters α > 0 and β > 0) and ex- ponential distribution (parameter β > 0). From (1.1), given that limα!0 fX (x) = βe−βx, it may be noted that the shifted Gompertz distribution gets close to an exponential distribution with mean 1/β, as the parameter α decreases to 0. So, for a fixed value of β, limα!0 σ = 1/β, where σ is the standard deviation of X. For the shifted Gompertz distribution we have limα!0 γ1 = 2 and limα!0 γ2 = 6, 4 F. Jim´enezTorres which are the skewness and kurtosis of the exponential distribution. If the shape parameter α increases to infinity, the asymptotic behavior of the shifted Gom- pertz distribution is nontrivial and these limits require analytic tools for their calculation. Based on numerical evidence showed in Jim´enezTorres and Jodr´a[8] the next three conjectures were presented: π Conjecture 1 : lim σ = p α!+1 6β p 12 6ζ(3) Conjecture 2 : lim γ1 = α!+1 π3 Conjecture 3 : lim γ2 = 2:4 α!+1 The remainder of this note is organized as follows. In Section 2, we prove Conjecture 1. In Section 3, we provide an explicit expression for the ith moment of the shifted Gompertz distribution. In Section 4 and Section 5, we prove Con- jecture 2 and Conjecture 3, respectively. In Section 6 we show the importance of these results in the choice of the shifted Gompertz distribution among the models to fit a real data set and finally, the main conclusions are presented in Section 7. 2. PROOF OF CONJECTURE 1 In Jim´enezTorres and Jodr´a[8] explicit expressions for the moments of orders 1 and 2 of X were obtained. The first moment of X, or mean µ of X, is ( ) 1 1 − e−α (2.1) E[X] = γ + log(α) + E (α) + ; β 1 α ≈ where γ 0:57721 is the Euler{MascheroniR constant and E1(x) is the exponential +1 e−t integral function, defined by E1(x) = x t dt, x > 0. The second moment of X is ( ) 2 (2.2) E[X2] = γ + log(α) + E (α) + F [1; 1; 1; 2; 2; 2; −α]α2 ; αβ2 1 3 3 P − − +1 (−α)k 1 where 3F3[1; 1; 1; 2; 2; 2; α] = k=1 k!k2 is a generalized hypergeometric func- tion. Moreover, we need the next expression (see Geller and Ng [5]) for a > 0 and b > 0 Z 1 +1 E (ax) 1( ) X+ (−ab)k (2.3) 1 dx = (γ + log(ab))2 + ζ(2) + ; x 2 k!k2 b k=1 Proof of conjectures of the shifted Gompertz distribution 5 π2 where ζ(2) = . In particular, using (2.3) with a = 1 and b = α, we obtain 6 Z 1 +1 E (x) 1( ) X+ (−α)k (2.4) 1 dx = (γ + log(α))2 + ζ(2) + ; x 2 k!k2 α k=1 and in the next theorem, we prove Conjecture 1. Theorem 2.1. The limit of the standard deviation, σ, of the shifted Gom- pertz distribution X as the shape parameter α increases to +1 is finite and its value is π (2.5) lim σ = p : α!+1 6β Proof: The variance of a random variable X is σ2 = E[X2] − (E[X])2. From (2.1), (2.2) and (2.4) we have [ Z ] 2 +1 E (x) α( ) σ2 = γ + log(α) + E (α) − α 1 dx + (γ + log(α))2 + ζ(2) αβ2 1 x 2 ( α ) 1 1 − e−α 2 (2.6) − γ + log(α) + E (α) + β2 1 α ζ(2) = + R(α); β2 where Z 2 ( ) 2 +1 E (x) (2.7) R(α) = γ + log(α) + E (α) − 1 dx αβ2 1 β2 x ( α ) 1 ( ) 1 1 − e−α 2 + (γ + log(α) 2 − γ + log(α) + E (α) + : β2 β2 1 α So, lim σ2 = ζ(2)/β2 + lim R(α). Now, in (2.7) we take limit as α increases α!+1 α!+1 to +1, taking into account the next limits related to the exponential integral function (see Geller and Ng [5]): ( ) ( ) ( ) −x p (2.8) lim log(x)E1(x) = lim e E1(x) = lim x E1(x) = 0: x!+1 x!+1 x!+1 So, lim R(α) = 0, and Conjecture 1 is proved. α!+1 To prove Conjecture 2 and Conjecture 3 we need expressions of the moments of orders 3 and 4, respectively. In Section 3 we are more ambitious and obtain a general expression for the moment of order i of the shifted Gompertz distribution. 6 F. Jim´enezTorres 3. MOMENT OF ORDER i OF X R i +1 i The ith moment of X, denoted and defined by E[X ] = 0 x fX (x)dx, i = 1; 2;:::, where fX (x) is given in (1.1), does not seem to have a closed-form expression in terms of elementary functions, but we can find a series expansion. Let γ(a; b) be the lower incomplete gamma function defined for any a > 0 and b > 0 by Z b (3.1) γ(a; b) = va−1e−vdv; 0 [ ] tX and let MX (t) be the moment generating function of X, i.e., MX (t) = E e . In the next theorem we obtain an expression of this function. Theorem 3.1. The moment generating function of the shifted Gompertz distribution X for jtj < β is t/β−1 −α (3.2) MX (t) = α (α + t/β)γ(1 − t/β; α) + e : Proof: By definition, we have Z Z [ ] +1 +1 tX tx tx−βx−αe−βx −βx E e = e fX (x)dx = β e (1 + α(1 − e ))dx 0 Z 0 Z +1 +1 −βx −βx (3.3) = (1 + α)β etx−βx−αe dx − αβ etx−2βx−αe dx: 0 0 The change of variable v = αe−βx in (3.3) provides Z Z [ ] α α E etX = αt/β−1(1 + α) e−vv−t/βdv − αt/β−1 e−vv1−t/βdv ( 0 0 ) (3.4) = αt/β−1 (1 + α)γ(1 − t/β; α) − γ(2 − t/β; α) : Integrating by parts in (3.1) yields the recurrence relation γ(a+1; b) = aγ(a; b)− bae−b.
Recommended publications
  • Lecture 22: Bivariate Normal Distribution Distribution
    6.5 Conditional Distributions General Bivariate Normal Let Z1; Z2 ∼ N (0; 1), which we will use to build a general bivariate normal Lecture 22: Bivariate Normal Distribution distribution. 1 1 2 2 f (z1; z2) = exp − (z1 + z2 ) Statistics 104 2π 2 We want to transform these unit normal distributions to have the follow Colin Rundel arbitrary parameters: µX ; µY ; σX ; σY ; ρ April 11, 2012 X = σX Z1 + µX p 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 1 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - Marginals General Bivariate Normal - Cov/Corr First, lets examine the marginal distributions of X and Y , Second, we can find Cov(X ; Y ) and ρ(X ; Y ) Cov(X ; Y ) = E [(X − E(X ))(Y − E(Y ))] X = σX Z1 + µX h p i = E (σ Z + µ − µ )(σ [ρZ + 1 − ρ2Z ] + µ − µ ) = σX N (0; 1) + µX X 1 X X Y 1 2 Y Y 2 h p 2 i = N (µX ; σX ) = E (σX Z1)(σY [ρZ1 + 1 − ρ Z2]) h 2 p 2 i = σX σY E ρZ1 + 1 − ρ Z1Z2 p 2 2 Y = σY [ρZ1 + 1 − ρ Z2] + µY = σX σY ρE[Z1 ] p 2 = σX σY ρ = σY [ρN (0; 1) + 1 − ρ N (0; 1)] + µY = σ [N (0; ρ2) + N (0; 1 − ρ2)] + µ Y Y Cov(X ; Y ) ρ(X ; Y ) = = ρ = σY N (0; 1) + µY σX σY 2 = N (µY ; σY ) Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 2 / 22 Statistics 104 (Colin Rundel) Lecture 22 April 11, 2012 3 / 22 6.5 Conditional Distributions 6.5 Conditional Distributions General Bivariate Normal - RNG Multivariate Change of Variables Consequently, if we want to generate a Bivariate Normal random variable Let X1;:::; Xn have a continuous joint distribution with pdf f defined of S.
    [Show full text]
  • Applied Biostatistics Mean and Standard Deviation the Mean the Median Is Not the Only Measure of Central Value for a Distribution
    Health Sciences M.Sc. Programme Applied Biostatistics Mean and Standard Deviation The mean The median is not the only measure of central value for a distribution. Another is the arithmetic mean or average, usually referred to simply as the mean. This is found by taking the sum of the observations and dividing by their number. The mean is often denoted by a little bar over the symbol for the variable, e.g. x . The sample mean has much nicer mathematical properties than the median and is thus more useful for the comparison methods described later. The median is a very useful descriptive statistic, but not much used for other purposes. Median, mean and skewness The sum of the 57 FEV1s is 231.51 and hence the mean is 231.51/57 = 4.06. This is very close to the median, 4.1, so the median is within 1% of the mean. This is not so for the triglyceride data. The median triglyceride is 0.46 but the mean is 0.51, which is higher. The median is 10% away from the mean. If the distribution is symmetrical the sample mean and median will be about the same, but in a skew distribution they will not. If the distribution is skew to the right, as for serum triglyceride, the mean will be greater, if it is skew to the left the median will be greater. This is because the values in the tails affect the mean but not the median. Figure 1 shows the positions of the mean and median on the histogram of triglyceride.
    [Show full text]
  • Fbst for Covariance Structures of Generalized Gompertz Models
    FBST FOR COVARIANCE STRUCTURES OF GENERALIZED GOMPERTZ MODELS Viviane Teles de Lucca Maranhão∗,∗∗, Marcelo de Souza Lauretto+, Julio Michael Stern∗ IME-USP∗ and EACH-USP+, University of São Paulo viviane@ime.usp.br∗∗ Abstract. The Gompertz distribution is commonly used in biology for modeling fatigue and mortality. This paper studies a class of models proposed by Adham and Walker, featuring a Gompertz type distribution where the dependence structure is modeled by a lognormal distribution, and develops a new multivariate formulation that facilitates several numerical and computational aspects. This paper also implements the FBST, the Full Bayesian Significance Test for pertinent sharp (precise) hypotheses on the lognormal covariance structure. The FBST’s e-value, ev(H), gives the epistemic value of hypothesis, H, or the value of evidence in the observed in support of H. Keywords: Full Bayesian Significance Test, Evidence, Multivariate Gompertz models INTRODUCTION This paper presents a framework for testing covariance structures in biological sur- vival data. Gavrilov (1991,2001) and Stern (2008) motivate the use of Gompertz type distributions for survival data of biological organisms. Section 2 presents Adham and Walker (2001) characterization of the univariate Gompertz Distribution as a Gamma mixing stochastic process, and the Gompertz type distribution obtained by replacing the Gamma mixing distribution by a Log-Normal approximation. Section 3 presents the multivariate case. Section 4 presents the formulation of the FBST for sharp hypotheses about the covariance structure in these models. Section 5 presents some details concern- ing efficient numerical optimization and integration procedures. Section 6 and 7 present some experimental results and our final remarks.
    [Show full text]
  • The Gompertz Distribution and Maximum Likelihood Estimation of Its Parameters - a Revision
    Max-Planck-Institut für demografi sche Forschung Max Planck Institute for Demographic Research Konrad-Zuse-Strasse 1 · D-18057 Rostock · GERMANY Tel +49 (0) 3 81 20 81 - 0; Fax +49 (0) 3 81 20 81 - 202; http://www.demogr.mpg.de MPIDR WORKING PAPER WP 2012-008 FEBRUARY 2012 The Gompertz distribution and Maximum Likelihood Estimation of its parameters - a revision Adam Lenart (lenart@demogr.mpg.de) This working paper has been approved for release by: James W. Vaupel (jwv@demogr.mpg.de), Head of the Laboratory of Survival and Longevity and Head of the Laboratory of Evolutionary Biodemography. © Copyright is held by the authors. Working papers of the Max Planck Institute for Demographic Research receive only limited review. Views or opinions expressed in working papers are attributable to the authors and do not necessarily refl ect those of the Institute. The Gompertz distribution and Maximum Likelihood Estimation of its parameters - a revision Adam Lenart November 28, 2011 Abstract The Gompertz distribution is widely used to describe the distribution of adult deaths. Previous works concentrated on formulating approximate relationships to char- acterize it. However, using the generalized integro-exponential function Milgram (1985) exact formulas can be derived for its moment-generating function and central moments. Based on the exact central moments, higher accuracy approximations can be defined for them. In demographic or actuarial applications, maximum-likelihood estimation is often used to determine the parameters of the Gompertz distribution. By solving the maximum-likelihood estimates analytically, the dimension of the optimization problem can be reduced to one both in the case of discrete and continuous data.
    [Show full text]
  • 1. How Different Is the T Distribution from the Normal?
    Statistics 101–106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M §7.1 and §7.2, ignoring starred parts. Reread M&M §3.2. The eects of estimated variances on normal approximations. t-distributions. Comparison of two means: pooling of estimates of variances, or paired observations. In Lecture 6, when discussing comparison of two Binomial proportions, I was content to estimate unknown variances when calculating statistics that were to be treated as approximately normally distributed. You might have worried about the effect of variability of the estimate. W. S. Gosset (“Student”) considered a similar problem in a very famous 1908 paper, where the role of Student’s t-distribution was first recognized. Gosset discovered that the effect of estimated variances could be described exactly in a simplified problem where n independent observations X1,...,Xn are taken from (, ) = ( + ...+ )/ a normal√ distribution, N . The sample mean, X X1 Xn n has a N(, / n) distribution. The random variable X Z = √ / n 2 2 Phas a standard normal distribution. If we estimate by the sample variance, s = ( )2/( ) i Xi X n 1 , then the resulting statistic, X T = √ s/ n no longer has a normal distribution. It has a t-distribution on n 1 degrees of freedom. Remark. I have written T , instead of the t used by M&M page 505. I find it causes confusion that t refers to both the name of the statistic and the name of its distribution. As you will soon see, the estimation of the variance has the effect of spreading out the distribution a little beyond what it would be if were used.
    [Show full text]
  • A Study of Non-Central Skew T Distributions and Their Applications in Data Analysis and Change Point Detection
    A STUDY OF NON-CENTRAL SKEW T DISTRIBUTIONS AND THEIR APPLICATIONS IN DATA ANALYSIS AND CHANGE POINT DETECTION Abeer M. Hasan A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2013 Committee: Arjun K. Gupta, Co-advisor Wei Ning, Advisor Mark Earley, Graduate Faculty Representative Junfeng Shang. Copyright c August 2013 Abeer M. Hasan All rights reserved iii ABSTRACT Arjun K. Gupta, Co-advisor Wei Ning, Advisor Over the past three decades there has been a growing interest in searching for distribution families that are suitable to analyze skewed data with excess kurtosis. The search started by numerous papers on the skew normal distribution. Multivariate t distributions started to catch attention shortly after the development of the multivariate skew normal distribution. Many researchers proposed alternative methods to generalize the univariate t distribution to the multivariate case. Recently, skew t distribution started to become popular in research. Skew t distributions provide more flexibility and better ability to accommodate long-tailed data than skew normal distributions. In this dissertation, a new non-central skew t distribution is studied and its theoretical properties are explored. Applications of the proposed non-central skew t distribution in data analysis and model comparisons are studied. An extension of our distribution to the multivariate case is presented and properties of the multivariate non-central skew t distri- bution are discussed. We also discuss the distribution of quadratic forms of the non-central skew t distribution. In the last chapter, the change point problem of the non-central skew t distribution is discussed under different settings.
    [Show full text]
  • The Likelihood Principle
    1 01/28/99 ãMarc Nerlove 1999 Chapter 1: The Likelihood Principle "What has now appeared is that the mathematical concept of probability is ... inadequate to express our mental confidence or diffidence in making ... inferences, and that the mathematical quantity which usually appears to be appropriate for measuring our order of preference among different possible populations does not in fact obey the laws of probability. To distinguish it from probability, I have used the term 'Likelihood' to designate this quantity; since both the words 'likelihood' and 'probability' are loosely used in common speech to cover both kinds of relationship." R. A. Fisher, Statistical Methods for Research Workers, 1925. "What we can find from a sample is the likelihood of any particular value of r [a parameter], if we define the likelihood as a quantity proportional to the probability that, from a particular population having that particular value of r, a sample having the observed value r [a statistic] should be obtained. So defined, probability and likelihood are quantities of an entirely different nature." R. A. Fisher, "On the 'Probable Error' of a Coefficient of Correlation Deduced from a Small Sample," Metron, 1:3-32, 1921. Introduction The likelihood principle as stated by Edwards (1972, p. 30) is that Within the framework of a statistical model, all the information which the data provide concerning the relative merits of two hypotheses is contained in the likelihood ratio of those hypotheses on the data. ...For a continuum of hypotheses, this principle
    [Show full text]
  • On the Meaning and Use of Kurtosis
    Psychological Methods Copyright 1997 by the American Psychological Association, Inc. 1997, Vol. 2, No. 3,292-307 1082-989X/97/$3.00 On the Meaning and Use of Kurtosis Lawrence T. DeCarlo Fordham University For symmetric unimodal distributions, positive kurtosis indicates heavy tails and peakedness relative to the normal distribution, whereas negative kurtosis indicates light tails and flatness. Many textbooks, however, describe or illustrate kurtosis incompletely or incorrectly. In this article, kurtosis is illustrated with well-known distributions, and aspects of its interpretation and misinterpretation are discussed. The role of kurtosis in testing univariate and multivariate normality; as a measure of departures from normality; in issues of robustness, outliers, and bimodality; in generalized tests and estimators, as well as limitations of and alternatives to the kurtosis measure [32, are discussed. It is typically noted in introductory statistics standard deviation. The normal distribution has a kur- courses that distributions can be characterized in tosis of 3, and 132 - 3 is often used so that the refer- terms of central tendency, variability, and shape. With ence normal distribution has a kurtosis of zero (132 - respect to shape, virtually every textbook defines and 3 is sometimes denoted as Y2)- A sample counterpart illustrates skewness. On the other hand, another as- to 132 can be obtained by replacing the population pect of shape, which is kurtosis, is either not discussed moments with the sample moments, which gives or, worse yet, is often described or illustrated incor- rectly. Kurtosis is also frequently not reported in re- ~(X i -- S)4/n search articles, in spite of the fact that virtually every b2 (•(X i - ~')2/n)2' statistical package provides a measure of kurtosis.
    [Show full text]
  • Characteristics and Statistics of Digital Remote Sensing Imagery (1)
    Characteristics and statistics of digital remote sensing imagery (1) Digital Images: 1 Digital Image • With raster data structure, each image is treated as an array of values of the pixels. • Image data is organized as rows and columns (or lines and pixels) start from the upper left corner of the image. • Each pixel (picture element) is treated as a separate unite. Statistics of Digital Images Help: • Look at the frequency of occurrence of individual brightness values in the image displayed • View individual pixel brightness values at specific locations or within a geographic area; • Compute univariate descriptive statistics to determine if there are unusual anomalies in the image data; and • Compute multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). 2 Statistics of Digital Images It is necessary to calculate fundamental univariate and multivariate statistics of the multispectral remote sensor data. This involves identification and calculation of – maximum and minimum value –the range, mean, standard deviation – between-band variance-covariance matrix – correlation matrix, and – frequencies of brightness values The results of the above can be used to produce histograms. Such statistics provide information necessary for processing and analyzing remote sensing data. A “population” is an infinite or finite set of elements. A “sample” is a subset of the elements taken from a population used to make inferences about certain characteristics of the population. (e.g., training signatures) 3 Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around the central value, and the frequency of occurrence declines away from this central point.
    [Show full text]
  • A Multivariate Student's T-Distribution
    Open Journal of Statistics, 2016, 6, 443-450 Published Online June 2016 in SciRes. http://www.scirp.org/journal/ojs http://dx.doi.org/10.4236/ojs.2016.63040 A Multivariate Student’s t-Distribution Daniel T. Cassidy Department of Engineering Physics, McMaster University, Hamilton, ON, Canada Received 29 March 2016; accepted 14 June 2016; published 17 June 2016 Copyright © 2016 by author and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/ Abstract A multivariate Student’s t-distribution is derived by analogy to the derivation of a multivariate normal (Gaussian) probability density function. This multivariate Student’s t-distribution can have different shape parameters νi for the marginal probability density functions of the multi- variate distribution. Expressions for the probability density function, for the variances, and for the covariances of the multivariate t-distribution with arbitrary shape parameters for the marginals are given. Keywords Multivariate Student’s t, Variance, Covariance, Arbitrary Shape Parameters 1. Introduction An expression for a multivariate Student’s t-distribution is presented. This expression, which is different in form than the form that is commonly used, allows the shape parameter ν for each marginal probability density function (pdf) of the multivariate pdf to be different. The form that is typically used is [1] −+ν Γ+((ν n) 2) T ( n) 2 +Σ−1 n 2 (1.[xx] [ ]) (1) ΓΣ(νν2)(π ) This “typical” form attempts to generalize the univariate Student’s t-distribution and is valid when the n marginal distributions have the same shape parameter ν .
    [Show full text]
  • Linear Regression
    eesc BC 3017 statistics notes 1 LINEAR REGRESSION Systematic var iation in the true value Up to now, wehav e been thinking about measurement as sampling of values from an ensemble of all possible outcomes in order to estimate the true value (which would, according to our previous discussion, be well approximated by the mean of a very large sample). Givenasample of outcomes, we have sometimes checked the hypothesis that it is a random sample from some ensemble of outcomes, by plotting the data points against some other variable, such as ordinal position. Under the hypothesis of random selection, no clear trend should appear.Howev er, the contrary case, where one finds a clear trend, is very important. Aclear trend can be a discovery,rather than a nuisance! Whether it is adiscovery or a nuisance (or both) depends on what one finds out about the reasons underlying the trend. In either case one must be prepared to deal with trends in analyzing data. Figure 2.1 (a) shows a plot of (hypothetical) data in which there is a very clear trend. The yaxis scales concentration of coliform bacteria sampled from rivers in various regions (units are colonies per liter). The x axis is a hypothetical indexofregional urbanization, ranging from 1 to 10. The hypothetical data consist of 6 different measurements at each levelofurbanization. The mean of each set of 6 measurements givesarough estimate of the true value for coliform bacteria concentration for rivers in a region with that urbanization level. The jagged dark line drawn on the graph connects these estimates of true value and makes the trend quite clear: more extensive urbanization is associated with higher true values of bacteria concentration.
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; guillermomartinez@correo.unicordoba.edu.co 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; emilio.gomez-deniz@ulpgc.es 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; carolina.marchant.fuentes@gmail.com * Correspondence: victor.leiva@pucv.cl or victorleivasanchez@gmail.com Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]