Location-Scale Distributions

Total Page:16

File Type:pdf, Size:1020Kb

Location-Scale Distributions Location–Scale Distributions Linear Estimation and Probability Plotting Using MATLAB Horst Rinne Copyright: Prof. em. Dr. Horst Rinne Department of Economics and Management Science Justus–Liebig–University, Giessen, Germany Contents Preface VII List of Figures IX List of Tables XII 1 The family of location–scale distributions 1 1.1 Properties of location–scale distributions . 1 1.2 Genuine location–scale distributions — A short listing . 5 1.3 Distributions transformable to location–scale type . 11 2 Order statistics 18 2.1 Distributional concepts . 18 2.2 Moments of order statistics . 21 2.2.1 Definitions and basic formulas . 21 2.2.2 Identities, recurrence relations and approximations . 26 2.3 Functions of order statistics . 32 3 Statistical graphics 36 3.1 Some historical remarks . 36 3.2 The role of graphical methods in statistics . 38 3.2.1 Graphical versus numerical techniques . 38 3.2.2 Manipulation with graphs and graphical perception . 39 3.2.3 Graphical displays in statistics . 41 3.3 Distribution assessment by graphs . 43 3.3.1 PP–plots and QQ–plots . 43 3.3.2 Probability paper and plotting positions . 47 3.3.3 Hazard plot . 54 3.3.4 TTT–plot . 56 4 Linear estimation — Theory and methods 59 4.1 Types of sampling data . 59 IV Contents 4.2 Estimators based on moments of order statistics . 63 4.2.1 GLS estimators . 64 4.2.1.1 GLS for a general location–scale distribution . 65 4.2.1.2 GLS for a symmetric location–scale distribution . 71 4.2.1.3 GLS and censored samples . 74 4.2.2 Approximations to GLS . 76 4.2.2.1 B approximated by a diagonal matrix . 76 4.2.2.2 B approximated by an identity matrix . 82 4.2.2.3 BLOM’s estimator . 85 4.2.2.4 DOWNTON’s estimator . 90 4.2.3 Approximations to GLS with approximated moments of order statistics . 91 4.3 Estimators based on empirical percentiles . 97 4.3.1 Grouped data . 97 4.3.2 Randomly and multiply censored data . 101 4.4 Goodness–of–fit . 104 5 Probability plotting and linear estimation — Applications 105 5.1 What will be given for each distribution? . 105 5.2 The case of genuine location–scale distributions . 106 5.2.1 Arc–sine distribution — X ∼ AS(a, b) . 106 5.2.2 CAUCHY distribution — X ∼ CA(a, b) . 109 5.2.3 Cosine distributions . 113 5.2.3.1 Ordinary cosine distribution — X ∼ COO(a, b) . 114 5.2.3.2 Raised cosine distribution — X ∼ COR(a, b) . 116 5.2.4 Exponential distribution — X ∼ EX(a, b) . 118 5.2.5 Extreme value distributions of type I . 122 5.2.5.1 Maximum distribution (GUMBEL distribution) — X ∼ EMX1(a, b) . 124 5.2.5.2 Minimum distribution (Log–WEIBULL)— X ∼ EMN1(a, b) . 128 5.2.6 Half–distributions . 130 5.2.6.1 Half–CAUCHY distribution — X ∼ HC(a, b) . 131 Contents V 5.2.6.2 Half–logistic distribution — X ∼ HL(a, b) . 133 5.2.6.3 Half–normal distribution — X ∼ HN(a, b) . 140 5.2.7 Hyperbolic secant distribution — X ∼ HS(a, b) . 144 5.2.8 LAPLACE distribution — X ∼ LA(a, b) . 147 5.2.9 Logistic distribution — X ∼ LO(a, b) . 152 5.2.10 MAXWELL–BOLTZMANN distribution — X ∼ MB(a, b) . 157 5.2.11 Normal distribution — X ∼ NO(a, b) . 161 5.2.12 Parabolic distributions of order 2 . 167 5.2.12.1 U–shaped parabolic distribution — X ∼ PAU(a, b) . 167 5.2.12.2 Inverted U–shaped parabolic distribution — X ∼ PAI(a, b) . 169 5.2.13 RAYLEIGH distribution — X ∼ RA(a, b) . 171 5.2.14 Reflected exponential distribution — X ∼ RE(a, b) . 175 5.2.15 Semi–elliptical distribution — X ∼ SE(a, b) . 178 5.2.16 TEISSIER distribution — X ∼ TE(a, b) . 181 5.2.17 Triangular distributions . 183 5.2.17.1 Symmetric version — X ∼ TS(a, b) . 184 5.2.17.2 Right–angled and negatively skew version — X ∼ TN(a, b) . 187 5.2.17.3 Right–angled and positively skew version — X ∼ TP (a, b) . 190 5.2.18 Uniform or rectangular distribution — X ∼ UN(a, b) . 192 5.2.19 U–shaped beta distribution — X ∼ UB(a, b) . 197 5.2.20 V–shaped distribution — X ∼ VS(a, b) . 200 5.3 The case of ln–transformable distributions . 203 5.3.1 Estimation of the shift parameter . 203 5.3.2 Extreme value distributions of types II and III . 206 5.3.2.1 Type II maximum or inverse WEIBULL distribution — X ∼ EMX2(a, b, c) . 207 5.3.2.2 Type III maximum or reflected WEIBULL distribution — X ∼ EMX3(a, b, c) . 209 5.3.2.3 Type II minimum or FRECHET´ distribution — X ∼ EMN2(a, b, c) . 212 VI Contents 5.3.2.4 Type III minimum or WEIBULL distribution — X ∼ EMN3(a, b, c) . 214 5.3.3 Lognormal distributions . 219 5.3.3.1 Lognormal distribution with lower threshold — X ∼ LNL(a, ea,eb) . 219 5.3.3.2 Lognormal distribution with upper threshold — X ∼ LNU(a, ea,eb) . 223 5.3.4 PARETO distribution — X ∼ PA(a, b, c) . 225 5.3.5 Power–function distribution — X ∼ PO(a, b, c) . 229 6 The program LEPP 232 6.1 Structure of LEPP . 232 6.2 Input to LEPP . 234 6.3 Directions for using LEPP . 235 Mathematical and statistical notations 237 Bibliography 245 Author Index 255 Subject Index 257 Preface VII Preface Statistical distributions can be grouped into families or systems. Such groupings are de- scribed in JOHNSON/KOTZ/KEMP (1992, Chapter 2), JOHNSON/KOTZ/BALAKRISHNAN (1994, Chapter 12) or PATEL(KAPADIA/OWEN (1976, Chapter 4). The most popular families are those of PEARSON,JOHNSON and BURR, the exponential, the stable and the infinitely divisible distributions or those with a monotone likelihood ratio or with a mono- tone failure rate. All these categories have attracted the attention of statisticians and they are fully discussed in the statistical literature. But there is one family, the location–scale family, which hitherto has not been discussed in greater detail. To my knowledge this book is the first comprehensive monograph on one–dimensional continuous location–scale dis- tributions and it is organized as follows. Chapter 1 goes into the details of location–scale distributions and gives their properties along with a short list of those distributions which are genuinely location–scale and which — after a suitable transformation of its variable — become member of this class. We will only consider the ln–transformation. Location–scale distributions easily lend them- selves to an assessment by graphical methods. On a suitably chosen probability paper the cumulative distribution function of the universe gives a straight line and the cumulative distribution of a sample only deviates by chance from a straight line. Thus we can realize an informal goodness–of–fit test. When we fit the straight line free–hand or by eye we may read off the location and scale parameters as percentiles. Another and objective method is to find the straight line on probability paper by a least–squares technique. Then, the estimates of the location and scale parameters will be the parameters of that straight line. Because probability plotting heavily relies on ordered observations Chapter 2 gives — as a prerequisite — a detailed representation of the theory of order statistics. Probability plotting is a graphical assessment of statistical distributions. To see how this kind of graphics fits into the framework of statistical graphics we have written Chapter 3. A first core chapter is Chapter 4. It presents the theory and the methods of linear estimating the location and scale parameters. The methods to be implemented depend on the type of sample, i.e. grouped or non–grouped, censored or uncensored, the type of censoring and also whether the moments of the order statistics are easily calculable or are readily available in tabulated form or not. In the latter case we will give various approximations to the optimal method of general least–squares. Applications of the exact or approximate linear estimation procedures to a great number of location–scale distributions will be presented in Chapter 5, which is central to this book. For each of 35 distributions we give a warrant of arrest enumerating the characteristics, the underlying stochastic model and the fields of application together with the pertinent probability paper and the estimators of the location parameter and the scale parameter. Distributions which have to be transformed to location–scale type sometimes have a third parameter which has to be pre–estimated before applying probability plotting and the lin- ear estimation procedure. We will show how to estimate this third parameter. VIII Preface The calculations and graphics of Chapter 5 have been done using MATLAB,1 Version 7.4 (R2007a). The accompanying CD contains the MATLAB script M–file LEPP and all the function–files to be used by the reader when he wants to do inference on location–scale distributions. Hints how to handle the menu–driven program LEPP and how to organize the data input will be given in Chapter 6 as well as in the comments in the files on the CD. Dr. HORST RINNE, Professor Emeritus of Statistics and Econometrics Department of Economics and Management Science Justus–Liebig–University, Giessen, Germany 1 MATLAB R is a registered trade–mark of ‘The MathWorks, Inc.’ List of Figures 2/1 Structure of the variance–covariance matrix of order statistics from a distri- bution symmetric around zero . 24 3/1 Explanation of the QQ–plot and the PP–plot .
Recommended publications
  • The Poisson Burr X Inverse Rayleigh Distribution and Its Applications
    Journal of Data Science,18(1). P. 56 – 77,2020 DOI:10.6339/JDS.202001_18(1).0003 The Poisson Burr X Inverse Rayleigh Distribution And Its Applications Rania H. M. Abdelkhalek* Department of Statistics, Mathematics and Insurance, Benha University, Egypt. ABSTRACT A new flexible extension of the inverse Rayleigh model is proposed and studied. Some of its fundamental statistical properties are derived. We assessed the performance of the maximum likelihood method via a simulation study. The importance of the new model is shown via three applications to real data sets. The new model is much better than other important competitive models. Keywords: Inverse Rayleigh; Poisson; Simulation; Modeling. * [email protected] Rania H. M. Abdelkhalek 57 1. Introduction and physical motivation The well-known inverse Rayleigh (IR) model is considered as a distribution for a life time random variable (r.v.). The IR distribution has many applications in the area of reliability studies. Voda (1972) proved that the distribution of lifetimes of several types of experimental (Exp) units can be approximated by the IR distribution and studied some properties of the maximum likelihood estimation (MLE) of the its parameter. Mukerjee and Saran (1984) studied the failure rate of an IR distribution. Aslam and Jun (2009) introduced a group acceptance sampling plans for truncated life tests based on the IR. Soliman et al. (2010) studied the Bayesian and non- Bayesian estimation of the parameter of the IR model Dey (2012) mentioned that the IR model has also been used as a failure time distribution although the variance and higher order moments does not exist for it.
    [Show full text]
  • New Proposed Length-Biased Weighted Exponential and Rayleigh Distribution with Application
    CORE Metadata, citation and similar papers at core.ac.uk Provided by International Institute for Science, Technology and Education (IISTE): E-Journals Mathematical Theory and Modeling www.iiste.org ISSN 2224-5804 (Paper) ISSN 2225-0522 (Online) Vol.4, No.7, 2014 New Proposed Length-Biased Weighted Exponential and Rayleigh Distribution with Application Kareema Abed Al-kadim1 Nabeel Ali Hussein 2 1,2.College of Education of Pure Sciences, University of Babylon/Hilla 1.E-mail of the Corresponding author: [email protected] 2.E-mail: [email protected] Abstract The concept of length-biased distribution can be employed in development of proper models for lifetime data. Length-biased distribution is a special case of the more general form known as weighted distribution. In this paper we introduce a new class of length-biased of weighted exponential and Rayleigh distributions(LBW1E1D), (LBWRD).This paper surveys some of the possible uses of Length - biased distribution We study the some of its statistical properties with application of these new distribution. Keywords: length- biased weighted Rayleigh distribution, length- biased weighted exponential distribution, maximum likelihood estimation. 1. Introduction Fisher (1934) [1] introduced the concept of weighted distributions Rao (1965) [2]developed this concept in general terms by in connection with modeling statistical data where the usual practi ce of using standard distributions were not found to be appropriate, in which various situations that can be modeled by weighted distributions, where non-observe ability of some events or damage caused to the original observation resulting in a reduced value, or in a sampling procedure which gives unequal chances to the units in the original That means the recorded observations cannot be considered as a original random sample.
    [Show full text]
  • Suitability of Different Probability Distributions for Performing Schedule Risk Simulations in Project Management
    2016 Proceedings of PICMET '16: Technology Management for Social Innovation Suitability of Different Probability Distributions for Performing Schedule Risk Simulations in Project Management J. Krige Visser Department of Engineering and Technology Management, University of Pretoria, Pretoria, South Africa Abstract--Project managers are often confronted with the The Project Evaluation and Review Technique (PERT), in question on what is the probability of finishing a project within conjunction with the Critical Path Method (CPM), were budget or finishing a project on time. One method or tool that is developed in the 1950’s to address the uncertainty in project useful in answering these questions at various stages of a project duration for complex projects [11], [13]. The expected value is to develop a Monte Carlo simulation for the cost or duration or mean value for each activity of the project network was of the project and to update and repeat the simulations with actual data as the project progresses. The PERT method became calculated by applying the beta distribution and three popular in the 1950’s to express the uncertainty in the duration estimates for the duration of the activity. The total project of activities. Many other distributions are available for use in duration was determined by adding all the duration values of cost or schedule simulations. the activities on the critical path. The Monte Carlo simulation This paper discusses the results of a project to investigate the (MCS) provides a distribution for the total project duration output of schedule simulations when different distributions, e.g. and is therefore more useful as a method or tool for decision triangular, normal, lognormal or betaPert, are used to express making.
    [Show full text]
  • Estimation of Common Location and Scale Parameters in Nonregular Cases Ahmad Razmpour Iowa State University
    Iowa State University Capstones, Theses and Retrospective Theses and Dissertations Dissertations 1982 Estimation of common location and scale parameters in nonregular cases Ahmad Razmpour Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/rtd Part of the Statistics and Probability Commons Recommended Citation Razmpour, Ahmad, "Estimation of common location and scale parameters in nonregular cases " (1982). Retrospective Theses and Dissertations. 7528. https://lib.dr.iastate.edu/rtd/7528 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Retrospective Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. INFORMATION TO USERS This reproduction was made from a copy of a document sent to us for microfilming. While the most advanced technology has been used to photograph and reproduce this document, the quality of the reproduction is heavily dependent upon the quality of the material submitted. The following explanation of techniques is provided to help clarify markings or notations which may appear on this reproduction. 1. The sign or "target" for pages apparently lacking from the document photographed is "Missing Page(s)". If it was possible to obtain the missing page(s) or section, they are spliced into the film along with adjacent pages. This may have necessitated cutting through an image and duplicating adjacent pages to assure complete continuity. 2. When an image on the film is obliterated with a round black mark, it is an indication of either blurred copy because of movement during exposure, duplicate copy, or copyrighted materials that should not have been filmed.
    [Show full text]
  • A Comparison of Unbiased and Plottingposition Estimators of L
    WATER RESOURCES RESEARCH, VOL. 31, NO. 8, PAGES 2019-2025, AUGUST 1995 A comparison of unbiased and plotting-position estimators of L moments J. R. M. Hosking and J. R. Wallis IBM ResearchDivision, T. J. Watson ResearchCenter, Yorktown Heights, New York Abstract. Plotting-positionestimators of L momentsand L moment ratios have several disadvantagescompared with the "unbiased"estimators. For generaluse, the "unbiased'? estimatorsshould be preferred. Plotting-positionestimators may still be usefulfor estimatingextreme upper tail quantilesin regional frequencyanalysis. Probability-Weighted Moments and L Moments •r+l-" (--1)r • P*r,k Olk '- E p *r,!•[J!•. Probability-weightedmoments of a randomvariable X with k=0 k=0 cumulativedistribution function F( ) and quantile function It is convenient to define dimensionless versions of L mo- x( ) were definedby Greenwoodet al. [1979]to be the quan- tities ments;this is achievedby dividingthe higher-orderL moments by the scale measure h2. The L moment ratios •'r, r = 3, Mp,ra= E[XP{F(X)}r{1- F(X)} s] 4, '", are definedby ßr-" •r/•2 ß {X(u)}PUr(1 -- U)s du. L momentratios measure the shapeof a distributionindepen- dently of its scaleof measurement.The ratios *3 ("L skew- ness")and *4 ("L kurtosis")are nowwidely used as measures Particularlyuseful specialcases are the probability-weighted of skewnessand kurtosis,respectively [e.g., Schaefer,1990; moments Pilon and Adamowski,1992; Royston,1992; Stedingeret al., 1992; Vogeland Fennessey,1993]. 12•r= M1,0, r = •01 (1 - u)rx(u) du, Estimators Given an ordered sample of size n, Xl: n • X2:n • ''' • urx(u) du. X.... there are two establishedways of estimatingthe proba- /3r--- Ml,r, 0 =f01 bility-weightedmoments and L moments of the distribution from whichthe samplewas drawn.
    [Show full text]
  • Study of Generalized Lomax Distribution and Change Point Problem
    STUDY OF GENERALIZED LOMAX DISTRIBUTION AND CHANGE POINT PROBLEM Amani Alghamdi A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2018 Committee: Arjun K. Gupta, Committee Co-Chair Wei Ning, Committee Co-Chair Jane Chang, Graduate Faculty Representative John Chen Copyright c 2018 Amani Alghamdi All rights reserved iii ABSTRACT Arjun K. Gupta and Wei Ning, Committee Co-Chair Generalizations of univariate distributions are often of interest to serve for real life phenomena. These generalized distributions are very useful in many fields such as medicine, physics, engineer- ing and biology. Lomax distribution (Pareto-II) is one of the well known univariate distributions that is considered as an alternative to the exponential, gamma, and Weibull distributions for heavy tailed data. However, this distribution does not grant great flexibility in modeling data. In this dissertation, we introduce a generalization of the Lomax distribution called Rayleigh Lo- max (RL) distribution using the form obtained by El-Bassiouny et al. (2015). This distribution provides great fit in modeling wide range of real data sets. It is a very flexible distribution that is related to some of the useful univariate distributions such as exponential, Weibull and Rayleigh dis- tributions. Moreover, this new distribution can also be transformed to a lifetime distribution which is applicable in many situations. For example, we obtain the inverse estimation and confidence intervals in the case of progressively Type-II right censored situation. We also apply Schwartz information approach (SIC) and modified information approach (MIC) to detect the changes in parameters of the RL distribution.
    [Show full text]
  • A Study of Non-Central Skew T Distributions and Their Applications in Data Analysis and Change Point Detection
    A STUDY OF NON-CENTRAL SKEW T DISTRIBUTIONS AND THEIR APPLICATIONS IN DATA ANALYSIS AND CHANGE POINT DETECTION Abeer M. Hasan A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2013 Committee: Arjun K. Gupta, Co-advisor Wei Ning, Advisor Mark Earley, Graduate Faculty Representative Junfeng Shang. Copyright c August 2013 Abeer M. Hasan All rights reserved iii ABSTRACT Arjun K. Gupta, Co-advisor Wei Ning, Advisor Over the past three decades there has been a growing interest in searching for distribution families that are suitable to analyze skewed data with excess kurtosis. The search started by numerous papers on the skew normal distribution. Multivariate t distributions started to catch attention shortly after the development of the multivariate skew normal distribution. Many researchers proposed alternative methods to generalize the univariate t distribution to the multivariate case. Recently, skew t distribution started to become popular in research. Skew t distributions provide more flexibility and better ability to accommodate long-tailed data than skew normal distributions. In this dissertation, a new non-central skew t distribution is studied and its theoretical properties are explored. Applications of the proposed non-central skew t distribution in data analysis and model comparisons are studied. An extension of our distribution to the multivariate case is presented and properties of the multivariate non-central skew t distri- bution are discussed. We also discuss the distribution of quadratic forms of the non-central skew t distribution. In the last chapter, the change point problem of the non-central skew t distribution is discussed under different settings.
    [Show full text]
  • A New Generalization of the Generalized Inverse Rayleigh Distribution with Applications
    S S symmetry Article A New Generalization of the Generalized Inverse Rayleigh Distribution with Applications Rana Ali Bakoban * and Ashwaq Mohammad Al-Shehri Department of Statistics, College of Science, University of Jeddah, Jeddah 22254, Saudi Arabia; [email protected] * Correspondence: [email protected] Abstract: In this article, a new four-parameter lifetime model called the beta generalized inverse Rayleigh distribution (BGIRD) is defined and studied. Mixture representation of this model is derived. Curve’s behavior of probability density function, reliability function, and hazard function are studied. Next, we derived the quantile function, median, mode, moments, harmonic mean, skewness, and kurtosis. In addition, the order statistics and the mean deviations about the mean and median are found. Other important properties including entropy (Rényi and Shannon), which is a measure of the uncertainty for this distribution, are also investigated. Maximum likelihood estimation is adopted to the model. A simulation study is conducted to estimate the parameters. Four real-life data sets from difference fields were applied on this model. In addition, a comparison between the new model and some competitive models is done via information criteria. Our model shows the best fitting for the real data. Keywords: beta generalized inverse Rayleigh distribution; statistical properties; mean deviations; quantile; Rényi entropy; Shannon entropy; incomplete beta function; Montecarlo Simulation Citation: Bakoban, R.A.; Al-Shehri, MSC: Primary 62E10; Secondary 60E05 A.M. A New Generalization of the Generalized Inverse Rayleigh Distribution with Applications. Symmetry 2021, 13, 711. https:// 1. Introduction doi.org/10.3390/sym13040711 Modeling and analysis of lifetime phenomena are important aspects of statistical Academic Editor: Sergei D.
    [Show full text]
  • The Asymmetric T-Copula with Individual Degrees of Freedom
    The Asymmetric t-Copula with Individual Degrees of Freedom d-fine GmbH Christ Church University of Oxford A thesis submitted for the degree of MSc in Mathematical Finance Michaelmas 2012 Abstract This thesis investigates asymmetric dependence structures of multivariate asset returns. Evidence of such asymmetry for equity returns has been reported in the literature. In order to model the dependence structure, a new t-copula approach is proposed called the skewed t-copula with individ- ual degrees of freedom (SID t-copula). This copula provides the flexibility to assign an individual degree-of-freedom parameter and an individual skewness parameter to each asset in a multivariate setting. Applying this approach to GARCH residuals of bivariate equity index return data and using maximum likelihood estimation, we find significant asymmetry. By means of the Akaike information criterion, it is demonstrated that the SID t-copula provides the best model for the market data compared to other copula approaches without explicit asymmetry parameters. In addition, it yields a better fit than the conventional skewed t-copula with a single degree-of-freedom parameter. In a model impact study, we analyse the errors which can occur when mod- elling asymmetric multivariate SID-t returns with the symmetric multi- variate Gauss or standard t-distribution. The comparison is done in terms of the risk measures value-at-risk and expected shortfall. We find large deviations between the modelled and the true VaR/ES of a spread posi- tion composed of asymmetrically distributed risk factors. Going from the bivariate case to a larger number of risk factors, the model errors increase.
    [Show full text]
  • On the Meaning and Use of Kurtosis
    Psychological Methods Copyright 1997 by the American Psychological Association, Inc. 1997, Vol. 2, No. 3,292-307 1082-989X/97/$3.00 On the Meaning and Use of Kurtosis Lawrence T. DeCarlo Fordham University For symmetric unimodal distributions, positive kurtosis indicates heavy tails and peakedness relative to the normal distribution, whereas negative kurtosis indicates light tails and flatness. Many textbooks, however, describe or illustrate kurtosis incompletely or incorrectly. In this article, kurtosis is illustrated with well-known distributions, and aspects of its interpretation and misinterpretation are discussed. The role of kurtosis in testing univariate and multivariate normality; as a measure of departures from normality; in issues of robustness, outliers, and bimodality; in generalized tests and estimators, as well as limitations of and alternatives to the kurtosis measure [32, are discussed. It is typically noted in introductory statistics standard deviation. The normal distribution has a kur- courses that distributions can be characterized in tosis of 3, and 132 - 3 is often used so that the refer- terms of central tendency, variability, and shape. With ence normal distribution has a kurtosis of zero (132 - respect to shape, virtually every textbook defines and 3 is sometimes denoted as Y2)- A sample counterpart illustrates skewness. On the other hand, another as- to 132 can be obtained by replacing the population pect of shape, which is kurtosis, is either not discussed moments with the sample moments, which gives or, worse yet, is often described or illustrated incor- rectly. Kurtosis is also frequently not reported in re- ~(X i -- S)4/n search articles, in spite of the fact that virtually every b2 (•(X i - ~')2/n)2' statistical package provides a measure of kurtosis.
    [Show full text]
  • Arcsine Laws for Random Walks Generated from Random Permutations with Applications to Genomics
    Applied Probability Trust (4 February 2021) ARCSINE LAWS FOR RANDOM WALKS GENERATED FROM RANDOM PERMUTATIONS WITH APPLICATIONS TO GENOMICS XIAO FANG,1 The Chinese University of Hong Kong HAN LIANG GAN,2 Northwestern University SUSAN HOLMES,3 Stanford University HAIYAN HUANG,4 University of California, Berkeley EROL PEKOZ,¨ 5 Boston University ADRIAN ROLLIN,¨ 6 National University of Singapore WENPIN TANG,7 Columbia University 1 Email address: [email protected] 2 Email address: [email protected] 3 Email address: [email protected] 4 Email address: [email protected] 5 Email address: [email protected] 6 Email address: [email protected] 7 Email address: [email protected] 1 Postal address: Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong 2 Postal address: Department of Mathematics, Northwestern University, 2033 Sheridan Road, Evanston, IL 60208 2 Current address: University of Waikato, Private Bag 3105, Hamilton 3240, New Zealand 1 2 X. Fang et al. Abstract A classical result for the simple symmetric random walk with 2n steps is that the number of steps above the origin, the time of the last visit to the origin, and the time of the maximum height all have exactly the same distribution and converge when scaled to the arcsine law. Motivated by applications in genomics, we study the distributions of these statistics for the non-Markovian random walk generated from the ascents and descents of a uniform random permutation and a Mallows(q) permutation and show that they have the same asymptotic distributions as for the simple random walk.
    [Show full text]
  • A Multivariate Student's T-Distribution
    Open Journal of Statistics, 2016, 6, 443-450 Published Online June 2016 in SciRes. http://www.scirp.org/journal/ojs http://dx.doi.org/10.4236/ojs.2016.63040 A Multivariate Student’s t-Distribution Daniel T. Cassidy Department of Engineering Physics, McMaster University, Hamilton, ON, Canada Received 29 March 2016; accepted 14 June 2016; published 17 June 2016 Copyright © 2016 by author and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/ Abstract A multivariate Student’s t-distribution is derived by analogy to the derivation of a multivariate normal (Gaussian) probability density function. This multivariate Student’s t-distribution can have different shape parameters νi for the marginal probability density functions of the multi- variate distribution. Expressions for the probability density function, for the variances, and for the covariances of the multivariate t-distribution with arbitrary shape parameters for the marginals are given. Keywords Multivariate Student’s t, Variance, Covariance, Arbitrary Shape Parameters 1. Introduction An expression for a multivariate Student’s t-distribution is presented. This expression, which is different in form than the form that is commonly used, allows the shape parameter ν for each marginal probability density function (pdf) of the multivariate pdf to be different. The form that is typically used is [1] −+ν Γ+((ν n) 2) T ( n) 2 +Σ−1 n 2 (1.[xx] [ ]) (1) ΓΣ(νν2)(π ) This “typical” form attempts to generalize the univariate Student’s t-distribution and is valid when the n marginal distributions have the same shape parameter ν .
    [Show full text]