Triangular, Gamma, Erlang, Weibull Distributions

Total Page:16

File Type:pdf, Size:1020Kb

Triangular, Gamma, Erlang, Weibull Distributions IV. Triangular Distribution Known values are the minimum (a), the mode (b - the most likely value of the pdf), and the maximum (c). probability density function (area under the curve = 1) f(x) 2 h = c - a a b c x ì 2(x -a) h for a £ x £ b (slope = ) ï(c - a)(b - a) b- a The pdf is given by f(x) = í -2(x - c) - h ï for b £ x £ c (slope = ) îï (c -a)(c -b) c- b = 0 otherwise The expected value is given by ¥ b 2(x -a) c 2(c - x) a + b +c E(X) = x ×f(x)dx = × xdx + ×xdx = ò ò (c - a)(b -a) ò (c -a)(c - b) 3 -¥ a b The derivation is fairly tedious; with a little work it can be shown that é2b3 - 3ab 2 + a3 c3 - 3cb 2 + 2b3 ù a 3 (c - b) + b3 (a - c) + c3 (b - a) E(X) = h × ê + ú = ëê 6(b - a) 6(c - b) ûú 3(c - a)(b - a)(c - b) (a + b + c)(c - a)(b - a)(c - b) a + b + c = = 3(c - a)(b - a)(c - b) 3 Remark: For a discrete sample, measures of centrality that are typically determined are the mean, the mode, and the median. The mean is the average value of the sample and corresponds to E(X). The mode corresponds to the maximum value of the pdf. When working with a sample, it is necessary to resort to a histogram (which can be tricky) to estimate the mode of the underlying pdf. The median simply corresponds to that point at which half of the area under the curve is to the left and half is to the right. The triangular distribution is typically employed when not much is known about the distribution, but the minimum, mode, and maximum can be estimated. Sampling from the triangular distribution requires solving rsample x = òf(z)dz -¥ for rsample given random probability x. Since f(z) is piecewise continuous, its distribution function F(t) is given by ì 0 for t £ a ï t ï f(z)dz for a < t £ b t ò ïa F(t) = òf(z)dz = í c -¥ ï1 - f(z)dx for b £ t < c ï ò ï t î 1 for t ³ c Hence, for a £ rsample £ b we get rsample rsample rsample 2(z -a) z 2 - 2az (rsample - a)2 A x = f(z)dz = dz = = ò ò (b -a)(c - a) (b -a)(c - a) (b - a)(c - a) a a a and for b £ rsample £ c, since c c c 2(c - z) 2cz - z 2 (c - rsample) 2 f(z)dz = dz = = ò ò (c - b)(c -a) (c - b)(c - a) (c - b)(c - a) rsample rsample rsample we get (c - rsample) 2 B x = 1 - (c- b)(c - a) Since b (b - a) f(z)dz = ò (c -a) a (b - a) if the random probability x £ then equation A is used to solve for (c - a) rsample; otherwise equation B is used. A rsample = a + (b -a)(c - a)x for x £ (b -a)/(c - a) B rsample = c - (c - b)(c -a)(1- x) for x > (b - a)/(c - a) Graphically, the sampling function has the appearance rsample c b a x 0 1 (b-a)/(c-a) Example: (Note: the median corresponds to x= 0.5) For a=1, b = 2, c = 4 mean = (a+b+c)/3 = 2.333 mode = 2 median = c− (c − b)(c −a)(0.5) = 4 - 3 = 2.268 V. Gamma Distribution A large number of useful functions are related to the exponential function. The gamma function is one of these. The gamma function generally traces from 18th century work by Euler in which he was using interpolation methods to define n! for non-integral values (it was later dubbed the gamma function by LeGendre in a series of books published between 1811 and 1826). The gamma function appears naturally in the study of anti-differentiation; i.e., it is also studied in the context of differential equations when calculating LaPlace transforms. The gamma function is given by ∞ Γ(α) = ∫ xα - 1 e−x dx (α > 0) 0 Integrating by parts, we get Γ(α) = (α-1)Γ(α-1) for α > 1. ∞ Since Γ(1) = ∫ex- dx = 1 , then when α is an integer, Γ(α) = (α - 1)! 0 Hence, the gamma function is a generalization of the factorial, applying to all α > 0, not just integers. The gamma distribution is obtained from the gamma function by specifying the pdf x ì - ï a - 1 ß f(x) = íkx e for x > 0 for fixed a > 0 and b > 0 ï î 0 otherwise ¥ where the proportionality constant k is chosen so that òf(x)dx = 1 . -¥ k is easy to figure: x ¥- ¥ 1 = kò x a - 1e ß dx = kßa òt a - 1e-t dt where x = bt . 0 0 G(a) 1 1 so k = and f(x) is given by x a - 1 e-x/ß ßa G(a ) ßa G(a ) a is called shape (or order) parameter; b us called the scale parameter. 1 1 - × x Note that when a = 1, f(x) = × e ß which is the exponential distribution ß with mean b. In general, E(X) = ab and s2 = ab2 . Hence, if the mean and standard deviation can be estimated, then a and b can also be determined. Algorithm for calculating the natural logarithm of the gamma function Attributed to Lanczos, C., Journal S.I.A.M. Numerical Analysis, ser. B, vol. 1, p. 86 (1964) and adapted from Numerical Recipes in C by Press, W.H., and B.P. Flannery, S.A. Teukolsky, W.T. Vetterling (Cambridge University Press, 1988). FUNCTION lngamma(z) /* Use the reflection formula for z < 1 */ IF z < 1 z ¬ 1 - z RETURN ln(pz) - (lngamma(1 + z) + ln(sin(pz)) ENDIF coeff ¬ 76.18009173, -86.50532033, 24.01409822, -1.231739516, 0.00120858003, -0.00000536382 /* These values are the (approximate) coefficients for the first 6 terms of an infinite series involved in an exact formulation for the gamma function credited to Lanczos. They yield an approximation for the variable "a" (determined below) which is within |e| < 2 ´ 10-10 of its true value */ a ¬ 1 FOR i ¬ 1 TO 6 a ¬ a + coeff(i)/(i + z - 1) ENDFOR RETURN ln(a 2p ) - (z + 4.5) + (z - 0.5)ln(z + 4.5) END Gamma pdf for fixed mean ab = 5 and varying values of a and b f(x) 0.6 0.5 a=.5, b=10 0.4 a=1.5, b=3.3333 a=5, b=1 0.3 a=10, b=.5 0.2 0.1 0 x 0 5 10 Corresponding distribution functions and sampling functions F(x) rsample 1 10 8 .6 6 .2 4 0 0 10 2 0 0 .2 .6 1 The gamma distribution is used to model waiting times or time to complete a task. More specifically, it can be shown that if we have exponentially distributed interarrival times with mean 1/l, the time needed to obtain k changes distributes according to a gamma distribution with a = k and b = 1/l. ¥ Gamma Function: G(a ) = ò x a - 1 e -x dx (a > 0) 0 The general relationship G(a) = (a-1)G(a-1) for a > 1 holds. p It can also be shown that GG(a ) (1- a ) = for 0 < a < 1. sin( p×a ) (Note that in particular, this means that G(.5) = p ) For 0 < a < 1, 1 + a > 1, so G(1+a) = aG(a). This in turn gives the reflection formula p ×a G(1- a ) = for 0 < a < 1 G(1 + a )sin( p×a ) Selected values computed according to the algorithm for ln(G(a)). G(.25) » 3.62560991 G(a) G(.5) = p » 1.77245385 10 G(.75) » 1.22541670 G(1) = 0! = 1 G(1.25) » 0.90640248 8 G(1.5) = .5G(.5) = p / 2 » 0.88622693 G(1.75) » 0.91906253 G(2) = 1! = 1 6 x 3! G(2.25) » 1.13300310 0! 1! 2! G(2.5) = 1.5G(1.5) = 3 p / 4 » 1.32934039 4 G(2.75) » 1.60835942 G(3) = 2! = 2 G(3.25) » 2.54925697 2 x G(3.5) = 2.5G(2.5) = 15 p / 8 » 3.32335097 x x G(3.75) » 4.42298841 0 a G(4) = 3! = 6 0 1 2 3 4 5 G(4.25) » 8.28508514 G(4.5) = 3.5G(3.5) = 105 p /16 » 11.63172840 G(4.75) » 16.58620654 G(5) = 4! = 24 G(5.25) » 35.21161185 G(5.5) = 4.5G(4.5) = 945 p /32 » 52.34277778 G(5.75) » 78.78448105 G(6) = 5! = 120 For the pdf of the gamma distribution x ì - ï a - 1 ß f(x) = íkx e for x > 0 for fixed a > 0 and b > 0 ï î 0 otherwise note that if: a < 1 then xa - 1 ® ¥ as x ® 0 a = 1 then the distribution is the exponential distribution a > 1 then xa - 1 ® 0 as x ® 0 The earlier example showed three basic shapes, each of which is described by the behavior of the derivative f '(x) (slope function) of f(x). f '(x) = k[(a - 1)xa - 2 e-x/b + (-1/b)xa - 1 e-x/b] There are actually 5 cases: a < 1 the slope ® -¥ as x ® 0 since each term is < 0 and each exponent of x is < 0 a = 1 the slope ® -1/b2 as x ® 0 in accord with the exponential distribution since k = 1/b, term 1 is 0 and term 2 is -1/b a < 2 and a > 1 the slope ® +¥ as x ® 0 since the lead term ® +¥ and term 2 is 0 a = 2 the slope ® +1/b2 as x ® 0 since k = 1/ba = 1/b2 a > 2 the slope ® 0 as x ® 0 In each case the slope ® 0 as x ® +¥ The gamma distribution is one which is usually sampled by the accept-reject technique, which means to get k, the value of G(a) must be computed.
Recommended publications
  • Modelling Censored Losses Using Splicing: a Global Fit Strategy With
    Modelling censored losses using splicing: a global fit strategy with mixed Erlang and extreme value distributions Tom Reynkens∗a, Roel Verbelenb, Jan Beirlanta,c, and Katrien Antoniob,d aLStat and LRisk, Department of Mathematics, KU Leuven. Celestijnenlaan 200B, 3001 Leuven, Belgium. bLStat and LRisk, Faculty of Economics and Business, KU Leuven. Naamsestraat 69, 3000 Leuven, Belgium. cDepartment of Mathematical Statistics and Actuarial Science, University of the Free State. P.O. Box 339, Bloemfontein 9300, South Africa. dFaculty of Economics and Business, University of Amsterdam. Roetersstraat 11, 1018 WB Amsterdam, The Netherlands. August 14, 2017 Abstract In risk analysis, a global fit that appropriately captures the body and the tail of the distribution of losses is essential. Modelling the whole range of the losses using a standard distribution is usually very hard and often impossible due to the specific characteristics of the body and the tail of the loss distribution. A possible solution is to combine two distributions in a splicing model: a light-tailed distribution for the body which covers light and moderate losses, and a heavy-tailed distribution for the tail to capture large losses. We propose a splicing model with a mixed Erlang (ME) distribution for the body and a Pareto distribution for the tail. This combines the arXiv:1608.01566v4 [stat.ME] 11 Aug 2017 flexibility of the ME distribution with the ability of the Pareto distribution to model extreme values. We extend our splicing approach for censored and/or truncated data. Relevant examples of such data can be found in financial risk analysis. We illustrate the flexibility of this splicing model using practical examples from risk measurement.
    [Show full text]
  • A New Parameter Estimator for the Generalized Pareto Distribution Under the Peaks Over Threshold Framework
    mathematics Article A New Parameter Estimator for the Generalized Pareto Distribution under the Peaks over Threshold Framework Xu Zhao 1,*, Zhongxian Zhang 1, Weihu Cheng 1 and Pengyue Zhang 2 1 College of Applied Sciences, Beijing University of Technology, Beijing 100124, China; [email protected] (Z.Z.); [email protected] (W.C.) 2 Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA; [email protected] * Correspondence: [email protected] Received: 1 April 2019; Accepted: 30 April 2019 ; Published: 7 May 2019 Abstract: Techniques used to analyze exceedances over a high threshold are in great demand for research in economics, environmental science, and other fields. The generalized Pareto distribution (GPD) has been widely used to fit observations exceeding the tail threshold in the peaks over threshold (POT) framework. Parameter estimation and threshold selection are two critical issues for threshold-based GPD inference. In this work, we propose a new GPD-based estimation approach by combining the method of moments and likelihood moment techniques based on the least squares concept, in which the shape and scale parameters of the GPD can be simultaneously estimated. To analyze extreme data, the proposed approach estimates the parameters by minimizing the sum of squared deviations between the theoretical GPD function and its expectation. Additionally, we introduce a recently developed stopping rule to choose the suitable threshold above which the GPD asymptotically fits the exceedances. Simulation studies show that the proposed approach performs better or similar to existing approaches, in terms of bias and the mean square error, in estimating the shape parameter.
    [Show full text]
  • Suitability of Different Probability Distributions for Performing Schedule Risk Simulations in Project Management
    2016 Proceedings of PICMET '16: Technology Management for Social Innovation Suitability of Different Probability Distributions for Performing Schedule Risk Simulations in Project Management J. Krige Visser Department of Engineering and Technology Management, University of Pretoria, Pretoria, South Africa Abstract--Project managers are often confronted with the The Project Evaluation and Review Technique (PERT), in question on what is the probability of finishing a project within conjunction with the Critical Path Method (CPM), were budget or finishing a project on time. One method or tool that is developed in the 1950’s to address the uncertainty in project useful in answering these questions at various stages of a project duration for complex projects [11], [13]. The expected value is to develop a Monte Carlo simulation for the cost or duration or mean value for each activity of the project network was of the project and to update and repeat the simulations with actual data as the project progresses. The PERT method became calculated by applying the beta distribution and three popular in the 1950’s to express the uncertainty in the duration estimates for the duration of the activity. The total project of activities. Many other distributions are available for use in duration was determined by adding all the duration values of cost or schedule simulations. the activities on the critical path. The Monte Carlo simulation This paper discusses the results of a project to investigate the (MCS) provides a distribution for the total project duration output of schedule simulations when different distributions, e.g. and is therefore more useful as a method or tool for decision triangular, normal, lognormal or betaPert, are used to express making.
    [Show full text]
  • ACTS 4304 FORMULA SUMMARY Lesson 1: Basic Probability Summary of Probability Concepts Probability Functions
    ACTS 4304 FORMULA SUMMARY Lesson 1: Basic Probability Summary of Probability Concepts Probability Functions F (x) = P r(X ≤ x) S(x) = 1 − F (x) dF (x) f(x) = dx H(x) = − ln S(x) dH(x) f(x) h(x) = = dx S(x) Functions of random variables Z 1 Expected Value E[g(x)] = g(x)f(x)dx −∞ 0 n n-th raw moment µn = E[X ] n n-th central moment µn = E[(X − µ) ] Variance σ2 = E[(X − µ)2] = E[X2] − µ2 µ µ0 − 3µ0 µ + 2µ3 Skewness γ = 3 = 3 2 1 σ3 σ3 µ µ0 − 4µ0 µ + 6µ0 µ2 − 3µ4 Kurtosis γ = 4 = 4 3 2 2 σ4 σ4 Moment generating function M(t) = E[etX ] Probability generating function P (z) = E[zX ] More concepts • Standard deviation (σ) is positive square root of variance • Coefficient of variation is CV = σ/µ • 100p-th percentile π is any point satisfying F (π−) ≤ p and F (π) ≥ p. If F is continuous, it is the unique point satisfying F (π) = p • Median is 50-th percentile; n-th quartile is 25n-th percentile • Mode is x which maximizes f(x) (n) n (n) • MX (0) = E[X ], where M is the n-th derivative (n) PX (0) • n! = P r(X = n) (n) • PX (1) is the n-th factorial moment of X. Bayes' Theorem P r(BjA)P r(A) P r(AjB) = P r(B) fY (yjx)fX (x) fX (xjy) = fY (y) Law of total probability 2 If Bi is a set of exhaustive (in other words, P r([iBi) = 1) and mutually exclusive (in other words P r(Bi \ Bj) = 0 for i 6= j) events, then for any event A, X X P r(A) = P r(A \ Bi) = P r(Bi)P r(AjBi) i i Correspondingly, for continuous distributions, Z P r(A) = P r(Ajx)f(x)dx Conditional Expectation Formula EX [X] = EY [EX [XjY ]] 3 Lesson 2: Parametric Distributions Forms of probability
    [Show full text]
  • Final Paper (PDF)
    Analytic Method for Probabilistic Cost and Schedule Risk Analysis Final Report 5 April2013 PREPARED FOR: NATIONAL AERONAUTICS AND SPACE ADMINISTRATION (NASA) OFFICE OF PROGRAM ANALYSIS AND EVALUATION (PA&E) COST ANALYSIS DIVISION (CAD) Felecia L. London Contracting Officer NASA GODDARD SPACE FLIGHT CENTER, PROCUREMENT OPERATIONS DIVISION OFFICE FOR HEADQUARTERS PROCUREMENT, 210.H Phone: 301-286-6693 Fax:301-286-1746 e-mail: [email protected] Contract Number: NNHl OPR24Z Order Number: NNH12PV48D PREPARED BY: RAYMOND P. COVERT, COVARUS, LLC UNDER SUBCONTRACT TO GALORATHINCORPORATED ~ SEER. br G A L 0 R A T H [This Page Intentionally Left Blank] ii TABLE OF CONTENTS 1 Executive Summacy.................................................................................................. 11 2 In.troduction .............................................................................................................. 12 2.1 Probabilistic Nature of Estimates .................................................................................... 12 2.2 Uncertainty and Risk ....................................................................................................... 12 2.2.1 Probability Density and Probability Mass ................................................................ 12 2.2.2 Cumulative Probability ............................................................................................. 13 2.2.3 Definition ofRisk ..................................................................................................... 14 2.3
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Procedures for Estimation of Weibull Parameters James W
    United States Department of Agriculture Procedures for Estimation of Weibull Parameters James W. Evans David E. Kretschmann David W. Green Forest Forest Products General Technical Report February Service Laboratory FPL–GTR–264 2019 Abstract Contents The primary purpose of this publication is to provide an 1 Introduction .................................................................. 1 overview of the information in the statistical literature on 2 Background .................................................................. 1 the different methods developed for fitting a Weibull distribution to an uncensored set of data and on any 3 Estimation Procedures .................................................. 1 comparisons between methods that have been studied in the 4 Historical Comparisons of Individual statistics literature. This should help the person using a Estimator Types ........................................................ 8 Weibull distribution to represent a data set realize some advantages and disadvantages of some basic methods. It 5 Other Methods of Estimating Parameters of should also help both in evaluating other studies using the Weibull Distribution .......................................... 11 different methods of Weibull parameter estimation and in 6 Discussion .................................................................. 12 discussions on American Society for Testing and Materials Standard D5457, which appears to allow a choice for the 7 Conclusion ................................................................
    [Show full text]
  • A New Weibull-X Family of Distributions: Properties, Characterizations and Applications Zubair Ahmad1* , M
    Ahmad et al. Journal of Statistical Distributions and Applications (2018) 5:5 https://doi.org/10.1186/s40488-018-0087-6 RESEARCH Open Access A new Weibull-X family of distributions: properties, characterizations and applications Zubair Ahmad1* , M. Elgarhy2 and G. G. Hamedani3 * Correspondence: [email protected] Abstract 1Department of Statistics, Quaid-i-Azam University 45320, We propose a new family of univariate distributions generated from the Weibull random Islamabad 44000, Pakistan variable, called a new Weibull-X family of distributions. Two special sub-models of the Full list of author information is proposed family are presented and the shapes of density and hazard functions are available at the end of the article investigated. General expressions for some statistical properties are discussed. For the new family, three useful characterizations based on truncated moments are presented. Three different methods to estimate the model parameters are discussed. Monti Carlo simulation study is conducted to evaluate the performances of these estimators. Finally, the importance of the new family is illustrated empirically via two real life applications. Keywords: Weibull distribution, T-X family, Moment, Characterizations, Order statistics, Estimation 1. Introduction In the field of reliability theory, modeling of lifetime data is very crucial. A number of statistical distributions such as Weibull, Pareto, Gompertz, linear failure rate, Rayleigh, Exonential etc., are available for modeling lifetime data. However, in many practical areas, these classical distributions do not provide adequate fit in modeling data, and there is a clear need for the extended version of these classical distributions. In this re- gard, serious attempts have been made to propose new families of continuous prob- ability distributions that extend the existing well-known distributions by adding additional parameter(s) to the model of the baseline random variable.
    [Show full text]
  • Location-Scale Distributions
    Location–Scale Distributions Linear Estimation and Probability Plotting Using MATLAB Horst Rinne Copyright: Prof. em. Dr. Horst Rinne Department of Economics and Management Science Justus–Liebig–University, Giessen, Germany Contents Preface VII List of Figures IX List of Tables XII 1 The family of location–scale distributions 1 1.1 Properties of location–scale distributions . 1 1.2 Genuine location–scale distributions — A short listing . 5 1.3 Distributions transformable to location–scale type . 11 2 Order statistics 18 2.1 Distributional concepts . 18 2.2 Moments of order statistics . 21 2.2.1 Definitions and basic formulas . 21 2.2.2 Identities, recurrence relations and approximations . 26 2.3 Functions of order statistics . 32 3 Statistical graphics 36 3.1 Some historical remarks . 36 3.2 The role of graphical methods in statistics . 38 3.2.1 Graphical versus numerical techniques . 38 3.2.2 Manipulation with graphs and graphical perception . 39 3.2.3 Graphical displays in statistics . 41 3.3 Distribution assessment by graphs . 43 3.3.1 PP–plots and QQ–plots . 43 3.3.2 Probability paper and plotting positions . 47 3.3.3 Hazard plot . 54 3.3.4 TTT–plot . 56 4 Linear estimation — Theory and methods 59 4.1 Types of sampling data . 59 IV Contents 4.2 Estimators based on moments of order statistics . 63 4.2.1 GLS estimators . 64 4.2.1.1 GLS for a general location–scale distribution . 65 4.2.1.2 GLS for a symmetric location–scale distribution . 71 4.2.1.3 GLS and censored samples .
    [Show full text]
  • Section III Non-Parametric Distribution Functions
    DRAFT 2-6-98 DO NOT CITE OR QUOTE LIST OF ATTACHMENTS ATTACHMENT 1: Glossary ...................................................... A1-2 ATTACHMENT 2: Probabilistic Risk Assessments and Monte-Carlo Methods: A Brief Introduction ...................................................................... A2-2 Tiered Approach to Risk Assessment .......................................... A2-3 The Origin of Monte-Carlo Techniques ........................................ A2-3 What is Monte-Carlo Analysis? .............................................. A2-4 Random Nature of the Monte Carlo Analysis .................................... A2-6 For More Information ..................................................... A2-6 ATTACHMENT 3: Distribution Selection ............................................ A3-1 Section I Introduction .................................................... A3-2 Monte-Carlo Modeling Options ........................................ A3-2 Organization of Document ........................................... A3-3 Section II Parametric Methods .............................................. A3-5 Activity I ! Selecting Candidate Distributions ............................ A3-5 Make Use of Prior Knowledge .................................. A3-5 Explore the Data ............................................ A3-7 Summary Statistics .................................... A3-7 Graphical Data Analysis. ................................ A3-9 Formal Tests for Normality and Lognormality ............... A3-10 Activity II ! Estimation of Parameters
    [Show full text]
  • Identifying Probability Distributions of Key Variables in Sow Herds
    Iowa State University Capstones, Theses and Creative Components Dissertations Summer 2020 Identifying Probability Distributions of Key Variables in Sow Herds Hao Tong Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/creativecomponents Part of the Veterinary Preventive Medicine, Epidemiology, and Public Health Commons Recommended Citation Tong, Hao, "Identifying Probability Distributions of Key Variables in Sow Herds" (2020). Creative Components. 617. https://lib.dr.iastate.edu/creativecomponents/617 This Creative Component is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Creative Components by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. 1 Identifying Probability Distributions of Key Variables in Sow Herds Hao Tong, Daniel C. L. Linhares, and Alejandro Ramirez Iowa State University, College of Veterinary Medicine, Veterinary Diagnostic and Production Animal Medicine Department, Ames, Iowa, USA Abstract Figuring out what probability distribution your data fits is critical for data analysis as statistical assumptions must be met for specific tests used to compare data. However, most studies with swine populations seldom report information about the distribution of the data. In most cases, sow farm production data are treated as having a normal distribution even when they are not. We conducted this study to describe the most common probability distributions in sow herd production data to help provide guidance for future data analysis. In this study, weekly production data from January 2017 to June 2019 were included involving 47 different sow farms.
    [Show full text]
  • Fixed-K Asymptotic Inference About Tail Properties
    Journal of the American Statistical Association ISSN: 0162-1459 (Print) 1537-274X (Online) Journal homepage: http://www.tandfonline.com/loi/uasa20 Fixed-k Asymptotic Inference About Tail Properties Ulrich K. Müller & Yulong Wang To cite this article: Ulrich K. Müller & Yulong Wang (2017) Fixed-k Asymptotic Inference About Tail Properties, Journal of the American Statistical Association, 112:519, 1334-1343, DOI: 10.1080/01621459.2016.1215990 To link to this article: https://doi.org/10.1080/01621459.2016.1215990 Accepted author version posted online: 12 Aug 2016. Published online: 13 Jun 2017. Submit your article to this journal Article views: 206 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=uasa20 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION , VOL. , NO. , –, Theory and Methods https://doi.org/./.. Fixed-k Asymptotic Inference About Tail Properties Ulrich K. Müller and Yulong Wang Department of Economics, Princeton University, Princeton, NJ ABSTRACT ARTICLE HISTORY We consider inference about tail properties of a distribution from an iid sample, based on extreme value the- Received February ory. All of the numerous previous suggestions rely on asymptotics where eventually, an infinite number of Accepted July observations from the tail behave as predicted by extreme value theory, enabling the consistent estimation KEYWORDS of the key tail index, and the construction of confidence intervals using the delta method or other classic Extreme quantiles; tail approaches. In small samples, however, extreme value theory might well provide good approximations for conditional expectations; only a relatively small number of tail observations.
    [Show full text]