Statistical Modeling of Skewed Data Using Newly Formed Parametric Distributions

UNLV Retrospective Theses & Dissertations 1-1-2008 Statistical modeling of skewed data using newly formed parametric distributions Kahadawala Cooray University of Nevada, Las Vegas Follow this and additional works at: https://digitalscholarship.unlv.edu/rtds Repository Citation Cooray, Kahadawala, "Statistical modeling of skewed data using newly formed parametric distributions" (2008). UNLV Retrospective Theses & Dissertations. 2825. http://dx.doi.org/10.25669/yyox-bk71 This Dissertation is protected by copyright and/or related rights. It has been brought to you by Digital Scholarship@UNLV with permission from the rights-holder(s). You are free to use this Dissertation in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/or on the work itself. This Dissertation has been accepted for inclusion in UNLV Retrospective Theses & Dissertations by an authorized administrator of Digital Scholarship@UNLV. For more information, please contact [email protected]. STATISTICAL MODELING OF SKEWED DATA USING NEWLY FORMED PARAMETRIC DISTRIBUTIONS by Kahadawala Cooray Bachelor of Science University of Colombo, Sri Lanka 1994 A dissertation submitted in partial fulfillment of the requirements for the Doctor of Philosophy Degree in Mathematical Sciences Department of Mathematical Sciences College of Sciences Graduate College University of Nevada, Las Vegas August 2008 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3338257 Copyright 2008 by Cooray, Kahadawala All rights reserved. INFORMATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. UMI UMI Microform 3338257 Copyright 2009 by ProQuest LLC. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest LLC 789 E. Eisenhower Parkway PC Box 1346 Ann Arbor, Ml 48106-1346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Copyright by Kahadawala Cooray 2008 All Rights Reserved Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Dissertation Approval I’he Graduate College University of Nevada, Las Vegas May 12 2008 The Dissertation prepared by Kahadawala Cooray Entitled Statistical Modeling of Skewed Data Using Newly Formed Parametric Distributions is approved in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematical Sciences Examination Committee Chair Dean of the Graduate College Exaimnation ^mnmittee Member Examination CommiUee MÆmbei Exammution Committee Member Graduate College Faculty Representative 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT Statistical Modeling of Skewed Data Using Newly Formed Parametric Distributions by Kahadawala Cooray Dr. Malwane M. A. Ananda, Examination Committee Chair Professor of Statistics University of Nevada, Las Vegas Several newly formed continuous parametric distributions are introduced to an alyze skewed data. Firstly, a two-parameter smooth continuous lognormal-Pareto composite distribution is introduced for modeling highly positively skewed data. The new density is a lognormal density up to an unknown threshold value and a Pareto density for the remainder. The resulting density is similar in shape to the lognormal density, yet its upper tail is larger than the lognormal density and the tail behavior is quite similar to the Pareto density. Parameter estimation methods and the goodness- of-fit criterion for the new distribution are presented. A large actuarial data set is analyzed to illustrate the better fit and applicability of the new distribution over other leading distributions. Secondly, the Odd Weibull family is introduced for modeling data with a wide variety of hazard functions. This three-parameter family is derived by considering the distributions of the odds of the Weibull and inverse Weibull fami lies. As a result, the Odd Weibull family is not only useful for testing goodness-of-ht of the Weibull and inverse Weibull as submodels, but it is also convenient for modeling iii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and fitting different data sets, especially in the presence of censoring and truncation. This newly formed family not only possesses all five major hazard shapes: constant, increasing, decreasing, bathtub-shaped and unimodal failure rates, but also has wide variety of density shapes. The model parameters for exact, grouped, censored and truncated data are estimated in two different ways due to the fact that the inverse transformation of the Odd Weibull family does not change its density function. Ex amples are provided based on survival, reliability, and environmental sciences data to illustrate the variety of density and hazard shapes by analyzing complete and in complete data. Thirdly, the two-parameter logistic-sinh distribution is introduced for modeling highly negatively skewed data with extreme observations. The resulting family provides not only negatively skewed densities with thick tails, but also vari ety of monotonie density shapes. The advantages of using the proposed family are demonstrated and compared by illustrating well-known examples. Finally, the folded parametric families are introduced to model the positively skewed data with zero data values. IV Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS ABSTRACT........................................................................................................... iii TABLE OF CONTENTS...................................................................................... v LIST OF TABLES................................................................................................ viii LIST OF FIGURES.............................................................................................. x ACKNOWLEDGMENTS.................................................................................... xii CHAPTER I INTRODUCTION...................................................................... 1 1.1 The background of some continuous univariate distributions .............. 1 1.2 Motivation to construct new distributions .............................................. 3 1.2.1 The lognormal-Pareto composite distribution ........................... 3 1.2.2 The Odd Weibull distribution ...................................................... 5 1.2.3 The logistic-sinh distribution ....................................................... 8 1.2.4 The folded distributions ................................................. 9 CHAPTER II THE LOGNORMAL-PARETO COMPOSITE DISTRIBUTION.................................................................................................... 11 2.1 Introduction................................................................................................ 11 2.2 Model derivation ....................................................................................... 13 2.3 Moment properties .................................................................................... 14 2.4 Maximum likelihood estimators for complete data ............................... 17 2.5 Approximate conditional coverage probabilities for ML estimators 19 2.6 Least square estimators for complete data ............................................ 21 2.7 Bayesian estimators for complete data .................................................. 24 2.8 Goodness-of-ht tests ................................................................................. 27 2.9 Maximum likelihood estimators for right censored data ................. 34 2.10 Illustrative examples ................................................................................. 36 2.11 The Weibull-Pareto composite distribution ............................................ 42 2.11.1 Motivation to medical diagnostics ............................................. 43 2.11.2 Model derivation ......................................................................... 45 2.11.3 Parameter estimation under the least square method .............. 50 2.11.4 Parameter estimation under the likelihood method ................. 53 2.11.5 Approximate coverage probabilities for ML estimators 54 2.11.6 The ML estimation for Type I right censored data .................. 56 2.11.7 Illustrative examples .................................................................... 57 V Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.12 The loglogistic-Pareto composite distribution ........................................... 70 2.13 The inverse Weibull-Pareto composite distribution ................................. 74 2.14 Grouped likelihood procedure for

Load more