Generalized Extreme Value-Pareto Distribution Function and Its Applica

Total Page:16

File Type:pdf, Size:1020Kb

Generalized Extreme Value-Pareto Distribution Function and Its Applica China Ocean Eng., 2019, Vol. 33, No. 2, P. 127–136 DOI: 10.1007/s13344-019-0013-9, ISSN 0890-5487 http://www.chinaoceanengin.cn/ E-mail: [email protected] Generalized Extreme Value-Pareto Distribution Function and Its Applica- tions in Ocean Engineering CHEN Bai-yua, ZHANG Kuang-yuanb, WANG Li-pingc, JIANG Songd, LIU Gui-line, * aCollege of Engineering, University of California Berkeley, Berkeley, USA bDepartment of Energy and Mining Engineering, Pennsylvania State University, University Park, PA, USA cSchool of Mathematical Sciences, Ocean University of China, Qingdao 266100, China dSchool of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China eCollege of Engineering, Ocean University of China, Qingdao 266100, China Received September 14, 2018; revised October 30, 2018; accepted December 2, 2018 ©2019 Chinese Ocean Engineering Society and Springer-Verlag GmbH Germany, part of Springer Nature Abstract In this paper, we establish a generalized extreme Value-Pareto distribution model and derive an analytical expression of Weibull–Pareto distribution model. Based on a data sample of 26-year wave height, we adopt the new model to estimate the design wave height for 500, 700 and 1000-year return periods. Results show that the design wave height from Weibull–Pareto distribution is between that of the Weibull distribution and that of the Pearson-III distribution. For the 500-year return period design wave height, the results from the new model is 1.601% lower than those from the Weibull distribution and 1.319% higher than those from the Pearson-III distribution. The Weibull–Pareto distribution innovatively considers the fractal features, extreme-value statistics and the truncated data in the derivation process. Therefore, it is a more holistic and practical model for estimating the design parameters in marine and coastal environments. Key words: fractal, Pareto distribution, design wave height Citation: Chen, B. Y., Zhang, K. Y., Wang, L. P., Jiang, S., Liu, G. L., 2019. Generalized extreme value-pareto distribution function and its applications in ocean engineering. China Ocean Eng., 33(2): 127–136, doi: 10.1007/s13344-019-0013-9 1 Introduction and Background quired. The essence of the design lies in the creation of the In area of designing coastal and marine structures, the most reasonable and appropriate model to estimate marine estimation of parameters of the marine environment such as environment design parameters. wave height and water level is one of the critical factors. Traditionally, univariate extreme value models, such as Proper and accurate calculation of the wave height or water Gumbel, Weibull, Pearson-III type distributions and the level is very important for the research on safety and build- maximum entropy distribution have been adopted for mod- ing cost of engineering structures, especially under severe eling wave height and water level under extreme conditions marine conditions (Wang et al., 2017; Xian et al., 2018). (Liu et al., 2006; Wang et al., 2014). As the multivariate The marine environment can be very complicated. Offshore statistical theory develops, multivariate models like Logist- projects need to withstand the damage caused by natural ic model, Hüsler-Reiss model, Beta model and other mix- disasters such as strong waves and typhoons. When design- ture models have been adopted in an accumulation of literat- ing coastal structures and assessing the relevant risk, if the ure, and compound extreme value distribution under ex- designing security’s standard is not high enough, extreme treme conditions has been introduced recently (Chen et al., marine impacts such as typhoons can easily result in im- 2017a; Liu G.L. et al., 2015, 2018; Liu X.J. et al., 2018; measurable losses. On the other hand, if the designing se- Wang and Wang, 2013; Wang et al., 2013, 2016). Tradition- curity standard is too high, there may be issues like over-in- ally, the main idea of deriving design wave height (or wave vestment and financial waste. Therefore, the estimation of a height for a specific return period) is to select a certain type proper and accurate design standard, which can not only of probability distribution (such as, Weibull, Pearson-III, withstand the natural impacts but also guarantee the secur- Gumbel distribution) to model the distribution of the max- ity and stability of the designed marine structures, is re- imum annual wave height. Then observed data of maxim- Foundation item: This work was supported by the NSFC-Shandong Joint Fund (Grant No. U1706226), and Graduate Education Reform and Research Fund (Grant No. HDJG18007). *Corresponding author. E-mail: [email protected] 128 CHEN Bai-yu et al. China Ocean Eng., 2019, Vol. 33, No. 2, P. 127–136 um annual wave height and frequency is used to determine most distribution functions of extreme values/statistics have the model parameters. After the calculation of parameters, the form of a power function. Taking advantage of the the cumulative density function is used to finally determine fractal characteristics of power function when the large- the design wave height for a specific return period. Specific- value tail is truncated, this research discusses the changing ally, this modeling process is based on the following as- laws of marine environment elements over a shorter time sumptions: 1) The annual maximum wave height follows period (20 to 50 years). Then, using self-similarity of data the same probability distribution both in the short-term and statistics, this paper calculates the design wave height over the long-term; 2) the observed statistical characteristics in a an extended period of time (100 or 500 years). Second, relatively short period of time are strictly self-similar to when using above-threshold data to derive design wave those in the long-term (100-year return period or longer); 3) height, the tail of the distribution function approaches 0, a due to constrains in the distribution function, only the annu- poor fit of tail data. Therefore, this research introduces the al maximum wave height (only one data point per year) is Pareto distribution that exhibits fractal characteristics when used in the models of wave height and water level, instead low-value data is truncated. Third, this research combines of the majority of observed data. generalized extreme distribution function (de Haan and Fer- The data volume and quality are fundamental for the reira, 2006) and Pareto distribution function, to take advant- modeling and analysis and the quantity of the data often dir- age of their complementary strengths. Fourth, in Section 3, ectly influences the quality of the results. Due to practical we derived the generalized extreme value – Pareto distribu- constraints in many cases, only data over a limited time tion model, and gave the explicit express of Weibull–Pareto range (For example, 20 or 30 years) is available. Therefore, distribution, where the steps are: first create a function fam- it is crucial to make full use of all obtained data and extract ily; then, prove when random variable T follows the gener- as much useful information as possible. alized extreme value distribution, and X follows Pareto dis- The probabilistic analysis on ocean environmental ele- tribution, the expression for Weibull–Pareto distribution can ments such as wave height and water level with Pearson-III, be found. Weibull distribution and multivariate extreme value distri- bution is designed to explore the changing laws of marine 2 Fractal characteristics of the function environmental factors on different time scales (i.e., 100 Fractal refers to a system with irregularity and complex- years, 500 years) and locations, and the expansion and infer- ity, but with similarities between parts and the whole. There ences to and from different scales. In previous extreme is randomness in the formation of fractal with a certain reg- value models, if the random variable is larger than a certain ularity in the randomness. The most important fundamental threshold, the distribution function shows the form of a feature of fractal is self-similarity and scale-invariance. power function, which has a fractal characteristic if the Self-similarity means that an entity complies to the same large-value tail is truncated. Therefore, these distribution statistical distribution locally and globally. The scale-invari- functions can describe the power functions at large values ance feature refers to the entity’s structure and remains un- very well. However, the tail of the distribution function con- changed on different scales. verges to 0 so rapidly that the distribution function fits the Dimension is another important quantity to describe the data poorly near the tail. The Pareto distribution shows geometric entity. In the traditional Euclidean geometry, a fractal characteristics at the low-end of the tail and has su- single point is considered to be zero-dimension. A straight perior performance in fitting the tailing data. However, its line is one-dimensional, plane is two-dimensional, and performance for large values is mediocre. space is three-dimensional. The fractal dimension, in con- This research proposes a generalized extreme value – trast to the Euclidean geometry of the line, plane and space, Pareto distribution function to make full use of the marine has the dimension that may not only be an integer, but can data based on the extreme value statistical theory and fractal also be a fraction, which is different compared with the tra- theory. The new model contains the commonly used ex- ditional dimensions in Euclidean geometry. So far, fractal treme value distribution that shows the statistical character- has been abstracted into a theory. Data analysis based on istics of random variables in extreme conditions, as well as fractal has shown promising results (Barrs and Chen, 2018; the fractal characteristics that utilize data efficiently. If the Cai et al., 2011; Escalante et al., 2016; Liu G.L. et al., 2018; observed data are from a rather short period of time, the data Zhang S.F.
Recommended publications
  • A New Parameter Estimator for the Generalized Pareto Distribution Under the Peaks Over Threshold Framework
    mathematics Article A New Parameter Estimator for the Generalized Pareto Distribution under the Peaks over Threshold Framework Xu Zhao 1,*, Zhongxian Zhang 1, Weihu Cheng 1 and Pengyue Zhang 2 1 College of Applied Sciences, Beijing University of Technology, Beijing 100124, China; [email protected] (Z.Z.); [email protected] (W.C.) 2 Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA; [email protected] * Correspondence: [email protected] Received: 1 April 2019; Accepted: 30 April 2019 ; Published: 7 May 2019 Abstract: Techniques used to analyze exceedances over a high threshold are in great demand for research in economics, environmental science, and other fields. The generalized Pareto distribution (GPD) has been widely used to fit observations exceeding the tail threshold in the peaks over threshold (POT) framework. Parameter estimation and threshold selection are two critical issues for threshold-based GPD inference. In this work, we propose a new GPD-based estimation approach by combining the method of moments and likelihood moment techniques based on the least squares concept, in which the shape and scale parameters of the GPD can be simultaneously estimated. To analyze extreme data, the proposed approach estimates the parameters by minimizing the sum of squared deviations between the theoretical GPD function and its expectation. Additionally, we introduce a recently developed stopping rule to choose the suitable threshold above which the GPD asymptotically fits the exceedances. Simulation studies show that the proposed approach performs better or similar to existing approaches, in terms of bias and the mean square error, in estimating the shape parameter.
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Extreme Value Theory and Backtest Overfitting in Finance
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Bowdoin College Bowdoin College Bowdoin Digital Commons Honors Projects Student Scholarship and Creative Work 2015 Extreme Value Theory and Backtest Overfitting in Finance Daniel C. Byrnes [email protected] Follow this and additional works at: https://digitalcommons.bowdoin.edu/honorsprojects Part of the Statistics and Probability Commons Recommended Citation Byrnes, Daniel C., "Extreme Value Theory and Backtest Overfitting in Finance" (2015). Honors Projects. 24. https://digitalcommons.bowdoin.edu/honorsprojects/24 This Open Access Thesis is brought to you for free and open access by the Student Scholarship and Creative Work at Bowdoin Digital Commons. It has been accepted for inclusion in Honors Projects by an authorized administrator of Bowdoin Digital Commons. For more information, please contact [email protected]. Extreme Value Theory and Backtest Overfitting in Finance An Honors Paper Presented for the Department of Mathematics By Daniel Byrnes Bowdoin College, 2015 ©2015 Daniel Byrnes Acknowledgements I would like to thank professor Thomas Pietraho for his help in the creation of this thesis. The revisions and suggestions made by several members of the math faculty were also greatly appreciated. I would also like to thank the entire department for their support throughout my time at Bowdoin. 1 Contents 1 Abstract 3 2 Introduction4 3 Background7 3.1 The Sharpe Ratio.................................7 3.2 Other Performance Measures.......................... 10 3.3 Example of an Overfit Strategy......................... 11 4 Modes of Convergence for Random Variables 13 4.1 Random Variables and Distributions...................... 13 4.2 Convergence in Distribution..........................
    [Show full text]
  • Procedures for Estimation of Weibull Parameters James W
    United States Department of Agriculture Procedures for Estimation of Weibull Parameters James W. Evans David E. Kretschmann David W. Green Forest Forest Products General Technical Report February Service Laboratory FPL–GTR–264 2019 Abstract Contents The primary purpose of this publication is to provide an 1 Introduction .................................................................. 1 overview of the information in the statistical literature on 2 Background .................................................................. 1 the different methods developed for fitting a Weibull distribution to an uncensored set of data and on any 3 Estimation Procedures .................................................. 1 comparisons between methods that have been studied in the 4 Historical Comparisons of Individual statistics literature. This should help the person using a Estimator Types ........................................................ 8 Weibull distribution to represent a data set realize some advantages and disadvantages of some basic methods. It 5 Other Methods of Estimating Parameters of should also help both in evaluating other studies using the Weibull Distribution .......................................... 11 different methods of Weibull parameter estimation and in 6 Discussion .................................................................. 12 discussions on American Society for Testing and Materials Standard D5457, which appears to allow a choice for the 7 Conclusion ................................................................
    [Show full text]
  • A New Weibull-X Family of Distributions: Properties, Characterizations and Applications Zubair Ahmad1* , M
    Ahmad et al. Journal of Statistical Distributions and Applications (2018) 5:5 https://doi.org/10.1186/s40488-018-0087-6 RESEARCH Open Access A new Weibull-X family of distributions: properties, characterizations and applications Zubair Ahmad1* , M. Elgarhy2 and G. G. Hamedani3 * Correspondence: [email protected] Abstract 1Department of Statistics, Quaid-i-Azam University 45320, We propose a new family of univariate distributions generated from the Weibull random Islamabad 44000, Pakistan variable, called a new Weibull-X family of distributions. Two special sub-models of the Full list of author information is proposed family are presented and the shapes of density and hazard functions are available at the end of the article investigated. General expressions for some statistical properties are discussed. For the new family, three useful characterizations based on truncated moments are presented. Three different methods to estimate the model parameters are discussed. Monti Carlo simulation study is conducted to evaluate the performances of these estimators. Finally, the importance of the new family is illustrated empirically via two real life applications. Keywords: Weibull distribution, T-X family, Moment, Characterizations, Order statistics, Estimation 1. Introduction In the field of reliability theory, modeling of lifetime data is very crucial. A number of statistical distributions such as Weibull, Pareto, Gompertz, linear failure rate, Rayleigh, Exonential etc., are available for modeling lifetime data. However, in many practical areas, these classical distributions do not provide adequate fit in modeling data, and there is a clear need for the extended version of these classical distributions. In this re- gard, serious attempts have been made to propose new families of continuous prob- ability distributions that extend the existing well-known distributions by adding additional parameter(s) to the model of the baseline random variable.
    [Show full text]
  • Package 'Renext'
    Package ‘Renext’ July 4, 2016 Type Package Title Renewal Method for Extreme Values Extrapolation Version 3.1-0 Date 2016-07-01 Author Yves Deville <[email protected]>, IRSN <[email protected]> Maintainer Lise Bardet <[email protected]> Depends R (>= 2.8.0), stats, graphics, evd Imports numDeriv, splines, methods Suggests MASS, ismev, XML Description Peaks Over Threshold (POT) or 'methode du renouvellement'. The distribution for the ex- ceedances can be chosen, and heterogeneous data (including histori- cal data or block data) can be used in a Maximum-Likelihood framework. License GPL (>= 2) LazyData yes NeedsCompilation no Repository CRAN Date/Publication 2016-07-04 20:29:04 R topics documented: Renext-package . .3 anova.Renouv . .5 barplotRenouv . .6 Brest . .8 Brest.years . 10 Brest.years.missing . 11 CV2............................................. 11 CV2.test . 12 Dunkerque . 13 EM.mixexp . 15 expplot . 16 1 2 R topics documented: fgamma . 17 fGEV.MAX . 19 fGPD ............................................ 22 flomax . 24 fmaxlo . 26 fweibull . 29 Garonne . 30 gev2Ren . 32 gof.date . 34 gofExp.test . 37 GPD............................................. 38 gumbel2Ren . 40 Hpoints . 41 ini.mixexp2 . 42 interevt . 43 Jackson . 45 Jackson.test . 46 logLik.Renouv . 47 Lomax . 48 LRExp . 50 LRExp.test . 51 LRGumbel . 53 LRGumbel.test . 54 Maxlo . 55 MixExp2 . 57 mom.mixexp2 . 58 mom2par . 60 NBlevy . 61 OT2MAX . 63 OTjitter . 66 parDeriv . 67 parIni.MAX . 70 pGreenwood1 . 72 plot.Rendata . 73 plot.Renouv . 74 PPplot . 78 predict.Renouv . 79 qStat . 80 readXML . 82 Ren2gev . 84 Ren2gumbel . 85 Renouv . 87 RenouvNoEst . 94 RLlegend . 96 RLpar . 98 RLplot . 99 roundPred . 101 rRendata . 102 Renext-package 3 SandT . 105 skip2noskip .
    [Show full text]
  • Parametric (Theoretical) Probability Distributions. (Wilks, Ch. 4) Discrete
    6 Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Note: parametric: assume a theoretical distribution (e.g., Gauss) Non-parametric: no assumption made about the distribution Advantages of assuming a parametric probability distribution: Compaction: just a few parameters Smoothing, interpolation, extrapolation x x + s Parameter: e.g.: µ,σ population mean and standard deviation Statistic: estimation of parameter from sample: x,s sample mean and standard deviation Discrete distributions: (e.g., yes/no; above normal, normal, below normal) Binomial: E = 1(yes or success); E = 0 (no, fail). These are MECE. 1 2 P(E ) = p P(E ) = 1− p . Assume N independent trials. 1 2 How many “yes” we can obtain in N independent trials? x = (0,1,...N − 1, N ) , N+1 possibilities. Note that x is like a dummy variable. ⎛ N ⎞ x N − x ⎛ N ⎞ N ! P( X = x) = p 1− p , remember that = , 0!= 1 ⎜ x ⎟ ( ) ⎜ x ⎟ x!(N x)! ⎝ ⎠ ⎝ ⎠ − Bernouilli is the binomial distribution with a single trial, N=1: x = (0,1), P( X = 0) = 1− p, P( X = 1) = p Geometric: Number of trials until next success: i.e., x-1 fails followed by a success. x−1 P( X = x) = (1− p) p x = 1,2,... 7 Poisson: Approximation of binomial for small p and large N. Events occur randomly at a constant rate (per N trials) µ = Np . The rate per trial p is low so that events in the same period (N trials) are approximately independent. Example: assume the probability of a tornado in a certain county on a given day is p=1/100.
    [Show full text]
  • Hand-Book on STATISTICAL DISTRIBUTIONS for Experimentalists
    Internal Report SUF–PFY/96–01 Stockholm, 11 December 1996 1st revision, 31 October 1998 last modification 10 September 2007 Hand-book on STATISTICAL DISTRIBUTIONS for experimentalists by Christian Walck Particle Physics Group Fysikum University of Stockholm (e-mail: [email protected]) Contents 1 Introduction 1 1.1 Random Number Generation .............................. 1 2 Probability Density Functions 3 2.1 Introduction ........................................ 3 2.2 Moments ......................................... 3 2.2.1 Errors of Moments ................................ 4 2.3 Characteristic Function ................................. 4 2.4 Probability Generating Function ............................ 5 2.5 Cumulants ......................................... 6 2.6 Random Number Generation .............................. 7 2.6.1 Cumulative Technique .............................. 7 2.6.2 Accept-Reject technique ............................. 7 2.6.3 Composition Techniques ............................. 8 2.7 Multivariate Distributions ................................ 9 2.7.1 Multivariate Moments .............................. 9 2.7.2 Errors of Bivariate Moments .......................... 9 2.7.3 Joint Characteristic Function .......................... 10 2.7.4 Random Number Generation .......................... 11 3 Bernoulli Distribution 12 3.1 Introduction ........................................ 12 3.2 Relation to Other Distributions ............................. 12 4 Beta distribution 13 4.1 Introduction .......................................
    [Show full text]
  • Handbook on Probability Distributions
    R powered R-forge project Handbook on probability distributions R-forge distributions Core Team University Year 2009-2010 LATEXpowered Mac OS' TeXShop edited Contents Introduction 4 I Discrete distributions 6 1 Classic discrete distribution 7 2 Not so-common discrete distribution 27 II Continuous distributions 34 3 Finite support distribution 35 4 The Gaussian family 47 5 Exponential distribution and its extensions 56 6 Chi-squared's ditribution and related extensions 75 7 Student and related distributions 84 8 Pareto family 88 9 Logistic distribution and related extensions 108 10 Extrem Value Theory distributions 111 3 4 CONTENTS III Multivariate and generalized distributions 116 11 Generalization of common distributions 117 12 Multivariate distributions 133 13 Misc 135 Conclusion 137 Bibliography 137 A Mathematical tools 141 Introduction This guide is intended to provide a quite exhaustive (at least as I can) view on probability distri- butions. It is constructed in chapters of distribution family with a section for each distribution. Each section focuses on the tryptic: definition - estimation - application. Ultimate bibles for probability distributions are Wimmer & Altmann (1999) which lists 750 univariate discrete distributions and Johnson et al. (1994) which details continuous distributions. In the appendix, we recall the basics of probability distributions as well as \common" mathe- matical functions, cf. section A.2. And for all distribution, we use the following notations • X a random variable following a given distribution, • x a realization of this random variable, • f the density function (if it exists), • F the (cumulative) distribution function, • P (X = k) the mass probability function in k, • M the moment generating function (if it exists), • G the probability generating function (if it exists), • φ the characteristic function (if it exists), Finally all graphics are done the open source statistical software R and its numerous packages available on the Comprehensive R Archive Network (CRAN∗).
    [Show full text]
  • Probability Distributions Used in Reliability Engineering
    Probability Distributions Used in Reliability Engineering Probability Distributions Used in Reliability Engineering Andrew N. O’Connor Mohammad Modarres Ali Mosleh Center for Risk and Reliability 0151 Glenn L Martin Hall University of Maryland College Park, Maryland Published by the Center for Risk and Reliability International Standard Book Number (ISBN): 978-0-9966468-1-9 Copyright © 2016 by the Center for Reliability Engineering University of Maryland, College Park, Maryland, USA All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from The Center for Reliability Engineering, Reliability Engineering Program. The Center for Risk and Reliability University of Maryland College Park, Maryland 20742-7531 In memory of Willie Mae Webb This book is dedicated to the memory of Miss Willie Webb who passed away on April 10 2007 while working at the Center for Risk and Reliability at the University of Maryland (UMD). She initiated the concept of this book, as an aid for students conducting studies in Reliability Engineering at the University of Maryland. Upon passing, Willie bequeathed her belongings to fund a scholarship providing financial support to Reliability Engineering students at UMD. Preface Reliability Engineers are required to combine a practical understanding of science and engineering with statistics. The reliability engineer’s understanding of statistics is focused on the practical application of a wide variety of accepted statistical methods. Most reliability texts provide only a basic introduction to probability distributions or only provide a detailed reference to few distributions.
    [Show full text]
  • On Families of Generalized Pareto Distributions: Properties and Applications
    Journal of Data Science 377-396 , DOI: 10.6339/JDS.201804_16(2).0008 On Families of Generalized Pareto Distributions: Properties and Applications D. Hamed1 and F. Famoye2 and C. Lee2 1 Department of Mathematics, Winthrop University, Rock Hill, SC, USA; 2Department of Mathematics, Central Michigan University, Mount Pleasant, Michigan, USA In this paper, we introduce some new families of generalized Pareto distributions using the T-R{Y} framework. These families of distributions are named T-Pareto{Y} families, and they arise from the quantile functions of exponential, log-logistic, logistic, extreme value, Cauchy and Weibull distributions. The shapes of these T-Pareto families can be unimodal or bimodal, skewed to the left or skewed to the right with heavy tail. Some general properties of the T-Pareto{Y} family are investigated and these include the moments, modes, mean deviations from the mean and from the median, and Shannon entropy. Several new generalized Pareto distributions are also discussed. Four real data sets from engineering, biomedical and social science are analyzed to demonstrate the flexibility and usefulness of the T-Pareto{Y} families of distributions. Key words: Shannon entropy; quantile function; moment; T-X family. 1. Introduction The Pareto distribution is named after the well-known Italian-born Swiss sociologist and economist Vilfredo Pareto (1848-1923). Pareto [1] defined Pareto’s Law, which can be stated as N Axa , where N represents the number of persons having income x in a population. Pareto distribution is commonly used in modelling heavy tailed distributions, including but not limited to income, insurance and city size populations.
    [Show full text]
  • Discriminating Between the Weibull and Log-Normal Distributions
    Discriminating Between The Weibull and Log-Normal Distributions Debasis Kundu1 Anubhav Manglick2 Abstract Log-Normal and Weibull distributions are the most popular distributions for modeling skewed data. In this paper, we consider the ratio of the maximized likelihood in choosing between the two distributions. The asymptotic distribution of the logarithm of the maximized likeli- hood ratio has been obtained. It is observed that the asymptotic distribution is independent of the unknown parameters. The asymptotic distribution has been used to determine the minimum sample size required to discriminate between two families of distributions for a user speci¯ed probability of correct selection. We perform some numerical experiments to observe how the asymptotic methods work for di®erent sample sizes. It is observed that the asymptotic results work quite well even for small samples also. Two real data sets have been analyzed. Key Words and Phrases Asymptotic distributions; Likelihood ratio tests; Probability of correct selection; Location Scale Family. Address of Correspondence: Professor Debasis Kundu, Department of Mathematics, Indian Institute of Technology Kanpur, Pin 208016, INDIA. e-mail: [email protected]. 1 Department of Mathematics, Indian Institute of Technology Kanpur, Pin 208016, INDIA. 2 Faculty of Mathematics and Informatics, University of Passau, GERMANY. 1 1 Introduction: We address the following problem in this paper. Suppose an experimenter has observed n data points, say x1; : : : ; xn and he wants to use either two-parameter log-normal model or two parameter Weibull model, which one is preferable? It is well known that both the log-normal and Weibull models can be used quite e®ectively to analyze skewed data set.
    [Show full text]