Statistical Distributions in General Insurance Stochastic Processes

Statistical distributions in general insurance stochastic processes by Jan Hendrik Harm Steenkamp Submitted in partial fulfilment of the requirements for the degree Magister Scientiae In the Department of Statistics In the Faculty of Natural & Agricultural Sciences University of Pretoria Pretoria January 2014 © University of Pretoria 1 I, Jan Hendrik Harm Steenkamp declare that the dissertation, which I hereby submit for the degree Magister Scientiae in Mathematical Statistics at the University of Pretoria, is my own work and has not previously been submitted for a degree at this or any other tertiary institution. SIGNATURE: DATE: 31 January 2014 © University of Pretoria Summary A general insurance risk model consists of in initial reserve, the premiums collected, the return on investment of these premiums, the claims frequency and the claims sizes. Except for the initial reserve, these components are all stochastic. The assumption of the distributions of the claims sizes is an integral part of the model and can greatly influence decisions on reinsurance agreements and ruin probabilities. An array of parametric distributions are available for use in describing the distribution of claims. The study is focussed on parametric distributions that have positive skewness and are defined for positive real values. The main properties and parameterizations are studied for a number of distributions. Maximum likelihood estimation and method-of-moments estimation are considered as techniques for fitting these distributions. Multivariate numerical maximum likelihood estimation algorithms are proposed together with discussions on the efficiency of each of the estimation algorithms based on simulation exercises. These discussions are accompanied with programs developed in SAS PROC IML that can be used to simulate from the various parametric distributions and to fit these parametric distributions to observed data. The presence of heavy upper tails in the context of general insurance claims size distributions indicates that there exists a high risk of observing very large and even extreme claims. This needs to be allowed for in the modeling of claims. Methods used to describe tail weight together with techniques that can be used to detect the presence of heavy upper tails are studied. These methods are then applied to the parametric distributions to classify their tails' heaviness. The study is concluded with an application of the techniques developed to fit the parametric distributions and to evaluate the tail heaviness of real- life claims data. The goodness-of-fit of the various fitted distributions are discussed. Based on the final results further research topics are identified. 2 © University of Pretoria Acknowledgements To Jesus Christ I am sincerely greatful for being blessed with the opportu- nity to have studied and also for the ability to have conducted this study. I received strength to persist despite setbacks and challenges. To my family and in particular my mother I would like to express my thanks for your support and encouragement throughout the duration of my studies. To my supervisor, Dr. Inger Fabris-Rotelli, thank you for your time, sharing of knowledge, valuable input and guidance during this research. To my employer and colleagues I would like to express my thanks for your support and helping me to free up time to do this research. 3 © University of Pretoria Contents 1 Introduction 18 1.1 Problem Statement . 18 1.2 Literature Review . 21 1.3 Structure of the dissertation . 26 2 General Insurance Overview 28 2.1 Inter-occurrence times . 29 2.2 Number of claims . 30 2.2.1 Poisson Process . 32 2.2.2 Mixed Poisson Process . 34 2.3 Claim size . 37 2.4 Aggregate claim amount . 39 2.5 Premium Income . 42 2.6 Risk Reserve . 43 2.7 Reinsurance . 44 2.8 Ruin Problems . 47 3 Theory and Properties of Distributions 48 3.1 Introduction . 48 3.2 Moments of parametric distributions . 55 3.3 Distributions with monotone hazard rates . 56 3.4 Heavy-tailed distributions . 59 3.4.1 Classification of tail heaviness based on the moment generating function . 59 3.4.2 Classification of tail heaviness based on exponential boundedness . 60 4 © University of Pretoria CONTENTS 5 3.4.3 Comparison of tail weights using limiting tail behaviour 61 3.4.4 Classification of tail heaviness based on the hazard function . 63 3.4.5 Classification of tail heaviness based on the hazard rate function . 65 3.4.6 Classification of tail heaviness based on the mean residual hazard rate function . 66 3.5 Left Censored Variables and Integrated Tail Distributions . 69 3.5.1 Excess Loss Variable . 69 3.5.2 Left Censored . 70 3.5.3 Limited Loss Variable . 70 3.5.4 Coefficient of variation when the distribution is heavy- tailed . 71 3.6 Subexponential Distributions . 74 3.7 Practical Methods to Detect Heavy Tails . 75 3.7.1 Overview . 75 3.7.2 Method 1: Using definition of the subexponential class 75 3.7.3 Method 2: Using the mean residual hazard rate function 76 3.7.4 Method 3: Testing for an exponentially bounded tail using quantile plots . 77 4 Parametric Loss Distributions 80 4.1 Continuous Distributions for Claim Sizes . 81 4.1.1 Gamma Distribution . 81 4.1.2 Exponential Distribution . 81 4.1.3 Chi-square Distribution . 83 4.1.4 Two-parameter Exponential Distribution . 84 4.1.5 Erlang Distribution . 85 4.1.6 Extreme Value Distribution . 87 4.1.7 Frechet Distribution . 92 4.1.8 Weibull Distribution . 93 4.1.9 Rayleigh Distribution . 96 4.1.10 Gumbel Distribution . 96 4.1.11 Pareto (Lomax) Distribution . 100 © University of Pretoria CONTENTS 6 4.1.12 Generalized Pareto Distribution . 102 4.1.13 Lognormal Distribution . 103 4.1.14 Beta-prime Distribution (Pearson Type VI) . 106 4.1.15 Birnbaum-Saunders Distribution . 108 4.1.16 Burr Distribution . 109 4.1.17 Dagum Distribution . 113 4.1.18 Generalized Beta Distribution of the Second Kind . 116 4.1.19 Singh-Maddala Distribution . 118 4.1.20 Kappa Family of Distributions . 121 4.1.21 Loggamma Distribution . 123 4.1.22 Snedecor's F Distribution . 126 4.1.23 Log-logistic Distribution . 128 4.1.24 Folded and Half Normal Distribution . 129 4.1.25 Inverse Gamma Distribution . 133 4.1.26 Inverse Chi-square Distribution . 135 4.1.27 Inverse Gaussian Distribution . 136 4.1.28 Skew-normal Distribution . 138 4.1.29 Exponential-Gamma Distribution . 141 4.2 Other Useful Distributions . 143 4.2.1 Normal Distribution . 143 4.2.2 Logistic Distribution . 146 4.2.3 Dirichlet Distribution . 147 5 Tail Behaviour of Parametric Distributions 149 5.1 Introduction . 149 5.2 Classification of Tail Heaviness for Parametric Continuous Distributions . 150 5.2.1 Gamma and related distributions . 151 5.2.2 Erlang Distribution . 153 5.2.3 Generalized Extreme Value Distribution . 153 5.2.4 Pareto Distribution . 155 5.2.5 Generalized Pareto Distribution . 156 5.2.6 Lognormal Distribution . 158 © University of Pretoria CONTENTS 7 5.2.7 Beta-prime Distribution . 158 5.2.8 Birnbaum-Saunders Distribution . 160 5.2.9 Burr and related distributions . 161 5.2.10 Dagum Distribution . 163 5.2.11 Singh-Maddala Distribution . 164 5.2.12 Kappa Family of Distributions . 165 5.2.13 Generalized Beta Distribution of the Second Kind . 167 5.2.14 Log-logistic Distribution . 168 5.2.15 Folded Normal Distribution . 170 5.2.16 Inverse Gamma and related distributions . 170 5.2.17 Loggamma Distribution . 172 5.2.18 Inverse Gaussian Distribution . 173 5.2.19 Snedecor's F Distribution . 174 5.2.20 Skew-normal Distribution . 174 6 Fitting Parametric Loss Distributions 177 6.1 Introduction . 177 6.2 Numerical Optimisation . 179 6.3 Simulating Parametric Loss Distributions . 180 6.4 Fitting Analysis . 182 6.5 Methods of Fitting Parametric Loss Distributions . 183 6.5.1 Gamma Distribution . 183 6.5.2 Exponential Distribution . 185 6.5.3 Chi-square Distribution . 186 6.5.4 Two-parameter Exponential Distribution . 187 6.5.5 Erlang Distribution . 189 6.5.6 Generalized Extreme Value Distributions . 189 6.5.7 Frechet Distribution . 191 6.5.8 Weibull Distribution . 194 6.5.9 Rayleigh Distribution . 199 6.5.10 Gumbel Distribution . 200 6.5.11 Pareto Distribution . 202 6.5.12 Generalized Pareto Distribution . 203 © University of Pretoria CONTENTS 8 6.5.13 Lognormal Distribution . 205 6.5.14 Beta-prime Distribution . 206 6.5.15 Birnbaum-Saunders Distribution . 208 6.5.16 Burr and related distributions . 209 6.5.17 Dagum Distribution . 210 6.5.18 Generalized Beta Distribution of the Second Kind . 212 6.5.19 Kappa Family of Distributions . 214 6.5.20 Log-logistic Distribution . 216 6.5.21 Folded Normal Distribution . 218 6.5.22 Inverse Gamma and related distributions . 219 6.5.23 Loggamma Distribution . 221 6.5.24 Snedecor's F Distribution . 223 6.5.25 Inverse Gaussian Distribution . 225 6.5.26 Skew-normal Distribution . 227 7 Practical Application 232 7.1 Introduction . 232 7.2 Goodness-of-fit Assessment . 232 7.2.1 Likelihood ratio test . 233 7.2.2 Quantile plots . 234 7.3 Claims Size Data . 236 7.3.1 Background . 236 7.3.2 Description of data and segmentation . 237 7.4 Fitting of Distributions . 238 7.4.1 Approach . 239 7.4.2 Results . 239 7.4.3 Conclusion . 265 8 Conclusion 267 8.1 Overview of study conducted . 267 8.2 Results and Observations . 269 8.3 Future Research . ..

Statistical Distributions in General Insurance Stochastic Processes

Field Guide to Continuous Probability Distributions

A Class of Generalized Beta Distributions, Pareto Power Series and Weibull Power Series

Concentration Inequalities from Likelihood Ratio Method

Package 'Extradistr'

Parametric Representation of the Top of Income Distributions: Options, Historical Evidence and Model Selection

Probability Distribution Relationships

Distributions.Jl Documentation Release 0.6.3

Skewed Reflected Distributions Generated by the Laplace Kernel 1 Introduction

Combinatorial Clustering and the Beta Negative Binomial Process

From the Classical Beta Distribution to Generalized Beta Distributions

Probability Distributions

The Exact Distribution of the Ratio of Two Independent Hypoexponential Random Variables Abstract 1 Introduction