Mlardalen University
Total Page:16
File Type:pdf, Size:1020Kb
Institutionen för Matematik och Fysik Code: MdH-IMa-2005:018 MASTER THESIS IN MATHEMATICS /APPLIED MATHEMATICS Modelling Insurance Claim Sizes using the Mixture of Gamma & Reciprocal Gamma Distributions by Ying Ni Magisterarbete i matematik / tillämpad matematik DEPARTMENT OF MATHEMATICS AND PHYSICS MÄLARDALEN UNIVERSITY SE-721 23 VÄSTERÅS, SWEDEN DEPARTEMENT OF MATHEMATICS AND PHYSICS ___________________________________________________________________________ Master thesis in mathematics / applied mathematics Date: 2005-12-20 Project name: Modelling Insurance Claim Sizes using the Mixture of Gamma & Reciprocal Gamma Distributions Author: Ying Ni Supervisors: Dmitrii Silvestrov, Anatoliy Malyarenko Examiner: Dmitrii Silvestrov Comprising: 20 points ___________________________________________________________________________ ii Abstract A gamma-reciprocal gamma mixture distribution is used to model the claim size distribution. It is specially designed to handle both ordinary and extremely large claims. The gamma distribution has a lighter tail. This is meant to be a model for the frequent, small & moderate claims while the reciprocal gamma distribution covers the large, but infrequent claims. We begin with introducing the gamma distribution and motivating its role in small & moderate claim size modelling, followed with a discussion on the reciprocal gamma distribution and its application in large claims size modelling; finally we demonstrate the mixture of gamma & reciprocal gamma as the generally applicable statistical model. Two parameter estimation techniques, namely the method of moments and maximum likelihood estimation are provided to fit the single gamma, single reciprocal gamma, and the general mixture models. We shall explain in details the solution concepts of the involved moment estimation and likelihood equations. We shall even make an effort to evaluate the proposed moment estimation algorithm. Two Java applications, namely the Java Mixture Model Simulation Program and the Java Mixture Model Estimation Program are developed to facilitate the analysis. While the former simulates samples that follow the mixture distribution given known parameters, the latter provides the parameter estimation for simplified cases given samples. Keywords: Gamma distribution; Reciprocal gamma distribution; Mixture of gamma and reciprocal gamma; Claim size distribution; Large claim size distribution; Method of moments; Maximum likelihood estimation; Estimating the parameters of gamma; Estimating the parameters of reciprocal gamma iii Notation & Abbreviations Notation Pr (…) the probability of … ∞ Γ()α gamma function Γ()α = xα −1 e −x dx. ∫0 d ψ ()α digamma function ψ (α )= ln Γ (a ) dα ψ '()α trigamma function x γ (;)α x “lower” incomplete gamma function γ(;) α x= tα −1 e −t dt ∫0 ∞ Γ(;)α x “upper” incomplete gamma function Γ(;)α x = tα −1 e −t dt ∫x L likelihood function ln L log likelihood function 2 χα chi-square distribution with a degrees of freedom Abbreviations cdf cumulative distribution function mgf moment-generating function MLE maximum likelihood estimation, maximum likelihood estimator, or maximum likelihood estimate R-gamma reciprocal gamma (distribution) pdf probability density function iv Table of Contents 1. Introduction ……………...…….……………………..…………………………….. 1 2. The Gamma Distribution …………………….…………………………………….. 2 2.1. Definition, Moments & Properties ……………………………………………... 2 2.2. Applications in Claim Size Modelling …………………………………………. 5 2.3. Estimation of Parameters ……………………………………………………….. 6 2.3.1. Method of Moments Estimation …………..…………………………..….. 7 2.3.2. Maximum Likelihood Estimation …….…………………………………. 8 3. The Reciprocal Gamma Distribution ……………………………………………... 12 3.1. Definition, Moments & Properties …………………………………………….. 12 3.2. Applications in Claim Size Modelling ………………………………………… 15 3.3. Estimation of Parameters ………………………………………………………. 15 3.3.1. Method of Moments Estimation ………………………………………..…15 3.3.2. Maximum Likelihood Estimation …………………………………………16 4. The Mixture of Gamma & Reciprocal Gamma …………………………………... 17 4.1. The Mixture of Gamma & Reciprocal Gamma ……………….......................….17 4.2. Estimation of Parameters ………………………………………….....……….…18 4.2.1. Method of Moments Estimation …………………………………………. 18 4.2.2. Maximum Likelihood Estimation ……………………………………….. 24 5. Simulation Studies ………………………………………………………………..... 27 5.1. Simulation ……………….................................................................................... 27 5.2. Estimation ………………………………………….....………………………... 29 6. Conclusion ………………………………………………………………………....... 39 7. References …………………………………………………………………………... 40 8. Appendixes .……………………………………………………………………..…... 41 Appendix A: Java Mixture Model Simulation Program-Users’ Guide ….................... 42 Appendix B: Java Mixture Model Estimation Program- Users’ Guide……...………. 46 Appendix C: Java Source Code: MixtureModelSimulation.java ...………….…….… 49 Appendix D: Java Source Code: MixtureModelEstimation.java ………….……….... 62 Appendix E: Java Source Code: Newton_Solver. java ………..…………………….. 71 v 1. Introduction It is not unusual that, in addition to a large number of “ordinary” policies, an insurance portfolio also contains a small number of policies under which much larger claim sizes are likely. Those large claims usually represent the greatest part of the indemnities paid by the insurer. It is therefore essential for actuary to have a good model, which takes consideration into those extreme large claims. In this paper, we introduce a new and generally applicable statistical model for individual insurance claim sizes, the mixture of gamma and reciprocal gamma model, which is specially designed to handle both ordinary and extremely large claims. This model uses gamma distribution as the first component, placing probability p and reciprocal gamma distribution as the second, placing probability 1-p. The gamma distribution has a lighter tail. This is meant to be a model for the frequent, small claims while the reciprocal gamma distribution covers the large, but infrequent claims. If we set p = 1, the general model reduces to the single gamma which is by itself important as a model for ordinary claim sizes, that is, data without heavy tails; At p = 0, it reduces to the single reciprocal gamma model, and because of the heavy tail of this distribution it can model the extremely large claims. We shall discuss the statistical properties of the two component distributions, gamma and reciprocal gamma distributions and their advantages in modelling these two types of claims, namely, the ordinary and extra large claims. We shall also be interested in the estimation of parameters, because the parameters needed to be estimated from claim data before we can apply the proposed distribution to any particular actuarial problem, and parameters are seldom known a priori. The methods of moments and the maximum likelihood estimation are applied to the solution of these problems. These two parameters estimation techniques are illustrated in details for gamma, reciprocal gamma, and finally their mixture. The aim of this paper is to motivate the general mixture model, specify and study its two components, and discuss the related parameter estimation techniques. The remaining of the paper proceeds as follows. In section 2, we describe the gamma distribution, its applications in claim size modelling, and the estimation of parameters using method of moments and maximum likelihood estimation. In section 3, we repeat the procedure for the reciprocal gamma distribution. The general mixture is demonstrated in section 4. The two estimation techniques for this relatively complicated case are also discussed. Section 5 illustrates the simulation of the relevant random variables with the help of our Java Mixture Model Simulation Program, and evaluates the proposed moment estimation algorithm for simplified cases using the Java Mixture Model Estimation Program. And finally section 6 concludes. 1 2. The Gamma Distribution 2.1 Definition, Moments & Properties Probability density function (pdf) The gamma distribution is a well-known, flexible continuous distribution that can model individual claim size. It has the pdf λαx α−1 e − λ x (2.1) f( x |α , λ ) = , 0≤x < ∞ ,α > 0, λ > 0 , Γ()α where Γ(α) is the gamma function defined by ∞ (2.2) Γ()α = xα −1 e −x dx. ∫0 When α = 1, direct integration will verify thatΓ )1( = 1, we then get the exponential distribution. Integration by parts will verify that Γ(α + 1)= αΓ (α) for any α >1. In particular, when n is a positive integer,Γ(n )= ( n − 1)!, for example, Γ )6( = 120 . The gamma distribution is also called the Erlang distribution in this special case. If α is not an integer, Γ(α) can be found approximately by interpolating between the factorials or from tables and software programs. Figure 1(a,b) show the pdf plots for selected combinations of the parameters α and λ. Figure 1a, 1b, 2 are produced using MATLAB. Figure 1a. Examples of the gamma pdf with fixed λ =0.5 and various α 2 Figure 1b. Examples of the gamma pdf with fixed α = 3 and various λ Alternatively, the gamma distribution can be parameterized in terms of a shape parameter α (same as the parameter α in (2.1)) and scale parameter β (equal to 1 / λ in (2.1), λ in (2.1) is the inverse of scale parameter, or sometimes called rate parameter) xα−1 e −x β (2.3) f( x |α , β ) = , 0≤x < ∞ ,α > 0, β > 0 βα Γ() α Both (2.1) and (2.3) are commonly used, however (2.1) is the form that will be used through out the paper and also in the attached Java programs. Cumulative