A New Family of Generalized Distributions

Total Page:16

File Type:pdf, Size:1020Kb

A New Family of Generalized Distributions Journal of Statistical Computation & Simulation Vol. 00, No. 00, August 2009, 1{17 A new family of generalized distributions Gauss M. Cordeiroa and M¶ariode Castrob¤ aUniversidade Federal Rural de Pernambuco, Departamento de Estat¶³stica e Inform¶atica, 52171-900, Recife-PE, Brasil; bUniversidade de S~aoPaulo, Instituto de Ci^encias Matem¶aticas e de Computa»c~ao,Caixa Postal 668, 13560-970, S~aoCarlos-SP, Brasil (Received December 7, 2009) Kumaraswamy [1] introduced a distribution for double bounded random processes with hy- drological applications. For the ¯rst time, based on this distribution, we describe a new family of generalized distributions (denoted with the pre¯x \Kw") to extend the normal, Weibull, gamma, Gumbel, inverse Gaussian distributions, among several well-known distri- butions. Some special distributions in the new family such as the Kw-normal, Kw-Weibull, Kw-gamma, Kw-Gumbel and Kw-inverse Gaussian distribution are discussed. We express the ordinary moments of any Kw generalized distribution as linear functions of probability weighted moments of the parent distribution. We also obtain the ordinary moments of order statistics as functions of probability weighted moments of the baseline distribution. We use the method of maximum likelihood to ¯t the distributions in the new class and illustrate the potentiality of the new model with an application to real data. Keywords: gamma distribution; Kumaraswamy distribution; moments; normal distribution; order statistics; Weibull distribution AMS Subject Classi¯cation: 62E10; 62F03; 62F05; 62F10 1. Introduction Beta distributions are very versatile and a variety of uncertainties can be usefully modeled by them. Many of the ¯nite range distributions encountered in practice can be easily transformed into the standard beta distribution. In econometrics, many times the data are modeled by ¯nite range distributions. Generalized beta distributions have been widely studied in statistics and numerous authors have developed various classes of these distributions. Eugene et al. [2] proposed a general class of distributions for a random variable de¯ned from the logit of the beta random variable by employing two parameters whose role is to introduce skewness and to vary tail weight. Following the work of Eugene et al. [2], who de¯ned the beta normal distribution, Nadarajah and Kotz [3] introduced the beta Gumbel distribution, Nadarajah and Gupta [4] proposed the beta Fr¶echet distribution and Nadarajah and Kotz [5] worked with the beta exponential distribution. However, all these works lead to some mathematical di±culties because the beta distribution is not fairly tractable and, in particular, its cumulative distribution function (cdf) involves the incomplete beta function ratio. The paper by Kumaraswamy [1] proposed a new probability distribution for dou- ble bounded random processes with hydrological applications. The Kumaraswamy's distribution appears to have received considerable interest in hydrology and related areas, see [6{9]. ¤Corresponding author. Email: [email protected] ISSN: 0094-9655 print/ISSN 1563-5163 online © 2009 Taylor & Francis DOI: 10.1080/0094965YYxxxxxxxx http://www.informaworld.com 2 G. M. Cordeiro and M. de Castro In reliability and life testing experiments, many times the data are modeled by ¯nite range distributions. See, for example, [10]. We start with the Kumaraswamy's distribution (called from now on the Kw distribution) on the interval (0; 1), having the probability density function (pdf) and the cdf with two shape parameters a > 0 and b > 0 de¯ned by f(x) = a b xa¡1(1 ¡ xa)b¡1 and F (x) = 1 ¡ (1 ¡ xa)b: (1) The density function in (1) has many of the same properties as the beta distribution but has some advantages in terms of tractability. The Kw distribution does not seem to be very familiar to statisticians and has not been investigated systematically in much detail before, nor has its relative in- terchangeability with the beta distribution has been widely appreciated. However, in a very recent paper, Jones [11] explored the background and genesis of the Kw distribution and, more importantly, made clear some similarities and di®erences between the beta and Kw distributions. For example, the Kw densities are also unimodal, uniantimodal, increasing, decreasing or constant depending in the same way as the beta distribution on the values of its parameters. He highlighted sev- eral advantages of the Kw distribution over the beta distribution: the normalizing constant is very simple; simple explicit formulae for the distribution and quantile functions which do not involve any special functions; a simple formula for random variate generation; explicit formulae for L-moments and simpler formulae for mo- ments of order statistics. Further, according to Jones [11], the beta distribution has the following advantages over the Kw distribution: simpler formulae for moments and moment generating function; a one-parameter sub-family of symmetric distri- butions; simpler moment estimation and more ways of generating the distribution via physical processes. Consider starting from a parent continuous distribution function G(x). A natural way of generating families of distributions on some other support from a simple starting parent distribution with pdf g(x) = dG(x)=dx is to apply the quantile function to a family of distributions on the interval (0; 1). We now combine the works of Eugene et al. [2] and Jones [11] (see also [12]) to construct a new class of Kw generalized (Kw-G) distributions. From an arbitrary parent cdf G(x), the cdf F (x) of the Kw-G distribution is de¯ned by F (x) = 1 ¡ f1 ¡ G(x)agb; (2) where a > 0 and b > 0 are two additional parameters whose role is to introduce skewness and to vary tail weights. Because of its tractable distribution function (2), the Kw-G distribution can be used quite e®ectively even if the data are censored. Correspondingly, the density function of this family of distributions has a very simple form f(x) = a b g(x) G(x)a¡1f1 ¡ G(x)agb¡1; (3) whereas the density of the beta-G distribution is given by 1 f(x) = g(x)G(x)a¡1 f1 ¡ G(x)gb¡1 ; (4) B(a; b) where B(¢; ¢) denotes the beta function. The basic di®erence (except for a scale multiplier) between (3) and (4) is the power of G(x) inside the braces. Clearly, for b = 1 both densities are identical. Journal of Statistical Computation & Simulation 3 The new density (3) has an advantage over the class of generalized beta distri- butions due to Eugene et al. [2], since it does not involve any special function. For each continuous name distribution (here name denotes the name of the parent dis- tribution), we can associate the Kw-name distribution with two extra parameters a and b from the cdf G(x) and pdf g(x) of the name distribution whose density function is de¯ned by formula (3). Special Kw generalized distributions can be generated as follows: the Kw-normal (KwN) distribution is obtained by taking G(x) in formula (3) to be the distribution function of the normal distribution. Analogously, the Kw-Weibull (KwW ), Kw- gamma (KwGa) and Kw-Gumbel (KwGu) distributions are obtained by taking G(x) to be the cdf of the Weibull, gamma and Gumbel distributions, respectively, among several others. Hence, each new Kw-G distribution can be obtained from a speci¯ed G distribution. The Kw distribution is a special case of the Kw-G distri- bution with G being the uniform distribution on [0; 1], whereas the G distribution is the distribution corresponding to a = b = 1. With a = 1, the Kw-G distribu- tion coincides with the beta-G distribution generated by the beta(1; b) distribution. Furthermore, for b = 1 and a being an integer, the Kw-G is the distribution of the maximum of a random sample of size a from G. One major bene¯t of the Kw family of generalized distributions is its ability of ¯tting skewed data that can not be properly ¯tted by existing distributions. In this article we deal with formula (3) in some generality. The mathematical properties of the Kw generalized family are usually much simpler to derive than those of the class of generalized beta distributions proposed by Eugene et al. [2]. Even if g(x) is a symmetric function around 0, then f(x) will not be a symmetric distribution even when a = b. From (1), if u is sampled from the uniform (0,1) distribution, then G¡1(f1 ¡ (1 ¡ u)1=bg1=a) is drawn from the Kw-G distribution. The paper is outlined as follows. Section 2 provides some special distributions in the Kw generalized family. In Section 3, we derive general expansions for the density of the Kw-G distribution as a function of the parent density g(x) multi- plied by power series in G(x) depending if a is integer or real non-integer. We can easily apply these expansions to several Kw-G distributions. Probability weighted moments (PWMs) are expectations of certain functions of a random variable and they can be de¯ned for any random variable whose ordinary moments exist. In Section 4, we derive two simple expansions for the moments of any Kw-G distri- bution as linear functions of PWMs of the G distribution which are valid if a is integer or real non-integer. We derive in Section 5 some expansions for the density of order statistics of the class of Kw-G distributions. In Section 6, PWMs are ob- tained for this class. Section 7 provides an alternative formula for moments of order statistics of the Kw-G distribution. The L-moments are also given in this section. Some inferential tools are discussed in Section 8. A real data set is analyzed by some distributions in the Kw-G family in Section 9.
Recommended publications
  • Concentration and Consistency Results for Canonical and Curved Exponential-Family Models of Random Graphs
    CONCENTRATION AND CONSISTENCY RESULTS FOR CANONICAL AND CURVED EXPONENTIAL-FAMILY MODELS OF RANDOM GRAPHS BY MICHAEL SCHWEINBERGER AND JONATHAN STEWART Rice University Statistical inference for exponential-family models of random graphs with dependent edges is challenging. We stress the importance of additional structure and show that additional structure facilitates statistical inference. A simple example of a random graph with additional structure is a random graph with neighborhoods and local dependence within neighborhoods. We develop the first concentration and consistency results for maximum likeli- hood and M-estimators of a wide range of canonical and curved exponential- family models of random graphs with local dependence. All results are non- asymptotic and applicable to random graphs with finite populations of nodes, although asymptotic consistency results can be obtained as well. In addition, we show that additional structure can facilitate subgraph-to-graph estimation, and present concentration results for subgraph-to-graph estimators. As an ap- plication, we consider popular curved exponential-family models of random graphs, with local dependence induced by transitivity and parameter vectors whose dimensions depend on the number of nodes. 1. Introduction. Models of network data have witnessed a surge of interest in statistics and related areas [e.g., 31]. Such data arise in the study of, e.g., social networks, epidemics, insurgencies, and terrorist networks. Since the work of Holland and Leinhardt in the 1970s [e.g., 21], it is known that network data exhibit a wide range of dependencies induced by transitivity and other interesting network phenomena [e.g., 39]. Transitivity is a form of triadic closure in the sense that, when a node k is connected to two distinct nodes i and j, then i and j are likely to be connected as well, which suggests that edges are dependent [e.g., 39].
    [Show full text]
  • Use of Statistical Tables
    TUTORIAL | SCOPE USE OF STATISTICAL TABLES Lucy Radford, Jenny V Freeman and Stephen J Walters introduce three important statistical distributions: the standard Normal, t and Chi-squared distributions PREVIOUS TUTORIALS HAVE LOOKED at hypothesis testing1 and basic statistical tests.2–4 As part of the process of statistical hypothesis testing, a test statistic is calculated and compared to a hypothesised critical value and this is used to obtain a P- value. This P-value is then used to decide whether the study results are statistically significant or not. It will explain how statistical tables are used to link test statistics to P-values. This tutorial introduces tables for three important statistical distributions (the TABLE 1. Extract from two-tailed standard Normal, t and Chi-squared standard Normal table. Values distributions) and explains how to use tabulated are P-values corresponding them with the help of some simple to particular cut-offs and are for z examples. values calculated to two decimal places. STANDARD NORMAL DISTRIBUTION TABLE 1 The Normal distribution is widely used in statistics and has been discussed in z 0.00 0.01 0.02 0.03 0.050.04 0.05 0.06 0.07 0.08 0.09 detail previously.5 As the mean of a Normally distributed variable can take 0.00 1.0000 0.9920 0.9840 0.9761 0.9681 0.9601 0.9522 0.9442 0.9362 0.9283 any value (−∞ to ∞) and the standard 0.10 0.9203 0.9124 0.9045 0.8966 0.8887 0.8808 0.8729 0.8650 0.8572 0.8493 deviation any positive value (0 to ∞), 0.20 0.8415 0.8337 0.8259 0.8181 0.8103 0.8206 0.7949 0.7872 0.7795 0.7718 there are an infinite number of possible 0.30 0.7642 0.7566 0.7490 0.7414 0.7339 0.7263 0.7188 0.7114 0.7039 0.6965 Normal distributions.
    [Show full text]
  • The Exponential Family 1 Definition
    The Exponential Family David M. Blei Columbia University November 9, 2016 The exponential family is a class of densities (Brown, 1986). It encompasses many familiar forms of likelihoods, such as the Gaussian, Poisson, multinomial, and Bernoulli. It also encompasses their conjugate priors, such as the Gamma, Dirichlet, and beta. 1 Definition A probability density in the exponential family has this form p.x / h.x/ exp >t.x/ a./ ; (1) j D f g where is the natural parameter; t.x/ are sufficient statistics; h.x/ is the “base measure;” a./ is the log normalizer. Examples of exponential family distributions include Gaussian, gamma, Poisson, Bernoulli, multinomial, Markov models. Examples of distributions that are not in this family include student-t, mixtures, and hidden Markov models. (We are considering these families as distributions of data. The latent variables are implicitly marginalized out.) The statistic t.x/ is called sufficient because the probability as a function of only depends on x through t.x/. The exponential family has fundamental connections to the world of graphical models (Wainwright and Jordan, 2008). For our purposes, we’ll use exponential 1 families as components in directed graphical models, e.g., in the mixtures of Gaussians. The log normalizer ensures that the density integrates to 1, Z a./ log h.x/ exp >t.x/ d.x/ (2) D f g This is the negative logarithm of the normalizing constant. The function h.x/ can be a source of confusion. One way to interpret h.x/ is the (unnormalized) distribution of x when 0. It might involve statistics of x that D are not in t.x/, i.e., that do not vary with the natural parameter.
    [Show full text]
  • A Skew Extension of the T-Distribution, with Applications
    J. R. Statist. Soc. B (2003) 65, Part 1, pp. 159–174 A skew extension of the t-distribution, with applications M. C. Jones The Open University, Milton Keynes, UK and M. J. Faddy University of Birmingham, UK [Received March 2000. Final revision July 2002] Summary. A tractable skew t-distribution on the real line is proposed.This includes as a special case the symmetric t-distribution, and otherwise provides skew extensions thereof.The distribu- tion is potentially useful both for modelling data and in robustness studies. Properties of the new distribution are presented. Likelihood inference for the parameters of this skew t-distribution is developed. Application is made to two data modelling examples. Keywords: Beta distribution; Likelihood inference; Robustness; Skewness; Student’s t-distribution 1. Introduction Student’s t-distribution occurs frequently in statistics. Its usual derivation and use is as the sam- pling distribution of certain test statistics under normality, but increasingly the t-distribution is being used in both frequentist and Bayesian statistics as a heavy-tailed alternative to the nor- mal distribution when robustness to possible outliers is a concern. See Lange et al. (1989) and Gelman et al. (1995) and references therein. It will often be useful to consider a further alternative to the normal or t-distribution which is both heavy tailed and skew. To this end, we propose a family of distributions which includes the symmetric t-distributions as special cases, and also includes extensions of the t-distribution, still taking values on the whole real line, with non-zero skewness. Let a>0 and b>0be parameters.
    [Show full text]
  • On a Problem Connected with Beta and Gamma Distributions by R
    ON A PROBLEM CONNECTED WITH BETA AND GAMMA DISTRIBUTIONS BY R. G. LAHA(i) 1. Introduction. The random variable X is said to have a Gamma distribution G(x;0,a)if du for x > 0, (1.1) P(X = x) = G(x;0,a) = JoT(a)" 0 for x ^ 0, where 0 > 0, a > 0. Let X and Y be two independently and identically distributed random variables each having a Gamma distribution of the form (1.1). Then it is well known [1, pp. 243-244], that the random variable W = X¡iX + Y) has a Beta distribution Biw ; a, a) given by 0 for w = 0, (1.2) PiW^w) = Biw;x,x)=\ ) u"-1il-u)'-1du for0<w<l, Ío T(a)r(a) 1 for w > 1. Now we can state the converse problem as follows : Let X and Y be two independently and identically distributed random variables having a common distribution function Fix). Suppose that W = Xj{X + Y) has a Beta distribution of the form (1.2). Then the question is whether £(x) is necessarily a Gamma distribution of the form (1.1). This problem was posed by Mauldon in [9]. He also showed that the converse problem is not true in general and constructed an example of a non-Gamma distribution with this property using the solution of an integral equation which was studied by Goodspeed in [2]. In the present paper we carry out a systematic investigation of this problem. In §2, we derive some general properties possessed by this class of distribution laws Fix).
    [Show full text]
  • The Probability Lifesaver: Order Statistics and the Median Theorem
    The Probability Lifesaver: Order Statistics and the Median Theorem Steven J. Miller December 30, 2015 Contents 1 Order Statistics and the Median Theorem 3 1.1 Definition of the Median 5 1.2 Order Statistics 10 1.3 Examples of Order Statistics 15 1.4 TheSampleDistributionoftheMedian 17 1.5 TechnicalboundsforproofofMedianTheorem 20 1.6 TheMedianofNormalRandomVariables 22 2 • Greetings again! In this supplemental chapter we develop the theory of order statistics in order to prove The Median Theorem. This is a beautiful result in its own, but also extremely important as a substitute for the Central Limit Theorem, and allows us to say non- trivial things when the CLT is unavailable. Chapter 1 Order Statistics and the Median Theorem The Central Limit Theorem is one of the gems of probability. It’s easy to use and its hypotheses are satisfied in a wealth of problems. Many courses build towards a proof of this beautiful and powerful result, as it truly is ‘central’ to the entire subject. Not to detract from the majesty of this wonderful result, however, what happens in those instances where it’s unavailable? For example, one of the key assumptions that must be met is that our random variables need to have finite higher moments, or at the very least a finite variance. What if we were to consider sums of Cauchy random variables? Is there anything we can say? This is not just a question of theoretical interest, of mathematicians generalizing for the sake of generalization. The following example from economics highlights why this chapter is more than just of theoretical interest.
    [Show full text]
  • Notes Mean, Median, Mode & Range
    Notes Mean, Median, Mode & Range How Do You Use Mode, Median, Mean, and Range to Describe Data? There are many ways to describe the characteristics of a set of data. The mode, median, and mean are all called measures of central tendency. These measures of central tendency and range are described in the table below. The mode of a set of data Use the mode to show which describes which value occurs value in a set of data occurs most frequently. If two or more most often. For the set numbers occur the same number {1, 1, 2, 3, 5, 6, 10}, of times and occur more often the mode is 1 because it occurs Mode than all the other numbers in the most frequently. set, those numbers are all modes for the data set. If each number in the set occurs the same number of times, the set of data has no mode. The median of a set of data Use the median to show which describes what value is in the number in a set of data is in the middle if the set is ordered from middle when the numbers are greatest to least or from least to listed in order. greatest. If there are an even For the set {1, 1, 2, 3, 5, 6, 10}, number of values, the median is the median is 3 because it is in the average of the two middle the middle when the numbers are Median values. Half of the values are listed in order. greater than the median, and half of the values are less than the median.
    [Show full text]
  • The Kumaraswamy Pareto IV Distribution
    Austrian Journal of Statistics July 2021, Volume 50, 1{22. AJShttp://www.ajs.or.at/ doi:10.17713/ajs.v50i5.96 The Kumaraswamy Pareto IV Distribution M. H. Tahir Gauss M. Cordeiro M. Mansoor M. Zubair Ayman Alzaatreh Islamia University Federal University Govt. S.E. College Govt. S.E. College American University of Bahawapur of Pernambuco Bahawalpur Bahawalpur of Sharjah Abstract We introduce a new model named the Kumaraswamy Pareto IV distribution which extends the Pareto and Pareto IV distributions. The density function is very flexible and can be left-skewed, right-skewed and symmetrical shapes. It has increasing, decreasing, upside-down bathtub, bathtub, J and reversed-J shaped hazard rate shapes. Various structural properties are derived including explicit expressions for the quantile function, ordinary and incomplete moments, Bonferroni and Lorenz curves, mean deviations, mean residual life, mean waiting time, probability weighted moments and generating function. We provide the density function of the order statistics and their moments. The R´enyi and q entropies are also obtained. The model parameters are estimated by the method of maximum likelihood and the observed information matrix is determined. The usefulness of the new model is illustrated by means of three real-life data sets. In fact, our proposed model provides a better fit to these data than the gamma-Pareto IV, gamma-Pareto, beta-Pareto, exponentiated Pareto and Pareto IV models. Keywords: Arnold's Pareto, Kumaraswamy-G class, Pareto family, reliability. 1. Introduction The Pareto distribution and its generalizations are tractable statistical models for scien- tists, economists and engineers. These distributions cover a wide range of applications espe- cially in economics, finance, actuarial science, risk theory, reliability, telecommunications and medicine.
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • The Kumaraswamy Generalized Gamma Distribution with Application in Survival Analysis
    The Kumaraswamy Generalized Gamma Distribution with Application in Survival Analysis Marcelino A. R. de Pascoa and Edwin M. M. Ortega1 Department of Exact Sciences Universidade de São Paulo Gauss M. Cordeiro Department of Statistics and Informatics Universidade Federal Rural de Pernambuco Abstract We introduce and study the so-called Kumaraswamy generalized gamma distribution that is ca- pable of modeling bathtub-shaped hazard rate functions. The beauty and importance of this dis- tribution lies in its ability to model monotone and non-monotone failure rate functions, which are quite common in lifetime data analysis and reliability. The new distribution has a large number of well-known lifetime special sub-models such as the exponentiated generalized gamma, exponentiated Weibull, exponentiated generalized half-normal, exponentiated gamma, generalized Rayleigh, among others. Some structural properties of the new distribution are studied. We obtain two infinite sum representations for the moments and an expansion for the generating function. We calculate the den- sity function of the order statistics and an expansion for their moments. The method of maximum likelihood and a Bayesian procedure are adopted for estimating the model parameters. The usefulness of the new distribution is illustrated in two real data sets. Keywords: Censored data; Data analysis; Generalized gamma distribution; Maximum likelihood estimation; Moment; Order statistic. 1 Introduction The gamma distribution is the most popular model for analyzing skewed data. The generalized gamma distribution (GG) was introduced by Stacy (1962) and includes as special sub-models: the exponential, Weibull, gamma and Rayleigh distributions, among others. It is suitable for modeling data with different forms of hazard rate function: increasing, decreasing, in the form of bathtub and unimodal.
    [Show full text]
  • Sampling Student's T Distribution – Use of the Inverse Cumulative
    Sampling Student’s T distribution – use of the inverse cumulative distribution function William T. Shaw Department of Mathematics, King’s College, The Strand, London WC2R 2LS, UK With the current interest in copula methods, and fat-tailed or other non-normal distributions, it is appropriate to investigate technologies for managing marginal distributions of interest. We explore “Student’s” T distribution, survey its simulation, and present some new techniques for simulation. In particular, for a given real (not necessarily integer) value n of the number of degrees of freedom, −1 we give a pair of power series approximations for the inverse, Fn ,ofthe cumulative distribution function (CDF), Fn.Wealsogivesomesimpleandvery fast exact and iterative techniques for defining this function when n is an even −1 integer, based on the observation that for such cases the calculation of Fn amounts to the solution of a reduced-form polynomial equation of degree n − 1. We also explain the use of Cornish–Fisher expansions to define the inverse CDF as the composition of the inverse CDF for the normal case with a simple polynomial map. The methods presented are well adapted for use with copula and quasi-Monte-Carlo techniques. 1 Introduction There is much interest in many areas of financial modeling on the use of copulas to glue together marginal univariate distributions where there is no easy canonical multivariate distribution, or one wishes to have flexibility in the mechanism for combination. One of the more interesting marginal distributions is the “Student’s” T distribution. This statistical distribution was published by W. Gosset in 1908.
    [Show full text]
  • Lecture 2 — September 24 2.1 Recap 2.2 Exponential Families
    STATS 300A: Theory of Statistics Fall 2015 Lecture 2 | September 24 Lecturer: Lester Mackey Scribe: Stephen Bates and Andy Tsao 2.1 Recap Last time, we set out on a quest to develop optimal inference procedures and, along the way, encountered an important pair of assertions: not all data is relevant, and irrelevant data can only increase risk and hence impair performance. This led us to introduce a notion of lossless data compression (sufficiency): T is sufficient for P with X ∼ Pθ 2 P if X j T (X) is independent of θ. How far can we take this idea? At what point does compression impair performance? These are questions of optimal data reduction. While we will develop general answers to these questions in this lecture and the next, we can often say much more in the context of specific modeling choices. With this in mind, let's consider an especially important class of models known as the exponential family models. 2.2 Exponential Families Definition 1. The model fPθ : θ 2 Ωg forms an s-dimensional exponential family if each Pθ has density of the form: s ! X p(x; θ) = exp ηi(θ)Ti(x) − B(θ) h(x) i=1 • ηi(θ) 2 R are called the natural parameters. • Ti(x) 2 R are its sufficient statistics, which follows from NFFC. • B(θ) is the log-partition function because it is the logarithm of a normalization factor: s ! ! Z X B(θ) = log exp ηi(θ)Ti(x) h(x)dµ(x) 2 R i=1 • h(x) 2 R: base measure.
    [Show full text]