Distributions.Jl Documentation Release 0.6.3
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Suitability of Different Probability Distributions for Performing Schedule Risk Simulations in Project Management
2016 Proceedings of PICMET '16: Technology Management for Social Innovation Suitability of Different Probability Distributions for Performing Schedule Risk Simulations in Project Management J. Krige Visser Department of Engineering and Technology Management, University of Pretoria, Pretoria, South Africa Abstract--Project managers are often confronted with the The Project Evaluation and Review Technique (PERT), in question on what is the probability of finishing a project within conjunction with the Critical Path Method (CPM), were budget or finishing a project on time. One method or tool that is developed in the 1950’s to address the uncertainty in project useful in answering these questions at various stages of a project duration for complex projects [11], [13]. The expected value is to develop a Monte Carlo simulation for the cost or duration or mean value for each activity of the project network was of the project and to update and repeat the simulations with actual data as the project progresses. The PERT method became calculated by applying the beta distribution and three popular in the 1950’s to express the uncertainty in the duration estimates for the duration of the activity. The total project of activities. Many other distributions are available for use in duration was determined by adding all the duration values of cost or schedule simulations. the activities on the critical path. The Monte Carlo simulation This paper discusses the results of a project to investigate the (MCS) provides a distribution for the total project duration output of schedule simulations when different distributions, e.g. and is therefore more useful as a method or tool for decision triangular, normal, lognormal or betaPert, are used to express making. -
Arcsine Laws for Random Walks Generated from Random Permutations with Applications to Genomics
Applied Probability Trust (4 February 2021) ARCSINE LAWS FOR RANDOM WALKS GENERATED FROM RANDOM PERMUTATIONS WITH APPLICATIONS TO GENOMICS XIAO FANG,1 The Chinese University of Hong Kong HAN LIANG GAN,2 Northwestern University SUSAN HOLMES,3 Stanford University HAIYAN HUANG,4 University of California, Berkeley EROL PEKOZ,¨ 5 Boston University ADRIAN ROLLIN,¨ 6 National University of Singapore WENPIN TANG,7 Columbia University 1 Email address: [email protected] 2 Email address: [email protected] 3 Email address: [email protected] 4 Email address: [email protected] 5 Email address: [email protected] 6 Email address: [email protected] 7 Email address: [email protected] 1 Postal address: Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong 2 Postal address: Department of Mathematics, Northwestern University, 2033 Sheridan Road, Evanston, IL 60208 2 Current address: University of Waikato, Private Bag 3105, Hamilton 3240, New Zealand 1 2 X. Fang et al. Abstract A classical result for the simple symmetric random walk with 2n steps is that the number of steps above the origin, the time of the last visit to the origin, and the time of the maximum height all have exactly the same distribution and converge when scaled to the arcsine law. Motivated by applications in genomics, we study the distributions of these statistics for the non-Markovian random walk generated from the ascents and descents of a uniform random permutation and a Mallows(q) permutation and show that they have the same asymptotic distributions as for the simple random walk. -
Final Paper (PDF)
Analytic Method for Probabilistic Cost and Schedule Risk Analysis Final Report 5 April2013 PREPARED FOR: NATIONAL AERONAUTICS AND SPACE ADMINISTRATION (NASA) OFFICE OF PROGRAM ANALYSIS AND EVALUATION (PA&E) COST ANALYSIS DIVISION (CAD) Felecia L. London Contracting Officer NASA GODDARD SPACE FLIGHT CENTER, PROCUREMENT OPERATIONS DIVISION OFFICE FOR HEADQUARTERS PROCUREMENT, 210.H Phone: 301-286-6693 Fax:301-286-1746 e-mail: [email protected] Contract Number: NNHl OPR24Z Order Number: NNH12PV48D PREPARED BY: RAYMOND P. COVERT, COVARUS, LLC UNDER SUBCONTRACT TO GALORATHINCORPORATED ~ SEER. br G A L 0 R A T H [This Page Intentionally Left Blank] ii TABLE OF CONTENTS 1 Executive Summacy.................................................................................................. 11 2 In.troduction .............................................................................................................. 12 2.1 Probabilistic Nature of Estimates .................................................................................... 12 2.2 Uncertainty and Risk ....................................................................................................... 12 2.2.1 Probability Density and Probability Mass ................................................................ 12 2.2.2 Cumulative Probability ............................................................................................. 13 2.2.3 Definition ofRisk ..................................................................................................... 14 2.3 -
Package 'Distributional'
Package ‘distributional’ February 2, 2021 Title Vectorised Probability Distributions Version 0.2.2 Description Vectorised distribution objects with tools for manipulating, visualising, and using probability distributions. Designed to allow model prediction outputs to return distributions rather than their parameters, allowing users to directly interact with predictive distributions in a data-oriented workflow. In addition to providing generic replacements for p/d/q/r functions, other useful statistics can be computed including means, variances, intervals, and highest density regions. License GPL-3 Imports vctrs (>= 0.3.0), rlang (>= 0.4.5), generics, ellipsis, stats, numDeriv, ggplot2, scales, farver, digest, utils, lifecycle Suggests testthat (>= 2.1.0), covr, mvtnorm, actuar, ggdist RdMacros lifecycle URL https://pkg.mitchelloharawild.com/distributional/, https: //github.com/mitchelloharawild/distributional BugReports https://github.com/mitchelloharawild/distributional/issues Encoding UTF-8 Language en-GB LazyData true Roxygen list(markdown = TRUE, roclets=c('rd', 'collate', 'namespace')) RoxygenNote 7.1.1 1 2 R topics documented: R topics documented: autoplot.distribution . .3 cdf..............................................4 density.distribution . .4 dist_bernoulli . .5 dist_beta . .6 dist_binomial . .7 dist_burr . .8 dist_cauchy . .9 dist_chisq . 10 dist_degenerate . 11 dist_exponential . 12 dist_f . 13 dist_gamma . 14 dist_geometric . 16 dist_gumbel . 17 dist_hypergeometric . 18 dist_inflated . 20 dist_inverse_exponential . 20 dist_inverse_gamma -
Statistical Distributions in General Insurance Stochastic Processes
Statistical distributions in general insurance stochastic processes by Jan Hendrik Harm Steenkamp Submitted in partial fulfilment of the requirements for the degree Magister Scientiae In the Department of Statistics In the Faculty of Natural & Agricultural Sciences University of Pretoria Pretoria January 2014 © University of Pretoria 1 I, Jan Hendrik Harm Steenkamp declare that the dissertation, which I hereby submit for the degree Magister Scientiae in Mathematical Statistics at the University of Pretoria, is my own work and has not previously been submit- ted for a degree at this or any other tertiary institution. SIGNATURE: DATE: 31 January 2014 © University of Pretoria Summary A general insurance risk model consists of in initial reserve, the premiums collected, the return on investment of these premiums, the claims frequency and the claims sizes. Except for the initial reserve, these components are all stochastic. The assumption of the distributions of the claims sizes is an integral part of the model and can greatly influence decisions on reinsurance agreements and ruin probabilities. An array of parametric distributions are available for use in describing the distribution of claims. The study is focussed on parametric distributions that have positive skewness and are defined for positive real values. The main properties and parameterizations are studied for a number of distribu- tions. Maximum likelihood estimation and method-of-moments estimation are considered as techniques for fitting these distributions. Multivariate nu- merical maximum likelihood estimation algorithms are proposed together with discussions on the efficiency of each of the estimation algorithms based on simulation exercises. These discussions are accompanied with programs developed in SAS PROC IML that can be used to simulate from the var- ious parametric distributions and to fit these parametric distributions to observed data. -
Free Infinite Divisibility and Free Multiplicative Mixtures of the Wigner Distribution
FREE INFINITE DIVISIBILITY AND FREE MULTIPLICATIVE MIXTURES OF THE WIGNER DISTRIBUTION Victor Pérez-Abreu and Noriyoshi Sakuma Comunicación del CIMAT No I-09-0715/ -10-2009 ( PE/CIMAT) Free Infinite Divisibility of Free Multiplicative Mixtures of the Wigner Distribution Victor P´erez-Abreu∗ Department of Probability and Statistics, CIMAT Apdo. Postal 402, Guanajuato Gto. 36000, Mexico [email protected] Noriyoshi Sakumay Department of Mathematics, Keio University, 3-14-1, Hiyoshi, Yokohama 223-8522, Japan. [email protected] March 19, 2010 Abstract Let I∗ and I be the classes of all classical infinitely divisible distributions and free infinitely divisible distributions, respectively, and let Λ be the Bercovici-Pata bijection between I∗ and I: The class type W of symmetric distributions in I that can be represented as free multiplicative convolutions of the Wigner distribution is studied. A characterization of this class under the condition that the mixing distribution is 2-divisible with respect to free multiplicative convolution is given. A correspondence between sym- metric distributions in I and the free counterpart under Λ of the positive distributions in I∗ is established. It is shown that the class type W does not include all symmetric distributions in I and that it does not coincide with the image under Λ of the mixtures of the Gaussian distribution in I∗. Similar results for free multiplicative convolutions with the symmetric arcsine measure are obtained. Several well-known and new concrete examples are presented. AMS 2000 Subject Classification: 46L54, 15A52. Keywords: Free convolutions, type G law, free stable law, free compound distribution, Bercovici-Pata bijection. -
Location-Scale Distributions
Location–Scale Distributions Linear Estimation and Probability Plotting Using MATLAB Horst Rinne Copyright: Prof. em. Dr. Horst Rinne Department of Economics and Management Science Justus–Liebig–University, Giessen, Germany Contents Preface VII List of Figures IX List of Tables XII 1 The family of location–scale distributions 1 1.1 Properties of location–scale distributions . 1 1.2 Genuine location–scale distributions — A short listing . 5 1.3 Distributions transformable to location–scale type . 11 2 Order statistics 18 2.1 Distributional concepts . 18 2.2 Moments of order statistics . 21 2.2.1 Definitions and basic formulas . 21 2.2.2 Identities, recurrence relations and approximations . 26 2.3 Functions of order statistics . 32 3 Statistical graphics 36 3.1 Some historical remarks . 36 3.2 The role of graphical methods in statistics . 38 3.2.1 Graphical versus numerical techniques . 38 3.2.2 Manipulation with graphs and graphical perception . 39 3.2.3 Graphical displays in statistics . 41 3.3 Distribution assessment by graphs . 43 3.3.1 PP–plots and QQ–plots . 43 3.3.2 Probability paper and plotting positions . 47 3.3.3 Hazard plot . 54 3.3.4 TTT–plot . 56 4 Linear estimation — Theory and methods 59 4.1 Types of sampling data . 59 IV Contents 4.2 Estimators based on moments of order statistics . 63 4.2.1 GLS estimators . 64 4.2.1.1 GLS for a general location–scale distribution . 65 4.2.1.2 GLS for a symmetric location–scale distribution . 71 4.2.1.3 GLS and censored samples . -
Section III Non-Parametric Distribution Functions
DRAFT 2-6-98 DO NOT CITE OR QUOTE LIST OF ATTACHMENTS ATTACHMENT 1: Glossary ...................................................... A1-2 ATTACHMENT 2: Probabilistic Risk Assessments and Monte-Carlo Methods: A Brief Introduction ...................................................................... A2-2 Tiered Approach to Risk Assessment .......................................... A2-3 The Origin of Monte-Carlo Techniques ........................................ A2-3 What is Monte-Carlo Analysis? .............................................. A2-4 Random Nature of the Monte Carlo Analysis .................................... A2-6 For More Information ..................................................... A2-6 ATTACHMENT 3: Distribution Selection ............................................ A3-1 Section I Introduction .................................................... A3-2 Monte-Carlo Modeling Options ........................................ A3-2 Organization of Document ........................................... A3-3 Section II Parametric Methods .............................................. A3-5 Activity I ! Selecting Candidate Distributions ............................ A3-5 Make Use of Prior Knowledge .................................. A3-5 Explore the Data ............................................ A3-7 Summary Statistics .................................... A3-7 Graphical Data Analysis. ................................ A3-9 Formal Tests for Normality and Lognormality ............... A3-10 Activity II ! Estimation of Parameters -
Fixed-K Asymptotic Inference About Tail Properties
Journal of the American Statistical Association ISSN: 0162-1459 (Print) 1537-274X (Online) Journal homepage: http://www.tandfonline.com/loi/uasa20 Fixed-k Asymptotic Inference About Tail Properties Ulrich K. Müller & Yulong Wang To cite this article: Ulrich K. Müller & Yulong Wang (2017) Fixed-k Asymptotic Inference About Tail Properties, Journal of the American Statistical Association, 112:519, 1334-1343, DOI: 10.1080/01621459.2016.1215990 To link to this article: https://doi.org/10.1080/01621459.2016.1215990 Accepted author version posted online: 12 Aug 2016. Published online: 13 Jun 2017. Submit your article to this journal Article views: 206 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=uasa20 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION , VOL. , NO. , –, Theory and Methods https://doi.org/./.. Fixed-k Asymptotic Inference About Tail Properties Ulrich K. Müller and Yulong Wang Department of Economics, Princeton University, Princeton, NJ ABSTRACT ARTICLE HISTORY We consider inference about tail properties of a distribution from an iid sample, based on extreme value the- Received February ory. All of the numerous previous suggestions rely on asymptotics where eventually, an infinite number of Accepted July observations from the tail behave as predicted by extreme value theory, enabling the consistent estimation KEYWORDS of the key tail index, and the construction of confidence intervals using the delta method or other classic Extreme quantiles; tail approaches. In small samples, however, extreme value theory might well provide good approximations for conditional expectations; only a relatively small number of tail observations. -
Fisher Information Matrix for Gaussian and Categorical Distributions
Fisher information matrix for Gaussian and categorical distributions Jakub M. Tomczak November 28, 2012 1 Notations Let x be a random variable. Consider a parametric distribution of x with parameters θ, p(xjθ). The contiuous random variable x 2 R can be modelled by normal distribution (Gaussian distribution): 1 n (x − µ)2 o p(xjθ) = p exp − 2πσ2 2σ2 = N (xjµ, σ2); (1) where θ = µ σ2T. A discrete (categorical) variable x 2 X , X is a finite set of K values, can be modelled by categorical distribution:1 K Y xk p(xjθ) = θk k=1 = Cat(xjθ); (2) P where 0 ≤ θk ≤ 1, k θk = 1. For X = f0; 1g we get a special case of the categorical distribution, Bernoulli distribution, p(xjθ) = θx(1 − θ)1−x = Bern(xjθ): (3) 2 Fisher information matrix 2.1 Definition The Fisher score is determined as follows [1]: g(θ; x) = rθ ln p(xjθ): (4) The Fisher information matrix is defined as follows [1]: T F = Ex g(θ; x) g(θ; x) : (5) 1We use the 1-of-K encoding [1]. 1 2.2 Example 1: Bernoulli distribution Let us calculate the fisher matrix for Bernoulli distribution (3). First, we need to take the logarithm: ln Bern(xjθ) = x ln θ + (1 − x) ln(1 − θ): (6) Second, we need to calculate the derivative: d x 1 − x ln Bern(xjθ) = − dθ θ 1 − θ x − θ = : (7) θ(1 − θ) Hence, we get the following Fisher score for the Bernoulli distribution: x − θ g(θ; x) = : (8) θ(1 − θ) The Fisher information matrix (here it is a scalar) for the Bernoulli distribution is as follows: F = Ex[g(θ; x) g(θ; x)] h (x − θ)2 i = Ex (θ(1 − θ))2 1 n o = [x2 − 2xθ + θ2] (θ(1 − θ))2 Ex 1 n o = [x2] − 2θ [x] + θ2 (θ(1 − θ))2 Ex Ex 1 n o = θ − 2θ2 + θ2 (θ(1 − θ))2 1 = θ(1 − θ) (θ(1 − θ))2 1 = : (9) θ(1 − θ) 2.3 Example 2: Categorical distribution Let us calculate the fisher matrix for categorical distribution (2). -
Categorical Distributions in Natural Language Processing Version 0.1
Categorical Distributions in Natural Language Processing Version 0.1 MURAWAKI Yugo 12 May 2016 MURAWAKI Yugo Categorical Distributions in NLP 1 / 34 Categorical distribution Suppose random variable x takes one of K values. x is generated according to categorical distribution Cat(θ), where θ = (0:1; 0:6; 0:3): RYG In many task settings, we do not know the true θ and need to infer it from observed data x = (x1; ··· ; xN). Once we infer θ, we often want to predict new variable x0. NOTE: In Bayesian settings, θ is usually integrated out and x0 is predicted directly from x. MURAWAKI Yugo Categorical Distributions in NLP 2 / 34 Categorical distributions are a building block of natural language models N-gram language model (predicting the next word) POS tagging based on a Hidden Markov Model (HMM) Probabilistic context-free grammar (PCFG) Topic model (Latent Dirichlet Allocation (LDA)) MURAWAKI Yugo Categorical Distributions in NLP 3 / 34 Example: HMM-based POS tagging BOS DT NN VBZ VBN EOS the sun has risen Let K be the number of POS tags and V be the vocabulary size (ignore BOS and EOS for simplicity). The transition probabilities can be computed using K categorical θTRANS; θTRANS; ··· distributions ( DT NN ), with the dimension K. θTRANS = : ; : ; : ; ··· DT (0 21 0 27 0 09 ) NN NNS ADJ Similarly, the emission probabilities can be computed using K θEMIT; θEMIT; ··· categorical distributions ( DT NN ), with the dimension V. θEMIT = : ; : ; : ; ··· NN (0 012 0 002 0 005 ) sun rose cat MURAWAKI Yugo Categorical Distributions in NLP 4 / 34 Outline Categorical and multinomial distributions Conjugacy and posterior predictive distribution LDA (Latent Dirichlet Applocation) as an application Gibbs sampling for inference MURAWAKI Yugo Categorical Distributions in NLP 5 / 34 Categorical distribution: 1 observation Suppose θ is known. -
Handbook on Probability Distributions
R powered R-forge project Handbook on probability distributions R-forge distributions Core Team University Year 2009-2010 LATEXpowered Mac OS' TeXShop edited Contents Introduction 4 I Discrete distributions 6 1 Classic discrete distribution 7 2 Not so-common discrete distribution 27 II Continuous distributions 34 3 Finite support distribution 35 4 The Gaussian family 47 5 Exponential distribution and its extensions 56 6 Chi-squared's ditribution and related extensions 75 7 Student and related distributions 84 8 Pareto family 88 9 Logistic distribution and related extensions 108 10 Extrem Value Theory distributions 111 3 4 CONTENTS III Multivariate and generalized distributions 116 11 Generalization of common distributions 117 12 Multivariate distributions 133 13 Misc 135 Conclusion 137 Bibliography 137 A Mathematical tools 141 Introduction This guide is intended to provide a quite exhaustive (at least as I can) view on probability distri- butions. It is constructed in chapters of distribution family with a section for each distribution. Each section focuses on the tryptic: definition - estimation - application. Ultimate bibles for probability distributions are Wimmer & Altmann (1999) which lists 750 univariate discrete distributions and Johnson et al. (1994) which details continuous distributions. In the appendix, we recall the basics of probability distributions as well as \common" mathe- matical functions, cf. section A.2. And for all distribution, we use the following notations • X a random variable following a given distribution, • x a realization of this random variable, • f the density function (if it exists), • F the (cumulative) distribution function, • P (X = k) the mass probability function in k, • M the moment generating function (if it exists), • G the probability generating function (if it exists), • φ the characteristic function (if it exists), Finally all graphics are done the open source statistical software R and its numerous packages available on the Comprehensive R Archive Network (CRAN∗).