Package 'Distributional'
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
5.1 Convergence in Distribution
556: MATHEMATICAL STATISTICS I CHAPTER 5: STOCHASTIC CONVERGENCE The following definitions are stated in terms of scalar random variables, but extend naturally to vector random variables defined on the same probability space with measure P . Forp example, some results 2 are stated in terms of the Euclidean distance in one dimension jXn − Xj = (Xn − X) , or for se- > quences of k-dimensional random variables Xn = (Xn1;:::;Xnk) , 0 1 1=2 Xk @ 2A kXn − Xk = (Xnj − Xj) : j=1 5.1 Convergence in Distribution Consider a sequence of random variables X1;X2;::: and a corresponding sequence of cdfs, FX1 ;FX2 ;::: ≤ so that for n = 1; 2; :: FXn (x) =P[Xn x] : Suppose that there exists a cdf, FX , such that for all x at which FX is continuous, lim FX (x) = FX (x): n−!1 n Then X1;:::;Xn converges in distribution to random variable X with cdf FX , denoted d Xn −! X and FX is the limiting distribution. Convergence of a sequence of mgfs or cfs also indicates conver- gence in distribution, that is, if for all t at which MX (t) is defined, if as n −! 1, we have −! () −!d MXi (t) MX (t) Xn X: Definition : DEGENERATE DISTRIBUTIONS The sequence of random variables X1;:::;Xn converges in distribution to constant c if the limiting d distribution of X1;:::;Xn is degenerate at c, that is, Xn −! X and P [X = c] = 1, so that { 0 x < c F (x) = X 1 x ≥ c Interpretation: A special case of convergence in distribution occurs when the limiting distribution is discrete, with the probability mass function only being non-zero at a single value, that is, if the limiting random variable is X, then P [X = c] = 1 and zero otherwise. -
Benford's Law As a Logarithmic Transformation
Benford’s Law as a Logarithmic Transformation J. M. Pimbley Maxwell Consulting, LLC Abstract We review the history and description of Benford’s Law - the observation that many naturally occurring numerical data collections exhibit a logarithmic distribution of digits. By creating numerous hypothetical examples, we identify several cases that satisfy Benford’s Law and several that do not. Building on prior work, we create and demonstrate a “Benford Test” to determine from the functional form of a probability density function whether the resulting distribution of digits will find close agreement with Benford’s Law. We also discover that one may generalize the first-digit distribution algorithm to include non-logarithmic transformations. This generalization shows that Benford’s Law rests on the tendency of the transformed and shifted data to exhibit a uniform distribution. Introduction What the world calls “Benford’s Law” is a marvelous mélange of mathematics, data, philosophy, empiricism, theory, mysticism, and fraud. Even with all these qualifiers, one easily describes this law and then asks the simple questions that have challenged investigators for more than a century. Take a large collection of positive numbers which may be integers or real numbers or both. Focus only on the first (non-zero) digit of each number. Count how many numbers of the large collection have each of the nine possibilities (1-9) as the first digit. For typical J. M. Pimbley, “Benford’s Law as a Logarithmic Transformation,” Maxwell Consulting Archives, 2014. number collections – which we’ll generally call “datasets” – the first digit is not equally distributed among the values 1-9. -
1988: Logarithmic Series Distribution and Its Use In
LOGARITHMIC SERIES DISTRIBUTION AND ITS USE IN ANALYZING DISCRETE DATA Jeffrey R, Wilson, Arizona State University Tempe, Arizona 85287 1. Introduction is assumed to be 7. for any unit and any trial. In a previous paper, Wilson and Koehler Furthermore, for ~ach unit the m trials are (1988) used the generalized-Dirichlet identical and independent so upon summing across multinomial model to account for extra trials X. = (XI~. X 2 ...... X~.)', the vector of variation. The model allows for a second order counts ~r the j th ~nit has ~3multinomial of pairwise correlation among units, a type of distribution with probability vector ~ = (~1' assumption found reasonable in some biological 72 ..... 71) ' and sample size m. Howe~er, data. In that paper, the two-way crossed responses given by the J units at a particular generalized Dirichlet Multinomial model was used trial may be correlated, producing a set of J to analyze repeated measure on the categorical correlated multinomial random vectors, X I, X 2, preferences of insurance customers. The number • o.~ X • of respondents was assumed to be fixed and Ta~is (1962) developed a model for this known. situation, which he called the generalized- In this paper a generalization of the model multinomial distribution in which a single is made allowing the number of respondents m, to parameter P, is used to reflect the common be random. Thus both the number of units m, and dependency between any two of the dependent the underlying probability vector are allowed to multinomial random vectors. The distribution of vary. The model presented here uses the m logarithmic series distribution to account for the category total Xi3. -
Notes on Scale-Invariance and Base-Invariance for Benford's
NOTES ON SCALE-INVARIANCE AND BASE-INVARIANCE FOR BENFORD’S LAW MICHAŁ RYSZARD WÓJCIK Abstract. It is known that if X is uniformly distributed modulo 1 and Y is an arbitrary random variable independent of X then Y + X is also uniformly distributed modulo 1. We prove a converse for any continuous random variable Y (or a reasonable approximation to a continuous random variable) so that if X and Y +X are equally distributed modulo 1 and Y is independent of X then X is uniformly distributed modulo 1 (or approximates the uniform distribution equally reasonably). This translates into a characterization of Benford’s law through a generalization of scale-invariance: from multiplication by a constant to multiplication by an independent random variable. We also show a base-invariance characterization: if a positive continuous random variable has the same significand distribution for two bases then it is Benford for both bases. The set of bases for which a random variable is Benford is characterized through characteristic functions. 1. Introduction Before the early 1970s, handheld electronic calculators were not yet in widespread use and scientists routinely used in their calculations books with tables containing the decimal logarithms of numbers between 1 and 10 spaced evenly with small increments like 0.01 or 0.001. For example, the first page would be filled with numbers 1.01, 1.02, 1.03, . , 1.99 in the left column and their decimal logarithms in the right column, while the second page with 2.00, 2.01, 2.02, 2.03, ..., 2.99, and so on till the ninth page with 9.00, 9.01, 9.02, 9.03, . -
Parameter Specification of the Beta Distribution and Its Dirichlet
%HWD'LVWULEXWLRQVDQG,WV$SSOLFDWLRQV 3DUDPHWHU6SHFLILFDWLRQRIWKH%HWD 'LVWULEXWLRQDQGLWV'LULFKOHW([WHQVLRQV 8WLOL]LQJ4XDQWLOHV -5HQpYDQ'RUSDQG7KRPDV$0D]]XFKL (Submitted January 2003, Revised March 2003) I. INTRODUCTION.................................................................................................... 1 II. SPECIFICATION OF PRIOR BETA PARAMETERS..............................................5 A. Basic Properties of the Beta Distribution...............................................................6 B. Solving for the Beta Prior Parameters...................................................................8 C. Design of a Numerical Procedure........................................................................12 III. SPECIFICATION OF PRIOR DIRICHLET PARAMETERS................................. 17 A. Basic Properties of the Dirichlet Distribution...................................................... 18 B. Solving for the Dirichlet prior parameters...........................................................20 IV. SPECIFICATION OF ORDERED DIRICHLET PARAMETERS...........................22 A. Properties of the Ordered Dirichlet Distribution................................................. 23 B. Solving for the Ordered Dirichlet Prior Parameters............................................ 25 C. Transforming the Ordered Dirichlet Distribution and Numerical Stability ......... 27 V. CONCLUSIONS........................................................................................................ 31 APPENDIX................................................................................................................... -
Severity Models - Special Families of Distributions
Severity Models - Special Families of Distributions Sections 5.3-5.4 Stat 477 - Loss Models Sections 5.3-5.4 (Stat 477) Claim Severity Models Brian Hartman - BYU 1 / 1 Introduction Introduction Given that a claim occurs, the (individual) claim size X is typically referred to as claim severity. While typically this may be of continuous random variables, sometimes claim sizes can be considered discrete. When modeling claims severity, insurers are usually concerned with the tails of the distribution. There are certain types of insurance contracts with what are called long tails. Sections 5.3-5.4 (Stat 477) Claim Severity Models Brian Hartman - BYU 2 / 1 Introduction parametric class Parametric distributions A parametric distribution consists of a set of distribution functions with each member determined by specifying one or more values called \parameters". The parameter is fixed and finite, and it could be a one value or several in which case we call it a vector of parameters. Parameter vector could be denoted by θ. The Normal distribution is a clear example: 1 −(x−µ)2=2σ2 fX (x) = p e ; 2πσ where the parameter vector θ = (µ, σ). It is well known that µ is the mean and σ2 is the variance. Some important properties: Standard Normal when µ = 0 and σ = 1. X−µ If X ∼ N(µ, σ), then Z = σ ∼ N(0; 1). Sums of Normal random variables is again Normal. If X ∼ N(µ, σ), then cX ∼ N(cµ, cσ). Sections 5.3-5.4 (Stat 477) Claim Severity Models Brian Hartman - BYU 3 / 1 Introduction some claim size distributions Some parametric claim size distributions Normal - easy to work with, but careful with getting negative claims. -
Iam 530 Elements of Probability and Statistics
IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 4-SOME DISCERETE AND CONTINUOUS DISTRIBUTION FUNCTIONS SOME DISCRETE PROBABILITY DISTRIBUTIONS Degenerate, Uniform, Bernoulli, Binomial, Poisson, Negative Binomial, Geometric, Hypergeometric DEGENERATE DISTRIBUTION • An rv X is degenerate at point k if 1, Xk P X x 0, ow. The cdf: 0, Xk F x P X x 1, Xk UNIFORM DISTRIBUTION • A finite number of equally spaced values are equally likely to be observed. 1 P(X x) ; x 1,2,..., N; N 1,2,... N • Example: throw a fair die. P(X=1)=…=P(X=6)=1/6 N 1 (N 1)(N 1) E(X) ; Var(X) 2 12 BERNOULLI DISTRIBUTION • An experiment consists of one trial. It can result in one of 2 outcomes: Success or Failure (or a characteristic being Present or Absent). • Probability of Success is p (0<p<1) 1 with probability p Xp;0 1 0 with probability 1 p P(X x) px (1 p)1 x for x 0,1; and 0 p 1 1 E(X ) xp(y) 0(1 p) 1p p y 0 E X 2 02 (1 p) 12 p p V (X ) E X 2 E(X ) 2 p p2 p(1 p) p(1 p) Binomial Experiment • Experiment consists of a series of n identical trials • Each trial can end in one of 2 outcomes: Success or Failure • Trials are independent (outcome of one has no bearing on outcomes of others) • Probability of Success, p, is constant for all trials • Random Variable X, is the number of Successes in the n trials is said to follow Binomial Distribution with parameters n and p • X can take on the values x=0,1,…,n • Notation: X~Bin(n,p) Consider outcomes of an experiment with 3 Trials : SSS y 3 P(SSS) P(Y 3) p(3) p3 SSF, SFS, FSS y 2 P(SSF SFS FSS) P(Y 2) p(2) 3p2 (1 p) SFF, FSF, FFS y 1 P(SFF FSF FFS ) P(Y 1) p(1) 3p(1 p)2 FFF y 0 P(FFF ) P(Y 0) p(0) (1 p)3 In General: n n! 1) # of ways of arranging x S s (and (n x) F s ) in a sequence of n positions x x!(n x)! 2) Probability of each arrangement of x S s (and (n x) F s ) p x (1 p)n x n 3)P(X x) p(x) p x (1 p)n x x 0,1,..., n x • Example: • There are black and white balls in a box. -
The Length-Biased Versus Random Sampling for the Binomial and Poisson Events Makarand V
Journal of Modern Applied Statistical Methods Volume 12 | Issue 1 Article 10 5-1-2013 The Length-Biased Versus Random Sampling for the Binomial and Poisson Events Makarand V. Ratnaparkhi Wright State University, [email protected] Uttara V. Naik-Nimbalkar Pune University, Pune, India Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Part of the Applied Statistics Commons, Social and Behavioral Sciences Commons, and the Statistical Theory Commons Recommended Citation Ratnaparkhi, Makarand V. and Naik-Nimbalkar, Uttara V. (2013) "The Length-Biased Versus Random Sampling for the Binomial and Poisson Events," Journal of Modern Applied Statistical Methods: Vol. 12 : Iss. 1 , Article 10. DOI: 10.22237/jmasm/1367381340 Available at: http://digitalcommons.wayne.edu/jmasm/vol12/iss1/10 This Regular Article is brought to you for free and open access by the Open Access Journals at DigitalCommons@WayneState. It has been accepted for inclusion in Journal of Modern Applied Statistical Methods by an authorized editor of DigitalCommons@WayneState. Journal of Modern Applied Statistical Methods Copyright © 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 54-57 1538 – 9472/13/$95.00 The Length-Biased Versus Random Sampling for the Binomial and Poisson Events Makarand V. Ratnaparkhi Uttara V. Naik-Nimbalkar Wright State University, Pune University, Dayton, OH Pune, India The equivalence between the length-biased and the random sampling on a non-negative, discrete random variable is established. The length-biased versions of the binomial and Poisson distributions are discussed. Key words: Length-biased data, weighted distributions, binomial, Poisson, convolutions. Introduction binomial distribution (probabilities), which was The occurrence of so-called length-biased data appropriate at that time. -
Bivariate Burr Distributions by Extending the Defining Equation
DECLARATION This thesis contains no material which has been accepted for the award of any other degree or diploma in any university and to the best ofmy knowledge and belief, it contains no material previously published by any other person, except where due references are made in the text ofthe thesis !~ ~ Cochin-22 BISMI.G.NADH June 2005 ACKNOWLEDGEMENTS I wish to express my profound gratitude to Dr. V.K. Ramachandran Nair, Professor, Department of Statistics, CUSAT for his guidance, encouragement and support through out the development ofthis thesis. I am deeply indebted to Dr. N. Unnikrishnan Nair, Fonner Vice Chancellor, CUSAT for the motivation, persistent encouragement and constructive criticism through out the development and writing of this thesis. The co-operation rendered by all teachers in the Department of Statistics, CUSAT is also thankfully acknowledged. I also wish to express great happiness at the steadfast encouragement rendered by my family. Finally I dedicate this thesis to the loving memory of my father Late. Sri. C.K. Gopinath. BISMI.G.NADH CONTENTS Page CHAPTER I INTRODUCTION 1 1.1 Families ofDistributions 1.2· Burr system 2 1.3 Present study 17 CHAPTER II BIVARIATE BURR SYSTEM OF DISTRIBUTIONS 18 2.1 Introduction 18 2.2 Bivariate Burr system 21 2.3 General Properties of Bivariate Burr system 23 2.4 Members ofbivariate Burr system 27 CHAPTER III CHARACTERIZATIONS OF BIVARIATE BURR 33 SYSTEM 3.1 Introduction 33 3.2 Characterizations ofbivariate Burr system using reliability 33 concepts 3.3 Characterization ofbivariate -
The Multivariate Normal Distribution=1See Last Slide For
Moment-generating Functions Definition Properties χ2 and t distributions The Multivariate Normal Distribution1 STA 302 Fall 2017 1See last slide for copyright information. 1 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ2 and t distributions 2 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Joint moment-generating function Of a p-dimensional random vector x t0x Mx(t) = E e x t +x t +x t For example, M (t1; t2; t3) = E e 1 1 2 2 3 3 (x1;x2;x3) Just write M(t) if there is no ambiguity. Section 4.3 of Linear models in statistics has some material on moment-generating functions (optional). 3 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Uniqueness Proof omitted Joint moment-generating functions correspond uniquely to joint probability distributions. M(t) is a function of F (x). Step One: f(x) = @ ··· @ F (x). @x1 @xp @ @ R x2 R x1 For example, f(y1; y2) dy1dy2 @x1 @x2 −∞ −∞ 0 Step Two: M(t) = R ··· R et xf(x) dx Could write M(t) = g (F (x)). Uniqueness says the function g is one-to-one, so that F (x) = g−1 (M(t)). 4 / 40 Moment-generating Functions Definition Properties χ2 and t distributions g−1 (M(t)) = F (x) A two-variable example g−1 (M(t)) = F (x) −1 R 1 R 1 x1t1+x2t2 R x2 R x1 g −∞ −∞ e f(x1; x2) dx1dx2 = −∞ −∞ f(y1; y2) dy1dy2 5 / 40 Moment-generating Functions Definition Properties χ2 and t distributions Theorem Two random vectors x1 and x2 are independent if and only if the moment-generating function of their joint distribution is the product of their moment-generating functions. -
Loss Distributions
Munich Personal RePEc Archive Loss Distributions Burnecki, Krzysztof and Misiorek, Adam and Weron, Rafal 2010 Online at https://mpra.ub.uni-muenchen.de/22163/ MPRA Paper No. 22163, posted 20 Apr 2010 10:14 UTC Loss Distributions1 Krzysztof Burneckia, Adam Misiorekb,c, Rafał Weronb aHugo Steinhaus Center, Wrocław University of Technology, 50-370 Wrocław, Poland bInstitute of Organization and Management, Wrocław University of Technology, 50-370 Wrocław, Poland cSantander Consumer Bank S.A., Wrocław, Poland Abstract This paper is intended as a guide to statistical inference for loss distributions. There are three basic approaches to deriving the loss distribution in an insurance risk model: empirical, analytical, and moment based. The empirical method is based on a sufficiently smooth and accurate estimate of the cumulative distribution function (cdf) and can be used only when large data sets are available. The analytical approach is probably the most often used in practice and certainly the most frequently adopted in the actuarial literature. It reduces to finding a suitable analytical expression which fits the observed data well and which is easy to handle. In some applications the exact shape of the loss distribution is not required. We may then use the moment based approach, which consists of estimating only the lowest characteristics (moments) of the distribution, like the mean and variance. Having a large collection of distributions to choose from, we need to narrow our selection to a single model and a unique parameter estimate. The type of the objective loss distribution can be easily selected by comparing the shapes of the empirical and theoretical mean excess functions. -
Fisher Information Matrix for Gaussian and Categorical Distributions
Fisher information matrix for Gaussian and categorical distributions Jakub M. Tomczak November 28, 2012 1 Notations Let x be a random variable. Consider a parametric distribution of x with parameters θ, p(xjθ). The contiuous random variable x 2 R can be modelled by normal distribution (Gaussian distribution): 1 n (x − µ)2 o p(xjθ) = p exp − 2πσ2 2σ2 = N (xjµ, σ2); (1) where θ = µ σ2T. A discrete (categorical) variable x 2 X , X is a finite set of K values, can be modelled by categorical distribution:1 K Y xk p(xjθ) = θk k=1 = Cat(xjθ); (2) P where 0 ≤ θk ≤ 1, k θk = 1. For X = f0; 1g we get a special case of the categorical distribution, Bernoulli distribution, p(xjθ) = θx(1 − θ)1−x = Bern(xjθ): (3) 2 Fisher information matrix 2.1 Definition The Fisher score is determined as follows [1]: g(θ; x) = rθ ln p(xjθ): (4) The Fisher information matrix is defined as follows [1]: T F = Ex g(θ; x) g(θ; x) : (5) 1We use the 1-of-K encoding [1]. 1 2.2 Example 1: Bernoulli distribution Let us calculate the fisher matrix for Bernoulli distribution (3). First, we need to take the logarithm: ln Bern(xjθ) = x ln θ + (1 − x) ln(1 − θ): (6) Second, we need to calculate the derivative: d x 1 − x ln Bern(xjθ) = − dθ θ 1 − θ x − θ = : (7) θ(1 − θ) Hence, we get the following Fisher score for the Bernoulli distribution: x − θ g(θ; x) = : (8) θ(1 − θ) The Fisher information matrix (here it is a scalar) for the Bernoulli distribution is as follows: F = Ex[g(θ; x) g(θ; x)] h (x − θ)2 i = Ex (θ(1 − θ))2 1 n o = [x2 − 2xθ + θ2] (θ(1 − θ))2 Ex 1 n o = [x2] − 2θ [x] + θ2 (θ(1 − θ))2 Ex Ex 1 n o = θ − 2θ2 + θ2 (θ(1 − θ))2 1 = θ(1 − θ) (θ(1 − θ))2 1 = : (9) θ(1 − θ) 2.3 Example 2: Categorical distribution Let us calculate the fisher matrix for categorical distribution (2).