Spectral Barriers in Certification Problems

Total Page:16

File Type:pdf, Size:1020Kb

Spectral Barriers in Certification Problems Spectral Barriers in Certification Problems by Dmitriy Kunisky A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Mathematics New York University May, 2021 Professor Afonso S. Bandeira Professor Gérard Ben Arous © Dmitriy Kunisky all rights reserved, 2021 Dedication To my grandparents. iii Acknowledgements First and foremost, I am immensely grateful to Afonso Bandeira for his generosity with his time, his broad mathematical insight, his resourceful creativity, and of course his famous reservoir of open problems. Afonso’s wonderful knack for keeping up the momentum— for always finding some new angle or somebody new to talk to when we were stuck, for mentioning “you know, just yesterday I heard a very nice problem...” right when I could use something else to work on—made five years fly by. I have Afonso to thank for my sense of taste in research, for learning to collaborate, for remembering to enjoy myself, and above all for coming to be ever on the lookout for something new to think about. I am also indebted to Gérard Ben Arous, who, though I wound up working on topics far from his main interests, was always willing to hear out what I’d been doing, to encourage and give suggestions, and to share his truly encyclopedic knowledge of mathematics and its culture and history. On many occasions I’ve learned more tracking down what Gérard meant with the smallest aside in the classroom or in our meetings than I could manage to get out of a week on my own in the library. I couldn’t have asked for a better role model of deep expertise, precise thinking, and genuine scholarship. Thanks also to the other faculty mentors I was fortunate to have at Courant: Carlos Fernandez-Granda at the beginning of my PhD, whose patience with my endless requests for advice on an early project helped me find my footing and grow to be confident in my ideas, and Jon Niles-Weed at the end, who brought his great energy in full force to the department iv and helped me stay on track at a difficult time. In the same breath I must also mention Cris Moore, who most generously arranged a visit at the beginning of my degree when I knew less than nothing about anything, and gently guided me over the years into his terrific interdisciplinary community. Thanks to Cris for administering another dose of spectacular taste in research, imparting a major appreciation of the gastronomy of Santa Fe, sharing his physicist’s enthusiasm for a pretty calculation, and impressing upon me his conviction that a scientist is, always and first of all, a citizen. Thanks to Oded Regev and Michael Overton, two of the most thoughtful teachers I’ve had the honor of learning from over the years, for being on my thesis committee. It’s a pleasure to share my work with you. My cohort at Courant and other collaborators over the years have taught me so many little bits of wisdom, tricks, tools, ideas big and small, and I am grateful for having had the opportunity to work or banter with all of them. A separate thanks to Alex Wein, Yunzi Ding, and Jess Banks, with whom I worked on the material that has ended up in several chapters of this thesis. All that work would have been impossible without their many insights and suggestions. Thanks to my friends in New York, for their chorus of “so...how’s the math?” and their knowing smiles when I would describe once again how it’s “kind of related” to something “that’s like something else that’s somewhat useful,” for their gently steering me away from work from time to time, and for many blissfully idle evenings. And to my family, their cheerful kindness, their love and support, their effortless confidence that I was making the right decisions, their sayings and jokes and stories around the kitchen table. I’m made of those stories, and that kitchen table is home for me. Thanks finally to Eleanor: for all these happy, bright, and weightless years together, the adventures and travels and the quiet sturdy times in between, and for knowing just when to take me by the hand and lead me far away from all this nonsense. v Abstract The related tasks of certifying bounds on optimization problems and refuting unsatisfiable systems of constraints have a long history in computer science and optimization, and deep mathematical roots in the proof complexity of algebraic systems. From the algorithmic perspective, these problems differ fundamentally from ordinary optimization in that they ask not merely for a single high-quality solution, but rather for a simultaneous bound on the quality of all possible solutions. The computational complexity of certification and its connections with and distinctions from the complexity of optimization, especially in the average case, remain poorly understood for many problems. The purpose of this thesis is to study the average-case complexity of certification for con- strained principal component analysis (PCA) problems, eigenvalue-like problems optimizing quadratic forms over sets of structured vectors or low-rank matrices. As one notable ex- ample, several of our results concern the much-studied Sherrington-Kirkpatrick (SK) Hamil- tonian, a random function over the hypercube originating in the statistical physics of spin glasses. In problems of this kind, a natural certification strategy is a bound by the spec- tral norm of the relevant coefficient matrix. We provide evidence for spectral barriers in constrained PCA problems, supporting the conjecture that, for reasonable instance distribu- tions, no efficient certification algorithm can improve on the spectral bound. In the first part of the thesis, we develop tools for proving reductions from certification over instances drawn from the Gaussian orthogonal ensemble (GOE) to hypothesis testing vi in spiked matrix models. These models describe tiltings of the eigenspaces of the GOE towards planted constrained PCA solutions. We then provide evidence that these testing problems are hard via lower bounds against algorithms computing low-degree polynomials of the observations. In doing so, we develop new techniques for working with the low-degree likelihood ratio. Namely, we show that its L2 norm, which governs the efficacy of low-degree tests, is in many models an expectation of a function of a scalar overlap or inner product of two draws of the random signal observed in the model. We use this to give a simple, unified account of old and new low-degree lower bounds, with tight results for testing in models including tensor PCA, Wigner and Wishart matrix PCA of arbitrary rank, sparse PCA, and the stochastic block model. Using our results for Wishart models, we deduce the conditional hardness of better-than-spectral certification for the SK Hamiltonian, the Potts spin glass Hamiltonian, and non-negative PCA. In the second part, we investigate lower bounds against the sum-of-squares hierarchy of semidefinite programming relaxations in the hypercube setting of the SK Hamiltonian. We first study the structure of the degree 4 relaxation, giving constraints on the spectra of pseudomoments and deterministic examples from finite frame theory. We then propose a general means of extending degree 2 lower bounds to higher degrees, an alternative to the pseudocalibration framework of Barak et al. (2019). We show that this technique is exact for the deterministic lower bound of Grigoriev (2001) and Laurent (2003), and prove a conjecture of Laurent’s on the spectrum of the associated pseudomoments. Finally, we apply our extension to the SK Hamiltonian and give tight lower bounds for the degree 4 and 6 relaxations. vii Contents Dedication iii Acknowledgments iv Abstract vi List of Figures xiv List of Tables xv List of Notations xvi 1 Introduction 1 1.1 Search and Certification .................................. 1 1.2 Average-Case Computational Complexity........................ 6 1.3 A Motivating Example....................................10 1.4 Spectral Barriers .......................................15 1.5 Summary of Contributions.................................17 1.5.1 Reductions, Planting, and Low-Degree Polynomials .............18 1.5.2 The Sum-of-Squares Hierarchy..........................19 I Computationally-Quiet Planting and Low-Degree Polynomials 21 2 Spectral Planting and Reductions to Hypothesis Testing 22 viii 2.1 Hypothesis Testing .....................................24 2.1.1 Statistical-Computational Gaps..........................26 2.2 Pushout and Computationally-Quiet Planting .....................27 2.3 Tools for Reductions Under the GOE...........................34 2.4 Hard Regime in the Wishart Spiked Matrix Model...................38 2.5 An Example of Spectral Planting in Graphs.......................40 3 Preliminaries on Low-Degree Polynomial Algorithms 44 3.1 Le Cam’s Contiguity and the Second Moment Method................45 3.2 Two Roads to the Low-Degree Likelihood Ratio....................51 3.2.1 Computationally-Bounded Second Moment Method.............51 3.2.2 Appearance in Sum-of-Squares Lower Bounds ................54 3.3 Consequences of Low-Degree Lower Bounds......................57 4 Low-Degree Likelihood Ratio Analysis: Overlap Formulae 60 4.1 Gaussian Wigner Models..................................62 4.1.1 Hermite Polynomials................................66 4.1.2 Computing the Low-Degree Likelihood Ratio .................68 4.2 Exponential Families and Variance Functions .....................71 4.3 NEF-QVF Models .......................................76 4.3.1 Orthogonal Polynomials in NEF-QVFs......................77 4.3.2
Recommended publications
  • The Strongish Planted Clique Hypothesis and Its Consequences
    The Strongish Planted Clique Hypothesis and Its Consequences Pasin Manurangsi Google Research, Mountain View, CA, USA [email protected] Aviad Rubinstein Stanford University, CA, USA [email protected] Tselil Schramm Stanford University, CA, USA [email protected] Abstract We formulate a new hardness assumption, the Strongish Planted Clique Hypothesis (SPCH), which postulates that any algorithm for planted clique must run in time nΩ(log n) (so that the state-of-the-art running time of nO(log n) is optimal up to a constant in the exponent). We provide two sets of applications of the new hypothesis. First, we show that SPCH implies (nearly) tight inapproximability results for the following well-studied problems in terms of the parameter k: Densest k-Subgraph, Smallest k-Edge Subgraph, Densest k-Subhypergraph, Steiner k-Forest, and Directed Steiner Network with k terminal pairs. For example, we show, under SPCH, that no polynomial time algorithm achieves o(k)-approximation for Densest k-Subgraph. This inapproximability ratio improves upon the previous best ko(1) factor from (Chalermsook et al., FOCS 2017). Furthermore, our lower bounds hold even against fixed-parameter tractable algorithms with parameter k. Our second application focuses on the complexity of graph pattern detection. For both induced and non-induced graph pattern detection, we prove hardness results under SPCH, improving the running time lower bounds obtained by (Dalirrooyfard et al., STOC 2019) under the Exponential Time Hypothesis. 2012 ACM Subject Classification Theory of computation → Problems, reductions and completeness; Theory of computation → Fixed parameter tractability Keywords and phrases Planted Clique, Densest k-Subgraph, Hardness of Approximation Digital Object Identifier 10.4230/LIPIcs.ITCS.2021.10 Related Version A full version of the paper is available at https://arxiv.org/abs/2011.05555.
    [Show full text]
  • Computational Lower Bounds for Community Detection on Random Graphs
    JMLR: Workshop and Conference Proceedings vol 40:1–30, 2015 Computational Lower Bounds for Community Detection on Random Graphs Bruce Hajek [email protected] Department of ECE, University of Illinois at Urbana-Champaign, Urbana, IL Yihong Wu [email protected] Department of ECE, University of Illinois at Urbana-Champaign, Urbana, IL Jiaming Xu [email protected] Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA, Abstract This paper studies the problem of detecting the presence of a small dense community planted in a large Erdos-R˝ enyi´ random graph G(N; q), where the edge probability within the community exceeds q by a constant factor. Assuming the hardness of the planted clique detection problem, we show that the computational complexity of detecting the community exhibits the following phase transition phenomenon: As the graph size N grows and the graph becomes sparser according to −α 2 q = N , there exists a critical value of α = 3 , below which there exists a computationally intensive procedure that can detect far smaller communities than any computationally efficient procedure, and above which a linear-time procedure is statistically optimal. The results also lead to the average-case hardness results for recovering the dense community and approximating the densest K-subgraph. 1. Introduction Networks often exhibit community structure with many edges joining the vertices of the same com- munity and relatively few edges joining vertices of different communities. Detecting communities in networks has received a large amount of attention and has found numerous applications in social and biological sciences, etc (see, e.g., the exposition Fortunato(2010) and the references therein).
    [Show full text]
  • Spectral Algorithms (Draft)
    Spectral Algorithms (draft) Ravindran Kannan and Santosh Vempala March 3, 2013 ii Summary. Spectral methods refer to the use of eigenvalues, eigenvectors, sin- gular values and singular vectors. They are widely used in Engineering, Ap- plied Mathematics and Statistics. More recently, spectral methods have found numerous applications in Computer Science to \discrete" as well \continuous" problems. This book describes modern applications of spectral methods, and novel algorithms for estimating spectral parameters. In the first part of the book, we present applications of spectral methods to problems from a variety of topics including combinatorial optimization, learning and clustering. The second part of the book is motivated by efficiency considerations. A fea- ture of many modern applications is the massive amount of input data. While sophisticated algorithms for matrix computations have been developed over a century, a more recent development is algorithms based on \sampling on the fly” from massive matrices. Good estimates of singular values and low rank ap- proximations of the whole matrix can be provably derived from a sample. Our main emphasis in the second part of the book is to present these sampling meth- ods with rigorous error bounds. We also present recent extensions of spectral methods from matrices to tensors and their applications to some combinatorial optimization problems. Contents I Applications 1 1 The Best-Fit Subspace 3 1.1 Singular Value Decomposition . .3 1.2 Algorithms for computing the SVD . .7 1.3 The k-means clustering problem . .8 1.4 Discussion . 11 2 Clustering Discrete Random Models 13 2.1 Planted cliques in random graphs .
    [Show full text]
  • Diagonal Estimation with Probing Methods
    Diagonal Estimation with Probing Methods Bryan J. Kaperick Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Master of Science in Mathematics Matthias C. Chung, Chair Julianne M. Chung Serkan Gugercin Jayanth Jagalur-Mohan May 3, 2019 Blacksburg, Virginia Keywords: Probing Methods, Numerical Linear Algebra, Computational Inverse Problems Copyright 2019, Bryan J. Kaperick Diagonal Estimation with Probing Methods Bryan J. Kaperick (ABSTRACT) Probing methods for trace estimation of large, sparse matrices has been studied for several decades. In recent years, there has been some work to extend these techniques to instead estimate the diagonal entries of these systems directly. We extend some analysis of trace estimators to their corresponding diagonal estimators, propose a new class of deterministic diagonal estimators which are well-suited to parallel architectures along with heuristic ar- guments for the design choices in their construction, and conclude with numerical results on diagonal estimation and ordering problems, demonstrating the strengths of our newly- developed methods alongside existing methods. Diagonal Estimation with Probing Methods Bryan J. Kaperick (GENERAL AUDIENCE ABSTRACT) In the past several decades, as computational resources increase, a recurring problem is that of estimating certain properties very large linear systems (matrices containing real or complex entries). One particularly important quantity is the trace of a matrix, defined as the sum of the entries along its diagonal. In this thesis, we explore a problem that has only recently been studied, in estimating the diagonal entries of a particular matrix explicitly. For these methods to be computationally more efficient than existing methods, and with favorable convergence properties, we require the matrix in question to have a majority of its entries be zero (the matrix is sparse), with the largest-magnitude entries clustered near and on its diagonal, and very large in size.
    [Show full text]
  • A Guide on Probability Distributions
    powered project A guide on probability distributions R-forge distributions Core Team University Year 2008-2009 LATEXpowered Mac OS' TeXShop edited Contents Introduction 4 I Discrete distributions 6 1 Classic discrete distribution 7 2 Not so-common discrete distribution 27 II Continuous distributions 34 3 Finite support distribution 35 4 The Gaussian family 47 5 Exponential distribution and its extensions 56 6 Chi-squared's ditribution and related extensions 75 7 Student and related distributions 84 8 Pareto family 88 9 Logistic ditribution and related extensions 108 10 Extrem Value Theory distributions 111 3 4 CONTENTS III Multivariate and generalized distributions 116 11 Generalization of common distributions 117 12 Multivariate distributions 132 13 Misc 134 Conclusion 135 Bibliography 135 A Mathematical tools 138 Introduction This guide is intended to provide a quite exhaustive (at least as I can) view on probability distri- butions. It is constructed in chapters of distribution family with a section for each distribution. Each section focuses on the tryptic: definition - estimation - application. Ultimate bibles for probability distributions are Wimmer & Altmann (1999) which lists 750 univariate discrete distributions and Johnson et al. (1994) which details continuous distributions. In the appendix, we recall the basics of probability distributions as well as \common" mathe- matical functions, cf. section A.2. And for all distribution, we use the following notations • X a random variable following a given distribution, • x a realization of this random variable, • f the density function (if it exists), • F the (cumulative) distribution function, • P (X = k) the mass probability function in k, • M the moment generating function (if it exists), • G the probability generating function (if it exists), • φ the characteristic function (if it exists), Finally all graphics are done the open source statistical software R and its numerous packages available on the Comprehensive R Archive Network (CRAN∗).
    [Show full text]
  • Probability Statistics Economists
    PROBABILITY AND STATISTICS FOR ECONOMISTS BRUCE E. HANSEN Contents Preface x Acknowledgements xi Mathematical Preparation xii Notation xiii 1 Basic Probability Theory 1 1.1 Introduction . 1 1.2 Outcomes and Events . 1 1.3 Probability Function . 3 1.4 Properties of the Probability Function . 4 1.5 Equally-Likely Outcomes . 5 1.6 Joint Events . 5 1.7 Conditional Probability . 6 1.8 Independence . 7 1.9 Law of Total Probability . 9 1.10 Bayes Rule . 10 1.11 Permutations and Combinations . 11 1.12 Sampling With and Without Replacement . 13 1.13 Poker Hands . 14 1.14 Sigma Fields* . 16 1.15 Technical Proofs* . 17 1.16 Exercises . 18 2 Random Variables 22 2.1 Introduction . 22 2.2 Random Variables . 22 2.3 Discrete Random Variables . 22 2.4 Transformations . 24 2.5 Expectation . 25 2.6 Finiteness of Expectations . 26 2.7 Distribution Function . 28 2.8 Continuous Random Variables . 29 2.9 Quantiles . 31 2.10 Density Functions . 31 2.11 Transformations of Continuous Random Variables . 34 2.12 Non-Monotonic Transformations . 36 ii CONTENTS iii 2.13 Expectation of Continuous Random Variables . 37 2.14 Finiteness of Expectations . 39 2.15 Unifying Notation . 39 2.16 Mean and Variance . 40 2.17 Moments . 42 2.18 Jensen’s Inequality . 42 2.19 Applications of Jensen’s Inequality* . 43 2.20 Symmetric Distributions . 45 2.21 Truncated Distributions . 46 2.22 Censored Distributions . 47 2.23 Moment Generating Function . 48 2.24 Cumulants . 50 2.25 Characteristic Function . 51 2.26 Expectation: Mathematical Details* . 52 2.27 Exercises .
    [Show full text]
  • Public-Key Cryptography in the Fine-Grained Setting
    Public-Key Cryptography in the Fine-Grained Setting B B Rio LaVigne( ), Andrea Lincoln( ), and Virginia Vassilevska Williams MIT CSAIL and EECS, Cambridge, USA {rio,andreali,virgi}@mit.edu Abstract. Cryptography is largely based on unproven assumptions, which, while believable, might fail. Notably if P = NP, or if we live in Pessiland, then all current cryptographic assumptions will be broken. A compelling question is if any interesting cryptography might exist in Pessiland. A natural approach to tackle this question is to base cryptography on an assumption from fine-grained complexity. Ball, Rosen, Sabin, and Vasudevan [BRSV’17] attempted this, starting from popular hardness assumptions, such as the Orthogonal Vectors (OV) Conjecture. They obtained problems that are hard on average, assuming that OV and other problems are hard in the worst case. They obtained proofs of work, and hoped to use their average-case hard problems to build a fine-grained one-way function. Unfortunately, they proved that constructing one using their approach would violate a popular hardness hypothesis. This moti- vates the search for other fine-grained average-case hard problems. The main goal of this paper is to identify sufficient properties for a fine-grained average-case assumption that imply cryptographic prim- itives such as fine-grained public key cryptography (PKC). Our main contribution is a novel construction of a cryptographic key exchange, together with the definition of a small number of relatively weak struc- tural properties, such that if a computational problem satisfies them, our key exchange has provable fine-grained security guarantees, based on the hardness of this problem.
    [Show full text]
  • Robust Inversion, Dimensionality Reduction, and Randomized Sampling
    Robust inversion, dimensionality reduction, and randomized sampling Aleksandr Aravkin Michael P. Friedlander Felix J. Herrmann Tristan van Leeuwen November 16, 2011 Abstract We consider a class of inverse problems in which the forward model is the solution operator to linear ODEs or PDEs. This class admits several di- mensionality-reduction techniques based on data averaging or sampling, which are especially useful for large-scale problems. We survey these approaches and their connection to stochastic optimization. The data-averaging approach is only viable, however, for a least-squares misfit, which is sensitive to outliers in the data and artifacts unexplained by the forward model. This motivates us to propose a robust formulation based on the Student's t-distribution of the error. We demonstrate how the corresponding penalty function, together with the sampling approach, can obtain good results for a large-scale seismic inverse problem with 50% corrupted data. Keywords inverse problems · seismic inversion · stochastic optimization · robust estimation 1 Introduction Consider the generic parameter-estimation scheme in which we conduct m exper- iments, recording the corresponding experimental input vectors fq1; q2; : : : ; qmg and observation vectors fd1; d2; : : : ; dmg. We model the data for given parameters n x 2 R by di = Fi(x)qi + i for i = 1; : : : ; m; (1.1) This work was in part financially supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant (22R81254) and the Collaborative Research and Develop- ment Grant DNOISE II (375142-08). This research was carried out as part of the SINBAD II project with support from the following organizations: BG Group, BPG, BP, Chevron, Conoco Phillips, Petrobras, PGS, Total SA, and WesternGeco.
    [Show full text]
  • A New Approach to the Planted Clique Problem
    A new approach to the planted clique problem Alan Frieze∗ Ravi Kannan† Department of Mathematical Sciences, Department of Computer Science, Carnegie-Mellon University. Yale University. [email protected]. [email protected]. 20, October, 2003 1 Introduction It is well known that finding the largest clique in a graph is NP-hard, [11]. Indeed, Hastad [8] has shown that it is NP-hard to approximate the size of the largest clique in an n vertex graph to within 1 ǫ a factor n − for any ǫ > 0. Not surprisingly, this has directed some researchers attention to finding the largest clique in a random graph. Let Gn,1/2 be the random graph with vertex set [n] in which each possible edge is included/excluded independently with probability 1/2. It is known that whp the size of the largest clique is (2+ o(1)) log2 n, but no known polymomial time algorithm has been proven to find a clique of size more than (1+ o(1)) log2 n. Karp [12] has even suggested that finding a clique of size (1+ ǫ)log2 n is computationally difficult for any constant ǫ > 0. Significant attention has also been directed to the problem of finding a hidden clique, but with only limited success. Thus let G be the union of Gn,1/2 and an unknown clique on vertex set P , where p = P is given. The problem is to recover P . If p c(n log n)1/2 then, as observed by Kucera [13], | | ≥ with high probability, it is easy to recover P as the p vertices of largest degree.
    [Show full text]
  • Lipics-CCC-2021-35.Pdf (1
    Hardness of KT Characterizes Parallel Cryptography Hanlin Ren # Ñ Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China Rahul Santhanam # University of Oxford, UK Abstract A recent breakthrough of Liu and Pass (FOCS’20) shows that one-way functions exist if and only if the (polynomial-)time-bounded Kolmogorov complexity, Kt, is bounded-error hard on average to compute. In this paper, we strengthen this result and extend it to other complexity measures: We show, perhaps surprisingly, that the KT complexity is bounded-error average-case hard if and only if there exist one-way functions in constant parallel time (i.e. NC0). This result crucially relies on the idea of randomized encodings. Previously, a seminal work of Applebaum, Ishai, and Kushilevitz (FOCS’04; SICOMP’06) used the same idea to show that NC0-computable one-way functions exist if and only if logspace-computable one-way functions exist. Inspired by the above result, we present randomized average-case reductions among the NC1- versions and logspace-versions of Kt complexity, and the KT complexity. Our reductions preserve both bounded-error average-case hardness and zero-error average-case hardness. To the best of our knowledge, this is the first reduction between the KT complexity and a variant of Kt complexity. We prove tight connections between the hardness of Kt complexity and the hardness of (the hardest) one-way functions. In analogy with the Exponential-Time Hypothesis and its variants, we define and motivate the Perebor Hypotheses for complexity measures such as Kt and KT. We show that a Strong Perebor Hypothesis for Kt implies the existence of (weak) one-way functions of near-optimal hardness 2n−o(n).
    [Show full text]
  • Handbook on Probability Distributions
    R powered R-forge project Handbook on probability distributions R-forge distributions Core Team University Year 2009-2010 LATEXpowered Mac OS' TeXShop edited Contents Introduction 4 I Discrete distributions 6 1 Classic discrete distribution 7 2 Not so-common discrete distribution 27 II Continuous distributions 34 3 Finite support distribution 35 4 The Gaussian family 47 5 Exponential distribution and its extensions 56 6 Chi-squared's ditribution and related extensions 75 7 Student and related distributions 84 8 Pareto family 88 9 Logistic distribution and related extensions 108 10 Extrem Value Theory distributions 111 3 4 CONTENTS III Multivariate and generalized distributions 116 11 Generalization of common distributions 117 12 Multivariate distributions 133 13 Misc 135 Conclusion 137 Bibliography 137 A Mathematical tools 141 Introduction This guide is intended to provide a quite exhaustive (at least as I can) view on probability distri- butions. It is constructed in chapters of distribution family with a section for each distribution. Each section focuses on the tryptic: definition - estimation - application. Ultimate bibles for probability distributions are Wimmer & Altmann (1999) which lists 750 univariate discrete distributions and Johnson et al. (1994) which details continuous distributions. In the appendix, we recall the basics of probability distributions as well as \common" mathe- matical functions, cf. section A.2. And for all distribution, we use the following notations • X a random variable following a given distribution, • x a realization of this random variable, • f the density function (if it exists), • F the (cumulative) distribution function, • P (X = k) the mass probability function in k, • M the moment generating function (if it exists), • G the probability generating function (if it exists), • φ the characteristic function (if it exists), Finally all graphics are done the open source statistical software R and its numerous packages available on the Comprehensive R Archive Network (CRAN∗).
    [Show full text]
  • Field Guide to Continuous Probability Distributions
    Field Guide to Continuous Probability Distributions Gavin E. Crooks v 1.0.0 2019 G. E. Crooks – Field Guide to Probability Distributions v 1.0.0 Copyright © 2010-2019 Gavin E. Crooks ISBN: 978-1-7339381-0-5 http://threeplusone.com/fieldguide Berkeley Institute for Theoretical Sciences (BITS) typeset on 2019-04-10 with XeTeX version 0.99999 fonts: Trump Mediaeval (text), Euler (math) 271828182845904 2 G. E. Crooks – Field Guide to Probability Distributions Preface: The search for GUD A common problem is that of describing the probability distribution of a single, continuous variable. A few distributions, such as the normal and exponential, were discovered in the 1800’s or earlier. But about a century ago the great statistician, Karl Pearson, realized that the known probabil- ity distributions were not sufficient to handle all of the phenomena then under investigation, and set out to create new distributions with useful properties. During the 20th century this process continued with abandon and a vast menagerie of distinct mathematical forms were discovered and invented, investigated, analyzed, rediscovered and renamed, all for the purpose of de- scribing the probability of some interesting variable. There are hundreds of named distributions and synonyms in current usage. The apparent diver- sity is unending and disorienting. Fortunately, the situation is less confused than it might at first appear. Most common, continuous, univariate, unimodal distributions can be orga- nized into a small number of distinct families, which are all special cases of a single Grand Unified Distribution. This compendium details these hun- dred or so simple distributions, their properties and their interrelations.
    [Show full text]