On the Super-Additivity and Estimation Biases of Quantile

Total Page:16

File Type:pdf, Size:1020Kb

On the Super-Additivity and Estimation Biases of Quantile EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES On the Super-Additivity and Estimation Biases of Quantile Contributions Nassim Nicholas Taleb∗, Raphael Douadyy ∗School of Engineering, New York University yRiskdata & C.N.R.S. Paris, Labex ReFi, Centre d’Economie de la Sorbonne Abstract—Sample measures of top centile contributions to the For a given sample (Xk)1≤k≤n, its "natural" estimator qthpercentile total (concentration) are downward biased, unstable estimators, κ ≡ , used in most academic studies, can be extremely sensitive to sample size and concave in accounting for bq total large deviations. It makes them particularly unfit in domains expressed, as Pn with power law tails, especially for low values of the exponent. 1 ^ Xi κ ≡ i=1 Xi>h(q) These estimators can vary over time and increase with the bq Pn X population size, as shown in this article, thus providing the i=1 i illusion of structural changes in concentration. They are also where h^(q) is the estimated exceedance threshold for the inconsistent under aggregation and mixing distributions, as the weighted average of concentration measures for A and B will probability q : tend to be lower than that from A [ B. In addition, it can be n 1 X shown that under such fat tails, increases in the total sum need h^(q) = inffh : 1 ≤ qg to be accompanied by increased sample size of the concentration n x>h measurement. We examine the estimation superadditivity and i=1 bias under homogeneous and mixed distributions. We shall see that the observed variable κbq is a downward Fourth version, Nov 11 2014 biased estimator of the true ratio κq, the one that would hold out of sample, and such bias is in proportion to the fatness of I. INTRODUCTION tails and, for very fat tailed distributions, remains significant, even for very large samples. Vilfredo Pareto noticed that 80% of the land in Italy belonged to 20% of the population, and vice-versa, thus both giving birth to the power law class of distributions and the II. ESTIMATION FOR UNMIXED PARETO-TAILED popular saying 80/20. The self-similarity at the core of the DISTRIBUTIONS property of power laws [1] and [2] allows us to recurse and Let X be a random variable belonging to the class of reapply the 80/20 to the remaining 20%, and so forth until one distributions with a "power law" right tail, that is: obtains the result that the top percent of the population will −α own about 53% of the total wealth. P(X > x) ∼ L(x) x (1) It looks like such a measure of concentration can be where L :[xmin; +1) ! (0; +1) is a slowly varying seriously biased, depending on how it is measured, so it is L(kx) very likely that the true ratio of concentration of what Pareto function, defined as limx!+1 L(x) = 1 for any k > 0. observed, that is, the share of the top percentile, was closer There is little difference for small exceedance quantiles to 70%, hence changes year-on-year would drift higher to (<50%) between the various possible distributions such as converge to such a level from larger sample. In fact, as we Student’s t, Lévy α-stable, Dagum,[3],[4] Singh-Maddala dis- arXiv:1405.1791v3 [stat.AP] 12 Nov 2014 will show in this discussion, for, say wealth, more complete tribution [5], or straight Pareto. samples resulting from technological progress, and also larger For exponents 1 ≤ α ≤ 2, as observed in [6], the law of population and economic growth will make such a measure large numbers operates, though extremely slowly. The problem converge by increasing over time, for no other reason than is acute for α around, but strictly above 1 and severe, as it expansion in sample space or aggregate value. diverges, for α = 1. The core of the problem is that, for the class one-tailed fat-tailed random variables, that is, bounded on the left and A. Bias and Convergence unbounded on the right, where the random variable X 2 1) Simple Pareto Distribution: Let us first consider φα(x) [xmin; 1), the in-sample quantile contribution is a biased estimator of the true value of the actual quantile contribution. the density of a α-Pareto distribution bounded from below by x > 0; φ (x) = αxα x−α−11 Let us define the quantile contribution min in other words: α min x≥xmin , and xmin α P(X > x) = x . Under these assumptions, the cutpoint E[XjX > h(q)] −1/α κq = q of exceedance is h(q) = xmin q and we have: E[X] R 1 x φ(x)dx h(q) h(q) 1−α α−1 where h(q) = inffh 2 [xmin; +1) ; P(X > h) ≤ qg is the α κq = R 1 = = q (2) exceedance threshold for the probability q: x φ(x)dx xmin xmin 1 EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES 2 If the distribution of X is α-Pareto only beyond a cut-point of concavity of the concentration measure with respect to xcut, which we assume to be below h(q), so that we have an innovation (a new sample value), whether it falls below λ α Pn 1 P(X > x) = x for some λ > 0, then we still have or above the threshold. Let Ah(n) = i=1 Xi>hXi and h(q) = λq−1/α n Ah(n) and S(n) = P X ; so that κ (n) = and assume a i=1 i bh S(n) α λ α−1 κq = q α frozen threshold h. If a new sample value Xn+1 < h then the α − 1 E [X] Ah(n) new value is κbh(n+1) = : The value is convex The estimation of κ hence requires that of the exponent α as S(n) + Xn+1 q in X so that uncertainty on X increases its expectation. well as that of the scaling parameter λ, or at least its ratio to n+1 n+1 At variance, if the new sample value X > h, the new value X n+1 the expectation of . Ah(n)+Xn+1−h S(n)−Ah(n) κh(n + 1) ≈ = 1 − ; which is Table I shows the bias of κbq as an estimator of κq in the b S(n)+Xn+1−h S(n)+Xn+1−h case of an α-Pareto distribution for α = 1:1, a value chosen now concave in Xn+1; so that uncertainty on Xn+1 reduces its to be compatible with practical economic measures, such as value. The competition between these two opposite effects is in the wealth distribution in the world or in a particular country, favor of the latter, because of a higher concavity with respect including developped ones.1 In such a case, the estimator is to the variable, and also of a higher variability (whatever its extemely sensitive to "small" samples, "small" meaning in measurement) of the variable conditionally to being above practice 108. We ran up to a trillion simulations across varieties the threshold than to being below. The fatter the right tail of the distribution, the stronger the effect. Overall, we find of sample sizes. While κ0:01 ≈ 0:657933, even a sample size E [Ah(n)] of 100 million remains severely biased as seen in the table. that [κ (n)] ≤ = κ (note that unfreezing the E bh [S(n)] h Naturally the bias is rapidly (and nonlinearly) reduced for α E threshold h^(q) also tends to reduce the concentration measure further away from 1, and becomes weak in the neighborhood estimate, adding to the effect, when introducing one extra of 2 for a constant α, though not under a mixture distribution sample because of a slight increase in the expected value of for α, as we shall se later. It is also weaker outside the top the estimator h^(q), although this effect is rather negligible). 1% centile, hence this discussion focuses on the famed "one We have in fact the following: percent" and on low values of the α exponent. n Proposition 1. Let X = (X)i=1 a random sample of size TABLE I: Biases of Estimator of κ = 0:657933 From 1012 1 n > q , Y = Xn+1 an extra single random observation, and Monte Carlo Realizations Pn 1 X + 1 Y define: κ (X t Y ) = i=1 Xi>h i Y >h . We remark κ(n) Mean Median STD bh Pn b i=1 Xi + Y across MC runs that, whenever Y > h, one has: κ(103) 0.405235 0.367698 0.160244 b 4 κb(10 ) 0.485916 0.458449 0.117917 5 2 κb(10 ) 0.539028 0.516415 0.0931362 @ κh(X t Y ) 6 b ≤ 0: κb(10 ) 0.581384 0.555997 0.0853593 2 κ(107) 0.591506 0.575262 0.0601528 @Y b 8 κb(10 ) 0.606513 0.593667 0.0461397 ^ This inequality is still valid with κbq as the value h(q; X t Y ) doesn’t depend on the particular value of Y > h^(q; X): In view of these results and of a number of tests we have performed around them, we can conjecture that the bias κq − We face a different situation from the common small sample −b(q)(α−1) κbq(n) is "of the order of" c(α; q)n where constants effect resulting from high impact from the rare observation b(q) and c(α; q) need to be evaluated. Simulations suggest that in the tails that are less likely to show up in small samples, b(q) = 1; whatever the value of α and q, but the rather slow a bias which goes away by repetition of sample runs.
Recommended publications
  • Report for the Academic Year 1995
    Institute /or ADVANCED STUDY REPORT FOR THE ACADEMIC YEAR 1994 - 95 PRINCETON NEW JERSEY Institute /or ADVANCED STUDY REPORT FOR THE ACADEMIC YEAR 1 994 - 95 OLDEN LANE PRINCETON • NEW JERSEY 08540-0631 609-734-8000 609-924-8399 (Fax) Extract from the letter addressed by the Founders to the Institute's Trustees, dated June 6, 1930. Newark, New jersey. It is fundamental in our purpose, and our express desire, that in the appointments to the staff and faculty, as well as in the admission of workers and students, no account shall be taken, directly or indirectly, of race, religion, or sex. We feel strongly that the spirit characteristic of America at its noblest, above all the pursuit of higher learning, cannot admit of any conditions as to personnel other than those designed to promote the objects for which this institution is established, and particularly with no regard whatever to accidents of race, creed, or sex. TABLE OF CONTENTS 4 BACKGROUND AND PURPOSE 5 • FOUNDERS, TRUSTEES AND OFFICERS OF THE BOARD AND OF THE CORPORATION 8 • ADMINISTRATION 11 REPORT OF THE CHAIRMAN 15 REPORT OF THE DIRECTOR 23 • ACKNOWLEDGMENTS 27 • REPORT OF THE SCHOOL OF HISTORICAL STUDIES ACADEMIC ACTIVITIES MEMBERS, VISITORS AND RESEARCH STAFF 36 • REPORT OF THE SCHOOL OF MATHEMATICS ACADEMIC ACTIVITIES MEMBERS AND VISITORS 42 • REPORT OF THE SCHOOL OF NATURAL SCIENCES ACADEMIC ACTIVITIES MEMBERS AND VISITORS 50 • REPORT OF THE SCHOOL OF SOCIAL SCIENCE ACADEMIC ACTIVITIES MEMBERS, VISITORS AND RESEARCH STAFF 55 • REPORT OF THE INSTITUTE LIBRARIES 57 • RECORD OF INSTITUTE EVENTS IN THE ACADEMIC YEAR 1994-95 85 • INDEPENDENT AUDITORS' REPORT INSTITUTE FOR ADVANCED STUDY: BACKGROUND AND PURPOSE The Institute for Advanced Study is an independent, nonprofit institution devoted to the encouragement of learning and scholarship.
    [Show full text]
  • The Largest Gathering of Hedge Fund of Funds & Their Investors in The
    Leading Investors Sheila Healy Berube, 3M Company Karin E. Brodbeck, Nestlé Business Services Craig R. Dandurand, CalPERS Joel Katzman New opportunities for managers & Kevin E. Lynch, Verizon Investment Management Corp allocators to meet, one-on-one, via the Maurice E. Maertens, New York University “Manager & Allocators’Access Platform” Donald Pierce, San Bernardino County Employees see p 13 for details Retirement Association Mario Therrien, Caisse De Dépôt Et Placement Du Québec David W Wiederecht, GE Asset Management Incorporated Salim A. Shariff, Weyerhaeuser Company Retirement Plan Leading Consultants Janine Baldridge, Russell Investment Group Alan H. Dorsey, CRA RogersCasey Tim Jackson, Rocaton Investment Advisors J. Alan Lenahan, Fund Evaluation Group Kevin P. Quirk, Casey, Quirk & Associates Leading Hedge Fund of Funds Mustafa Jama, Morgan Stanley September 18-20, 2006 • Pier Sixty • New York, NY Carrie A. McCabe, FRM Research LLC George H. Walker, Goldman Sachs & Co Thomas Strauss, Ramius HVB Partners, LLC The largest gathering of Hedge Fund of Funds Judson P. Reis, Sire Management Corporation Charles M. Johnson, III, Private Advisors, LLC R. Kelsey Biggers, K2 Advisors & their investors in the USA in 2006 Kent A. Clark, Goldman Sachs Hedge Fund Strategies (HFS) Madhav Misra, AllianceHFP Michael F. Klein, Aetos Capital At GAIM USA Fund of Funds 2006: Jerry Baesel, Morgan Stanley Alternative Investment Partners I The largest, most senior gathering of hedge fund of I Over 20 hand picked, out-performing niche hedge fund Stuart Leaf, Cadogan Management, LLC fund leaders in the US, including: of funds discussing how they are generating alpha and Jean Karoubi, The Longchamp Group differentiating themselves in a crowded space Robert A.
    [Show full text]
  • Computational and Financial Econometrics (CFE 2017)
    CFE-CMStatistics 2017 PROGRAMME AND ABSTRACTS 11th International Conference on Computational and Financial Econometrics (CFE 2017) http://www.cfenetwork.org/CFE2017 and 10th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (CMStatistics 2017) http://www.cmstatistics.org/CMStatistics2017 Senate House & Birkbeck University of London, UK 16 – 18 December 2017 ⃝c ECOSTA ECONOMETRICS AND STATISTICS. All rights reserved. I CFE-CMStatistics 2017 ISBN 978-9963-2227-4-2 ⃝c 2017 - ECOSTA ECONOMETRICS AND STATISTICS Technical Editors: Gil Gonzalez-Rodriguez and Marc Hofmann. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any means without the prior permission from the publisher. II ⃝c ECOSTA ECONOMETRICS AND STATISTICS. All rights reserved. CFE-CMStatistics 2017 International Organizing Committee: Ana Colubi, Erricos Kontoghiorghes, Marc Levene, Bernard Rachet, Herman Van Dijk. CFE 2017 Co-chairs: Veronika Czellar, Hashem Pesaran, Mike Pitt and Stefan Sperlich. CFE 2017 Programme Committee: Knut Are Aastveit, Alessandra Amendola, Josu Arteche, Monica Billio, Roberto Casarin, Gianluca Cubadda, Manfred Deistler, Jean-Marie Dufour, Ekkehard Ernst, Jean-David Fermanian, Catherine Forbes, Philip Hans Franses, Marc Hallin, Alain Hecq, David Hendry, Benjamin Holcblat, Jan Jacobs, Degui Li, Alessandra Luati, Richard Luger, J Isaac Miller, Claudio Morana, Bent
    [Show full text]
  • Table of Contents
    Table of contents A Word from the Director................................................................................................................2 CRM's 30th Anniversary ...................................................................................................................4 Presenting the CRM.........................................................................................................................6 Personnel..........................................................................................................................................7 Scientific Personnel..........................................................................................................................8 Members 8 Postdoctoral Fellows 9 Visitors 10 Management................................................................................................................................... 12 Bureau 12 Advisory Committee 12 Computer Facilities 13 Scientific Activities........................................................................................................................ 14 Theme Year 1999-2000: Mathematical Physics 14 Aisenstadt Chair 24 General Programme 27 CRM Prizes 34 Members’ Seminars & Special Events 37 CRM-ISM Colloquium 41 World Mathematical Year ............................................................................................................. 42 Coming Events ..............................................................................................................................
    [Show full text]
  • The Precautionary Principle (With Application to the Genetic Modification of Organisms)
    EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES The Precautionary Principle (with Application to the Genetic Modification of Organisms) Nassim Nicholas Taleb⇤, Rupert Read§, Raphael Douady‡, Joseph Norman†,Yaneer Bar-Yam† ⇤School of Engineering, New York University †New England Complex Systems Institute ‡ Institute of Mathematics and Theoretical Physics, C.N.R.S., Paris §School of Philosophy, University of East Anglia F Abstract—The precautionary principle (PP) states that if an action PP states that if an action or policy has a suspected risk or policy has a suspected risk of causing severe harm to the public of causing severe harm to the public domain (such as domain (affecting general health or the environment globally), the action general health or the environment), and in the absence should not be taken in the absence of scientific near-certainty about its of scientific near-certainty about the safety of the action, safety. Under these conditions, the burden of proof about absence of harm falls on those proposing an action, not those opposing it. PP is the burden of proof about absence of harm falls on those intended to deal with uncertainty and risk in cases where the absence proposing the action. It is meant to deal with effects of of evidence and the incompleteness of scientific knowledge carries absence of evidence and the incompleteness of scientific profound implications and in the presence of risks of "black swans", knowledge in some risky domains.1 unforeseen and unforeseable events of extreme consequence. We believe that the PP should be evoked only in This non-naive version of the PP allows us to avoid paranoia and extreme situations: when the potential harm is systemic paralysis by confining precaution to specific domains and problems.
    [Show full text]
  • LABEX REFI and STONY BROOK UNIVERSITY
    Labex ReFi Summer School Conference Program v34c 02/09/2016 11:15:58 ET Document de travail LABEX REFI and STONY BROOK UNIVERSITY SUMMER SCHOOL ON A COMPARATIVE OF AMERICAN AND EUROPEAN FINANCIAL REGULATION New York City, September 7, 8 and 9, 2016 CONFERENCE ON QUANTITATIVE METHODS FOR FINANCIAL REGULATION Stony Brook: September 10 and 11, 2016 Website and Registration http://financialregulation2016.com http://caer2016.weebly.com/registration-form.html Main venue: Stony Brook Manhattan, 387 Park Avenue South (entrance on 27th Street), 3rd Floor, New York, NY 10016 Contact: [email protected] Tel: +1 (631) 632 9125 Sponsoring opportunities: [email protected] www.financialregulation2016.com 1/11 Labex ReFi Summer School Conference Program v34c 02/09/2016 11:15:58 ET Document de travail SUMMER SCHOOL A COMPARATIVE OF AMERICAN AND EUROPEAN FINANCIAL REGULATION New York City September 7, 8 and 9, 2016 Organizing Committee - Prof. Raphael Douady (Chair, Professor at Stony Brook Univ. ,Academic Director of Labex ReFi ) - François-Gilles Le Theule (Executive Director of Labex ReFi; Professor at ESCP-Europe) - Michel Perez (Labex ReFi, President. MAPI LLC) - Gérard Hertig (Labex ReFi, Chairman of the International Advisory Board) - Marco Dell’Erba (LabEx Refi) Tuesday, September the 6th Paramount Hotel, Time Square 4:30 pm Arrival of Labex team (20 members plus family) at JFK from Paris in the afternoon Transfer from JFK airport to Paramount hotel and check-in 7:30 pm Dinner, welcome remarks Georges Ugeux, CEO & Founder
    [Show full text]
  • The Precautionary Principle
    EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES The Precautionary Principle (with Application to the Genetic Modification of Organisms) Nassim Nicholas Taleb∗, Rupert Readx, Raphael Douadyz, Joseph Normany,Yaneer Bar-Yamy ∗School of Engineering, New York University yNew England Complex Systems Institute z Institute of Mathematics and Theoretical Physics, C.N.R.S., Paris xSchool of Philosophy, University of East Anglia Abstract—The precautionary principle (PP) states that if an of the action, the burden of proof about absence of harm falls action or policy has a suspected risk of causing severe harm to on those proposing the action. It is meant to deal with effects the public domain (affecting general health or the environment of absence of evidence and the incompleteness of scientific globally), the action should not be taken in the absence of 1 scientific near-certainty about its safety. Under these conditions, knowledge in some risky domains. the burden of proof about absence of harm falls on those We believe that the PP should be evoked only in extreme proposing an action, not those opposing it. PP is intended to deal situations: when the potential harm is systemic (rather than with uncertainty and risk in cases where the absence of evidence localized) and the consequences can involve total irreversible and the incompleteness of scientific knowledge carries profound ruin, such as the extinction of human beings or all life on the implications and in the presence of risks of "black swans", unforeseen and unforeseable events of extreme consequence. planet. This non-naive version of the PP allows us to avoid paranoia The aim of this paper is to place the concept of precaution and paralysis by confining precaution to specific domains and within a formal statistical and risk-analysis structure, ground- problems.
    [Show full text]