Kolmogorov Complexity and Algorithmic Randomness

Total Page:16

File Type:pdf, Size:1020Kb

Kolmogorov Complexity and Algorithmic Randomness Mathematical Surveys and Monographs Volume 220 Kolmogorov Complexity and Algorithmic Randomness A. Shen V. A. Uspensky N. Vereshchagin American Mathematical Society 10.1090/surv/220 Kolmogorov Complexity and Algorithmic Randomness Mathematical Surveys and Monographs Volume 220 Kolmogorov Complexity and Algorithmic Randomness A. Shen V. A. Uspensky N. Vereshchagin American Mathematical Society Providence, Rhode Island EDITORIAL COMMITTEE Robert Guralnick Benjamin Sudakov Michael A. Singer, Chair Constantin Teleman MichaelI.Weinstein N. K. Verewagin, V. A. Uspenskii, A. Xen Kolmogorovska slonost i algoritmiqeska sluqainost MCNMO, Moskva, 2013 2010 Mathematics Subject Classification. Primary 68Q30, 03D32, 60A99, 97K70. For additional information and updates on this book, visit www.ams.org/bookpages/surv-220 Library of Congress Cataloging-in-Publication Data Names: Shen, A. (Alexander), 1958- | Uspenski˘ı, V. A. (Vladimir Andreevich) | Vereshchagin, N. K., 1928- Title: Kolmogorov complexity and algorithmic randomness / A. Shen, V. A. Uspensky, N. Vere- shchagin. Other titles: Kolmogorovskaya slozhnost i algoritmieskaya sluchanost. English Description: Providence, Rhode Island : American Mathematical Society, [2017] | Series: Mathe- matical surveys and monographs ; volume 220 | Includes bibliographical references and index. Identifiers: LCCN 2016049293 | ISBN 9781470431822 (alk. paper) Subjects: LCSH: Kolmogorov complexity. | Computational complexity. | Information theory. | AMS: Computer science – Theory of computing – Algorithmic information theory (Kolmogorov complexity, etc.). msc | Mathematical logic and foundations – Computability and recursion theory – Algorithmic randomness and dimension. msc | Probability theory and stochastic processes – Foundations of probability theory – None of the above, but in this section. msc | Mathematics education – Combinatorics, graph theory, probability theory, statistics – Foun- dations and methodology of statistics. msc Classification: LCC QA267.7 .S52413 2017 | DDC 511.3/42–dc23 LC record available at https://lccn.loc.gov/2016049293 Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Permissions to reuse portions of AMS publication content are handled by Copyright Clearance Center’s RightsLink service. For more information, please visit: http://www.ams.org/rightslink. Send requests for translation rights and licensed reprints to [email protected]. Excluded from these provisions is material for which the author holds copyright. In such cases, requests for permission to reuse or reprint material should be addressed directly to the author(s). Copyright ownership is indicated on the copyright page, or on the lower right-hand corner of the first page of each article within proceedings volumes. c 2017 by the authors. All rights reserved. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. Visit the AMS home page at http://www.ams.org/ 10987654321 222120191817 To the memory of Andrei Muchnik Contents Preface xi Acknowledgments xiii Basic notions and notation xv Introduction. What is this book about? 1 What is Kolmogorov complexity? 1 Optimal description modes 2 Kolmogorov complexity 4 Complexity and information 5 Complexity and randomness 8 Non-computability of C and Berry’s paradox 9 Some applications of Kolmogorov complexity 10 Chapter 1. Plain Kolmogorov complexity 15 1.1. Definition and main properties 15 1.2. Algorithmic properties 21 Chapter 2. Complexity of pairs and conditional complexity 31 2.1. Complexity of pairs 31 2.2. Conditional complexity 34 2.3. Complexity as the amount of information 45 Chapter 3. Martin-L¨of randomness 53 3.1. Measures on Ω 53 3.2. The Strong Law of Large Numbers 55 3.3. Effectively null sets 59 3.4. Properties of Martin-L¨of randomness 65 3.5. Randomness deficiencies 70 Chapter 4. A priori probability and prefix complexity 75 4.1. Randomized algorithms and semimeasures on N 75 4.2. Maximal semimeasures 79 4.3. Prefix machines 81 4.4. A digression: Machines with self-delimiting input 85 4.5. The main theorem on prefix complexity 91 4.6. Properties of prefix complexity 96 4.7. Conditional prefix complexity and complexity of pairs 102 Chapter 5. Monotone complexity 115 5.1. Probabilistic machines and semimeasures on the tree 115 vii viii CONTENTS 5.2. Maximal semimeasure on the binary tree 121 5.3. A priori complexity and its properties 122 5.4. Computable mappings of type Σ → Σ 126 5.5. Monotone complexity 129 5.6. Levin–Schnorr theorem 144 5.7. The random number Ω 157 5.8. Effective Hausdorff dimension 172 5.9. Randomness with respect to different measures 176 Chapter 6. General scheme for complexities 193 6.1. Decision complexity 193 6.2. Comparing complexities 197 6.3. Conditional complexities 200 6.4. Complexities and oracles 202 Chapter 7. Shannon entropy and Kolmogorov complexity 213 7.1. Shannon entropy 213 7.2. Pairs and conditional entropy 217 7.3. Complexity and entropy 226 Chapter 8. Some applications 233 8.1. There are infinitely many primes 233 8.2. Moving information along the tape 233 8.3. Finite automata with several heads 236 8.4. Laws of Large Numbers 238 8.5. Forbidden substrings 241 8.6. A proof of an inequality 255 8.7. Lipschitz transformations are not transitive 258 Chapter 9. Frequency and game approaches to randomness 261 9.1. The original idea of von Mises 261 9.2. Set of strings as selection rules 262 9.3. Mises–Church randomness 264 9.4. Ville’s example 267 9.5. Martingales 270 9.6. A digression: Martingales in probability theory 275 9.7. Lower semicomputable martingales 277 9.8. Computable martingales 279 9.9. Martingales and Schnorr randomness 282 9.10. Martingales and effective dimension 284 9.11. Partial selection rules 287 9.12. Non-monotonic selection rules 291 9.13. Change in the measure and randomness 297 Chapter 10. Inequalities for entropy, complexity, and size 313 10.1. Introduction and summary 313 10.2. Uniform sets 318 10.3. Construction of a uniform set 321 10.4. Uniform sets and orbits 323 10.5. Almost uniform sets 324 CONTENTS ix 10.6. Typization trick 326 10.7. Combinatorial interpretation: Examples 328 10.8. Combinatorial interpretation: The general case 330 10.9. One more combinatorial interpretation 332 10.10. The inequalities for two and three strings 336 10.11. Dimensions and Ingleton’s inequality 337 10.12. Conditionally independent random variables 342 10.13. Non-Shannon inequalities 343 Chapter 11. Common information 351 11.1. Incompressible representations of strings 351 11.2. Representing mutual information as a string 352 11.3. The combinatorial meaning of common information 357 11.4. Conditional independence and common information 362 Chapter 12. Multisource algorithmic information theory 367 12.1. Information transmission requests 367 12.2. Conditional encoding 368 12.3. Conditional codes: Muchnik’s theorem 369 12.4. Combinatorial interpretation of Muchnik’s theorem 373 12.5. A digression: On-line matching 375 12.6. Information distance and simultaneous encoding 377 12.7. Conditional codes for two conditions 379 12.8. Information flow and network cuts 383 12.9. Networks with one source 384 12.10. Common information as an information request 388 12.11. Simplifying a program 389 12.12. Minimal sufficient statistics 390 Chapter 13. Information and logic 401 13.1. Problems, operations, complexity 401 13.2. Problem complexity and intuitionistic logic 403 13.3. Some formulas and their complexity 405 13.4. More examples and the proof of Theorem 238 408 13.5. Proof of a result similar to Theorem 238 using Kripke models 413 13.6. A problem whose complexity is not expressible in terms of the complexities of tuples 417 Chapter 14. Algorithmic statistics 425 14.1. The framework and randomness deficiency 425 14.2. Stochastic objects 428 14.3. Two-part descriptions 431 14.4. Hypotheses of restricted type 438 14.5. Optimality and randomness deficiency 447 14.6. Minimal hypotheses 450 14.7. A bit of philosophy 452 Appendix 1. Complexity and foundations of probability 455 Probability theory paradox 455 Current best practice 455 xCONTENTS Simple events and events specified in advance 456 Frequency approach 458 Dynamical and statistical laws 459 Are “real-life” sequences complex? 459 Randomness as ignorance: Blum–Micali–Yao pseudo-randomness 460 A digression: Thermodynamics 461 Another digression: Quantum mechanics 463 Appendix 2. Four algorithmic faces of randomness 465 Introduction 465 Face one: Frequency stability and stochasticness 468 Face two: Chaoticness 470 Face three: Typicalness 475 Face four: Unpredictability 476 Generalization for arbitrary computable distributions 480 History and bibliography 486 Bibliography 491 Name Index 501 Subject Index 505 Preface The notion of algorithmic complexity (also sometimes called algorithmic en- tropy) appeared in the 1960s in between the theory of computation, probability theory, and information theory. The idea of A. N. Kolmogorov was to measure the amount of information in finite objects (and not in random variables,
Recommended publications
  • The Foundations of Solomonoff Prediction
    The Foundations of Solomonoff Prediction MSc Thesis for the graduate programme in History and Philosophy of Science at the Universiteit Utrecht by Tom Florian Sterkenburg under the supervision of prof.dr. D.G.B.J. Dieks (Institute for History and Foundations of Science, Universiteit Utrecht) and prof.dr. P.D. Gr¨unwald (Centrum Wiskunde & Informatica; Universiteit Leiden) February 2013 ii Parsifal: Wer ist der Gral? Gurnemanz: Das sagt sich nicht; doch bist du selbst zu ihm erkoren, bleibt dir die Kunde unverloren. – Und sieh! – Mich dunkt,¨ daß ich dich recht erkannt: kein Weg fuhrt¨ zu ihm durch das Land, und niemand k¨onnte ihn beschreiten, den er nicht selber m¨ocht’ geleiten. Parsifal: Ich schreite kaum, - doch w¨ahn’ ich mich schon weit. Richard Wagner, Parsifal, Act I, Scene I iv Voor mum v Abstract R.J. Solomonoff’s theory of Prediction assembles notions from information theory, confirmation theory and computability theory into the specification of a supposedly all-encompassing objective method of prediction. The theory has been the subject of both general neglect and occasional passionate promotion, but of very little serious philosophical reflection. This thesis presents an attempt towards a more balanced philosophical appraisal of Solomonoff’s theory. Following an in-depth treatment of the mathematical framework and its motivation, I shift attention to the proper interpretation of these formal results. A discussion of the theory’s possible aims turns into the project of identifying its core principles, and a defence of the primacy of the unifying principle of Universality supports the development of my proposed interpretation of Solomonoff Prediction as the statement, to be read in the context of the philosophical problem of prediction, that in a universal setting, there exist universal predictors.
    [Show full text]
  • Computational Complexity Computational Complexity
    In 1965, the year Juris Hartmanis became Chair Computational of the new CS Department at Cornell, he and his KLEENE HIERARCHY colleague Richard Stearns published the paper On : complexity the computational complexity of algorithms in the Transactions of the American Mathematical Society. RE CO-RE RECURSIVE The paper introduced a new fi eld and gave it its name. Immediately recognized as a fundamental advance, it attracted the best talent to the fi eld. This diagram Theoretical computer science was immediately EXPSPACE shows how the broadened from automata theory, formal languages, NEXPTIME fi eld believes and algorithms to include computational complexity. EXPTIME complexity classes look. It As Richard Karp said in his Turing Award lecture, PSPACE = IP : is known that P “All of us who read their paper could not fail P-HIERARCHY to realize that we now had a satisfactory formal : is different from ExpTime, but framework for pursuing the questions that Edmonds NP CO-NP had raised earlier in an intuitive fashion —questions P there is no proof about whether, for instance, the traveling salesman NLOG SPACE that NP ≠ P and problem is solvable in polynomial time.” LOG SPACE PSPACE ≠ P. Hartmanis and Stearns showed that computational equivalences and separations among complexity problems have an inherent complexity, which can be classes, fundamental and hard open problems, quantifi ed in terms of the number of steps needed on and unexpected connections to distant fi elds of a simple model of a computer, the multi-tape Turing study. An early example of the surprises that lurk machine. In a subsequent paper with Philip Lewis, in the structure of complexity classes is the Gap they proved analogous results for the number of Theorem, proved by Hartmanis’s student Allan tape cells used.
    [Show full text]
  • 3 Autocorrelation
    3 Autocorrelation Autocorrelation refers to the correlation of a time series with its own past and future values. Autocorrelation is also sometimes called “lagged correlation” or “serial correlation”, which refers to the correlation between members of a series of numbers arranged in time. Positive autocorrelation might be considered a specific form of “persistence”, a tendency for a system to remain in the same state from one observation to the next. For example, the likelihood of tomorrow being rainy is greater if today is rainy than if today is dry. Geophysical time series are frequently autocorrelated because of inertia or carryover processes in the physical system. For example, the slowly evolving and moving low pressure systems in the atmosphere might impart persistence to daily rainfall. Or the slow drainage of groundwater reserves might impart correlation to successive annual flows of a river. Or stored photosynthates might impart correlation to successive annual values of tree-ring indices. Autocorrelation complicates the application of statistical tests by reducing the effective sample size. Autocorrelation can also complicate the identification of significant covariance or correlation between time series (e.g., precipitation with a tree-ring series). Autocorrelation implies that a time series is predictable, probabilistically, as future values are correlated with current and past values. Three tools for assessing the autocorrelation of a time series are (1) the time series plot, (2) the lagged scatterplot, and (3) the autocorrelation function. 3.1 Time series plot Positively autocorrelated series are sometimes referred to as persistent because positive departures from the mean tend to be followed by positive depatures from the mean, and negative departures from the mean tend to be followed by negative departures (Figure 3.1).
    [Show full text]
  • The Bayesian Approach to Statistics
    THE BAYESIAN APPROACH TO STATISTICS ANTHONY O’HAGAN INTRODUCTION the true nature of scientific reasoning. The fi- nal section addresses various features of modern By far the most widely taught and used statisti- Bayesian methods that provide some explanation for the rapid increase in their adoption since the cal methods in practice are those of the frequen- 1980s. tist school. The ideas of frequentist inference, as set out in Chapter 5 of this book, rest on the frequency definition of probability (Chapter 2), BAYESIAN INFERENCE and were developed in the first half of the 20th century. This chapter concerns a radically differ- We first present the basic procedures of Bayesian ent approach to statistics, the Bayesian approach, inference. which depends instead on the subjective defini- tion of probability (Chapter 3). In some respects, Bayesian methods are older than frequentist ones, Bayes’s Theorem and the Nature of Learning having been the basis of very early statistical rea- Bayesian inference is a process of learning soning as far back as the 18th century. Bayesian from data. To give substance to this statement, statistics as it is now understood, however, dates we need to identify who is doing the learning and back to the 1950s, with subsequent development what they are learning about. in the second half of the 20th century. Over that time, the Bayesian approach has steadily gained Terms and Notation ground, and is now recognized as a legitimate al- ternative to the frequentist approach. The person doing the learning is an individual This chapter is organized into three sections.
    [Show full text]
  • An Algorithmic Approach to Information and Meaning a Formal Framework for a Philosophical Discussion∗
    An Algorithmic Approach to Information and Meaning A Formal Framework for a Philosophical Discussion∗ Hector Zenil [email protected] Institut d'Histoire et de Philosophie des Sciences et des Techniques (Paris 1/ENS Ulm/CNRS) http://www.algorithmicnature.org/zenil Abstract I'll survey some of the aspects relevant to a philosophical discussion of information taking into account the developments of algorithmic information theory. I will propose that meaning is deep in Bennett's logical depth sense, and that algorithmic probability may provide the stability for a robust algorithmic definition of meaning, taking into consideration the interpretation and the receiver's own knowledge encoded in the story of a message. Keywords: information content; meaning; algorithmic probability; algorithmic complexity; logical depth; philosophy of information; in- formation theory. ∗Presented at the Interdisciplinary Workshop: Ontological, Epistemological and Methodological Aspects of Computer Science, Philosophy of Simulation (SimTech Clus- ter of Excellence), Institute of Philosophy, at the Faculty of Informatics, University of Stuttgart, Germany, July 7, 2011. 1 1 Introduction Information can be a cornerstone for interpreting all kind of world phenom- ena as it can constitute the basis for the description of objects. While it is legitimate to study ideas and concepts related to information in their broad- est sense, that the use of information outside formal contexts amounts to misuse cannot and should not be overlooked. It is not unusual to come across surveys and volumes devoted to information (in the larger sense) in which the mathematical discussion does not venture beyond the state of the field as Shannon [30] left it some 60 years ago.
    [Show full text]
  • A Short History of Computational Complexity
    The Computational Complexity Column by Lance FORTNOW NEC Laboratories America 4 Independence Way, Princeton, NJ 08540, USA [email protected] http://www.neci.nj.nec.com/homepages/fortnow/beatcs Every third year the Conference on Computational Complexity is held in Europe and this summer the University of Aarhus (Denmark) will host the meeting July 7-10. More details at the conference web page http://www.computationalcomplexity.org This month we present a historical view of computational complexity written by Steve Homer and myself. This is a preliminary version of a chapter to be included in an upcoming North-Holland Handbook of the History of Mathematical Logic edited by Dirk van Dalen, John Dawson and Aki Kanamori. A Short History of Computational Complexity Lance Fortnow1 Steve Homer2 NEC Research Institute Computer Science Department 4 Independence Way Boston University Princeton, NJ 08540 111 Cummington Street Boston, MA 02215 1 Introduction It all started with a machine. In 1936, Turing developed his theoretical com- putational model. He based his model on how he perceived mathematicians think. As digital computers were developed in the 40's and 50's, the Turing machine proved itself as the right theoretical model for computation. Quickly though we discovered that the basic Turing machine model fails to account for the amount of time or memory needed by a computer, a critical issue today but even more so in those early days of computing. The key idea to measure time and space as a function of the length of the input came in the early 1960's by Hartmanis and Stearns.
    [Show full text]
  • A Review of Graph and Network Complexity from an Algorithmic Information Perspective
    entropy Review A Review of Graph and Network Complexity from an Algorithmic Information Perspective Hector Zenil 1,2,3,4,5,*, Narsis A. Kiani 1,2,3,4 and Jesper Tegnér 2,3,4,5 1 Algorithmic Dynamics Lab, Centre for Molecular Medicine, Karolinska Institute, 171 77 Stockholm, Sweden; [email protected] 2 Unit of Computational Medicine, Department of Medicine, Karolinska Institute, 171 77 Stockholm, Sweden; [email protected] 3 Science for Life Laboratory (SciLifeLab), 171 77 Stockholm, Sweden 4 Algorithmic Nature Group, Laboratoire de Recherche Scientifique (LABORES) for the Natural and Digital Sciences, 75005 Paris, France 5 Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia * Correspondence: [email protected] or [email protected] Received: 21 June 2018; Accepted: 20 July 2018; Published: 25 July 2018 Abstract: Information-theoretic-based measures have been useful in quantifying network complexity. Here we briefly survey and contrast (algorithmic) information-theoretic methods which have been used to characterize graphs and networks. We illustrate the strengths and limitations of Shannon’s entropy, lossless compressibility and algorithmic complexity when used to identify aspects and properties of complex networks. We review the fragility of computable measures on the one hand and the invariant properties of algorithmic measures on the other demonstrating how current approaches to algorithmic complexity are misguided and suffer of similar limitations than traditional statistical approaches such as Shannon entropy. Finally, we review some current definitions of algorithmic complexity which are used in analyzing labelled and unlabelled graphs. This analysis opens up several new opportunities to advance beyond traditional measures.
    [Show full text]
  • Introduction to Bayesian Inference and Modeling Edps 590BAY
    Introduction to Bayesian Inference and Modeling Edps 590BAY Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2019 Introduction What Why Probability Steps Example History Practice Overview ◮ What is Bayes theorem ◮ Why Bayesian analysis ◮ What is probability? ◮ Basic Steps ◮ An little example ◮ History (not all of the 705+ people that influenced development of Bayesian approach) ◮ In class work with probabilities Depending on the book that you select for this course, read either Gelman et al. Chapter 1 or Kruschke Chapters 1 & 2. C.J. Anderson (Illinois) Introduction Fall 2019 2.2/ 29 Introduction What Why Probability Steps Example History Practice Main References for Course Throughout the coures, I will take material from ◮ Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., & Rubin, D.B. (20114). Bayesian Data Analysis, 3rd Edition. Boco Raton, FL, CRC/Taylor & Francis.** ◮ Hoff, P.D., (2009). A First Course in Bayesian Statistical Methods. NY: Sringer.** ◮ McElreath, R.M. (2016). Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Boco Raton, FL, CRC/Taylor & Francis. ◮ Kruschke, J.K. (2015). Doing Bayesian Data Analysis: A Tutorial with JAGS and Stan. NY: Academic Press.** ** There are e-versions these of from the UofI library. There is a verson of McElreath, but I couldn’t get if from UofI e-collection. C.J. Anderson (Illinois) Introduction Fall 2019 3.3/ 29 Introduction What Why Probability Steps Example History Practice Bayes Theorem A whole semester on this? p(y|θ)p(θ) p(θ|y)= p(y) where ◮ y is data, sample from some population.
    [Show full text]
  • Autocorrelation
    Autocorrelation David Gerbing School of Business Administration Portland State University January 30, 2016 Autocorrelation The difference between an actual data value and the forecasted data value from a model is the residual for that forecasted value. Residual: ei = Yi − Y^i One of the assumptions of the least squares estimation procedure for the coefficients of a regression model is that the residuals are purely random. One consequence of randomness is that the residuals would not correlate with anything else, including with each other at different time points. A value above the mean, that is, a value with a positive residual, would contain no information as to whether the next value in time would have a positive residual, or negative residual, with a data value below the mean. For example, flipping a fair coin yields random flips, with half of the flips resulting in a Head and the other half a Tail. If a Head is scored as a 1 and a Tail as a 0, and the probability of both Heads and Tails is 0.5, then calculate the value of the population mean as: Population Mean: µ = (0:5)(1) + (0:5)(0) = :5 The forecast of the outcome of the next flip of a fair coin is the mean of the process, 0.5, which is stable over time. What are the corresponding residuals? A residual value is the difference of the corresponding data value minus the mean. With this scoring system, a Head generates a positive residual from the mean, µ, Head: ei = 1 − µ = 1 − 0:5 = 0:5 A Tail generates a negative residual from the mean, Tail: ei = 0 − µ = 0 − 0:5 = −0:5 The error terms of the coin flips are independent of each other, so if the current flip is a Head, or if the last 5 flips are Heads, the forecast for the next flip is still µ = :5.
    [Show full text]
  • Mild Vs. Wild Randomness: Focusing on Those Risks That Matter
    with empirical sense1. Granted, it has been Mild vs. Wild Randomness: tinkered with using such methods as Focusing on those Risks that complementary “jumps”, stress testing, regime Matter switching or the elaborate methods known as GARCH, but while they represent a good effort, Benoit Mandelbrot & Nassim Nicholas Taleb they fail to remediate the bell curve’s irremediable flaws. Forthcoming in, The Known, the Unknown and the The problem is that measures of uncertainty Unknowable in Financial Institutions Frank using the bell curve simply disregard the possibility Diebold, Neil Doherty, and Richard Herring, of sharp jumps or discontinuities. Therefore they editors, Princeton: Princeton University Press. have no meaning or consequence. Using them is like focusing on the grass and missing out on the (gigantic) trees. Conventional studies of uncertainty, whether in statistics, economics, finance or social science, In fact, while the occasional and unpredictable have largely stayed close to the so-called “bell large deviations are rare, they cannot be dismissed curve”, a symmetrical graph that represents a as “outliers” because, cumulatively, their impact in probability distribution. Used to great effect to the long term is so dramatic. describe errors in astronomical measurement by The good news, especially for practitioners, is the 19th-century mathematician Carl Friedrich that the fractal model is both intuitively and Gauss, the bell curve, or Gaussian model, has computationally simpler than the Gaussian. It too since pervaded our business and scientific culture, has been around since the sixties, which makes us and terms like sigma, variance, standard deviation, wonder why it was not implemented before. correlation, R-square and Sharpe ratio are all The traditional Gaussian way of looking at the directly linked to it.
    [Show full text]
  • Random Variables and Applications
    Random Variables and Applications OPRE 6301 Random Variables. As noted earlier, variability is omnipresent in the busi- ness world. To model variability probabilistically, we need the concept of a random variable. A random variable is a numerically valued variable which takes on different values with given probabilities. Examples: The return on an investment in a one-year period The price of an equity The number of customers entering a store The sales volume of a store on a particular day The turnover rate at your organization next year 1 Types of Random Variables. Discrete Random Variable: — one that takes on a countable number of possible values, e.g., total of roll of two dice: 2, 3, ..., 12 • number of desktops sold: 0, 1, ... • customer count: 0, 1, ... • Continuous Random Variable: — one that takes on an uncountable number of possible values, e.g., interest rate: 3.25%, 6.125%, ... • task completion time: a nonnegative value • price of a stock: a nonnegative value • Basic Concept: Integer or rational numbers are discrete, while real numbers are continuous. 2 Probability Distributions. “Randomness” of a random variable is described by a probability distribution. Informally, the probability distribution specifies the probability or likelihood for a random variable to assume a particular value. Formally, let X be a random variable and let x be a possible value of X. Then, we have two cases. Discrete: the probability mass function of X specifies P (x) P (X = x) for all possible values of x. ≡ Continuous: the probability density function of X is a function f(x) that is such that f(x) h P (x < · ≈ X x + h) for small positive h.
    [Show full text]
  • A Philosophical Treatise of Universal Induction
    Entropy 2011, 13, 1076-1136; doi:10.3390/e13061076 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article A Philosophical Treatise of Universal Induction Samuel Rathmanner and Marcus Hutter ? Research School of Computer Science, Australian National University, Corner of North and Daley Road, Canberra ACT 0200, Australia ? Author to whom correspondence should be addressed; E-Mail: [email protected]. Received: 20 April 2011; in revised form: 24 May 2011 / Accepted: 27 May 2011 / Published: 3 June 2011 Abstract: Understanding inductive reasoning is a problem that has engaged mankind for thousands of years. This problem is relevant to a wide range of fields and is integral to the philosophy of science. It has been tackled by many great minds ranging from philosophers to scientists to mathematicians, and more recently computer scientists. In this article we argue the case for Solomonoff Induction, a formal inductive framework which combines algorithmic information theory with the Bayesian framework. Although it achieves excellent theoretical results and is based on solid philosophical foundations, the requisite technical knowledge necessary for understanding this framework has caused it to remain largely unknown and unappreciated in the wider scientific community. The main contribution of this article is to convey Solomonoff induction and its related concepts in a generally accessible form with the aim of bridging this current technical gap. In the process we examine the major historical contributions that have led to the formulation of Solomonoff Induction as well as criticisms of Solomonoff and induction in general. In particular we examine how Solomonoff induction addresses many issues that have plagued other inductive systems, such as the black ravens paradox and the confirmation problem, and compare this approach with other recent approaches.
    [Show full text]