Algorithmic Data Analytics, Small Data Matters and Correlation Versus Causation Arxiv:1309.1418V9 [Cs.CE] 26 Jul 2017
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Foundations of Solomonoff Prediction
The Foundations of Solomonoff Prediction MSc Thesis for the graduate programme in History and Philosophy of Science at the Universiteit Utrecht by Tom Florian Sterkenburg under the supervision of prof.dr. D.G.B.J. Dieks (Institute for History and Foundations of Science, Universiteit Utrecht) and prof.dr. P.D. Gr¨unwald (Centrum Wiskunde & Informatica; Universiteit Leiden) February 2013 ii Parsifal: Wer ist der Gral? Gurnemanz: Das sagt sich nicht; doch bist du selbst zu ihm erkoren, bleibt dir die Kunde unverloren. – Und sieh! – Mich dunkt,¨ daß ich dich recht erkannt: kein Weg fuhrt¨ zu ihm durch das Land, und niemand k¨onnte ihn beschreiten, den er nicht selber m¨ocht’ geleiten. Parsifal: Ich schreite kaum, - doch w¨ahn’ ich mich schon weit. Richard Wagner, Parsifal, Act I, Scene I iv Voor mum v Abstract R.J. Solomonoff’s theory of Prediction assembles notions from information theory, confirmation theory and computability theory into the specification of a supposedly all-encompassing objective method of prediction. The theory has been the subject of both general neglect and occasional passionate promotion, but of very little serious philosophical reflection. This thesis presents an attempt towards a more balanced philosophical appraisal of Solomonoff’s theory. Following an in-depth treatment of the mathematical framework and its motivation, I shift attention to the proper interpretation of these formal results. A discussion of the theory’s possible aims turns into the project of identifying its core principles, and a defence of the primacy of the unifying principle of Universality supports the development of my proposed interpretation of Solomonoff Prediction as the statement, to be read in the context of the philosophical problem of prediction, that in a universal setting, there exist universal predictors. -
An Algorithmic Approach to Information and Meaning a Formal Framework for a Philosophical Discussion∗
An Algorithmic Approach to Information and Meaning A Formal Framework for a Philosophical Discussion∗ Hector Zenil [email protected] Institut d'Histoire et de Philosophie des Sciences et des Techniques (Paris 1/ENS Ulm/CNRS) http://www.algorithmicnature.org/zenil Abstract I'll survey some of the aspects relevant to a philosophical discussion of information taking into account the developments of algorithmic information theory. I will propose that meaning is deep in Bennett's logical depth sense, and that algorithmic probability may provide the stability for a robust algorithmic definition of meaning, taking into consideration the interpretation and the receiver's own knowledge encoded in the story of a message. Keywords: information content; meaning; algorithmic probability; algorithmic complexity; logical depth; philosophy of information; in- formation theory. ∗Presented at the Interdisciplinary Workshop: Ontological, Epistemological and Methodological Aspects of Computer Science, Philosophy of Simulation (SimTech Clus- ter of Excellence), Institute of Philosophy, at the Faculty of Informatics, University of Stuttgart, Germany, July 7, 2011. 1 1 Introduction Information can be a cornerstone for interpreting all kind of world phenom- ena as it can constitute the basis for the description of objects. While it is legitimate to study ideas and concepts related to information in their broad- est sense, that the use of information outside formal contexts amounts to misuse cannot and should not be overlooked. It is not unusual to come across surveys and volumes devoted to information (in the larger sense) in which the mathematical discussion does not venture beyond the state of the field as Shannon [30] left it some 60 years ago. -
On the Normality of Numbers
ON THE NORMALITY OF NUMBERS Adrian Belshaw B. Sc., University of British Columbia, 1973 M. A., Princeton University, 1976 A THESIS SUBMITTED 'IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in the Department of Mathematics @ Adrian Belshaw 2005 SIMON FRASER UNIVERSITY Fall 2005 All rights reserved. This work may not be reproduced in whole or in part, by photocopy or other means, without the permission of the author. APPROVAL Name: Adrian Belshaw Degree: Master of Science Title of Thesis: On the Normality of Numbers Examining Committee: Dr. Ladislav Stacho Chair Dr. Peter Borwein Senior Supervisor Professor of Mathematics Simon Fraser University Dr. Stephen Choi Supervisor Assistant Professor of Mathematics Simon Fraser University Dr. Jason Bell Internal Examiner Assistant Professor of Mathematics Simon Fraser University Date Approved: December 5. 2005 SIMON FRASER ' u~~~snrllbrary DECLARATION OF PARTIAL COPYRIGHT LICENCE The author, whose copyright is declared on the title page of this work, has granted to Simon Fraser University the right to lend this thesis, project or extended essay to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users. The author has further granted permission to Simon Fraser University to keep or make a digital copy for use in its circulating collection, and, without changing the content, to translate the thesislproject or extended essays, if technically possible, to any medium or format for the purpose of preservation of the digital work. -
The Number of Solutions of Φ(X) = M
Annals of Mathematics, 150 (1999), 1–29 The number of solutions of φ(x) = m By Kevin Ford Abstract An old conjecture of Sierpi´nskiasserts that for every integer k > 2, there is a number m for which the equation φ(x) = m has exactly k solutions. Here φ is Euler’s totient function. In 1961, Schinzel deduced this conjecture from his Hypothesis H. The purpose of this paper is to present an unconditional proof of Sierpi´nski’sconjecture. The proof uses many results from sieve theory, in particular the famous theorem of Chen. 1. Introduction Of fundamental importance in the theory of numbers is Euler’s totient function φ(n). Two famous unsolved problems concern the possible values of the function A(m), the number of solutions of φ(x) = m, also called the multiplicity of m. Carmichael’s Conjecture ([1],[2]) states that for every m, the equation φ(x) = m has either no solutions x or at least two solutions. In other words, no totient can have multiplicity 1. Although this remains an open problem, very large lower bounds for a possible counterexample are relatively easy to obtain, the latest being 101010 ([10], Th. 6). Recently, the author has also shown that either Carmichael’s Conjecture is true or a positive proportion of all values m of Euler’s function give A(m) = 1 ([10], Th. 2). In the 1950’s, Sierpi´nski conjectured that all multiplicities k > 2 are possible (see [9] and [16]), and in 1961 Schinzel [17] deduced this conjecture from his well-known Hypothesis H [18]. -
PRIME-PERFECT NUMBERS Paul Pollack [email protected] Carl
PRIME-PERFECT NUMBERS Paul Pollack Department of Mathematics, University of Illinois, Urbana, Illinois 61801, USA [email protected] Carl Pomerance Department of Mathematics, Dartmouth College, Hanover, New Hampshire 03755, USA [email protected] In memory of John Lewis Selfridge Abstract We discuss a relative of the perfect numbers for which it is possible to prove that there are infinitely many examples. Call a natural number n prime-perfect if n and σ(n) share the same set of distinct prime divisors. For example, all even perfect numbers are prime-perfect. We show that the count Nσ(x) of prime-perfect numbers in [1; x] satisfies estimates of the form c= log log log x 1 +o(1) exp((log x) ) ≤ Nσ(x) ≤ x 3 ; as x ! 1. We also discuss the analogous problem for the Euler function. Letting N'(x) denote the number of n ≤ x for which n and '(n) share the same set of prime factors, we show that as x ! 1, x1=2 x7=20 ≤ N (x) ≤ ; where L(x) = xlog log log x= log log x: ' L(x)1=4+o(1) We conclude by discussing some related problems posed by Harborth and Cohen. 1. Introduction P Let σ(n) := djn d be the sum of the proper divisors of n. A natural number n is called perfect if σ(n) = 2n and, more generally, multiply perfect if n j σ(n). The study of such numbers has an ancient pedigree (surveyed, e.g., in [5, Chapter 1] and [28, Chapter 1]), but many of the most interesting problems remain unsolved. -
A Review of Graph and Network Complexity from an Algorithmic Information Perspective
entropy Review A Review of Graph and Network Complexity from an Algorithmic Information Perspective Hector Zenil 1,2,3,4,5,*, Narsis A. Kiani 1,2,3,4 and Jesper Tegnér 2,3,4,5 1 Algorithmic Dynamics Lab, Centre for Molecular Medicine, Karolinska Institute, 171 77 Stockholm, Sweden; [email protected] 2 Unit of Computational Medicine, Department of Medicine, Karolinska Institute, 171 77 Stockholm, Sweden; [email protected] 3 Science for Life Laboratory (SciLifeLab), 171 77 Stockholm, Sweden 4 Algorithmic Nature Group, Laboratoire de Recherche Scientifique (LABORES) for the Natural and Digital Sciences, 75005 Paris, France 5 Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia * Correspondence: [email protected] or [email protected] Received: 21 June 2018; Accepted: 20 July 2018; Published: 25 July 2018 Abstract: Information-theoretic-based measures have been useful in quantifying network complexity. Here we briefly survey and contrast (algorithmic) information-theoretic methods which have been used to characterize graphs and networks. We illustrate the strengths and limitations of Shannon’s entropy, lossless compressibility and algorithmic complexity when used to identify aspects and properties of complex networks. We review the fragility of computable measures on the one hand and the invariant properties of algorithmic measures on the other demonstrating how current approaches to algorithmic complexity are misguided and suffer of similar limitations than traditional statistical approaches such as Shannon entropy. Finally, we review some current definitions of algorithmic complexity which are used in analyzing labelled and unlabelled graphs. This analysis opens up several new opportunities to advance beyond traditional measures. -
A Philosophical Treatise of Universal Induction
Entropy 2011, 13, 1076-1136; doi:10.3390/e13061076 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article A Philosophical Treatise of Universal Induction Samuel Rathmanner and Marcus Hutter ? Research School of Computer Science, Australian National University, Corner of North and Daley Road, Canberra ACT 0200, Australia ? Author to whom correspondence should be addressed; E-Mail: [email protected]. Received: 20 April 2011; in revised form: 24 May 2011 / Accepted: 27 May 2011 / Published: 3 June 2011 Abstract: Understanding inductive reasoning is a problem that has engaged mankind for thousands of years. This problem is relevant to a wide range of fields and is integral to the philosophy of science. It has been tackled by many great minds ranging from philosophers to scientists to mathematicians, and more recently computer scientists. In this article we argue the case for Solomonoff Induction, a formal inductive framework which combines algorithmic information theory with the Bayesian framework. Although it achieves excellent theoretical results and is based on solid philosophical foundations, the requisite technical knowledge necessary for understanding this framework has caused it to remain largely unknown and unappreciated in the wider scientific community. The main contribution of this article is to convey Solomonoff induction and its related concepts in a generally accessible form with the aim of bridging this current technical gap. In the process we examine the major historical contributions that have led to the formulation of Solomonoff Induction as well as criticisms of Solomonoff and induction in general. In particular we examine how Solomonoff induction addresses many issues that have plagued other inductive systems, such as the black ravens paradox and the confirmation problem, and compare this approach with other recent approaches. -
Random Number Generation Using Normal Numbers
Random Number Generation Using Normal Numbers Random Number Generation Using Normal Numbers Michael Mascagni1 and F. Steve Brailsford2 Department of Computer Science1;2 Department of Mathematics1 Department of Scientific Computing1 Graduate Program in Molecular Biophysics1 Florida State University, Tallahassee, FL 32306 USA AND Applied and Computational Mathematics Division, Information Technology Laboratory1 National Institute of Standards and Technology, Gaithersburg, MD 20899-8910 USA E-mail: [email protected] or [email protected] or [email protected] URL: http://www.cs.fsu.edu/∼mascagni In collaboration with Dr. David H. Bailey, Lawrence Berkeley Laboratory and UC Davis Random Number Generation Using Normal Numbers Outline of the Talk Introduction Normal Numbers Examples of Normal Numbers Properties Relationship with standard pseudorandom number generators Normal Numbers and Random Number Recursions Normal from recursive sequence Source Code Seed Generation Calculation Code Initial Calculation Code Iteration Implementation and Results Conclusions and Future Work Random Number Generation Using Normal Numbers Introduction Normal Numbers Normal Numbers: Types of Numbers p I Rational numbers - q where p and q are integers I Irrational numbers - not rational I b-dense numbers - α is b-dense () in its base-b expansion every possible finite string of consecutive digits appears I If α is b-dense then α is also irrational; it cannot have a repeating/terminating base-b digit expansion I Normal number - α is b-normal () in its base-b expansion -
No Free Lunch Versus Occam's Razor in Supervised Learning
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by The Australian National University No Free Lunch versus Occam's Razor in Supervised Learning Tor Lattimore1 and Marcus Hutter1;2;3 Research School of Computer Science 1Australian National University and 2ETH Z¨urich and 3NICTA ftor.lattimore,[email protected] 15 November 2011 Abstract The No Free Lunch theorems are often used to argue that domain specific knowledge is required to design successful algorithms. We use algorithmic in- formation theory to argue the case for a universal bias allowing an algorithm to succeed in all interesting problem domains. Additionally, we give a new al- gorithm for off-line classification, inspired by Solomonoff induction, with good performance on all structured (compressible) problems under reasonable as- sumptions. This includes a proof of the efficacy of the well-known heuristic of randomly selecting training data in the hope of reducing the misclassification rate. Contents 1 Introduction 2 2 Preliminaries 3 3 No Free Lunch Theorem 4 4 Complexity-based classification 8 5 Discussion 12 A Technical proofs 14 Keywords Supervised learning; Kolmogorov complexity; no free lunch; Occam's razor. 1 1 Introduction The No Free Lunch (NFL) theorems, stated and proven in various settings and domains [Sch94, Wol01, WM97], show that no algorithm performs better than any other when their performance is averaged uniformly over all possible problems of a particular type.1 These are often cited to argue that algorithms must be designed for a particular domain or style of problem, and that there is no such thing as a general purpose algorithm. -
Normality and the Digits of Π
Normality and the Digits of π David H. Bailey∗ Jonathan M. Borweiny Cristian S. Caludez Michael J. Dinneenx Monica Dumitrescu{ Alex Yeek February 15, 2012 1 Introduction The question of whether (and why) the digits of well-known constants of mathematics are statistically random in some sense has long fascinated mathematicians. Indeed, one prime motivation in computing and analyzing digits of π is to explore the age-old question of whether and why these digits appear \random." The first computation on ENIAC in 1949 of π to 2037 decimal places was proposed by John von Neumann to shed some light on the distribution of π (and of e)[8, pp. 277{281]. Since then, numerous computer-based statistical checks of the digits of π, for instance, so far have failed to disclose any deviation from reasonable statistical norms. See, for instance, Table1, which presents the counts of individual hexadecimal digits among the first trillion hex digits, as obtained by Yasumasa Kanada. By contrast, ∗Lawrence Berkeley National Laboratory, Berkeley, CA 94720. [email protected]. Supported in part by the Director, Office of Computational and Technology Research, Division of Mathemat- ical, Information, and Computational Sciences of the U.S. Department of Energy, under contract number DE-AC02-05CH11231. yCentre for Computer Assisted Research Mathematics and its Applications (CARMA), Uni- versity of Newcastle, Callaghan, NSW 2308, Australia. [email protected]. Distinguished Professor King Abdul-Aziz Univ, Jeddah. zDepartment of Computer Science, University of Auckland, Private Bag 92019, Auckland, New Zealand. [email protected]. xDepartment of Computer Science, University of Auckland, Private Bag 92019, Auckland, New Zealand. -
Mathematical Constants and Sequences
Mathematical Constants and Sequences a selection compiled by Stanislav Sýkora, Extra Byte, Castano Primo, Italy. Stan's Library, ISSN 2421-1230, Vol.II. First release March 31, 2008. Permalink via DOI: 10.3247/SL2Math08.001 This page is dedicated to my late math teacher Jaroslav Bayer who, back in 1955-8, kindled my passion for Mathematics. Math BOOKS | SI Units | SI Dimensions PHYSICS Constants (on a separate page) Mathematics LINKS | Stan's Library | Stan's HUB This is a constant-at-a-glance list. You can also download a PDF version for off-line use. But keep coming back, the list is growing! When a value is followed by #t, it should be a proven transcendental number (but I only did my best to find out, which need not suffice). Bold dots after a value are a link to the ••• OEIS ••• database. This website does not use any cookies, nor does it collect any information about its visitors (not even anonymous statistics). However, we decline any legal liability for typos, editing errors, and for the content of linked-to external web pages. Basic math constants Binary sequences Constants of number-theory functions More constants useful in Sciences Derived from the basic ones Combinatorial numbers, including Riemann zeta ζ(s) Planck's radiation law ... from 0 and 1 Binomial coefficients Dirichlet eta η(s) Functions sinc(z) and hsinc(z) ... from i Lah numbers Dedekind eta η(τ) Functions sinc(n,x) ... from 1 and i Stirling numbers Constants related to functions in C Ideal gas statistics ... from π Enumerations on sets Exponential exp Peak functions (spectral) .. -
Cryptographic Analysis of Random
CRYPTOGRAPHIC ANALYSIS OF RANDOM SEQUENCES By SANDHYA RANGINENI Bachelor of Technology in Computer Science and Engineering Jawaharlal Nehru Technological University Anantapur, India 2009 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of MASTER OF SCIENCE July, 2011 CRYPTOGRAPHIC ANALYSIS OF RANDOM SEQUENCES Thesis Approved: Dr. Subhash Kak Thesis Adviser Dr. Johnson Thomas Dr. Sanjay Kapil Dr. Mark E. Payton Dean of the Graduate College ii TABLE OF CONTENTS Chapter Page I. INTRODUCTION ......................................................................................................1 1.1 Random Numbers and Random Sequences .......................................................1 1.2 Types of Random Number Generators ..............................................................2 1.3 Random Number Generation Methods ..............................................................3 1.4 Applications of Random Numbers .....................................................................5 1.5 Statistical tests of Random Number Generators ................................................7 II. COMPARISION OF SOME RANDOM SEQUENCES ..........................................9 2.1 Binary Decimal Sequences ................................................................................9 2.1.1 Properties of Decimal Sequences ..........................................................10 • Frequency characteristics .....................................................................10