Philosophical Applications of Computational Learning Theory
Total Page:16
File Type:pdf, Size:1020Kb
Philosophical Applications of Computational Learning Theory Chomskyan Innateness and Occams Razor Afstudeerscriptie Filosoe van de Informatica sp R M de Wolf Erasmus Universiteit Rotterdam Vakgro ep Theoretische Filosoe Begeleider Dr G J C Lokhorst Juli Contents Preface v Intro duction to Computational Learning Theory Intro duction Algorithms that Learn Concepts from Examples The PAC Mo del and Eciency PAC Algorithms Sample Complexity Time Complexity Representations PolynomialTime PAC Learnability Some Related Settings PolynomialTime PAC Predictability Memb ership Queries Identication from Equivalence Queries Learning with Noise Summary Application to Language Learning Intro duction A Brief History of the Chomskyan Revolutions Against Behaviorism TransformationalGenerative Grammar PrinciplesandParameters The Innateness of Universal Grammar Formalizing Languages and Grammars Formal Languages Formal Grammars The Chomsky Hierarchy Simplifying Assumptions for the Formal Analysis Language Learning Is Algorithmic Grammar Learning All Children Have the Same Learning Algorithm No Noise in the Input PAC Learnability Is the Right Analysis Formal Analysis of Language Learnability The PAC Setting for Language Learning i ii CONTENTS A Conjecture The Learnability of Typ e Languages The Learnability of Typ e Languages The Learnability of Typ e Languages Of What Typ e Are Natural Languages A Negative Result k Bounded ContextFree Languages Are Learnable Simple Deterministic Languages are Learnable The Learnability of Typ e Languages Finite Classes Are Learnable What Do es All This Mean Wider Learning Issues Summary Kolmogorov Complexity and Simplicity Intro duction Denition and Prop erties Turing Machines and Computability Denition Ob jectivity up to a Constant NonComputability Relation to Shannons Information Theory Simplicity Randomness Finite Random Strings Innite Random Strings An Interesting Random String Godels Theorem Without SelfReference The Standard Pro of A Pro of Without SelfReference Randomness in Mathematics Summary Occams Razor Intro duction Occams Razor History Applications in Science Applications in Philosophy Occam and PAC Learning Occam and Minimum Description Length Occam and Universal Prediction On the Very Possibility of Science Summary A Pro of of Occams Razor PAC Version CONTENTS iii Summary and Conclusion Computational Learning Theory Language Learning Kolmogorov Complexity Occams Razor Conclusion List of Symb ols Bibliography Index of Names Index of Sub jects iv CONTENTS Preface What is learning Learning is what makes us adapt to changes and threats and what allows us to cop e with a world in ux In short learning is what keeps us alive Learning has strong links to almost any other topic in philosophy scientic inference knowledge truth reasoning logic language anthrop ology b ehaviour ethics go o d taste aesthetics and so on Accordingly it can b e seen as one of the quintessential philosophical topicsan appropriate topic for a graduation thesis Much can b e said ab out learning to o much to t in a single thesis Therefore this thesis is restricted in scop e dealing only with computational learning theory often abbreviated to COLT Learning seems so simple we do it every day often without noticing it Nevertheless it is obvious that some fairly complex mechanisms must b e at work when we learn COLT is the branch of Articial Intelligence that deals with the computational prop erties and limitations of such mechanisms The eld can b e seen as the intersection of Machine Learning and complexity theory COLT is a very young eldthe publication of Valiants seminal pap er in may b e seen as its birthand it app ears that virtually none of its many interesting results are known to philosophers Some philosophical work has referred to complexity theory for instance Che and some has referred to Machine Learning for instance Tha but as far as I know thus far no philosophical use has b een made of the results that have sprung from COLT For instance at the time of writing of this thesis the Philosophers Index a database containing most ma jor publications in philosophical journals or b o oks contains no entries whatso ever that refer to COLT or to its main mo del the mo del of PAC learning there is hardly any reference to Kolmogorov complexity either Stuart Russell devotes two pages to PAC learning Rus pp and James McAllister devotes one page to Kolmogorov complexity as a quantitative measure of simplicity McA pp but b oth do not provide more than a sketchy and sup ercial explanation As its title already indicates the present thesis tries to make contact b e tween COLT and philosophy The aim of the thesis is threefold The rst and most shallow goal is to obtain a degree in philosophy for its author The second goal is to take a numb er of recent results from computational learning theory insert them in their appropriate philosophical context and see how they b ear on various ongoing philosophical discussions The third and most ambitious goal is to draw the attention of philosophers to computational learning theory in general Unfortunately the traditional reluctance of philosophers in particular those of a nonanalytical strand to use formal metho ds as well as their inapt v vi CONTENTS ness with formal metho ds once they have outgrown that reluctance makes this goal rather hard to attain Nevertheless it is my opinion that computational learning theoryor for that matter complexity theory as a wholehas much to oer to philosophy Accordingly this thesis may b e seen as a plea for the philosophical relevance of computational learning theory as a whole The thesis is organized as follows In the rst chapter we will give an overview of the main learning setting used in COLT We will stick to the rig orous formal denitions as used in COLT but supplement them with a lot of informal and intuitive comment in order to make them accessible and readable for philosophers After that intro ductory chapter the second chapter applies results from COLT to Noam Chomskys ideas ab out language learning and the innateness of linguistic biases The third chapter gives an intro duction to the theory of Kolmogorov complexity which provides us with a fundamental mea sure of simplicity Kolmogorov complexity do es not b elong to computational learning theory prop er its invention in fact predates COLT but its main application for us lies in Occams Razor This fundamental maxim tells us to go for the simplest theories consistent with the data and is highly relevant in the context of learning in general and scientic theory construction in partic ular The fourth chapter provides dierent formal settings in which some form of the razor can b e mathematically justied using Kolmogorov complexity to quantitatively measure simplicity Finally we end with a brief chapter that summarizes the thesis in nontechnical terms Let me end this preface by expressing my gratitude to a numb er of p eo ple who contributed considerably to the contents of this thesis First of all I would of course like to thank my thesis advisor GertJan Lokhorstone of those philosophers who like myself do not hesitate to invoke formal deni tions and results whenever these might b e usefulfor his many comments and suggestions Secondly many thanks should go to ShanHwei NienhuysCheng who put me on the track of inductive learning in the rst place by inviting me to join her in writing a b o ok on inductive logic programming NW a b o ok which eventually to ok us more than two and a half years to nish Chapter of that b o ok actually provided the basis for a large part of the rst chapter of the present thesis Finally I would like to thank Jero en van Rijen for his many helpful comments Peter Sas and Peter Grun wald for some references on linguistics and Paul Vitanyi for his course on learning and Kolmogorov complexity Chapter Intro duction to Computational Learning Theory Intro duction This