The Logic of Probability Theory

Total Page:16

File Type:pdf, Size:1020Kb

The Logic of Probability Theory The Logic of Probability Theory Deriving Bayesian Statistics from Boolean Algebra William Sipes July 14, 2010 Probability Theory as Extended Logic • George Boole • The Laws of Thought • Algebraic expression of Aristotelian logical propositions • Full Title: An Investigation of the Laws of Thought on Which are Founded the Mathematical Theories of Logic and Probabilities (1854) Probability Theory as Extended Logic • Cox and Jaynes • The Algebra of Probable Inference (1961) • Probability Theory: The Logic of Science (2003) • Boolean Logic and Three Desiderata necessitate Bayesian Probability Boolean Algebra • Finite Field • Commutative ring wrt operations of disjunction and conjunction • Equivalence classes of [0] and [1] (representing FALSE and TRUE) • Foundation of Computer Science Boolean Algebra Disjunction Conjunction Negation Boolean Algebra Disjunction Conjunction Negation Though there are three distinct operations, it can be shown that any combination of two that includes negation is complete. Boolean Algebra F F F F F F F T F F T T T F F T F T T T T T T T F T T F Boolean Algebra Idempotence Associativity Commutativity De Morgan’s Laws Double Negation Bayesian Probability • Conditional Probability • Differs from frequentist approaches • Based on prior distributions • Models can be updated with new information Bayesian Probability Conjunction: A and B Conditional: A given B The Desiderata 1) Representation of plausibility with real numbers 2) Qualitative correspondence with common sense 3) Structural Consistency These uniquely determine the allowable operations of all probabilistic theory. Assumption of twice differentiability in functions Product Rule 1. Require that any plausibilities obey all of the desiderata simultaneously, unambiguously, and completely 2. Assume that there is a functional relation for conjoined propositions 3. Argue (using the requirements imposed by the desiderata) that there is only one form for this rule 4. Derive the form of the rule using differential equations Product Rule • Most basic assumption • Only functional form that does not degenerate when tested at “extremes” • Key feature of familiar probability theory Product Rule Product Rule Sum Rule 1. Using the product rule, derive a function that relates propositions and their negations 2. Impose the conditions of the desiderata to derive another functional equation 3. Argue analytically about the functional equation 4. Reduce the functional equation to a differential equation 5. This leaves a functional relation for complementary plausibilities Sum Rule Sum Rule Sum Rule Sum Rule Further Developing the Theory • Conditional probability and the sum rule gives definition of independence • Product rule can be used to derive Laplace’s definition of probability (frequentist) • Demonstrate agreement with Kolmogorov’s axioms of probability .
Recommended publications
  • Creating Modern Probability. Its Mathematics, Physics and Philosophy in Historical Perspective
    HM 23 REVIEWS 203 The reviewer hopes this book will be widely read and enjoyed, and that it will be followed by other volumes telling even more of the fascinating story of Soviet mathematics. It should also be followed in a few years by an update, so that we can know if this great accumulation of talent will have survived the economic and political crisis that is just now robbing it of many of its most brilliant stars (see the article, ``To guard the future of Soviet mathematics,'' by A. M. Vershik, O. Ya. Viro, and L. A. Bokut' in Vol. 14 (1992) of The Mathematical Intelligencer). Creating Modern Probability. Its Mathematics, Physics and Philosophy in Historical Perspective. By Jan von Plato. Cambridge/New York/Melbourne (Cambridge Univ. Press). 1994. 323 pp. View metadata, citation and similar papers at core.ac.uk brought to you by CORE Reviewed by THOMAS HOCHKIRCHEN* provided by Elsevier - Publisher Connector Fachbereich Mathematik, Bergische UniversitaÈt Wuppertal, 42097 Wuppertal, Germany Aside from the role probabilistic concepts play in modern science, the history of the axiomatic foundation of probability theory is interesting from at least two more points of view. Probability as it is understood nowadays, probability in the sense of Kolmogorov (see [3]), is not easy to grasp, since the de®nition of probability as a normalized measure on a s-algebra of ``events'' is not a very obvious one. Furthermore, the discussion of different concepts of probability might help in under- standing the philosophy and role of ``applied mathematics.'' So the exploration of the creation of axiomatic probability should be interesting not only for historians of science but also for people concerned with didactics of mathematics and for those concerned with philosophical questions.
    [Show full text]
  • There Is No Pure Empirical Reasoning
    There Is No Pure Empirical Reasoning 1. Empiricism and the Question of Empirical Reasons Empiricism may be defined as the view there is no a priori justification for any synthetic claim. Critics object that empiricism cannot account for all the kinds of knowledge we seem to possess, such as moral knowledge, metaphysical knowledge, mathematical knowledge, and modal knowledge.1 In some cases, empiricists try to account for these types of knowledge; in other cases, they shrug off the objections, happily concluding, for example, that there is no moral knowledge, or that there is no metaphysical knowledge.2 But empiricism cannot shrug off just any type of knowledge; to be minimally plausible, empiricism must, for example, at least be able to account for paradigm instances of empirical knowledge, including especially scientific knowledge. Empirical knowledge can be divided into three categories: (a) knowledge by direct observation; (b) knowledge that is deductively inferred from observations; and (c) knowledge that is non-deductively inferred from observations, including knowledge arrived at by induction and inference to the best explanation. Category (c) includes all scientific knowledge. This category is of particular import to empiricists, many of whom take scientific knowledge as a sort of paradigm for knowledge in general; indeed, this forms a central source of motivation for empiricism.3 Thus, if there is any kind of knowledge that empiricists need to be able to account for, it is knowledge of type (c). I use the term “empirical reasoning” to refer to the reasoning involved in acquiring this type of knowledge – that is, to any instance of reasoning in which (i) the premises are justified directly by observation, (ii) the reasoning is non- deductive, and (iii) the reasoning provides adequate justification for the conclusion.
    [Show full text]
  • The Interpretation of Probability: Still an Open Issue? 1
    philosophies Article The Interpretation of Probability: Still an Open Issue? 1 Maria Carla Galavotti Department of Philosophy and Communication, University of Bologna, Via Zamboni 38, 40126 Bologna, Italy; [email protected] Received: 19 July 2017; Accepted: 19 August 2017; Published: 29 August 2017 Abstract: Probability as understood today, namely as a quantitative notion expressible by means of a function ranging in the interval between 0–1, took shape in the mid-17th century, and presents both a mathematical and a philosophical aspect. Of these two sides, the second is by far the most controversial, and fuels a heated debate, still ongoing. After a short historical sketch of the birth and developments of probability, its major interpretations are outlined, by referring to the work of their most prominent representatives. The final section addresses the question of whether any of such interpretations can presently be considered predominant, which is answered in the negative. Keywords: probability; classical theory; frequentism; logicism; subjectivism; propensity 1. A Long Story Made Short Probability, taken as a quantitative notion whose value ranges in the interval between 0 and 1, emerged around the middle of the 17th century thanks to the work of two leading French mathematicians: Blaise Pascal and Pierre Fermat. According to a well-known anecdote: “a problem about games of chance proposed to an austere Jansenist by a man of the world was the origin of the calculus of probabilities”2. The ‘man of the world’ was the French gentleman Chevalier de Méré, a conspicuous figure at the court of Louis XIV, who asked Pascal—the ‘austere Jansenist’—the solution to some questions regarding gambling, such as how many dice tosses are needed to have a fair chance to obtain a double-six, or how the players should divide the stakes if a game is interrupted.
    [Show full text]
  • 1 Stochastic Processes and Their Classification
    1 1 STOCHASTIC PROCESSES AND THEIR CLASSIFICATION 1.1 DEFINITION AND EXAMPLES Definition 1. Stochastic process or random process is a collection of random variables ordered by an index set. ☛ Example 1. Random variables X0;X1;X2;::: form a stochastic process ordered by the discrete index set f0; 1; 2;::: g: Notation: fXn : n = 0; 1; 2;::: g: ☛ Example 2. Stochastic process fYt : t ¸ 0g: with continuous index set ft : t ¸ 0g: The indices n and t are often referred to as "time", so that Xn is a descrete-time process and Yt is a continuous-time process. Convention: the index set of a stochastic process is always infinite. The range (possible values) of the random variables in a stochastic process is called the state space of the process. We consider both discrete-state and continuous-state processes. Further examples: ☛ Example 3. fXn : n = 0; 1; 2;::: g; where the state space of Xn is f0; 1; 2; 3; 4g representing which of four types of transactions a person submits to an on-line data- base service, and time n corresponds to the number of transactions submitted. ☛ Example 4. fXn : n = 0; 1; 2;::: g; where the state space of Xn is f1; 2g re- presenting whether an electronic component is acceptable or defective, and time n corresponds to the number of components produced. ☛ Example 5. fYt : t ¸ 0g; where the state space of Yt is f0; 1; 2;::: g representing the number of accidents that have occurred at an intersection, and time t corresponds to weeks. ☛ Example 6. fYt : t ¸ 0g; where the state space of Yt is f0; 1; 2; : : : ; sg representing the number of copies of a software product in inventory, and time t corresponds to days.
    [Show full text]
  • Probability and Logic
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Journal of Applied Logic 1 (2003) 151–165 www.elsevier.com/locate/jal Probability and logic Colin Howson London School of Economics, Houghton Street, London WC2A 2AE, UK Abstract The paper is an attempt to show that the formalism of subjective probability has a logical in- terpretation of the sort proposed by Frank Ramsey: as a complete set of constraints for consistent distributions of partial belief. Though Ramsey proposed this view, he did not actually establish it in a way that showed an authentically logical character for the probability axioms (he started the current fashion for generating probabilities from suitably constrained preferences over uncertain options). Other people have also sought to provide the probability calculus with a logical character, though also unsuccessfully. The present paper gives a completeness and soundness theorem supporting a logical interpretation: the syntax is the probability axioms, and the semantics is that of fairness (for bets). 2003 Elsevier B.V. All rights reserved. Keywords: Probability; Logic; Fair betting quotients; Soundness; Completeness 1. Anticipations The connection between deductive logic and probability, at any rate epistemic probabil- ity, has been the subject of exploration and controversy for a long time: over three centuries, in fact. Both disciplines specify rules of valid non-domain-specific reasoning, and it would seem therefore a reasonable question why one should be distinguished as logic and the other not. I will argue that there is no good reason for this difference, and that both are indeed equally logic.
    [Show full text]
  • A FIRST COURSE in PROBABILITY This Page Intentionally Left Blank a FIRST COURSE in PROBABILITY
    A FIRST COURSE IN PROBABILITY This page intentionally left blank A FIRST COURSE IN PROBABILITY Eighth Edition Sheldon Ross University of Southern California Upper Saddle River, New Jersey 07458 Library of Congress Cataloging-in-Publication Data Ross, Sheldon M. A first course in probability / Sheldon Ross. — 8th ed. p. cm. Includes bibliographical references and index. ISBN-13: 978-0-13-603313-4 ISBN-10: 0-13-603313-X 1. Probabilities—Textbooks. I. Title. QA273.R83 2010 519.2—dc22 2008033720 Editor in Chief, Mathematics and Statistics: Deirdre Lynch Senior Project Editor: Rachel S. Reeve Assistant Editor: Christina Lepre Editorial Assistant: Dana Jones Project Manager: Robert S. Merenoff Associate Managing Editor: Bayani Mendoza de Leon Senior Managing Editor: Linda Mihatov Behrens Senior Operations Supervisor: Diane Peirano Marketing Assistant: Kathleen DeChavez Creative Director: Jayne Conte Art Director/Designer: Bruce Kenselaar AV Project Manager: Thomas Benfatti Compositor: Integra Software Services Pvt. Ltd, Pondicherry, India Cover Image Credit: Getty Images, Inc. © 2010, 2006, 2002, 1998, 1994, 1988, 1984, 1976 by Pearson Education, Inc., Pearson Prentice Hall Pearson Education, Inc. Upper Saddle River, NJ 07458 All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher. Pearson Prentice Hall™ is a trademark of Pearson Education, Inc. Printed in the United States of America 10987654321 ISBN-13: 978-0-13-603313-4 ISBN-10: 0-13-603313-X Pearson Education, Ltd., London Pearson Education Australia PTY. Limited, Sydney Pearson Education Singapore, Pte. Ltd Pearson Education North Asia Ltd, Hong Kong Pearson Education Canada, Ltd., Toronto Pearson Educacion´ de Mexico, S.A.
    [Show full text]
  • History of Probability (Part 4) - Inverse Probability and the Determination of Causes of Observed Events
    History of Probability (Part 4) - Inverse probability and the determination of causes of observed events. Thomas Bayes (c1702-1761) . By the early 1700s the work of Pascal, Fermat, and Huygens was well known, mainly couched in terms of odds and fair bets in gambling. Jacob Bernoulli and Abraham DeMoivre had made efforts to broaden the scope and interpretation of probability. Bernoulli had put probability measures on a scale between zero and one and DeMoivre had defined probability as a fraction of chances. But then there was a 50 year lull in further development in probability theory. This is surprising, but the “brains” of the day were busy applying the newly invented calculus to problems of physics – especially astronomy. At the beginning of the 18 th century a probability exercise like the following one was not too difficult. Suppose there are two boxes with white and black balls. Box A has 4 white and 1 black, box B has 2 white and 3 black. You pick one box at random, with probability 1/3 it will be Box A and 2/3 it will be B. Then you randomly pick one ball from the box. What is the probability you will end up with a white ball? We can model this scenario by a tree diagram. We also have the advantage of efficient symbolism. Expressions like = had not been developed by 1700. The idea of the probability of a specific event was just becoming useful in addition to the older emphasis on odds of one outcome versus another. Using the multiplication rule for independent events together with the addition rule, we find that the probability of picking a white ball is 8/15.
    [Show full text]
  • Unifying Logic and Probability
    review articles DOI:10.1145/2699411 those arguments: so one can write rules Open-universe probability models show merit about On(p, c, x, y, t) (piece p of color c is on square x, y at move t) without filling in unifying efforts. in each specific value for c, p, x, y, and t. Modern AI research has addressed BY STUART RUSSELL another important property of the real world—pervasive uncertainty about both its state and its dynamics—using probability theory. A key step was Pearl’s devel opment of Bayesian networks, Unifying which provided the beginnings of a formal language for probability mod- els and enabled rapid progress in rea- soning, learning, vision, and language Logic and understanding. The expressive power of Bayes nets is, however, limited. They assume a fixed set of variables, each taking a value from a fixed range; Probability thus, they are a propositional formal- ism, like Boolean circuits. The rules of chess and of many other domains are beyond them. What happened next, of course, is that classical AI researchers noticed the perva- sive uncertainty, while modern AI research- ers noticed, or remembered, that the world has things in it. Both traditions arrived at the same place: the world is uncertain and it has things in it. To deal with this, we have to unify logic and probability. But how? Even the meaning of such a goal is unclear. Early attempts by Leibniz, Bernoulli, De Morgan, Boole, Peirce, Keynes, and Carnap (surveyed PERHAPS THE MOST enduring idea from the early by Hailperin12 and Howson14) involved days of AI is that of a declarative system reasoning attaching probabilities to logical sen- over explicitly represented knowledge with a general tences.
    [Show full text]
  • Philosophy of Probability
    Philosophy of Probability Aidan Lyon Abstract In the philosophy of probability there are two central questions we are concerned with. The first is: what is the correct formal theory of proba- bility? Orthodoxy has it that Kolmogorov’s axioms are the correct axioms of probability. However, we shall see that there are good reasons to con- sider alternative axiom systems. The second central question is: what do probability statements mean? Are probabilities “out there”, in the world as frequencies, propensities, or some other objective feature of reality, or are probabilities “in the head”, as subjective degrees of belief? We will sur- vey some of the answers that philosophers, mathematicians and physicists have given to these questions. 1. Introduction The famous mathematician Henri Poincaré once wrote of the probability calcu- lus: “if this calculus be condemned, then the whole of the sciences must also be condemned” (Poincaré [1902], p. 186). Indeed, every branch of science makes extensive use of probability in some form or other. Quantum mechanics is well–known for making heavy use of probability. The second law of thermodynamics is a statistical law and, for- mulated one way, states that the entropy of a closed system is most likely to increase. In statistical mechanics, a probability distribution known as the micro–canonical distribution is used to make predictions concerning the macro- properties of gases. In evolutionary theory, the concept of fitness is often de- fined in terms of a probability function (one such definition says that fitness is expected number of offspring). Probability also plays central roles in natural se- lection, drift, and macro-evolutionary models.
    [Show full text]
  • Logic and Probability Lecture 1: Probability As Logic
    Logic and Probability Lecture 1: Probability as Logic Wesley Holliday & Thomas Icard UC Berkeley & Stanford August 11, 2014 ESSLLI, T¨ubingen Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 1 Overview Logic as a theory of: I truth-preserving inference I proof / deduction I consistency I rationality I definability ... Probability as a theory of: I uncertain inference I induction I learning I rationality I information ... Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 2 Some questions and points of contact: I In what ways might probability be said to extend logic? I How do probability and various logical systems differ on what they say about rational inference? I Can logic be used to gain a better understanding of probability? Say, through standard techniques of formalization, definability, questions of completeness, complexity, etc.? I Can the respective advantages of logical representation and probabilistic learning and inference be combined to create more powerful reasoning systems? Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 3 Course Outline I Day 1: Probability as (Extended) Logic I Day 2: Probability, Nonmonotonicity, and Graphical Models I Day 3: Beyond Boolean Logic I Day 4: Qualitative Probability I Day 5: Logical Dynamics Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 4 Interpretations of Probability I Frequentist: Probabilities are about `limiting frequencies' of events. I Propensity: Probabilities are about physical dispositions, or propensities, of events. I Logical: Probabilities are determined objectively via a logical language and some additional principles, e.g., of `symmetry'.
    [Show full text]
  • A Brief Overview of Probability Theory in Data Science by Geert
    A brief overview of probability theory in data science Geert Verdoolaege 1Department of Applied Physics, Ghent University, Ghent, Belgium 2Laboratory for Plasma Physics, Royal Military Academy (LPP–ERM/KMS), Brussels, Belgium Tutorial 3rd IAEA Technical Meeting on Fusion Data Processing, Validation and Analysis, 27-05-2019 Overview 1 Origins of probability 2 Frequentist methods and statistics 3 Principles of Bayesian probability theory 4 Monte Carlo computational methods 5 Applications Classification Regression analysis 6 Conclusions and references 2 Overview 1 Origins of probability 2 Frequentist methods and statistics 3 Principles of Bayesian probability theory 4 Monte Carlo computational methods 5 Applications Classification Regression analysis 6 Conclusions and references 3 Early history of probability Earliest traces in Western civilization: Jewish writings, Aristotle Notion of probability in law, based on evidence Usage in finance Usage and demonstration in gambling 4 Middle Ages World is knowable but uncertainty due to human ignorance William of Ockham: Ockham’s razor Probabilis: a supposedly ‘provable’ opinion Counting of authorities Later: degree of truth, a scale Quantification: Law, faith ! Bayesian notion Gaming ! frequentist notion 5 Quantification 17th century: Pascal, Fermat, Huygens Comparative testing of hypotheses Population statistics 1713: Ars Conjectandi by Jacob Bernoulli: Weak law of large numbers Principle of indifference De Moivre (1718): The Doctrine of Chances 6 Bayes and Laplace Paper by Thomas Bayes (1763): inversion
    [Show full text]
  • Basic Probability
    Mathematics Learning Centre Basic concepts in probability Sue Gordon c 2005 University of Sydney Mathematics Learning Centre, University of Sydney 1 1 Set Notation You may omit this section if you are familiar with these concepts. A set is a collection of objects. We often specify a set by listing its members, or elements, in parentheses like this {}. For example A = {2, 4, 6, 8} means that A is the set consisting of numbers 2,4,6,8. We could also write A ={even numbers less than 9}. The union of A and B is the set of elements which belong to A or to B (or both) and can be written as A ∪ B. The intersection of A and B is the set of elements which belong to both A and B, and can be written as A ∩ B. The complement of A, frequently denoted by A, is the set of all elements which do not belong to A. In making this definition we assume that all elements we are thinking about belong to some larger set U, which we call the universal set. The empty set, written ∅ or {}, means the set with no elements in it. A set C is a subset of A if all the elements in C are also in A. For example, let U = {all positive numbers ≤ 10} A = {2, 4, 6, 8} B = {1, 2, 3} C = {6, 8} Sets A, B and U may be represented in a Venn Diagram as follows: 5 AB 7 4 1 U 2 9 6 8 3 10 Mathematics Learning Centre, University of Sydney 2 A intersection B, A ∩ B, is shown in the Venn diagram by the overlap of the sets A and B, A ∩ B = {2}.
    [Show full text]