Logic and Probability Lecture 1: Probability As Logic

Logic and Probability Lecture 1: Probability as Logic Wesley Holliday & Thomas Icard UC Berkeley & Stanford August 11, 2014 ESSLLI, Tübingen Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 1 Overview Logic as a theory of: I truth-preserving inference I proof / deduction I consistency I rationality I definability ... Probability as a theory of: I uncertain inference I induction I learning I rationality I information ... Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 2 Some questions and points of contact: I In what ways might probability be said to extend logic? I How do probability and various logical systems differ on what they say about rational inference? I Can logic be used to gain a better understanding of probability? Say, through standard techniques of formalization, definability, questions of completeness, complexity, etc.? I Can the respective advantages of logical representation and probabilistic learning and inference be combined to create more powerful reasoning systems? Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 3 Course Outline I Day 1: Probability as (Extended) Logic I Day 2: Probability, Nonmonotonicity, and Graphical Models I Day 3: Beyond Boolean Logic I Day 4: Qualitative Probability I Day 5: Logical Dynamics Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 4 Interpretations of Probability I Frequentist: Probabilities are about `limiting frequencies' of events. I Propensity: Probabilities are about physical dispositions, or propensities, of events. I Logical: Probabilities are determined objectively via a logical language and some additional principles, e.g., of `symmetry'. I Bayesian: Probabilities are subjective and reflect an agent's degree of confidence concerning some event. We will mostly remain neutral, focusing on mathematical questions, but will note when the interpretation is critical. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 5 A measurable space is a pair (W , E) with I W is an arbitrary set I E is a s-algebra over W , i.e., a subset of }(W ) closed under complement and infinite union. A probability space is a triple (W , E, m), with (W , E) a measurable space and m : E! [0, 1] a measure function satisfying: I m(W ) = 1 ; I m(E [ F ) = m(E ) + m(F ), whenever E \ F = Æ . A random variable from (W , E, m) to some other space (V , F) is a function X : W ! V such that: whenever X (E ) = F and F 2 F, we have E 2 E . Then m induces a natural distribution for X , typically written P(X = v) = m(fw 2 W : X (w) = vg) . Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 6 Suppose we have a propositional logical language L generated as follows j ::= A j B j ... j j ^ j j :j We can define a probability P : L! [0, 1] directly on L, requiring I P(j) = 1, for any tautology j ; I P(j _ y) = P(j) + P(y), whenever :(j ^ y) . Equivalent set of requirements: I P(j) = 1 for any tautology ; I P(j) ≤ P(y) whenever j ! y ; I P(j) = P(j ^ y) + P(j ^ :y) . Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 7 It is then easy to show: I P(j) = 0, for any contradiction j ; I P(:j) = 1 − P(j) ; I P(j _ y) = P(j) + P(y) − P(j ^ y) ; I A propositional valuation sending atoms to 1 or 0 is a special case of a probability function ; I A probability on L gives rise to a standard probability measure over `world-states', i.e., maximally consistent sets of formulas from L . Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 8 With this setup we can easily define, e.g., conditional probability: P(j ^ y) P(jjy) = , P(y) provided P(y) > 0. With this it is easy to prove Bayes' Theorem: Theorem (Bayes) P(yjj)P(j) P(jjy) = . P(y) Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 9 Today's Theme: Probability as Logic Three senses in which probability has been claimed as part of logic: 1. Logical Deduction and Probability Preservation 2. Consistency and Fair Odds 3. Patterns of Plausible Reasoning Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 10 PART I: Logical Deduction and Probability Preservation Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 11 Logical Deduction and Probability Preservation Suppose we can show G ` j in propositional logic, but the assumptions g 2 G themselves are uncertain. Knowing about the probabilities of formulas in G, what can we conclude about the probability of j? Example (Adams & Levine) A telephone survey is made of 1,000 people as to their political party affiliation. Supposing 629 people report being Democrat, with some possibility of error in each case. What can we say about the probability of the following statement? At least 600 people are Democrats. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 12 Logical Deduction and Probability Preservation Proposition (Suppes 1966) If G ` j, then for all P we have P(:j) ≤ ∑g2G P(:g). Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 13 Logical Deduction and Probability Preservation Example (Kyburg's 1965 `Lottery Paradox') Imagine a fair lottery with n tickets. Let g1, ... , gn be sentences: gi : ticket i is the winner. n−1 Each sentence :gi might seem reasonable, since it has probability n . However, putting these all together rapidly decreases probability. Indeed, ^ P :gi = 0 . i≤n This demonstrates that the (logical) conclusion of all these individually reasonable premises|that none of the tickets will win the lottery|has probability 0. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 14 Logical Deduction and Probability Preservation Definition (Suppes Entailment) G s j, iff for all P: P(:j) ≤ ∑ P(:g) . g2G Definition (Adams Entailment) G a j, iff for all e > 0, there is a d > 0, such that for all P: if P(:g) < d for all g 2 G, then P(:j) < e . Lemma G s j, iff G a j. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 15 Logical Deduction and Probability Preservation Theorem (Adams 1966) Classical logic is sound and complete for Suppes/Adams entailment. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 16 Logical Deduction and Probability Preservation The following are equivalent: (a) G ` j (b) G j (c) G s j (d) G a j . Proof. (a) , (b) is just standard completeness. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 17 Logical Deduction and Probability Preservation The following are equivalent: (a) G ` j (b) G j (c) G s j (d) G a j . Proof. For (b) ) (c), suppose G 2s j, so there is a P, such that 1 − P(j) > ∑ P(:g) . g2G From this it follows that 1 > ∑ P(:g) + P(j) . g2G Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 18 Logical Deduction and Probability Preservation The following are equivalent: (a) G ` j (b) G j (c) G s j (d) G a j . Proof. For (b) ) (c), suppose G 2s j, so there is a P, such that 1 − P(j) > ∑ P(:g) . g2G From this it follows that 1 > ∑ P(:g) + P(j) ≥ P(g1 ^ · · · ^ gn ! j) , g2G for any subset fg1, ... , gng ⊆ G. Since all tautologies must have probability 1, no such implication g1 ^ · · · ^ gn ! j is a tautology. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 19 Logical Deduction and Probability Preservation The following are equivalent: (a) G ` j (b) G j (c) G s j (d) G a j . Proof. For (c) ) (d), suppose G s j, and e > 0. If G is finite, choose e d = . jGj Then we have P(:j) ≤ ∑ P(:g) < ∑ d = e . g g Hence P(:j) < e, and G a j. The case of G infinite is left as exercise. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 20 Logical Deduction and Probability Preservation The following are equivalent: (a) G ` j (b) G j (c) G s j (d) G a j . Proof. Finally, for (d) ) (b), suppose G 2 j. For any e > 0, we can simply take a valuation witnessing G 2 j, which is itself a probability function P making P(j) = 0 and P(g) = 1, for all g 2 G. Hence G 2a j. (Note that essentially the same proof works to show (c) ) (b).) Thus we have shown (a) , (b) ) (c) ) (d) ) (b). a Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 21 Logical Deduction and Probability Preservation Example In the telephone survey example, suppose each person's statement has probability 0.999. Then the best upper bound we can get on probability for the negation of the conclusion is 629 ∗ 0.001 = 0.629 . That is, our best lower bound on the probability that there are at least 600 Democrats is 0.371. This seems weak. Wesley Holliday & Thomas Icard: Logic and Probability, Lecture 1: Probability as Logic 22 Logical Deduction and Probability Preservation Definition 0 0 Where G j, we say G ⊆ G is essential if G − G 2 j. Definition Given G j, the degree of essentialness of g 2 G is given by 1 E (g) = , jSgj where jSgj is the size of the smallest essential set containing g.

Load more