Probability, Interpretations Of1

1 Probability, Interpretations of1 1. INTRODUCTION 'Interpreting probability' is a commonly used but misleading name for a worthy enterprise. The so-called 'interpretations of probability' would be better called 'analyses of various concepts of probability', and 'interpreting probability' is the task of providing such analyses. Normally, we speak of interpreting a formal system, that is, attaching familiar meanings to the primitive terms in its axioms and theorems, usually with an eye to turning them into true statements. However, there is no single formal system that is 'probability', but rather a host of such systems. To be sure, Kolmogorov's axiomatization, which we will present shortly, has achieved the status of orthodoxy, and it is typically what philosophers have in mind when they think of 'probability theory'. Nevertheless, several of the leading 'interpretations of probability' fail to satisfy all of Kolmogorov's axioms, yet they have not lost their title for that. Moreover, various other quantities that have nothing to do with probability do satisfy Kolmogorov's axioms, and thus are interpretations of it in a strict sense: mass, length, area, volume (each suitably normalized), and indeed anything that falls under the scope of measure theory. Nobody ever seriously considers these to be 'interpretations of probability', however, because they do not play the right role in our conceptual apparatus. Instead, we will be concerned here with various probability-like concepts that purportedly do. Be all that as it may, we will follow common usage and drop the cringing scare quotes in our survey of what philosophers have taken to be the chief interpretations of probability. Whatever we call it, the project of finding such interpretations is an important one. Probability is virtually ubiquitous. It plays a role in almost every branch of science. It finds its way, moreover, into much of philosophy. In epistemology, the philosophy of mind, and cognitive science, we see states of opinion being modeled by subjective probability functions, and learning being modeled by the updating of such probability functions. Since probability theory is central to decision theory and game theory, it has ramifications for various theories in ethics and political philosophy. It figures prominently in such staples of metaphysics as causation and laws of nature. It appears again in the philosophy of science in the analysis of confirmation of theories, scientific explanation, and in the philosophy of specific scientific theories, such as quantum mechanics, statistical mechanics, and genetics. It can even take center stage in the philosophy of logic, the philosophy of language, and the philosophy of religion. Thus, problems in the foundations of probability bear at least indirectly, and sometimes directly, upon central scientific and philosophical concerns. The interpretation of probability is arguably the most important such foundational problem. 2. KOLMOGOROV'S PROBABILITY CALCULUS Probability theory was inspired by games of chance in 17th century France and inaugurated by the Fermat-Pascal correspondence. However, its axiomatization had to wait until Kolmogorov’s classic book (1933). Let Ω be a non-empty set (‘the universal set’). A sigma-field (or sigma-algebra) on Ω is a set F of subsets of Ω that has Ω as a 1 I thank Branden Fitelson for extremely helpful comments. 2 member, and that is closed under complementation (with respect to Ω) and countable union. Let P be a function from F to the real numbers obeying: 1. (Non-negativity) P(A) ≥ 0 for all A ∈ F. 2. (Normalization) P(Ω) = 1. 3. (Finite additivity) P(A ∪ B) = P(A) + P(B) for all A, B ∈ F such that A ∩ B = ∅. Call P a probability function, and (Ω, F, P) a probability space. We could instead attach probabilities to members of a collection S of sentences of a formal language, closed under truth-functional combinations, with the following counterpart axiomatization: I. P(A) ≥ 0 for all A ∈ S. II. If T is a logical truth (in classical logic), then P(T) = 1. III. P(A ∨ B) = P(A) + P(B) for all A ∈ S and B ∈ S such that A and B are logically incompatible. It is controversial whether we should strengthen finite additivity, as Kolmogorov does: 3´. (Countable additivity) If {Ai} is a countable collection of (pairwise) disjoint sets, each ∈ F, then ∞ ∞ P(U An) = ∑ P(An). n =1 n=1 The conditional probability of A given B is then taken to be given by the ratio of unconditional probabilities: P(X ∩ Y) P(X|Y) = P(Y) , provided P(Y) > 0. There are other axiomatizations that give up normalization; that give up countable additivity, and even additivity; that allow probabilities to take infinitesimal values (positive, but smaller than every positive real number); that allow probabilities to be vague (interval-valued, or more generally sets of numerical values); and that take conditional probability to be primitive. For now, however, when we speak of 'the probability calculus', we will mean Kolmogorov's approach. Given certain probabilities as inputs, the axioms and theorems allow us to compute various further probabilities. However, apart from the assignment of 1 to the universal set and 0 to the empty set, they are silent regarding the initial assignment of probabilities. For guidance with that, we need to turn to the interpretations of probability. 3. CRITERIA OF ADEQUACY First, however, let us list some criteria of adequacy for such interpretations. We begin by following Salmon (19xx, 64): Admissibility. We say that an interpretation of a formal system is admissible if the meanings assigned to the primitive terms in the interpretation transform the formal axioms, and consequently all the theorems, into true statements. A fundamental requirement for probability concepts is to satisfy the mathematical relations specified by the calculus of probability… Ascertainability. This criterion requires that there be some method by which, in principle at least, we can ascertain values of probabilities. It merely expresses the 3 fact that a concept of probability will be useless if it is impossible in principle to find out what the probabilities are… Applicability. The force of this criterion is best expressed in Bishop Butler's famous aphorism, "Probability is the very guide of life."… It might seem that the criterion of admissibility goes without saying: 'interpretations' of the probability calculus that assigned to P the interpretation 'the number of hairs on the head of' or 'the political persuasion of' would obviously not even be in the running, because they would render the axioms and theorems so obviously false. The word 'interpretation' is often used in such a way that 'admissible interpretation' is a pleonasm. Yet it turns out that the criterion is non-trivial, and indeed if taken seriously would rule out several of the leading interpretations of probability! As we will see, some of them fail to satisfy countable additivity; for others (certain propensity interpretations) the status of at least some of the axioms is unclear; and one of them (unconstrained subjectivism) violates all of the axioms. Nevertheless, we regard them as genuine candidates. It should be remembered, moreover, that Kolmogorov's is just one of many possible axiomatizations, and there is not universal agreement on which is 'best' (whatever that might mean). Thus, there is no such thing as admissibility tout court, but rather admissibility with respect to this or that axiomatization. It would be unfortunate if, perhaps out of an overdeveloped regard for history, one felt obliged to reject any interpretation that did not obey the letter of Kolmogorov's laws, and that was thus 'inadmissible'. Indeed, Salmon's preferred axiomatization differs from Kolmogorov's.2 In any case, if we found an inadmissible interpretation that did a wonderful job of meeting the criteria of ascertainability and applicability, then we should surely embrace it. So let us turn to those criteria. It is a little unclear in the ascertainability criterion just what "in principle" amounts to, though perhaps some latitude here is all to the good. Understood charitably, and to avoid trivializing it, it presumably excludes omniscience. On the other hand, understanding it in a way acceptable to a strict empiricist or a verificationist may be too restrictive. 'Probability' is apparently, among other things, a modal concept, plausibly outrunning that which actually occurs, let alone that which is actually sensed. Most of the work will be done by the applicability criterion. We must say more (as Salmon indeed does) about what sort of a guide to life probability is supposed to be. Mass, length, area and volume are all useful concepts, and they are 'guides to life' in various ways (think how critical distance judgments can be to survival); moreover, they are admissible and ascertainable, so presumably it is the applicability criterion that will 2 It turns out that the axiomatization that Salmon gives (p. 59) is inconsistent, and thus that by his lights no interpretation could be admissible. His axiom A2 states: "If A is a subclass of B, P(A, B) = 1" (read this as 'the probability of B, given A, equals 1'). Let I be the empty class; then for all B, P(I, B) = 1. But his A3 states: "If B and C are mutually exclusive P(A, B ∪ C) = P(A, B) + P(A, C)." Then for any X, P(I, X ∪ –X) = P(I, X) + P(I, –X) = 1 + 1 = 2, which contradicts his normalization axiom A1. This problem is easily remedied (simply add the qualification in A2 that A is non-empty), but it is instructive. It suggests that we ought not take the admissibility criterion too seriously. After all, Salmon's subsequent discussion of the merits and demerits of the various interpretations, as judged by the ascertainability and applicability criteria, still stands, and that is where the real interest lies.

Load more