<<

The Paradox of Confirmation: New Challenges for

Standard Bayesian Solutions

by

Justin M. Dallmann

A Thesis submitted to the Faculty of Graduate Studies of laatBNiai _

GP UNIVERSITY O £ MANITOBA in partial fulfilment of the requirements of the degree of

MASTER OF ARTS

Department of Philosophy

University of Manitoba

Winnipeg

Copyright ©2009 by Justin M. Dallmann Library and Archives Bibliothdque et 1*1 Canada Archives Canada Published Heritage Direction du Branch Patrimoine de l'6dition

395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A0N4 Canada Canada

Your file Votre reference ISBN: 978-0-494-60374-1 Our file Notre reference ISBN: 978-0-494-60374-1

NOTICE: AVIS:

The author has granted a non- L'auteur a accorde une licence non exclusive exclusive license allowing Library and permettant a la Bibliotheque et Archives Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par Nnternet, preter, telecommunication or on the Internet, distribuer et vendre des theses partout dans le loan, distribute and sell theses monde, a des fins commerciales ou autres, sur worldwide, for commercial or non- support microforme, papier, electronique et/ou commercial purposes, in microform, autres formats. paper, electronic and/or any other formats.

The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in this et des droits moraux qui protege cette these. Ni thesis. Neither the thesis nor la these ni des extraits substantiels de celle-ci substantial extracts from it may be ne doivent etre imprimis ou autrement printed or otherwise reproduced reproduits sans son autorisation. without the author's permission.

In compliance with the Canadian Conformement a la loi canadienne sur la Privacy Act some supporting forms protection de la vie privee, quelques may have been removed from this formulaires secondaires ont ete enleves de thesis. cette these.

While these forms may be included Bien que ces formulaires aient inclus dans in the document page count, their la pagination, il n'y aura aucun contenu removal does not represent any loss manquant. of content from the thesis.

M Canada THE UNIVERSITY OF MANITOBA

FACULTY OF GRADUATE STUDIES

COPYRIGHT PERMISSION

The Paradox of Confirmation: New Challenges for Standard Bayesian Solutions

BY

Justin M. Dallmann

A Thesis/Practicum submitted to the Faculty of Graduate Studies of The University of

Manitoba in partial fulfillment of the requirement of the degree

MASTER OF ARTS

Justin M. Dallmann © 2009

Permission has been granted to the University of Manitoba Libraries to lend a copy of this thesis/practicum, to Library and Archives Canada (LAC) to lend a copy of this thesis/practicum, and to LAC's agent (UME/ProQuest) to microfilm, sell copies and to publish an abstract of this thesis/practicum.

This reproduction or copy of this thesis has been made available by authority of the copyright owner solely for the purpose of private study and research, and may only be reproduced and copied as permitted by copyright laws or with express written authorization from the copyright own 1 Front Matter/Prefatory Pages

1.1 Abstract

In this paper I demonstrate one way that Bayesian confirmation theory can con- tribute to an epistemology of several key metaphysical concepts found at the core of the natural sciences. I then use the results to generalize the Paradox of Confir- mation in a way that, I contend, undermines the standard Bayesian solution to the

Paradox as well as the more recent refinements proposed in (Howson & Urbach,

2006), (Fitelson, 2006), (Fitelson & Hawthorne, Forth.), and (Vranas, 2004). The formal results presented serve as yet another illustration of the inadequacy of the standard Bayesian solutions to the Paradox of Confirmation, prompting Bayesians of all stripes to reject the standard solution. As an upshot, this thesis also demon- strates that it is imprudent to ignore metaphysical phenomena when constructing a theory of confirmation.

i 1.2 Acknowledgements

I would like to thank the Social Sciences and Humanities Research Council of

Canada, the Donald Vernon Snider Memorial Foundation and the Province of Man- itoba for their generous support in the form of a Canadian Graduate Scholarship,

Donald Vernon Snider Memorial Award and Manitoba Graduate Scholarship, re- spectively.

ii 1.3 Dedication

To all those from whom I've learned,

my wife Amanda, and

my rabbit Bunny.

iii Contents

1 Front Matter/Prefatory Pages i

1.1 Abstract i

1.2 Acknowledgements ii

1.3 Dedication iii

2 Introduction 1

3 Confirming the Metaphysical 3

3.1 Bayesian Accounts of Confirmation 3

3.2 The Argument from Entailment 4

3.3 A Case Study: Causation 6

4 A Generalization of Hempel's Paradox 15

4.1 Non-Standard Solutions 16

4.2 'The'Standard Solution 19

4.3 The Generalization 24

4.4 A Counterargument to the Generalized Paradox 26

4.5 Weakenings of the Standard Solution 39

4.6 The Obvious 'Solution' 44

5 Conclusion 47

6 Appendix 49

References 54

iv 2 Introduction

Bayesianism, as it will be discussed within the context of this thesis, is the view that: (i) our credences come in degrees, (ii) conformity to the probability calculus is a rationality constraint on our credences, (iii) we are to update our credences by conditionalizing on our available evidence, and (iv) the support a theory garners from a given piece of evidence is a function of our subjective credence in the theory before we have learned that the evidence obtains and our credence in the theory after the evidence has been learned.

On most (subjective) Bayesian views, one's level of initial credence in a hypoth- esis is relatively unconstrained. All that is demanded is that one's initial credences are probabilistically consistent. As a trivial example, no agent who assigns a prob- ability or credence of 1 to some hypothesis h could also assign to its negation ~

As far as this overly simplified picture is concerned, the Bayesian framework is amenable to metaphysical enquiry. Nothing in it bars the assignment of positive credence to hypotheses involving cause and effect, dispositions, or claims couched in counterfactuals. However, as a purely sociological observation, the majority of practising Bayesians for the most part stear clear of such matters. Discussions of

1 Bayesian confirmation theory are no exception to this rule. One of the main strands of this thesis will be to examine the Bayesian confirmation project in light of the metaphysics of science.

To this end, the thesis commences with a brief preliminary exposition of Bayesian confirmation theory in §3.1. In §3.2,1 concern myself with ways that Bayesian con- firmation theory can contribute to an epistemology of several key metaphysical con- cepts found at the core of the natural sciences. I then use the results to generalize the Paradox of Confirmation in §4. After an exposition of the standard Bayesian response to the Paradox of Confirmation in §4.2, I contend that the generalization of the Paradox undermines this solution to the Paradox as well as the recent re- finements of the solution proposed in (Howson & Urbach, 2006), (Fitelson, 2006),

(Fitelson & Hawthorne, Forth.), and (Vranas, 2004). Several possible responses are canvassed and rejected in §4.4 (especially subsection 4.4.1). In §4.4.3, I present the upshot of the novel problems introduced by the preceding sections and try to draw out some general lessons for confirmation theory. Sections 4.5 looks at, and rejects, recent weakenings of the standard solution and §4.6 contains the presenta- tion and refutation of a novel solution to the Paradox. Finally, I conclude in §5 that the formal results presented in this thesis serve as yet another illustration of the in- adequacy of the standard Bayesian solution to the Paradox of Confirmation. Hence,

Bayesians of all stripes should reject the standard solution. The thesis demonstrates the imprudence of ignoring metaphysical phenomena when constructing a theory of confirmation.

2 3 Confirming the Metaphysical

3.1 Bayesian Accounts of Confirmation

Several Bayesian measures of confirmation, c [h, e], have been proposed and de- fended in the literature. Where h denotes some hypothesis and e the evidence, a few of the most popular such measures include:1

• The Difference: d [h, e] = P [h \ e] - P [h];

• The Log-Ratio: r [h, e] = In (P [h \ e] /P[/J]);

• The Log-Likelihood-Ratio: I [h, e] = In (P [e | h] /P [e | -i/ij).2

Despite the diversity of measures, it is worth pointing out that if d [h, e] > 0, then r[h, e] > 0 and l[h,e] > 0 (as is shown in the Appendix §6, Theorem 1). Hence, to show that some evidence e lends a positive degree of support to h it is sufficient to show that d [h, e] > 0. The proofs in this thesis will thus content themselves with demonstrations that hold for d [, ] whenever this sufficiency condition will deliver the requisite results for any of the above noted confirmation measures. Moreover, any future reference to a confirmation measure, c [, ], will assume that c [, ] = d [, ] unless otherwise stated.

JSee (Fitelson, 1999) for a more comprehensive list of measures as well as their historical back- ground. Notable proponents of each measure are easy to come by. Proponents of d include Earman, Eells, Gillies, Jeffrey, and Rosenkrantz. Horwich, Mackie and Milne endorse r. Finally, a short list of advocates of I includes the likes of Fitelson, Good and Pearl. The nomenclature and presentation of the confirmation measures in this section are essentially that of Fitelson and Hawthorne (Forth.).

2Most authors usually assume that confirmation, as well as subjective probability assignments, are relative to an agent's background knowledge k. I will also make this standard assumption since I think that it is standard for good reason. However, for ease of presentation, k will be suppressed in future premises except those cited from other texts. Hence, any assignment of the form P [h] is best regarded as shorthand for P [h \ k\, P [h \ e] as shorthand for P [h \ e • c [h, e\ as shorthand for c [A, e | k], and similarly for the rest.

3 3.2 The Argument from Entailment

It is part of a venerable tradition in the philosophy of science that analyses in their nascent stages are foremost informed by the sciences themselves. In confirmation theory, the rough template adduced from scientific practice is also sharpened by ex- amining simple probability models involving urns, cards, dice and coins; each im- posing constraints on the confirmation function. Certainly, the clarity and precision of these austere cases have done much to promote understanding. Nevertheless, analyses that content themselves with the examination of famous episodes from the history of science and simplistic probability models unduly restrict the range of evidence available to them. In this section, I examine a structural constraint on the confirmation function that stems from the often overlooked metaphysics of science.

In particular, I will examine the structure of a group of related objects at the core of scientific theory and practice: dispositions, natural laws, counterfactuals, and causation. For the remainder of this paper, I will refer to this loose collection of associated objects as 'the objects of natural analysis' or, more simply, as ONAs.3

One of the reasons for singling out ONAs when discussing theories of confirma- tion is that, in addition to being directly relevant to scientific practice, prominent analyses of ONAs entail several verifiable consequences. This result imposes con- straints on confirmation functions that will be examined and exploited to general- ize the Paradox of Confirmation in §4. The following argument establishes how the common structure of ONAs renders them confirmation conducive:

3It is worth pointing out that the robust ontological status of such properties is the subject of heated debate. One notable philosopher who has developed an anti-realist position is Van Fraassen (1980). However, even those who deny the existence of such objects should admit that they do appear in scientific discourse and that they do so in a way that constrains the way in which theorizing is done (as Van Fraassen indeed does). For my current purposes it will be sufficient that research programmes be constrained in this way.

4 Argument 1. The Argument from Entailment

1. Deterministic ONA statements generally entail their corresponding counter-

factual generalizations.

2. Any counterfactual generalization, Vx(Ax •-> Bx), entails the corresponding

material generalization ix{Ax D BX), (by §6, Theorem 4).

3. Any material generalization, Vx(Ax D BX), entails its corresponding instances,

-iAa v Ba (for any given a).

4. If (1), (2) and (3), then (5), (by the transitivity of entailment).

5. ONA statements generally entail the instances of their corresponding general-

izations.

6. If h entails e, (for non-trivial h, e), then P [h | e] > P [/?], (by §6, Theorem 2).

7. If (5) and (6), then (8).4

8. ONA statements gain a confirmation boost when instances of their entailed

material generalizations have been directly established, (from

on (7) and the conjunction of (5) and (6)).5 •

From a Bayesian point of view, premise (1) of Argument 1 is perhaps the least trivial, hence it requires the most defence. That is not to say that premise (1) is

4This premise is in some need of qualification, for one could certainly come up with some measure of confirmation on which (5) and (6) does not entail (8). Nevertheless, it certainly satisfies the difference measure of confirmation c [h, e] = P [h | e • k] - P [h | k] that is most frequently assumed in the literature and hence the popular measures discussed in §3.1. See (Fitelson, 1999) though for a more detailed discussion of confirmation function choice. 5Note that the 'directly5 part is important here. Learning something that entails an instance of a material generalization is not guaranteed to confirm that generalization without further background constraints. See (Good, 1967) for a compelling Bayesian counterexample to embracing the so-called "special consequence condition" as it applies in this context.

5 without substantive intuitive appeal. If a causes /?, or if 8 is disposed to T when

Oed, then it seems like p would be the case if a were, or if

would be the case. The link between ONA statements and counterfactuals are so tight that they have spawned various attempts at reductive analyses of the latter in terms of the former.6 However, even if the connection between the two is not a strict reduction of one in terms of the other, the above argument retains its potency.

As long as deterministic ONA statements entail certain corresponding counterfac- tual generalizations most of the time it will still be the case that ONA statements are often confirmation conducive. In other words, the conclusion that ONA state- ments are confirmable is robust. A lone counterexample to counterfactual theories of deterministic ONAs does not undermine the argument for the remaining ONAs.

Every counterfactual supporting ONA statement is an ONA statement to which the argument applies. So far then, given that most counterexamples to a strict cor- respondence between counterfactuals and deterministic ONA statement came at great labours, are often quite contrived and are generally "special cases", these loose considerations do a great deal to bolster arguments in the spirit of 1 above.

Nevertheless, the following case study provides substantial additional support for premise (1) of Argument 1, which is better still.

3.3 A Case Study: Causation

Since the Argument from Entailment will provide the basis for the generalization of the Paradox of Confirmation that follows in §4,1 will briefly pause to bolster the argument's non-trivial premise. In order to provide support for premise (1) I will focus attention on a well discussed ONA: causation. There are good reasons for

6See for instance (Choi, 2008; Hitchcock, 2001; Lewis, 1986a).

6 thinking that all analyses of deterministic causation, even ones that do not attempt to reduce causal claims to counterfactual claims, entail the requisite counterfactual generalization.

The brand of causation most directly applicable to the problem at hand is gen- eral or type-relating causation. Invocations of general causation, like that smoking causes lung cancer, for instance, are claims that hold of their objects generally rather than for any particular object.7 A general deterministic causal relation then is a re- lation that holds between types, like the above example, except that the antecedent cause brings about its effect determinately. That submitting carbon materials to 60 kilobars of pressure at a temperature of 1000°C will cause that material to become a diamond, is perhaps an example of general deterministic causation.

The varieties of analyses of causation are many. Some accounts incorporate counterfactual machinery directly, while in others the connection between counter- factuals and causation is less obvious. Historically, counterfactuals began to factor into the debates surrounding general causation when it was realized that even if no object satisfied the causal property, for example if no one were to smoke, it would still be the case that the causal relation holds, i.e. that smoking causes lung can- cer. This suggests a crude necessary condition for deterministic causation, namely that for all x, if the deterministic cause-type were to be instantiated by x, then the effect-type would come to be instantiated by x. It is for this reason that classical analyses of causation in terms of constant observations of events, one following the other, usually fail miserably. Constant conjunction theories of causation simply do not get the modal facts right.8 It must be pointed out that any successful analysis of

7I wish to remain noncommittal about the types of objects that causal relations relate and will talk sometimes of events causing other events and other times of particulars causing change in other particulars. Nothing will bear on this point as my usages will be intertranslatable.

8For a classical formulation of the constant-conjunction view of causation see (Hume, 1988).

7 causation must be much more subtle than the crude necessary connection sketched above.9 To restate a point made above, this is not a defence of a counterfactual analysis of causation but rather of the more modest claim that in many determin- istic cases the requisite counterfactuals will hold. In what follows only the cases in which the said counterfactuals hold will be needed to generalize the paradox of confirmation.

That being said, I would like to turn our attention now to analyses of singular deterministic causation in order to draw on how specific accounts of causation, that are popular in the literature, entail the requisite counterfactuals. Singular deter- ministic causation can be said to occur when a particular event C causes another particular event E. For example, it is a case of singular deterministic causation when the event of hitting a specific cue ball results in a specific eight ball falling into the corner pocket. I turn to accounts of singular causation in defence of premise (1), a premise about what I have been calling counterfactual generalizations, for sev- eral reasons. First, analyses of singular causation have been developed in depth and along numerous lines. Such development is a valuable source of information regarding causation in general since general accounts are made out of the same

"building blocks" so to speak.10 Secondly, any case of singular causation, C causes

E, for which a corresponding counterfactual, C cm E, holds also entails a corre-

9For instance, it would be nice to have a unified account of deterministic and probabilistic cau- sation and it seems that the conditional fails to hold in cases where causes bring about there effects only a certain proportion of the time. See (Lewis, 1986b) for a discussion of a counterfactual account of probabilistic causation and some difficulties associated with it. In any case, no matter how the jury rules on indeterministic counterfactual accounts, the link between causation and counterfactuals in deterministic cases is suggestive.

10Some have argued that the bifurcation between singular and general causation is deep (Sober, 1984; Hitchcock, 2003). The arguments are convincing. Be that as it may, any proposed analysis of general causation in the literature makes use of some of the same basic building blocks appealed to by theories of singular causation. Though the details differ, positive analyses of either type of causation fall under the broad rubric of probabilistic theories, process theories, or counterfactual theories.

8 sponding counterfactual generalization. Just let A be the property of being event C

and B be the property of being event E, then the generalization Vx(Ax CH BX) will

hold as required.

Turning now to accounts of singular causation, we find that the causal analysis

literature can be roughly divided along three lines: counterfactual analyses (includ-

ing recent manipulability analyses), process analyses, and probabilistic analyses. I will take each in turn.

3.3.1 Counterfactual analyses

Their name is already a sure indicator that these accounts will entail certain coun-

terfactuals, and indeed they do. Consider first a rough sketch of David Lewis' classic

account. On this view, e depends causally on c iff both (i) if c were to occur then e would occur, and (ii) if it is not the case that c were to occur, then it would not be the case that e would occur (Lewis, 1986a, p.166). A finite sequence of actual particular events c, d, e ... is a causal chain whenever "d depends causally on c, e

on d, and so on throughout" (Lewis, 1986a, p. 167). Finally, Lewis defines "one

event [... to be] a cause of another iff there exists a causal chain leading from the first to the second" (Lewis, 1986a, p. 167). Going back through the definitions we see that if c causes e, then at least two (and often more) counterfactuals (0 \j/),

(-i

ip) must hold for propositions

Similarly, recent "manipulability" analyses of causation straightforwardly im- ply corresponding counterfactuals. Roughly, and oversimplifying a little, on these

n Though Lewis has made several substantive revisions to his account, all of them make heavy use of counterfactuals in a way that validates (1). See (Lewis, 2000, 1986b) for the details of said revisions.

9 accounts causes are modelled by an ordered pair < 'V, £ >, where (i) "V is a set of variables (one for each causally relevant event in the system) that take as values the occurrence or nonoccurrence of each event, and (ii) £ is a set of structural equations that encode the causal relations of the system.12 £ can in turn be further divided into two non-empty subsets and &N. &x are the equations with exogenous vari- ables on the left hand side, where exogenous variables are variables that are set to a value from outside of the causal system under examination and do not depend on the other values of elements of *V, i.e. X = x. £>N are the equations in which endoge- nous variables appear on the left hand side. A variable is endogenous if changing the values of the other variables in its equation can only be accomplished by fixing exogenous variables in "V that do not appear in the equation being considered (i.e. an endogenous variable Z will always be such that Z = fz(X, Y,...,W) for variables X, ..., W. These equations encode counterfactuals of the form "If it were the case that X = x, Y = >',..., W = w, then it would be the case that Z = fz(x, y,..., w)" (Hitchcock, 2001, p. 280). In Hitchcock's words, "£ is a set of fundamental equa- tions from which all other counterfactuals [that are true of the causal system] may be derived" (Hitchcock, 2001, p. 283). Causes are then defined in terms of the event variables "V of the causal system, the counterfactual relations between them,

£, and the active causal routes that arise in such systems (the details of which are unimportant for the purpose at hand).13 What should be clear is that this account entails causal counterfactuals of the desired kind, since they are what make up half of the ordered causal model pairs < "V, £ >.

12The notation I will use throughout this brief exposition of "manipulability" analyses is (Hitchcock, 2001)'s. Other expositions can be found in (Hitchcock, 2007; Woodward, 2003; Wood- ward & Hitchcock, 2003).

13See (Hitchcock, 2001) for further details of the account.

10 3.3.2 Process analyses

That process analyses support counterfactuals is less obvious than in the case of the preceding "counterfactual accounts", but they nevertheless do support counterfac- tuals. To fix the discussion, I will outline a process analysis of causation roughly along the lines of (Salmon, 1984)'s well developed account. To quote Salmon at length

Let P\ and Pj be two processes14 that intersect with one another at the space- time point 5, which belongs to the histories of both. Let Q be a characteristic that process P\ would exhibit throughout an interval (which includes subin- tervals on both sides of S in the history of P\) if the intersection with fS did not occur; let R be a characteristic that process P2 would exhibit throughout an interval (which includes subintervals on both sides of S in the history of P2) if

the intersection with P\ did not occur. Then, the intersection of Pi and P2 at 5 constitutes a causal interaction if [there exist Q and R such that]:

1. P1 exhibits the characteristic Q before S, but exhibits a modified charac- teristic Q' throughout an interval immediately following S; and

2. P2 exhibits the characteristic R before S, but it exhibits a modified char- acteristic R' throughout an interval immediately following S. (Salmon, 1984, p. 171, emphasis mine)

In other words, a cause and effect relationship between two processes can only be said to obtain if it is the case that if either one of the processes had not been,

"Processes are intervals of events or "lines" of space-time rather than "points" of space-time on Salmon's account (Salmon, 1984, p. 139). Moreover, they are to be distinguished from what he calls "pseudo-processes" which, unlike genuine processes, do not transmit their structure. In other words, genuine processes are such that some local change at point P would change at least one of its characteristics at all future points without further interactions (Salmon, 1984, p. 148). Pseudo- processes on the other hand are merely local changes to a process. See (Salmon, 1984, p.141-8) for further details.

11 then the modification of the other process (the effect) would not have occurred.

Moreover, part of the definition of 'process' makes it something which would exhibit a change indefinitely were an interaction to occur. Hence, process analyses (or

Salmon's archetypical formulation thereof) incorporate counterfactual conditionals of the appropriate kind.15

3.3.3 Probabilistic analyses

These analyses all state, roughly, that an event C is the cause of an event E if and only if the occurrence or non-occurrence of C is probabilistically relevant to the oc- currence or non-occurrence of E. Probabilistic relevance can be parsed in numerous ways, for example the occurrence of C may be considered probabilistically relevant to the occurrence of E if P [E \ C] + P [£]; or P [C A E] > P [C] P [£]; or C "screens off" all other variables from E; ...; and the list of refinements goes on. Few, if any, of these conditions are stated in counterfactual terms. However, I claim that the interpretation of the probabilities mentioned in these analyses must support coun- terfactual claims in order to do the work required of them in the sciences16 and it is beyond question that probabilities play a foundational role within the sciences.17

I will not provide a complete and systematic argument for the modal dimension of

15Interestingly, Salmon notes that the counterfactuals that occur in his theory can be established empirically (Salmon, 1984, p. 149-50). From this fact he infers that they are sufficiently objective to be unproblematic. On the current analysis this is correct but may be generalized since most counterfactuals may be confirmed via the material conditionals that they imply. See §4 for further details set within a broadly Bayesian picture of confirmation.

16I do not claim any originality for the statement that probabilities, at least as they appear in the sciences, do not admit of non-modal reduction. In fact, the hope of non-modal analyses of physical probabilities has all but been abandoned in the literature. See, for example, (Fraassen, 1980; Fetzer, 1981; Gillies, 2000).

17For example, probabilities are ineliminable in studies of radioactive particle decay, quantum mechanics (i.e. the "Born Rules" that probabilistically link states to measurement outcomes), the laws of the special sciences, etc.

12 physical probabilities since many arguments of that type already appear in the liter- ature.18 However, an outline of a few of the basic reasons that theories of physical probabilities end up going modal is easy enough to state.

Firstly, it is hard not to read modality off of the axioms of probability them- selves. The base definition of probability function that appears in the scientific literature specifies P [•] as a function from members of a sigma field T of subsets of a set of possibly unactualized events to the unit interval [0,1] that satisfies a version of the Kolmogorov axioms with countable additivity. Though models of such a structure could be populated solely with actual events, thereby avoiding some modal commitments, such is rarely the case. Indeed, in cases where non- denumerable event spaces are called for, as they often are in practise, this cannot be the case.19 Take models that satisfy the "Born rules" of quantum mechanics as an example. These rules tell us the values of conditional probabilities of the form

P [(<2 e E)\(Q is measured in a system of state 5)], the probability that a physical quantity Q lies in an event space E given that Q is measured on a system in state S.

In the specified models, possible measurements over which the corresponding prob- ability function is defined are non-denumerably many. Models of these rules cannot solely contain actual events since then each of the non-denumerably many mea- surement events must actually occur for every one of the non-denumerably many

18Again I refer the reader to the thorough arguments that can be found in (Fraassen, 1980; Fetzer, 1981; Gillies, 2000; Salmon, 1984).

19A probability function cannot assign non-zero probability to non-denumerably many events without contradicting the Kolmogorov axioms. A fortiori, the probability that every event in a non- denumerable class of disjoint events obtains is zero. In other words, the probability that some event in the field is merely possible equals unity.

13 states.20 But, this cannot be the case. As long as measurements take some finite amount of time there will have to be unactualized measurements since (even if the universe lasts forever) the maximum number of events that could be measured will be denumerable.21 It seems then that in order to make sense of the conditional probability that Q lies in E given a merely possible measurement in state S we have to make use of the counterfactual 'if Q were measured in state S, then it would be the case that the probability of Q being in E is the value given by the "Born rules" '.

Now, let us apply what we have learned to the general discussion of probabilis- tic analyses of causation. The semantics of physical probabilities require the use of counterfactuals in their interpretation when merely possible events are conditional- ized on.22 For a model to be adequate for certain uses that appear in the everyday practice of science, its field will have to contain some unactualized events. Hence, probabilities conditional on these events are to be interpreted counterfactually. But then a general account of conditional probabilities will have to be interpreted coun- terfactually. Combine this with the fact that, on probabilistic analyses of causation, probabilistic relevance between a cause C and an effect E is stated in terms of the conditional probability P [E | C], and we have it that probabilistic analyses of cau- sation entail the requisite counterfactuals quite generally.

20This is not quite right, since by the downward Lowenheim-Skolem theorem, if there is a model (possibly with a non-denumerable domain) that satisfies some set of sentences T, then there is a denumerable model that satisfies r. However, such a model would not be the intended model. Given the type of model that has been fixed for this example, the domain could not consist solely of actual events. Moreover, the sentence '3x(x is an event A X is not actual)' would still be true on this model as long as we add the predicates 'actual' and 'non-actual' to our language. (These last two points will be argued presently.)

21 This discussion of the "Born rules" in relation to probabilities and in particular unactualized events draws heavily on (Fraassen, 1980, p. 169-178).

22Of course, even if the field contained only actualized events, the requisite counterfactuals would still hold since (A A B) entails (A •-» B). I am indebted to Chris Tillman for bringing this point to my attention.

14 3.3.4 The upshot

Though I have far from examined how each ONA statement supports or entails its corresponding counterfactual, i.e. rigorously established premise (1), I have given ample evidence for the premise as it pertains to one of the best studied ONAs: causation. In addition, I have examined some general reasons for thinking the premise plausible and argued that the thrust of the argument in which the premise is embedded only requires that the premise hold most of the time.

Summing up, the use of ONAs shouldn't be ruled out of "serious" science on the grounds that they cannot be confirmed. ONA statements entail their confirmation conducive material cousins. Hence, by Theorem 2 of §6, the Appendix, ONA state- ments are also confirmation conducive. The metaphysics of the objects of natural analysis yield important information about how the structure of our subjective cre- dence functions should be constrained. Hence, from a Bayesian point of view, the metaphysics of science in part determines how our choice of confirmation function behaves. Let us now turn our attention to how these observations relate to the

Paradox of Confirmation.

4 A Generalization of Hempel's Paradox

The Paradox of Confirmation, as it is usually formalized, consists of two seemingly innocuous premises:

(NC) Nicod's Condition, which states that universal generalizations, Vx(Ax D BX),

are confirmed by their positive instances,23 (AA A BA), for any object A; and

23It is important to make a distinction between what I have up until now been calling 'instances of a material conditional' and what I am now calling 'positive instances of a material conditional.' The former are statements of the form r-*Aa V Ba~l while the latter are statements of the form rAa A Ba'1.

15 (EQUIV) The Equivalence Condition, which states that a piece of evidence e con-

firms (disconfirms) a hypothesis h iff e confirms (disconfirms) all hypotheses

that are logically equivalent to h.

From these two plausible premises follows the paradoxical conclusion that for any predicates A, B and any object a, (Aa Ba) confirms the hypothesis Vx(Ax D Bx).24

To see this, note that (AaBa) is a positive instance of the universal generaliza- tion Vx{Bx D Ax) and so confirms it by (NC). But this generalization is logically equivalent to Vx(Ax D Bx), hence, by the equivalence condition, (Aa Ba) confirms

Vx(Ax D Bx).

4.1 Non-Standard Solutions

Before moving on to the family of Bayesian views that are the target of this thesis, it is worth mentioning some of the non-standard solutions that have been advanced in the literature. These solutions fall into roughly three camps, the first two cor- responding to denials of the premises in the Paradox, while the third accepts the argument as sound but tries to explain away its paradoxicality. There are thus non-equivalence solutions, solutions that deny Nicod's condition and solutions that explain away the counterintuitive results. I will take each in turn.

4.1.1 Against Equivalence

There are many different solutions that focus on the equivalence condition. I will only outline a few paradigmatic approaches here, presenting particular views as ex-

24The symbolic notation in this and the remaining sections is that common to treatments of prob- ability and lends itself well to readability in proof. The symbolization is as follows: sentences of the form rcD and^"1 will be symbolized as r T"1; rO or y as r0> v V; rIf D y; and r not-®"1 as r

16 amples. Prima facie, there are at least two ways that one could focus on the equiv- alence condition. One could deny (EQUIV), that is deny that a piece of evidence e confirms (disconfirms) a hypothesis h iff e confirms (disconfirms) all hypotheses h' that are logically equivalent to h. This route seems implausible, at least for the non-trivial hypotheses that are at issue in usual paradoxes confirmation. For two such hypotheses to be logically equivalent is, in some strong sense, for them to have the same content c. But if some piece of evidence supports c, then it supports any expression whose content is exhausted by c. Hence, c supports any logically equivalent hypothesis. Alternately, one could deny that (EQUIV) applies to the hy- pothesis h and the proposed hypothesis h'\ (Cohen, 1987; Clarke, 2009) provide recent exemplifications of this approach.

Cohen begins by noting that when speakers usually utter expressions like All As are Bs', they usually wish to convey that there is at least one A and that it along with the rest of the As are Bs.25 Moreover, he holds that if there are no As then what is conveyed lacks a truth value. Cohen then defines a presuppositional operator in order to capture what he takes to be the content of All As are Bs\ This truth- functional operator (*) maps true statements to true, indeterminate statements to the intermediate value i and falsehoods to i. He then takes expressions of the form

'All As are Bs' to be correctly captured in the formalism by Vx(*(Ax) z> Bx)\ Likewise

All non-fis are non-As' is formalized as lVx(*(.Bx) d AX)'. But given the definition of '*', these two formalizations are not equivalent. Hence, (EQUIV) does not apply and the paradox is dissolved.26

25 He offers no argument for this interpretation in (Cohen, 1987) besides a few plausible examples but cites Strawson's defence of a similar position in (Strawson, 1952) approvingly.

26It must be noted that the above sketch is only half of Cohen's view. He goes on to offer his own theory of confirmation on which Nicod's condition (NC) also fails. However, I believe that the above captures his solution insofar as it pertains to (EQUIV). See (Cohen, 1987) for the relevant details.

17 Other authors have argued that laws of nature of the form 'All j4S are Bs' do not entail their contrapositive.27 However, such a defence is harder to motivate than presuppositional accounts since it does not explain why we find the Paradox irksome in the case of material generalizations. On the other hand, these views are typically motivated by a dissatisfaction with the Bayesian insistence that confirmation be measured in terms of personal betting odds. On this count, I am at least sympathetic with their view and will offer further reasons to think that there is a gap between probabilistic confirmation and actual evidential relevance.

4.1.2 Against Nicod's Condition

Another way to sidestep the paradox of Confirmation is to deny that Nicod's condi- tion (NC), the condition that universal statements are confirmed by their instances, holds in full generality. Such manoeuvres are now commonplace. For instance,

Quine (1969) argues that evidence phrased in terms of non-natural kinds, like non- blackness or non-ravenity, is not confirmation conducive; thereby blocking the in- ference to the paradoxical conclusion. However, (Maher, 2004, p. 78) contains an argument that places even the restriction of (NC) to natural kinds into ques- tion. Just consider some natural kind predicate which was thought previously to be uninstantiated, perhaps '... is a unicorn' and consider the hypothesis 'All unicorns are uni-horned.' Now finding a uni-horned unicorn would make it less incredible to find a non-uni-horned unicorn. Thus, we have a counterexample to (NC) that employs only natural kinds.

Objecting to (NC) along different lines (Howson & Urbach, 2006, p. 102) para- phrases a convincing counterexample due to (Rosenkrantz, 1977, p. 35) as follows:

27See, for example, (Harre, 1970; Aronson, 1989) and possibly (Quine, 1969).

18 Three people leave a party, each with a hat. The hypothesis that none of the

three has his own hat is confirmed, according to Nicod, by the observation that

person 1 has person 2's hat and by the observation that person 2 has person l's

hat. But since the hypothesis concerns only three, particular people, the second

observation must refute the hypothesis, not confirm it.

The above arguments provide criticisms of Nicod's Condition that do not rely on the Bayesian apparatus.28 Hence, one needn't be a Bayesian to adopt them. It may of course be possible to give these criticisms a Bayesian gloss. But even without appeal to Bayesian techniques, they suffice to undercut The Paradox of Confirma- tion as stated at the beginning of this section (§4). Taken without a Bayesian gloss, these counterexamples are non-standard solutions to the Paradox, in a sense to be defined presently.

4.2 'The' Standard Solution

The suggested solutions to the Paradox of Confirmation are multifarious both in scope and in kind. In this thesis, my focus will only be on Bayesian solutions to the paradox that are of the "canonical" type. These solutions typically accept the paradoxical conclusion but qualify it in a way that eliminates its bite. A peculiarity of these views is that they reject the reasoning that originally lead to the paradox- ical conclusion but still embrace the problematic result due to their undergirding probabilistic framework.

In order to reach the desired conclusion, Bayesians usually argue that negative instances (Aa Ba) confirm the usual hypotheses Vx(Ax z> Bx) only to a negligeable

28Its setup does, however, rely upon background information implicit in the party story. Note also that it requires two iterations of (NC).

19 degree. Thus their strategy is to explain away the intuition that the negative in- stances of a hypothesis do not confirm. To spell out the details using a concrete example let the predicate 'A' be the .. is a raven' predicate, 'B' be the predicate

.. is black', and consider the customary hypothesis h - Vx(Ax D BX), that all ravens are black. If we let e = (Aa Ba), what the Bayesian needs to show is that c [h, e] = s, for acceptable confirmation function(s), c [, ], and some negligibly small e. In order to derive the desired result Bayesians usually assume that the number of non-black objects dwarfs the number of ravens and that P^a | Ba\ = P[y4a]. The desired result can easily be shown to follow for standard confirmation functions c [, ].29

Summing up, I will call a Bayesian solution to the Paradox of Confirmation

'standard' if (i) confirmation is defined probabilistically using one of the common measures outlined above in §3.1; (ii) it is assumed that drawing a non-black thing from the population is more probable than drawing a raven; (iii) samples are drawn from the domain without constraint; (iv) the conclusion reached is that either (iv.a) the statement that a is a non-black, non-raven, confirms the hypothesis that all ravens are black but only to a minute degree, or (iv.b) the statement that a is a non-black, non-raven, confirms the hypothesis that all ravens are black less than the statement that a is a black raven.

(Howson & Urbach, 2006; Fitelson, 2006; Fitelson & Hawthorne, Forth.; Vranas,

2004) all provide recent examples of the standard solution that are exceedingly similar to the presentation given in Theorem 3 of the Appendix §6. However, its

29 See (Vranas, 2004) and (Fitelson, 1999) for discussion, and §6, Proposition 3, of this thesis for a demonstration of the desired result for the confirmation function c[h,e] = | e] - P[/i]. My proof draws on that of (Vranas, 2004). It should be noted that in the same paper Vranas argues convincingly that the independence assumption, P [Aa | Su-j = P [/W], used in the standard solution is unfounded. For a reply see (Howson & Urbach, 2006) and for some plausible weakenings of the disputed assumption see (Fitelson & Hawthorne, Forth.). (As Brad C. Johnson pointed out in a personal communication, it is questionable whether being a raven should be considered independent of colour in the first place.

20 lineage stretches back much further. For example, G. H. Alexander (1958) proposes a probabilistic solution satisfying (i)-(iv) that could be seen as a precursor to more recent Bayesian analyses. His solution proceeds by an examination of cases for which it is plausible that P [AaBa \ ft] > P [Aa Ba | ft] and for which the probability of either piece of evidence given ft is smaller still.30 He notes that the values that he stipulated are plausible whenever the B things greatly outnumber the A things and claims that this restriction is met for most of the hypotheses that we investigate

(Alexander, 1958, p. 231-2). Alexander's general strategy is to note the probability of drawing an A-thing, a, and the probability of drawing a B-thing, b. Then we may state the probabilities of drawing any of the four possible A5-outcomes in the following table when ft = Vx(Ax D BX), ft = there is no correlation between As and

Bs:

AB AB BA AB

(h) ab a(\ — b) (1 - a)b (l-a)(l-6)

(h) a 0 (b-a) (1 -b)

Given the assumption (ii), that a < b < it follows that P [AB \ h] /P \AB \ ft] >

P \AB | ft] /P [AB | ft] > 0 as required. In modern terms, his argument is that for a large variety of common cases, in particular the ones for which P[5a] » P [Aa] for randomly selected a, the measure / [ft, e] = In (P [e \ ft]/ P [e | ft]) is such that

/[ft, AaBa] » / [ft, AaBa}.31 Of course Alexander's solution has the additional draw- back that it is only reasonable insofar as we know the distribution of As and Bs.

Worse it assumes that A and B are independent when ft fails to hold. This unduly

30Alexander does not refer to these probabilities using the Bayesian terminology nor does he explicitly invoke Bayes Theorem in his demonstrations.

31 It is interesting to note that Alexander does not assume that P [Ba \ h] = P [Ba] like many current analyses. This assumption is disputed in (Horwich, 1982; Maher, 1999; Vranas, 2004) and has been used in a solution to the Paradox as recently as (Howson & Urbach, 2006).

21 strong assumption appears false, since even if h fails to hold it appears that there is still a strong correlation between As and Bs in many cases including the ravens case.

Good's (1960) provides another classic example of a standard solution. In it he claims that the 'Bayes factor', P[

Hence, Good's solution satisfies (i). To "solve" the Paradox Good assumes that (ii) holds, i.e. that drawing a non-black thing from the population is more probable than drawing a raven (or a crow to use his terminology) when he proceeds thusly:

Now suppose that there are N objects that might be seen at any moment, of

which c are crows and b are black, and that the N objects each have probability

1/N of being seen. Let //, be the hypothesis that there are i white crows, and

suppose that the hypotheses H\, Hi, ..., Hc, are initially equiprobable. Then,

if we happen to see a black crow, the Bayes factor in favour of H is

i.e. about 2 if the number of crows in existence is known to be large. But the

factor if we see a white shoe is only

= (N-b)^max(N-b-(\/2)c-(l/2),(]/2)(N-b- 1)),

and thus exceeds unity by only about (1/2)c/(N - b) if TV - b is large compared

with c. Thus the weight of evidence for H provided by the sight of a white shoe

is positive, but is small if the number of crows is known to be small compared

22 with the number of non-black objects. (Good, 1960, p. 147-8)32

Note that in Good's example samples are drawn from the domain without con- straint, i.e. (iii) holds. Note also that the conclusion drawn is the quantitative conclusion (iv.a) that the sighting of a non-black, non-crow, confirms the hypothe- sis that all crows are black but only to a minute degree. Hence, Good's solution is standard.

Following Alexander and Good, J. L. Mackie's work is another precursor to the standard view. In accordance with (ii), Mackie claims that P[BQ-] > P[AQ'] is a necessary and sufficient condition IOX AaBa to confirm Vx(Ax D BX) more than Aa Ba does33 when P [e \ h] - P [e] > 0 is considered sufficient for positive confirmation

(as is the case for the choices of c[h, e] given in (i)) (Mackie, 1963, p. 267).34

He also argues that the comparative conclusion follows when samples are drawn without constraint from the population at large, though he also considers two stage procedures (Mackie, 1963, p. 276). Insofar as Mackie agrees with (i)-(iv), he may correctly be considered a standard-solutioner.

The solutions to the Paradox of the Ravens advanced in (Alexander, 1958; Good,

1960; Mackie, 1963; Vranas, 2004; Howson & Urbach, 2006; Fitelson, 2006; Fi- telson & Hawthorne, Forth.), as well as the mass of other solutions that respect

(i)-(iv), are standard. Solutions like these make up the majority of the responses to the Paradox of Confirmation. In the next section, I advance a novel generalization

32Good's equation fails without the addition of the ^ term, erroneously left out of the original article. If left out, the expression simplifies to 2 and there is no need to take the large number of crows into account. I would like to thank Brad Johnson for bringing this to my attention.

33Technically this is not a necessary condition but merely a sufficient one. See (Vranas, 2004; Fitelson & Hawthorne, Forth.) for an extended discussion of this point.

34Mackie dubs the condition that P [e \ h] - P [e] > 0 'The Inverse Principle' and argues that its sat- isfaction can serve as the basis for a definition of confirmation (Mackie, 1963, p. 277) in accordance with measures d [h, e] and r [h, e] of (i).

23 of the Paradox of Confirmation that applies to these standard solutions.

4.3 The Generalization

We are now in a position to show that Bayesians who accept the standard solution to the paradox of confirmation are susceptible to a generalized version of the para- dox. By the standard Bayesian solution to the paradox of confirmation, negative instances of material conditionals often confirm their respective conditional. Now let PAB [ • ] be the probability distribution arrived at by conditionalizing on such a negative instance of some particular , UG = Vx(Ax D BX), S be the amount of additional probability (Aa Ba) lends UG, and consider the amount of confirmation that (Aa Ba) bestows on a corresponding ONA statement. For con- creteness, let the ONA statement and negative instance in question be such that the standard solution's assumptions apply, i.e. that our credence distributions for the conditional entailed are such that P[Aa] p[/to] and | hj = p[£o|. For example, we could let the ONA be the causal relation that submitting carbon mate- rials to 60 kilobars of pressure at a temperature of 1000°C will cause that material to become a diamond, and let the negative instance be the observation of a brown bookcase.

c [ONA, A^Ba] = ¥AB [ONA] - P [CWA]

Applying the law of total probability on the partition created by UG, UG to Pab [ONA] we obtain

(P,-b [ONA | UG] [UG] + PiS [ONA \ UG} FAB [T7G]) - P [ONA].

24 But since (ONA UG),35 it follows that P^ [CWA | Z7g] is zero and we have

(PAB [ONA | UG] [UG] + 0) - P [ONA].

Expanding P [ONA\ using the law of total probability on the same partition and noting again that P | ONA \ UG | = 0, our equation can be written as

Pa-b [ONA | UG] P^-g [UG] - P [ONA \ UG] P [UG]. (*)

Now, for the most part, our credence distributions will be such that if (Aa Ba) is relevant to the ONA at all, it will be relevant through its effects on UG, so that

P,4b[CWA | UG] = P[CWA | UG]. After all, if we learn that the universal general- ization UG obtains, we have learned all of the information that (AaBa) could have given us about our ONA statement and more. Hence, we should expect that UG screens off (AaBa) from the ONA statement. But even if this is not the case for our personal probabilities, given that (Aa Ba) serves at most to eliminate irrelevant members of the sample space, conditionalizing on it will not decrease P [ONA \ UG], the probability of our ONA statement given UG. So (*) is greater than or equal to

P [ONA | UG] Pab [£/G] - P [ONA \ UG] P [f/G], which by our definitions is

P [ONA | UG] • (P [UG] + e) - P [ONA \ UG] P [UG],

35See the defence of premise (1), Argument 1 for details. Note that the A and B terms of both the negative instances and the corresponding material generalization may not be the same as the terms in the ONA. Rather, they may merely be related via the analysis of the ONA.

25 for e > 0. By simple algebra, the ¥[ONA | UG]¥[UG] terms cancel each other out, thus completing the demonstration that (Aa Ba) confirms its related ONA by a degree of at least (the positive number)

P [ONA | UG] • s (C-ONA)

Hence, the standard assumptions that Bayesians make in their solution to the para- dox of confirmation entail that the seemingly irrelevant proposition expressed by

(Aa Ba) positively confirms its corresponding ONA statements. According to their solution then, the proposition expressed by 'a is a brown bookcase' confirms the propositions expressed by statements like 'If I were free tonight, I would be out dancing', 'carbon materials submitted to 60 kilobars of pressure at a temperature of

1000°C have a disposition to become diamonds', or 'travelling at speeds in excess of the speed of sound causes a sonic boom'! But the (AaBa) propositions arrived at by observing brown bookcases do not confirm such statements. While accepting the paradoxical conclusion that negative instances confirm their corresponding gener- alizations is mildly disquieting when the generalizations are material, it is far more unsettling, if not outright disturbing, when the generalizations are replaced by ONA statements.

4.4 A Counterargument to the Generalized Paradox

In order to evade the generalized paradox above, Bayesians that adopt the standard solution to the paradox might try to generalize their solution along familiar lines.

They might attempt to explain away the above results as follows:

Argument 2. A Generalization of the Standard Solution.

9. We may assume the Bayesian framework as it applies to confirmation since

26 it boasts an impressive track record and is fairly intuitive in its own right.

Moreover, both the assumption that (i) the probability that an object drawn

at random from the population is a raven given that it is non-black is equal to

the probability that that object is a raven, and the assumption that (ii) there

are far more non-black things than ravens, are reasonable enough.

10. If (9), then (11), (by §6, Proposition 3).

11. The degree of paradoxical confirmation in the canonical case is in fact minute.

The standard solution provides a satisfactory explanation of the paradoxical

confirmation of universal generalizations by their negative instances.

12. The degree of paradoxical confirmation entailed by (9) in the generalized

paradox is also minute, (by Equation C-ONA above, the fact that e is minute

and the fact that ¥[ONA \ UG] < 1).

13. If (12), then the confirmation bestowed upon ONA statements is acceptable,

(by the reasoning in premises (9)-(11)).

The above argument attempts to show that on the assumption that the standard solution is successful, the generalization of the standard solution will be too.

4.4.1 Problems for the Generalized Standard Solution

The general thrust of this argument pushes from the success and intuitiveness of the Bayesian program against the counter-intuitiveness of the Paradox as quanti- tatively reduced by Bayesian methods. It must be noted that the paradox has not been eliminated, only reduced. As Hempel pointed out, a "small paradox" is still a problem (Hempel, 1965, p. 48). Hence, in order for the solution to be successful, both the Bayesian framework and the assumptions needed to establish (11) must

27 have enough initial plausibility and theoretic utility to withstand the anomaly. In

order for the generalized solution to be successful, the assumptions in (9) need to

have enough initial plausibility and theoretic utility to withstand not only the "small

paradox", but also the additional burden of strange ONA confirmation.

It must also be remembered that neither (EQUIV) nor (NC) can be called upon

to provide support for the Bayesian conclusion since neither needs to be invoked in

the Bayesian demonstration of the conclusion. Indeed, the latter principle, (NC),

should be flat out rejected from a Bayesian point of view.36 I have argued that

Bayesians must accept that the positive instances of, not ONA statements, but rather

the material generalizations entailed by ONA statements, confirm their correspond-

ing ONAs. The intuitive (but false) Nicod's Condition cannot lend any support for

the "intuitive modus ponens" that is at work here.37 Bayesians can rely on the much

less intuitive correlate of (NC) spelled out above which factors into the demonstra-

tion of (C-ONA). However, one should be more leery in this case of letting it bear

much weight.

The remaining premises are also tendentious. Vranas (2004) has argued con-

vincingly that there is a lacuna in the justification for the strong independence

assumption at play in premise (9). In particular, Vranas' extensive survey of the

literature has revealed that scarcely any argument whatsoever, and certainly no

satisfactory argument, has been advanced in favour of the independence assump- tion, P [Aa | Zto] = P [Aa], used in the standard Bayesian solution to the Paradox of

Confirmation.

36I must thank Jean-David Lefrance of the City University of New York for bringing this point to my attention with respect to (NC). He presented this point as a commentator for an ancestor of this thesis.

37See the corresponding section of this thesis, §4.1.2: Against Nicod's Condition, for the relevant arguments.

28 The "correct" Bayesian framework of confirmation is also a matter of live de- bate (Fitelson, 1999). Hence, the intuitive support garnered by any given mea- sure, which could in turn be transferred to the conclusion, is partial at best. One could perhaps argue that there is some overarching Bayesian framework that is well supported and that framework will transfer its support through the premises of Argument 2 since the most common confirmation measures can make use of the standard solution.38 One worry for this line of argument is that there are some

(non-standard) measures for which the standard solution fails and, given that these measures exist, it is not clear how one could carve out an overarching Bayesian framework in a non-circular manner. Worse, Argument 2 is valid only if we hold the measure constant throughout the premises and that measure will only be par- tially supported in the face of other viable alternatives. These considerations do not constitute a knock-down argument against the support of (9) but they surely weaken the support it transfers to Argument 2's conclusion.

Though I have not seen any denial of premise (10) in the literature, counterex- amples to it too may be generated on the grounds that its antecedent lacks the requisite breadth to establish its consequent. The argument in which premise (10) is embedded provides a resolution to the paradox phrased in terms of the canoni- cal predicates '... is a raven' and .. is black'. This would be unproblematic if the predicates were mere place holders in the argument; however, they are not. Facts about the denotation of the predicates are employed to establish the antecedent of

(10) in a way that is not guaranteed to generalize and, in fact, does not general -

38Theorem (1) of the appendix, §6, shows that the standard solution can be employed for any of the measures advanced in section 3.1 of this thesis.

29 ize.39 To see the force of this worry let us step away from the paradox as framed by ornithology and consider it from the point of view of an entomologist. Let'A' be the predicate .. is a metapleural gland holder',40 '6' be the predicate .. is an ant' and let our domain of interest be the class of insects I. Now assume that Vx(Ax D BX) is the hypothesis h we wish to confirm and let us compare the degree to which an instance, CAaBa), of h bestows confirmation on h with the degree to which a negative instance (Aa Ba) confirms h. It is an empirical fact that just over half of the insects on the planet observed so far have been metapleural-gland-holding ants, hence P [AaBa] > 0.5 and P[/4o'6a] is small, h is well confirmed. All metapleural- gland-holders observed so far have been ants and all but a few species of ants are metapleural-havers. But now we can define a large class of fairly realistic con- straints on our probability measure given the facts about the case at hand for which

P [h | AaBa] < P [h \ Aa 6a]41 and so on most Bayesian measures of confirmation c [h, AaBa] < But this is absurd. Even if we are willing to accept that seemingly irrelevant observations slightly confirm a hypothesis, it is another matter entirely to have a view that entails that seemingly irrelevant observations provide better confirmation for a hypothesis than its positive instances for a large class of cases. Note also that the problem is general, it is not due to some peculiarity of the chosen universe of discourse. If we were to formulate a hypothesis in terms of some ubiquitous fundamental matter (like "dark energy" or "dark matter") we

39I must here point out that Roger Clark has very recently come to the same conclusion in his work (Clarke, 2009). His example is structurally similar to the one I am about to give. However my presentation was developed independently of Clarke's and was presented at the Annual CUNY Graduate Conference — 2009, before its publication. 40Metapleurals are glands thought to be unique to ants that produce and secrete an antibiotic agent that prevents bacteria and fungus from developing both on the ants themselves and within their nests (Holldobler & Wilson, 1990).

41 See §6, Proposition 5 for the details.

30 would obtain similar results without choosing a restricted domain (i.e. J).42

Taken together the above shortcomings present a serious problem for the Bayesian who endorses the standard solution to the Paradox of Confirmation. On the other hand, their general strategy is not lacking in prima facie plausibility. I turn therefore to some strategies to develop the standard proposal along lines not yet ruled out in the hopes of getting to the underlying source of the appeal.

4.4.2 A Reinforcement of the Generalized Solution via Backwards Induction.

In what follows I will develop what I take to be one of the strongest lines of argu- ment that proponents of the standard solution and its generalized extension have at their disposal to ground their position. The said argument will proceed by back- wards induction on the impact of confirmatory instances in a way that will be made clearer shortly. However, any brief cause for celebration at the Bayesian camp is short lived since I go on to show that the given argument in favour of the standard solution fails to hold. The criticism will bring to light additional conditions that,

42It is worth mentioning that there is at least one well-known but non-standard solution to the Ravens Paradox that sidesteps this worry. (Earman, 1992, p. 69-73) shows that given the standard assumptions about the distribution of ravens and non-black things in the world, (i) the probabil- ity of the ravens hypothesis h, given as evidence that an object a drawn from the population of ravens is found to be black is larger than the probability of h given as evidence that another object p drawn from the population of non-black things turns out to be a non-raven; exactly when (ii) the probability that a is a non-black thing given that it was drawn from the population of ravens and -i/j holds is larger than the probability that p is a raven given that it drawn from the population of non-black things and -./? holds. Simplifying slightly, what Earman has argued is that sampling from the sub-population that is more likely to produce falsifiers (the class of ravens in this case) provides more confirmation than does sampling from the sub-population that is less likely to produce falsi- fying evidence. Now if you add to this conclusion the premise that falsificationism is a compelling methodology, then it turns out that the ants case above is not counterintuitive since, in this case, one is more likely to find a falsifier to the metapleural-gland hypothesis among the non-ants. Though this strategy does sidestep the worry posed by the ants hypothesis, its credibility depends on how compelling one finds the falsificationist methodology. I leave it to the reader to decide one way or the other. No matter the side on which one falls, the solution does not bear on what follows and is in any case distinct from the "standard" solution under examination. What is more, it might be thought that its "two stage" sampling process does not bear on cases where objects are verified for both ravinity and blackness in the same trial.

31 were they to hold, would provide a means to answering the generalized paradox — but I end the subsection by arguing that these additional assumptions do not obtain.

One might propose to bolster the standard solution to the Ravens Paradox by contemplating some (very) rough and ready observations: that probabilities, which are measure functions, are sensitive to "class size" and that our judgements about

"class size" are often informed by relative frequencies. To illustrate how these ideas apply, consider again the universal hypothesis h = Vx(Ax D BX). One way to com- pletely verify h is to round up all of the As and see if they are Bs. This is uncontro- versial. Another way would be to round up all of the non-B objects and check if any of them is an A — after all, if we have looked at all of the non-B objects without coming across a single A, then anything that is an A must be in the pile of Bs. But this is just a limit case of induction. If we look at all of the non-black objects, save one, it seems that we will still be justified in our conclusion, Vx(Ax D BX), just a little less justified than we were in the limit case. It also seems that we can repeat this step again and again, each time losing a bit of justification. What all of this seems to suggest is that confirmation should be related to the size of the class ex- amined as compared to the total class size, i.e. related to the relative frequencies observed. This rough argument far from shows that confirmation is equal to the relative frequency but it may be thought to bolster the case of those who claim that negative instances of a hypothesis are confirmatory.43

Unfortunately for the proponent of the standard solution (and its extension to

43 Solutions along these lines are proposed in the literature as early as (Mackie, 1963). Many since have followed suit. Unfortunately, the above explanation still leaves open the exact relation between observed relative frequencies and confirmation. In particular, it does little to assuage the worry for Bayesian confirmation theory raised in connection with the entomology example. It still would seem odd for a Bayesian to count it amongst the virtues of her theory that sometimes observations that are seemingly irrelevant to a hypothesis provide more confirmation to that hypothesis than its positive instances. This latter point is also discussed in (Laetz, 2007).

32 the generalized case), this type of backwards induction is prone to serious coun- terexample. For instance, consider the hypothesis hs, that all snakes reside outside of Ireland. Now assume that the evidence

c[h, k- eJ] > c[h, k- (e_ - n)]. ($)

Let n be the observation of a reptile that, according to our background knowl- edge k, has as its evolutionary ancestor some species of snake or other. Suppose further that this species has ecological needs similar to those of many types of snake and that its ancestor is found wherever it is found in all other observed ecosystems. With these details properly spelled out it seems that removing this observation from our evidence set e_ fails to remove justification from hs. Hence, c [hs, k • eS\ < c [hs, k • - «)] violating (t). Stated more broadly, the removal of a negative instance n of a generalization h from the evidence e need not diminish

44 the support of h. Similarly, if our evidence e+ is the sum of positive instances of our hypothesis hs, removing a single positive instance p from our evidence set need not diminish the support for hs. For instance, if according to our background knowledge k there is a mass of snakes on a plane headed for Ireland and we let p

44As noted in (Clarke, 2009, p. 11-12), the specification of the case details must be relegated to the background evidence for these counterexamples to function. If the additional information about the snake in question were included in n then n would no longer be a simple positive instance. A fortiori, such counterexamples cannot be formulated if the background knowledge k is tautological as in Hempel and Carnap's theories of confirmation.

33 be the observation statement that one of those snakes resides outside of Ireland, it follows that p is a positive instance of the hypothesis that all snakes reside outside of Ireland. However, if knowing that p has any effect on hs, it diminishes its support

45 when conditionalized upon. Therefore, c [hs, k • e+] < c [hs, k • (e+ - p)]. In either case, backwards induction type defences of the standard solution fail to hold in full generality. It might even be questioned in the ravens example, since ob- serving a raven that is a borderline shade of black might countersupport the hypoth- esis that all ravens are black. Moreover, when the counterexamples are taken into consideration the overall ineffectiveness of the strategy becomes apparent. There does not appear to be a relevant set of nontrivial constraints that could separate the cases for which the backwards induction argument is sound and the cases where it is not. Hence, the strategy will be relatively unhelpful even when we can find cases to which it applies.

It should also be noted that the above explanation does not apply as cleanly to the generalized solution to the Paradox of Confirmation. First of all, observing all of an ONA's (actual) negative instances rarely provides it with maximal confirmation.

Hence, the initial premise that we can fully justify our hypothesis by examining the right class of actual instances fails in the generalized case. The finitistic character of the standard solution to the "Ravens" Paradox also invites a false sense of comfort.

Indeed, the narrowness of the standard example is most evident in light of the Gen- eralized Paradox. The range of an ONA statement can be non-denumerable even

45In the above argument I suggest counterfactually removing an observation from the set of total observations. There is extensive literature on the difficulties inherent to such counterfactual exci- sions. So much so in fact that they are grouped under the common heading of "the old evidence problem". I do not wish to take a stand here on any particular method of excision. However, I am willing to claim that such counterfactual assessments are possible since they are ubiquitous in both everyday speech and even in more rigorous modes of enquiry. For an overview of the "old evidence" literature see (Howson & Urbach, 2006).

34 when the actual world only contains a finite number of its negative instances.46 In such domains, the simplistic justification for the standard view via a rough form of a backwards induction fails — the justification of a hypothesis provided by every instance but one need not differ from the justification of a hypothesis provided by every instance but two. Indeed, the phrase 'every instance but ...' begs elucida- tion in transfinite cases — probabilities in non-countable domains require careful definition of both field and measure.

However, the above cardinality worry is merely superficial. Those who wish to intuitively ground an extension of the standard solution to the generalized para- dox are not without the resources to give a similar such "intuitive justification". If conditionalizing on the proposition expressed by lVx(Ax D BX)' affords the appro- priate ONA statements some support, then we can bolster the generalized solution using considerations similar to those used to support the standard solution. For if learning that the universal generalization holds in fact provides support for an

ONA analysis and it is the case that we have learned it via thorough examination of either its negative or positive instances directly, it would be strange to think that it was only in virtue of a singular final observation that support for the ONA state- ment accrued, observations up until that point failing to incrementally support the

ONA statement. Rather, it seems that there should still be incremental support for the ONA statement in question without the final observation, perhaps just a little less than there would be if the proposition expressed by 'Vx(Ax D BX)' were verified completely. Now, as we noted above, the fact that Vx(Ax D BX) holds can be verified by observing every A and seeing that each is a B or by observing each of the non-£s

46For instance, current theory seems to underwrite the idea that for any observable state there is a non-denumerable set of possible underlying quantum states. By the counterfactual nature of ONA statements this seems to imply that the merely possible states in near enough worlds may also falsify the hypothesis even if none do at the actual world.

35 and noting that they are all non-As. Putting these observations together we should expect that (Aa Ba)47 incrementally confirms that Vx(Ax D BX), which in turn lends support to the appropriate ONA statements.

The above intuitive outline has a form similar to the considerations advanced in favour of the original solution to the paradox. The differences between the two are (i) that in the original case the amount of support garnered by observing all of the universal generalization's negative instances was total rather than incremental; and (ii) that the defence of the generalized solution depends upon the additional assumption that learning that Vx(Ax D BX) provides a boost in confirmation to its corresponding ONA statement.

To my mind, the first of these two discrepancies seems innocuous. However, there may be some problems lurking for (ii). In particular, if one has the intuition that instances unrelated to an ONA do not confirm statements of that ONA, then it is not clear that learning that Wx(Ax D BX) will provide positive incremental support for ONA statements that entailed it. We intuitively think that confirming the un- derlying connection between an ONA-statement's terms is an essential part of ONA statement confirmation whereas the connection is less tight in the case of a mere universal statement. It is a fairly uncontroversial fact that ONA statements admit of positive support. For example, appropriate controlled experiments often serve to incrementally confirm ONA statements. But, it must be noted that in cases like these, negative instances of their entailed universal generalizations play no role. It is then natural to think that learning that Vx(Ax D BX) is neither necessary nor suffi- cient for confirming AB-relating ONA statements. Learning that Vx(Ax D BX) holds in the process of confirming an A5-relating ONA should in many cases be viewed as

47 Or, if one doesn't think that the generalization less (AaBa) alone will diminish confirmation we could use V(x

36 a mere byproduct of ONA statement confirmation that is inessential to the confir- mation of the ONA statement in question. But, if learning that Vx(Ax D Bx) does not necessarily support its ONA statements, then the generalized solution to the ravens paradox cannot be bolstered using the above line of reasoning.

One should be leery of putting too much weight on the observation that c [ONA, UG] >

0 may fail to hold in full generality, since whenever the ONA statement in question entails the corresponding universal generalization UG, that c[ONA, UG] > 0 holds is a theorem of the probability calculus.48 Hence, to deny that c [ONA, UG] > 0 holds in this circumstance is to entirely give up on there being a probabilistic notion of confirmation49 — a rather radical thesis. On the other hand, maybe the preced- ing observations culminate in showing that a rather radical view of confirmation is called for. Maybe confirmation is not probabilistic and therefore should not be ana- lyzed as though it were. However, even if the more extreme thesis that confirmation is not probabilistic fails to hold, a sufficient case has still been mounted against the defence of the standard view via backwards induction.

4.4.3 Getting our Ravens in a Row

Let us take stock of what we have learned in the attempt to generalize the solu- tion to the Paradox of Confirmation. The argument began by listing some intuitive premises from which it was derived that the counterintuitive confirmation in the standard solution was minute. The argument then proceeded to show that the same set of premises entailed that the counterintuitive confirmation in the gener-

48The former observation follows from trivial application of Theorem 2 whenever the probabilities of the ONA statement and UG fall strictly between 0 and 1.

49Or, at least, to endorse this conclusion is to give up on any probabilistic notion of confirmation that employs a traditional axiomatization of the probability calculus. See (Sylvan & Nola, 1991) for a defence of non-traditional theories of probability in relation to problems that arise in confirmation theory.

37 alized case was also minute. I noted that many, if not all, of the intuitive premises failed to hold in full generality and perhaps fail in the case at hand. I then pre- sented and assuaged worries that might have been raised for the justification of the

Generalized Paradox due to differences in the cardinality between the justification of the standard solution and its generalized counterpart.

The upshot to take away from this discussion is that the flow of explanation in the standard defence threatens to be turned on its head when expanded to solve the Generalized Paradox. The strong intuitions we have in favour of (NC) and

(EQUIV) are not appealed to by the Bayesian statement of the Paradox; hence they cannot be relied upon to bolster their solution. The intuitiveness and fruitfulness of the Bayesian framework itself does, I think, outweigh the "minute" degree of unexpected confirmation for certain universal generalizations by their negative in- stances.50 However, since the confirmation of ONAs by their negative instances is prima facie much stranger than in the case of a universal generalization, that it is shown to follow from the well-supported Bayesian framework may not be enough to assuage the worry. Firm intuitions may transfer support across a conditional with a mildly surprising conclusion but the modus ponens inference risks changing into a case of when the conclusion becomes more radical. Of course, like many arguments from intuition, these results are not decisive, but that does not mean that they are without substantive pull. A convincing solution to the Paradox of Confirmation should not create as adverse a reaction as the Paradox itself.

S0Though, as we have observed, the "minuteness" of the conformation that negative instances afford their hypotheses is a function of both the domain on which they operate and the hypothesis in question. That such confirmation is minute is a peculiarity of the Ravens example that does not easily generalize.

38 4.5 Weakenings of the Standard Solution

In generating the generalization of the paradox of confirmation, as well as in my

various critiques of the standard solution, I have put some standard Bayesian as-

sumptions to work. The assumptions that Bayesians employ in their solution might

be thought to be more than a little heavy handed: assumptions about our cre-

dence weightings, strong independence assumptions, et cetera. It might therefore

be wondered whether the Bayesian who adopts the standard tack can weaken her

assumptions and still obtain the same results. Moreover, if she can obtain similar

results by imposing weaker constraints, will this solution evade the problems gener-

ated by the stronger assumptions employed in the Standard Solution to the Paradox

of Confirmation?

In this section I will examine these questions. In particular, I will flesh out a weakening of the standard solution found in, which provides necessary and suf-

ficient conditions to solve the Paradox of the Ravens that are in the spirit of the

standard solution but involve fewer substantive assumptions. After briefly stating

their results and examining the assumptions employed to secure the said results, I

go on to demonstrate that their solution to the Paradox still succumbs to the novel

problems that I introduced above. In particular, I argue that the Generalized Para-

dox still goes through on the weakened assumptions, as does the problem of the

ants inversion of §4.4.1.

Let us begin with some preliminary definitions and assumptions made in (Fitelson

& Hawthorne, Forth.). Let k be our background knowledge and h be the hypothesis that all ravens are black, Vx(Ax z> Bx). Fitelson and Hawthorne assume that the following non-triviality assumptions hold:

0 < ¥[h | BaAa • k] < 1; 0 < P[h \ Mlto •*]

39 They then define p to be the likelihood that a turns out to be black given that both h is false and a is a raven, q to be how much more likely it is that a- be a non-black thing than a raven given that h is false, and r to be how much more likely it is that a be a non-black thing than a raven given that h is true. In symbolese,

p = P [Ba | Aa • h • k]

q = F\Ba\h-k] JP[Aa\h-k]

r = ¥\Ba\h-k] / P [Aa | h • k].

Given (TA), Fitelson and Hawthorn show that 0 < (1 -p) < q. Moreover, they deduce that q-(\ - p) > /? • r is a necessary and sufficient condition for the observation of a black raven, (BaAa), to support h more than the observation of a non-black non- raven, (BaAa). Again, in symbolese,

P [BaAa \ h • k] / P f BaAa \hk] l 51 r r ' >1 «-» q-{\-p)> p- r (E-COND) P[BaAa\h k\ / P[BaAa\h-k\

How does this help the standard solution? In what way can this result be em- ployed to refine the solution? Well, if we assume some standard strong indepen- dence assumptions, P [Tto | fc] = P [fia] and P [/?q- | h] = P [/?a], it follows that q = r.

Combine this with the assumption that the probability of seeing a non-black object is greater than the probability that that object is a raven, i.e. that q = r > 1 is satis- fied, then q - (1 - p) > p • r also holds as required. Note that if we had to make the same disputed independence assumptions that were employed in the strong stan- dard solution, then this so-called "weakened solution" would be no improvement

51 For ease I will suppress the assumption that all probability assignments are relative to back- ground knowledge k for the remainder of the section.

40 after all, since it would rely on all of the same presumptions that we had earlier deemed to be unwarranted. However, the result will still obtain when the standard assumptions are relaxed, q = r > 1 will hold even when P [Ba | /ij = P [Baj fails and that q = r > 1 holds is alone sufficient to guarantee the desired conclusion.

It might be wondered whether it is realistic to assume that q = r. However here too Fitelson and Hawthorne have a response. Since (TA) implies that 0 < (1 - p) < q and in the case of the ravens q is arguably quite large, it follows that q - (1 - p) is just barely smaller than q itself. Hence, it remains plausible to assume that q - (1 - p) is bigger than the fraction p r of r, and so (E-COND) is satisfied, providing a weakened resolution of the paradox.

In fact, this solution sidesteps several of the oft cited problems with the standard

Bayesian solution. As we have just mentioned, this solution does not require that

| /t] = P[6o-j. Sidestepping this assumption is an important move for more than just the reason that no successful arguments have been given for it. For, if we assume that Ppto 1 h\ = p[a^],52 Ppa | /?] = Ppfcr], V[h] < ¥[h \ AaBa], and

P [h] < P [h | Aa Zto], it follows that the observation of a black non-raven disconfirms the hypothesis that all ravens are black! This result has been criticized by many authors as the main failing of the standard solution.53 However, Vranas points out that the standard assumptions also imply that the amount of disconfirmation

(AaBa) bestows on h is minute (Vranas, 2004, fn. 19).54 Hence, it would seem that

52This assumption is fairly orthodox among standard-solutioners. (Howson & Urbach, 2006) is an example of an author we have discussed here that makes this assumption, for a more extensive list see (Vranas, 2004). It is also worth noting that on the assumption that there are far more non- black things than ravens one doesn't need this independence condition to prove that the degree of confirmation c |h \ Aa /ia] is minute (See (Vranas, 2004, p. 548) and Proposition 3 of this work which is based on Vranas' result.)

53For a list of authors of said opinion see (Vranas, 2004, p. 551).

54 See Proposition 6. of the Appendix §6 for a reproduction of Vranas' result.

41 standard-solutioners could advance a line of argument that is similar to their so-

lution the paradox, namely, that the (dis) confirmation only seems counterintuitive because we are apt to confuse minute (dis) confirmation with no (dis) confirmation

at all. Whether Vranas' solution is found satisfying or not, if it is possible to under-

cut an objection at the outset by weakening or eliminating some of the unnecessary assumptions presumed to be at play, it is clearly desirable to do so. Fitelson and

Hawthorne have succeeded at accomplishing this end.

Hence, I readily admit that the steps that Fitelson and Hawthorne have taken are important steps forward for proponents of probabilistic solutions to the Paradox of

Confirmation. However, their refined solution to the Paradox does little to assuage the worries that I have raised for the programme. First of all, their view still assumes that negative instances, (BaAa), of the Raven's hypothesis confirm it, a key premise in my generalization of the paradox. Nor have they said anything that would rule- out any supplementary assumption that I make in generalizing the Paradox of the

Confirmation. Hence my argument that standard-solutioners must allow ONA state- ments to be confirmed by seemingly unrelated instances will apply equally well to weakened-solutioners like Fitelson and Hawthorne. Moreover, their solution is still susceptible to the worry that I raise in connection with the hypothesis 'all metapleu- ral gland holders are ants'. The weakened view still entails that negative instances will counterintuitively support a hypothesis more than its positive instances do. In fact, Fitelson and Hawthorne's necessary and sufficient condition (E-COND) pro- vides an independent argument for the counterintuitive claim that the observation of a non-metapleural having non-ant confirms the hypothesis that all metapleural gland havers are ants better than the observation of a metapleural gland having ant.

The ants argument parallels that given by Fitelson and Hawthorne in favour of the more intuitive ravens conclusion. First note that, like in the ravens case, r « q, but

42 unlike in the ravens case q < 1. Given the preceding it follows that q(\ - p) < (1 -p),

so q-pq < 1 -p and hence q- (1 - p) < pq, thereby violating (E-COND) the necessary

and sufficient condition for non-paradoxical confirmation. Again, observations of

non-metapleural having non-ants provide more support for the hypothesis that all

metapleural gland holders are ants than observations of metapleural gland having

ants. Far from providing a intuitive means for dealing with the ants inversion, the

weakened solution to the Paradox of Confirmation provides an quick rout to the

strange result.

The gravity and scope of the problems that I have raised for standard solutions

to the Ravens Paradox should now be clear. Unless one of the plausible initial

probability assignments is rejected, standard solutions must arrive at paradoxical

conclusions on pains of probabilistic incoherence. Take another weakened 'solution'

to the Ravens Paradox advanced by Hawthorne and Fitelson, for example. They

show that for any predicates A, B and a randomly selected individual a that the

premises (1) and (2) in the following argument allow one to conclude that (AaBa)

supports the hypothesis h = Vx(Ax D BX) more than (Aa Ba). More formally:

Argument 3. Fitelson and Hawthorne's Second Weakened Solution. If

1. P [/I | BA] < P[/Z | Aa].

2. P [Ba] > P [Aa] or r > 1 or q > 1.

F[BaAa\h] / P|BaAa\h\ Then i—v 1, > 1. r[BaAa\h]l¥[BaAa\h\

However, here again, on the assumptions at work in the ants inversion, some

premises must be rejected. (1) is at very least contentious and, as we have seen

in our discussion of the first weakened solution, the disjunction in (2) is arguably

false. Moreover, the same triviality assumptions (TA) at play in the first solution are

43 at play here allowing the Paradox of Confirmation to be generalized. The puzzles that have been introduced are both robust and general. The weakened standard solutions in the literature are just as prone to new paradox.

4.6 The Obvious 'Solution'

There is another line of argument with prima facie appeal that we have yet to ex- amine. The standard Bayesian line has been to accept the paradoxical result that c [Vx(Ax D Bx), Aa Ba] > 0 and to explain it away. As we have seen, certain unset- tling conclusions can be generated against those who adopt this strategy. It there- fore seems natural to wonder why we should accept the paradoxical confirmation statement to begin with. To quote Vranas's grandmother, "it's called a 'paradox' because its conclusion is absurd" (Vranas, 2004, p. 546), so why not undercut the paradox from the start by assuming that h = Vx(Ax D BX) is independent of (Aa Ba) so that P [h | A^ 5a] = P [/?] and c [h, Aa Ba] = 0?

As far as I know, this strategy has not been discussed in the literature. Why has this obvious move failed to garner attention? One reason perhaps is that this as- sumption is tantamount to denying the intuitively plausible Nicod's condition. How- ever, Bayesians are already committed to denying (NC) and, as I have argued above, for good reason. Hence, that this strategy rejects (NC) shouldn't stand in the way of Bayesians who wish to adopt this solution. Another reason that Bayesians might not want to pursue this strategy is that it is inconsistent with the other strong as- sumptions that they have made, namely: P [fia | h] = P [Ba] and P [#a] » P [Aa].55

However, recent criticism of these assumptions in (Vranas, 2004) and (Fitelson &

ssThat these assumptions contradict the claim that c [h, Aa B»] = 0 follows from Proposition (3) since it shows that the assumptions entail that c [/i | Aa Be] = e > 0 (§6). See (Maher, 1999; Vranas, 2004) for historical review and critical discussion of this problem.

44 Hawthorne, Forth.) should at very least make us leery of adopting them without substantive argument. Furthermore, these assumptions seems less firm than our intuition that the proposition expressed by la is a white shoe' does not confirm the statement that all ravens are black. It seems reasonable to assume that if one can embrace the dominant intuition, instead of having to explain it away, one should.

To endorse this solution is to act in concert with this sound methodological maxim.

Therefore, for the remainder of this section we will submit this solution to a long overdue examination.

I take it that the intuition that Bayesian solutions must try to explain away is the intuition that non-black, non-ravens are confirmationally irrelevant to the hy- pothesis that all ravens are black. In symbols, the intuition that c [h, Aa Ba] = p[h\Aa~Ba]-F[h\ = ¥[h] - P[h] = 0. What follows from an analysis that em- braces this intuition? Well, for starters, the argument for the Generalization of the

Paradox of Confirmation, at least as I have outlined it, will not go through. The argument relied crucially on the fact that standard Bayesian solutions embrace the paradoxical confirmation result (§4). Indeed, the final upshot of the generalization was that the degree of confirmation that a negative instance e = AaBa bestows upon a corresponding ONA statement is at least P [ONA \ UG] • s with e equal to the amount of additional confirmation e lends the intermediary universal generaliza- tion h = Vx{Ax D Bx). However, since E = 0 on the Obvious Solution, no paradoxical confirmation of ONA statements has been shown.

Unfortunately, the solution still succumbs to the Ants Inversion as nothing in my demonstration relied upon the (in)dependancy of h on AaBa (§6, Proposition 5).

The Obvious Solution still entails that the observation of a non-metapleural gland having non-ant bestows more confirmation upon the hypothesis that all metapleu- ral gland havers are ants than the proposition expressed by 'a is a metapleural gland

45 having ant'. This solution, like the others canvassed, falls short of full generality.

This failure seems all the more clear when we consider cases in which it seems

plausible that negative instances confirm their generalizations. Consider the hy-

pothesis 'all marbles in the sack at time t are red' and a case where you have placed

25 objects selected at random from the collection of marbles and peas into a sack

at time t. Suppose further that you have since drawn 24 non-red non-marbles out

of the sack. Then it would seem that drawing another non-red non-marble would

positively support the hypothesis. Since such cases exist, it would seem that princi-

pled criteria for the application of the independence strategy need to be advanced

by its proponents. A proponent of the independence solution had better be able to

explain when the independence condition fails to apply. Hopefully such an expla-

nation would be made on the basis of some principled distinction that will help sort

out how it might (or might not) apply to other cases as well.

Another serious problem with the Obvious Solution is that the strong indepen-

dence assumptions that it makes are notoriously difficult to provide positive argu-

ments for. As Vranas notes with respect to the assumption that P [fi

even if one were to refute the claim that one's personal probability of finding a

non-B thing should go up or down upon learning h = Vx(Ax D BX) it would still not

follow that P [Ba \ /z] = P[ita]. At most, what would follow is that one is permit-

ted to have a rational personal probability assignment such that the equality holds

(Vranas, 2004, p. 550). Of course such a demonstration would also be compatible with it being permissible that one's personal probability of Ba could fluctuate up or

down after learning that h holds. The above point also holds with respect to the

independence assumption of the Obvious Solution; just replace every instance of

Ba with an instance of Aa Ba in the preceding.

Thus, despite the Obvious Solution's success with respect to the Generalization

46 of the Paradox, it is far from being anything like a satisfactory solution. It succumbs

to many of the problems that befall the standard solution in addition to the family

of problems that arise as a result of stipulating a strong independence assumption

as a normative constraint on belief functions.

5 Conclusion

In sum, I have argued that the objects of natural analysis are confirmation conducive within the Bayesian confirmation framework. I then put the argument to work,

effectively generalizing the Paradox of Confirmation and undermining its canonical

solution. I take it that what we should take away from the failure of the standard

solution is that the metaphysics of science should not be taken for granted when we

construct theories of confirmation .

If we have learned anything from Bayesian analyses of confirmation, and here

I am on par, it is that the specification of background information is crucial to our concept of confirmation. Any satisfactory theory of confirmation needs to take into

account our antecedent pool of facts, some of which are facts of the kind that the

metaphysics of science studies. Metaphysical notions, and especially ONAs, are fer- tile grounds for testing confirmation theories since their domains are wider than those of their more mundane counterparts. Hence, I must disagree with Fitelson and Hawthorne when they say that "we think that [metaphysical facts, especially ones about natural kinds, have] few (if any) confirmational consequences. Confir- mation is a logical or epistemological relation, which may or may not align neatly with metaphysical notions like causation or law-likeness" (Fitelson & Hawthorne,

Forth., fn. 9). On the contrary, our background knowledge of ONAs places signif- icant and robust constraints on our choice of probability function and any realistic

47 confirmation function (whether or not confirmation should be analyzed in terms of subjective probability).

One might try to sidestep worries caused by esoteric entities by pointing to sci- ence's nominalistic preferences. However, until we cleanly underpin scientific dis- course (which seems to blatantly call upon ONAs) in a parsimonious way, it is irre- sponsible to invoke preferences where arguments are called for. Until the triumph of anti-realism, it seems that we should want to provide a notion of confirmation that is robust enough to handle the basic metaphysical notions that are seemingly explicit parts of both scientific theory and practise. A good place to start then would be by not ignoring metaphysics when formalizing our epistemology.

48 6 Appendix

Theorem 1. Let the probabilities of h, e, e given h and e given h be non-trivial (i.e. let P [h], P[e],P[e\h],P[e\h] 0, then r [h, e] > 0 and / [h, e] > 0.

Proof. Assume d [h, e] > 0. Then

P [h | e] - P [h] > 0 => In (P [h \ e]) > In (P [h]) (by linearity)

In (P [h | e] /P [h]) > 0 (by algebra)

=>r[h,e}> 0 (def of r [, ]).

And

/ [h, e] = In (P [e \ h] /P [e \ (def of I [h, e])

= h(P[*PWMHlPle|Sl) (by Bayes Rule)

> In (P [e]) - In (P [e \ h\) (by r [h, e] > 0) ' Pp]P[e] (by Bayes Rule) ln{¥[h\e ]P[e]J

= /N(PP])-/«(P [H\E])

But

d[h,e] > 0 ^ (1 -Pp| e])-(l -Pp]) > 0

=> P P] > P [/? | e]

ln(F\h})-ln(p\h\e})>0. •

Theorem 2. If h entails e, 0 < P [h] and 0 < P [e] < 1, then P [h | e] >

49 Proof. Assume that ft entails e, P [ft] > 0 and 0 < P [e] < 1. Then

,[h\e]>¥[h\e]¥ [e] (by P[e] < 1)

'[ft A e] [e] (d/ofP[ft \e], F[e] > 0) P[

= P [h A e] (by algebra)

= P[e | ft]P[ft] (d/ of P ft], P [ft] > 0 and algebra)

= 1 • P [ft] (by [ft => e]) •

Proposition 3. For non-trivial ft = Vx(Ax D Bx) and e = AaBa, if P[A a] «: P[Ba] and P [ITO \ ft] = P pto], then P [Vx(Ax D Bx) \A^Ba]-F [Vx(Ax => BX)] = s for some minute £ > 0.

Proof. Assume that P [Aa] «: P [Ito] and P\~Ba | ft] = P [Ba], and that ft, e are non- trivial.

-Remark: First note that if P [Aa]

Then, given our assumptions, it follows that:

'P [Ba | ft] /P [Ba] ' [ft M '[£?] =P[ft] ——- - 1 (by Bayes' Rule and algebra) P [Aa | Ba]

1 = P[ft] - 1 (by P [Ba | ft] = P [Ba]) ' [Aa | Ba]

= P[ft] - 1 (by 'Remark') (1 -8)

= £ > 0 (since t « (but [P [ft] • (1 - 1)] = 0).

50 Theorem 4. Vx(Ax •-> Bx) entails Vx04x D BX)

DISCLAIMER In order to sidestep concerns that arise in the metaphysics of modal-

ity when considering issues of quantification, the following proof will not be carried

out within any particular semantic or syntactic system. Fortunately, accepting the

proof will only commit us to the mildest modal constraints. All that we will assume

is that (1) when evaluated at a world w, the universal 'V' ranges over at

least the objects of w (whatever 'the objects of w' turns out to mean) and that (2) for

any conditional r(f> •-> ^ true at a world w, all worlds which are, ceteris paribus,

the same as w and at which r

condition holds even in the weakest standard system of conditional logic C. 56

Sketch of Proof. For the loose semantic argument, assume that Vx(Ax •—» Bx) holds

at a world w. Then by (1) it follows that {Aa •-> Ba) is true there for all a of w. By

(2), all worlds which are, ceteris paribus, the same as w and at which Aa holds, Ba

is true there also. Hence, assuming w is, ceteris paribus, the same as w, it follows

that if Aa holds at w then Ba is true at w, so (Aa d Ba) holds at w for all a of w.

Thus, Vx(Ax D Bx) holds at w as we wanted to show.

Correspondingly, a syntactic argument would run roughly as follows:

Vx(Ax •-> Bx) =» (A D-> B

=> (A D B(p) (conditional elimination).

Vx(Ax D Bx) (universal generalization).

Either way, the inference should be uncontroversial. •

56See (Nute, 1984; Priest, 2001) for further discussion of standard conditional logics.

51 Proposition 5. Let ft be Vx(Mx D AX)' and P [ft] > 0.

Then if (i) 0 < P[MaAa] < P[MaAa] < 1 and (ii) P[Ma | ft] < P [Aa | ft] in accor- dance with the facts about metapleural-gland-holders and ants in our example, it follows that P [ft | MaAa] < P [ft | MaAa].

Proof. Let (i) and (ii) hold.

Then P [Ma | ft] < P [Aa | ft] < (P [MaAa] P [MaAa]) P [Aa | ft], which implies

P [Ma | ft] P [h] P [Aa | ft] P [ft] => — [MaAa]—r— < p[MaAa- —-]—

P [MaAa | ft] P [ft] p [MaAa | ft] P [ft] < —— (by the content of ft) P [MaAa] p[MaAar ]

=> P [ft | MaAa] < P [ft | Ma Aa] (by Bayes Rule). •

Proposition 6. Vranas' (2004) proof that for non-trivial ft = Vx(Ax D BX), e =

(AaBa):

If P [fia] » P [Aa], P [Ba] » P [Aa], 0 P [Ba] /P [fia] « 1, P [Aa | ft] = P [Aa] and

P [Ba | ft] = P [Ba], then c [ft, e] = P [ft | e] - P [ft] = s, for some minute s < 0.

Proof. First note that if P [Aa \ ft] = P [Aa] and P [Ba \ ft] = P [Ba], then

P [ft | BaAa] = P \BaMx \ ft] P [h] /P [BaAa]

= (P [Ba | ft] - P [Aa \ ft]) P [h] /P \BoMx]

= (P [Ba] - P [Aa]) P [ft] /P [BaAa].

52 It follows that

h | flttAa] - P [/?]) P [BaA^\ /P [/J] = P [Ba] - P [Aa] - P [BOMY\

= P [BaAa] - P [Aa]

= -P [itoAa].

So we have it that P [h \ e] - P [h] < 0. Finally, we note that c [h, e] is minute since

c[h, e] = P [h | e]-¥[e]

= -P [h] P [ZtoAa] /P \BaAa\

= -P [h] P [Aa | Baj P [ito] (P [Ba] - P [BaAa])

= -P [h] P [Aa | Ba\ P [B^] (P [Ba] - [1 - P [Aa | Ba]]), which is "minute if the background knowledge ensures that P [Aa | Ba] and P [Aa | Ba] are minute but P [Ba] /P [fia] is non-minute (there are overwhelmingly more non- black objects and overwhelmingly more black objects than ravens, but at most many

- not overwhelmingly - more non-black that black objects) " (Vranas, 2004, fn.

19). •

53 References

Alexander, H. G. (1958, November). The paradoxes of confirmation. The British

Journal for the Philosophy of Science, 9(35), 227-223.

Aronson, J. L. (1989, April). The bayesians and the raven paradox. Nous, 23(2),

221-240.

Choi, S. (2008, October). Dispositional properties and counterfactual conditionals.

Mind, 117 (468), 795-841.

Clarke, R. (2009, April), "the ravens paradox" is a misnomer. Synthese. (Published

online 30 April 2009. DOI10.1007/sl 1229-009-9560-6)

Cohen, Y. (1987, March). Ravens and relevance. Erkenntnis, 26(2), 153-179.

Earman, J. (1992). Bayes or bust: A critical examination of bayesian confirmation

theory. Cambridge, Massachusets: The MIT Press.

Fetzer, J. (1981). Scientific knowledge: Causation, explanation, and corroboration

(Vol. 69). Dordrecht: D. Reidel.

Fitelson, B. (1999, September). The plurality of bayesian measures of confirmation

and the problem of measure sensitivity. Philosophy of Science, 66 (Proceedings

Supplement), S362-S378.

Fitelson, B. (2006, February). The paradox of confirmation. Philosophy Compass,

1. (URL: http://dx.doi.Org/10.llll/j.1747-9991.2005.00011.x)

Fitelson, B., & Hawthorne, J. (Forth.). How bayesian confirmation theory handles

the paradox of the ravens. In J. Fetzer & E. Eells (Eds.), The place of probability

in science. Chicago: Open Court.

Fraassen, B. C. van. (1980). The scientific image. New York: Clarendon Oxford

University Press.

Gillies, D. (2000, December). Varieties of propensity. British Journal for the Philos-

54 ophy of Science, 51 (4), 807-853.

Good, I. J. (1960, August). The paradox of confirmation. The British Journal for

the Philosophy of Science, 11 (42), 145-149.

Good, I. J. (1967). The white shoe is a red herring. British Journal for the Philosophy

of Science, J 7(4), 322.

Harre, R. (1970). The principles of scientific thinking. London: Macmillan.

Hempel, C. G. (1965). Aspects of scientific explanation and other essays in the philos-

ophy of science. New York: The Free Press.

Hitchcock, C. (2001, June). The intransitivity of causation revealed in equations

and graphs. The Journal of Philosophy, 98(6), 273-299.

Hitchcock, C. (2003). Causal generalizations and good advice. In H. E. Kyburg &

M. Thalos (Eds.), Probability is the very guide of life (chap. 10). Chicago: Open

Court.

Hitchcock, C. (2007, October). Prevention, preemption, and the principle of suffi-

cient reason. Philosophical Review, 116(4), 495-532.

Holldobler, B., & Wilson, E. O. (1990). The ants. Cambridge: Belknap Press.

Horwich, P. (1982). Probability and evidence. Cambridge: Cambridge University

Press.

Howson, C., & Urbach, P. (2006). Scientific reasoning: The bayesian approach. Peru,

Illinois: Open Court.

Hume, D. (1988). An enquiry concerning human understanding. Amherst:

Prometheus Books.

Laetz, B. (2007, October). Does the bayesian solution to the paradox of confirmation

really support bayesianism? (Paper presented at the annual Western Canadian

Philosophical Association Congress.)

Lewis, D. K. (1986a). Causation. In Philosophical papers (Vol. II, p. 159-172). New

55 York: Oxford University Press.

Lewis, D. K. (1986b). Postscripts to 'causation'. In Philosophical papers (Vol. II,

p. 172-213). New York: Oxford University Press.

Lewis, D. K. (2000, April). Causation as influence. Journal of Philosophy, 97(4),

182-197.

Mackie, J. L. (1963). The paradox of confirmation. British Journal for the Philosophy

of Science, 73(52), 256-277.

Maher, P. (1999, March). Inductive logic and the ravens paradox. Philosophy of

Science, 66(1), 50-70.

Maher, P. (2004). Probability captures the logic of scientific confirmation. In

C. Hitchcock (Ed.), Contemporary debates in the philosophy of science (chap. 3).

Oxford: Blackwell Publishing.

Nute, D. (1984). Conditional logic. In D. Gabbay & F. Guenthner (Eds.), Handbook

of philosophical logic: Extensions of (Vol. II, chap. 2). Dordrecht:

Kluwer.

Priest, G. (2001). An introduction to non-classical logic. Cambridge: Cambridge

University Press.

Quine, W. V. O. (1969). Natural kinds. In Ontological relativity and other essays.

New York: Columbia University Press.

Rosenkrantz, R. (1977). Inference, method, and decision: Towards a bayesian philos-

ophy of science. Dordrecht: Reidel.

Salmon, W. C. (1984). Scientific explanation and the causal structure of the world.

New Jersey: Princeton University Press.

Sober, E. (1984). Two concepts of cause. PSA: Proceedings of the Biennial Meeting

of the Philosophy of Science Association, 2, 405-424.

Strawson, P. F. (1952). Introduction to logical theory. London: Methuen.

56 Sylvan, R., & Nola, R. (1991). Confirmation without paradoxes. In G. Schurz &

G. J. W. Dorn (Eds.), Advances in scientific philosophy: Essays in honour of paul

weingartner (p. 5-44). Amsterdam: Rodopi.

Vranas, P. B. M. (2004, September). Hempel's raven paradox: A lacuna in the

standard bayesian solution. British Journal for the Philosophy of Science, 55(3),

545-560.

Woodward, J. (2003). Making things happen: A theory of causal explanation. Ox-

ford: Oxford University Press.

Woodward, J., & Hitchcock, C. (2003, March). Explanatory generalizations, part i:

A counterfactual account. Nous, 37(1), 1-24.

57