Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science
12-14 June 2012; Mugla, Turkey
Version: July 28, 2012 philsci-archive.pitt.edu
Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science 12-14 June 2012; Mugla, Turkey
This conference volume was automatically compiled from a collection of papers deposited in PhilSci-Archive in conjunction with Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science (12-14 June 2012; Mugla, Turkey).
PhilSci-Archive offers a service to those organizing conferences or preparing volumes to allow the deposit of papers as an easy way to circulate advance copies of papers. If you have a conference or volume you would like to make available through PhilSci-Archive, please send an email to the archive’s academic advisors at [email protected].
PhilSci-Archive is a free online repository for preprints in the philosophy of science offered jointly by the Center for Philosophy of Science and the University Library System, University of Pittsburgh, Pittsburgh, PA
Compiled on July 28, 2012
This work is freely available online at:
http://philsci-archive.pitt.edu/view/confandvol/confandvol2012sqfcmugla12jun2012.html
All documents available from PhilSci-Archive may be protected under U.S. and foreign copyright laws, and may not be reproduced without permission. Table of Contents
Page
Paulo Abrantes, Culture and Transitions in Individuality...... 1
Michael Bradie, The Moral Lives of Animals...... 19
Zal´an Gyenis and R´edei Mikl´os, Defusing Bertrand’s Paradox. ... 26
G´abor Hofer-Szab´oand P´eter Vecserny´es, Bell inequality and com- mon causal explanation in algebraic quantum field theory. .... 44
Brian L. Keeley, What kinds of kind are the senses? ...... 70
H. Kochiras, Newton on Matter and Space in De gravitatione et aequipondio fluidorum...... 101
Dan Nesher, G¨odel on Truth and Proof...... 118
John D. Norton, Einstein as the Greatest of the Nineteenth Century Physicists...... 142
Hern´an Pringe, From Kantian schematism to the system of ex- perience’s invariants: the coordination of concepts and spatio- temporal objects in Cassirer’s philosophy...... 152
David Schrader, Living Together in an Ecological Community. ... 174
Drozdstoj Stoyanov, The Pitt model of trans-disciplinary validity: challenges and prospects...... 193
Serife Tekin, Theorizing Looping Effects: Lessons from Cognitive Sciences...... 200
Derek Turner, The Relaxed Forces Strategy for Testing Natural State Theories: The Case of the ZFEL...... 234
Jan Wolenski, The Problem of Philosophical Assumptions and Con- sequences of Science...... 247
Gereon Wolters, Ambivalence and Conflict: Catholic Church and Evolution...... 261
i Seventh Quadrennial Fellows Conference of the Center for PhilosophyofScience -ii-
Zilh˜ao Ant´onio, Moore’s Problem...... 269
H. Kochiras, Newton on Matter and Space in De gravitatione et aequipondio fluidorum...... 283
Arto O. Siitonen, Reichenbach: from Kantianism to Logical Empiri- cism...... 296 12-14 June 2012; Mugla, Turkey -1-
CULTURE AND TRANSITIONS IN INDIVIDUALITY
Paulo C. Abrantes*
Several biologists and philosophers have been arguing, for a while now, that a Darwinian evolutionary dynamics might take place not only in the distribution of phenotypic traits in a particular kind of population, but also in the very dimensions that are used to track those, bringing about new kinds of populations, given certain special circumstances. These "major" evolutionary transitions have sometimes been described as transitions in individuality. In this depiction, natural selection (maybe combined with other causes) often brings about new kinds of individuals, whose evolutionary dynamics takes place in a novel way. This topic became a big concern since the groundbreaking works of Buss (1987), Maynard-Smith and Szathmáry (1997), and Michod (1999). Godfrey-Smith's 2009 book follows this trend by emphasizing that "evolutionary processes are themselves evolutionary products" (2009, 15). One of the chief thesis he puts forth, by pushing population thinking even further, is that a transition in individuality is fully accomplished when a new, "paradigmatic", Darwinian population emerges. In collective entities, where there are nested populations embodied in one individual, the higher and the lower level populations follow different evolutionary paths during a major transition: the latter ones usually change their Darwinian status from a "paradigmatic" to a "marginal" one. This process of "de-Darwinization" of the lower level populations - as Godfrey-Smith describes the evolutionary transition taking place at that level (Ibid., 100) -, can be tracked by significant changes in the values of a set of parameters that describe their evolutionary dynamics or "evolvability" (Ibid., 41). The process of de-Darwinization of the populations of cells that make up multicellular organisms is a well-known case. In this paper, I want to investigate
* Department of Philosophy and Institute of Biological Sciences, University of Brasília, Brazil. [email protected] Published version: http://www.cfh.ufsc.br/~nel/rumos.html
Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -2-
2
whether it is fruitful to describe the role that culture begins to play at some point in the Hominin lineage - arguably that of the emergence of a new inheritance system on top of the genetic inheritance system and coevolving with it -, as being a transition in individuality.
(I) REPRESENTING DARWINIAN DYNAMICS
Godfrey-Smith criticizes, in his book, previous attempts to give an abstract "summary" of the essential elements that are required for describing evolution in Darwinian terms (2009, 17). His way to open a new trail in what he calls the "classical approach" is to start with a "minimal concept" of a Darwinian population - which just requires that there be variation in the traits of individuals in a population that affect their reproduction and that part of this variation be heritable.1
The 'minimal concept' - associated with a "kind of change", evolution by natural selection - is permissive and includes much more than the paradigmatic cases of Darwinian populations (Godfrey-Smith, 2011, 67). To avoid the pitfalls of those attempts in the classical tradition, he aims to describe not only the purportedly paradigmatic cases of Darwinian populations, but also go into the marginal cases, that don't have all the features of the former ones. The particular way a kind of population located in this spectrum evolves depends on further features that are not specified by the minimal concept, requiring new parameters to describe its dynamics. In other words, the minimal concept provides just a "set up" and has to be complemented with "middle-level" theories or models to take into account the diversity of living beings and, more generally, of systems whose dynamics can be fruitfully described in populational-Darwinian terms (Ibid., 39; cf. 31).
Starting with the minimal concept as a scaffolding, Godfrey-Smith proposes a "spatial" representation in which the chief features of Darwinian populations, concerning their evolvability, are quantified in order to tell paradigmatic from marginal cases. This
1 Godfrey-Smith criticizes the "replicator approach" proposed by Dawkins and Hull among others, and takes the "classical approach", embraced for instance by Lewontin, as the starting point of his own proposal of an abstract representation for a Darwinian populational dynamics, that might be applied to different kinds of systems, not restricted to the biological realm (Godfrey-Smith, 2009, p. 31-6). 12-14 June 2012; Mugla, Turkey -3-
3
representation is also used to depict evolutionary transitions as well, as being trajectories in that space. Different kinds of Darwinian populations, associated with different kinds of individuals, are located in different places in the Darwinian hyperspace (as I will, henceforth, be calling this representation) given the values these populations score in a set of parameters that are briefly described below:
H - fidelity in inheritance
C - continuity2
S - relationship between fitness and intrinsic properties
V - abundance of variation
- reproductive competition3
Besides those, Godfrey-Smith emphasizes the relevance of three reproduction-related parameters (see Figure 1), summing up an eight-dimensional hyperspace:
B - bottleneck
G - reproductive specialization of the parts in a collective entity4
I - overall integration of the collective entity
2 The meaning of the parameter C can be grasped by using the idea of a fitness landscape. If it is rugged, small variations in the system's properties lead to big variations in fitness.This situation corresponds to a low value of the parameter C; in a landscape like this, the population can be easily trapped in a local fitness peak and not be able to cross a valley and to evolve towards a higher fitness peak on the landscape. The way the population might possibly evolve is, in this case, not Continuous, being as a result more susceptible to drift.
3 The parameter measures the degree in which the reproductive success of one individual in a population affects the reproductive success of another one in the same population.
4 The parameter G is modeled on the Germ/Soma reproductive specialization in multicellular organisms. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -4-
4
Figure 1: The Darwinian hyperspace with just three dimensions representing the reproduction-related parameters B, G and I. Several organisms are located in this space given their coordinates along these dimensions (From Godfrey-Smith, 2009, p. 95).
In the framework proposed by Godfrey-Smith, Darwinian populations have ontological priority, so to speak, vis-à-vis Darwinian individuals: "...the population-level concept comes first" (2009, 6). Therefore, any attempt to apply those parameters to track possible transitions in individuality associated with cultural change has first to address the question about what kinds of Darwinian populations might exist in this domain. This is the main topic of the next section. Afterwards, I will evaluate if it is fruitful to apply Godfrey-Smith's representation for telling paradigmatic from marginal Darwinian 12-14 June 2012; Mugla, Turkey -5-
5
populations in cultural evolution.5
(II) DARWINIAN POPULATIONS IN THE CULTURAL DOMAIN
Godfrey-Smith argues that there are "several ways" in which Darwinian populations can be represented in the cultural realm (2009, 151). He distinguishes two "options" I will be
naming in this paper 'BP' and 'CP'. They are first presented in an 'individualistic' way (BPi
and CPi). Godfrey-Smith suggests that there are also group-level descriptions (BPg and
CPg) of Darwinian populations in this domain (see Table 1): "... we have two cross-cutting distinctions, one concerning the type of thing that makes up the population, and hence the associated notion of reproduction, and the other concerning the level at which the population exists" (Godfrey-Smith, 2009, 151).
Level Individualistic Group-level
Type of thing
BP (biological) BPi - agents having cultural BPg - groups having cultural phenotypes phenotypes
CP (cultural) CPi - cultural variants CPg - cultural variants' (memes) bundles (memeplexes)
Table1- Darwinian populations in the cultural realm
Individualistic descriptions
BPi) In this option, the population is made up of "ordinary biological individuals" with different cultural phenotypes. Reproduction in this case is ordinary biological reproduction:
5 This paper is part of a larger project in which I am attempting to figure out how fruitful might be to apply the whole set of parameters of Godfrey-Smith's Darwinian hyperspace to track a possible transition in individuality that could have happened in the Hominin lineage, associated with cultural change. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -6-
6
"When people reproduce, their offspring often resemble the parents with respect to these features, as a consequence of teaching and imitation (...) It is not a new application of the theory, in fact, but an ordinary one" (Godfrey-Smith, 2009, 150).
As far as inheritance is concerned, in the BPi case we have just vertical transmission of cultural variants (or memes, if you like), through teaching and imitation.6
CPi) In the second individualistic option, cultural variants themselves make up a
(Darwinian) population. In the previous BPi option, the population is made up by the bearers of cultural variants. Here, cultural variants themselves make up the focal population and there is replication of cultural variants. I will come back later to the
modality of reproduction associated with a CPi-like population.
Group-level descriptions
The "two options" previously described are individualistic in character but Godfrey-Smith makes explicit that there are group-level populations, as well, of biological and cultural "types of things":
"It could be argued that human groups have cultural phenotypes that are transmitted to offspring groups (...), or that group-level cultural variants themselves (such as forms of political organization) may make up a pool of reproducing entities" (Godfrey-Smith, 2009, 151).
We end up with four kinds of populations in the cultural realm: at an 'individualistic' level, the populations are either composed of biological individuals (agents, for short) with
cultural phenotypes (BPi) or made up by the cultural variants themselves (CPi). At the group-level, either groups of agents with different cultural phenotypes (cultural groups, for
short) make up the population (BPg), or bundles of cultural variants (something akin to
what memeticists call "memeplexes") themselves constitute the population (CPg).
One might ask whether the kinds of Darwinian populations in each of the four cases (BPi,
6 I prefer to use the expression 'cultural variants' that is more neutral, not committing myself to the properties usually attributed to memes. 12-14 June 2012; Mugla, Turkey -7-
7
BPg, CPi, CPg), admitting individualistic and group-level descriptions, are paradigmatic or marginal. To address this question we should locate each case in the proposed Darwinian hyperspace.
In the following, I will focus on the BPg case. The chief question I want to address is whether this group-level population is paradigmatically Darwinian or just marginal.
After presenting the BPg option, Godfrey-Smith mentions Henrich and Boyd's 1998 paper on the role played by a conformist bias in human evolution. I highlight this reference here because this transmission bias will be discussed at length in this paper.
The BPg kind of Darwinian population is central to Richerson and Boyd's theory of human evolution, a particular brand of gene-culture coevolution theories. My bet is that their "dual inheritance" theory helps to shed light on some of the topics Godfrey-Smith addresses in his book, related to cultural evolution. And the other way around: Godfrey-Smith's way to represent transitions in individuality as trajectories in an abstract Darwinian hyperspace helps to develop further some aspects of Richerson and Boyd's theory.7
(III) HOW IS BPg LOCATED IN THE DARWINIAN HYPERSPACE?
Taking for granted the conceptual framework presented above, I want to put forth once more the chief questions I will be addressing in this paper: Might human groups with
different cultural phenotypes be Darwinian individuals? Do we have in BPg a paradigmatic or a marginal Darwinian population?
To tackle these questions, we must apply Godfrey-Smith's procedure, that is, we must
locate the BPg population in the Darwinian hyperspace, by roughly indicating its coordinates along the eight dimensions presented above. This is a much bigger project than I will be able to accomplish in this paper. I will focus here on just a few of those parameters
7 Boyd and Richerson share with Godfrey-Smith, furthermore, some more general points of view that invite the kind of approximation between their work I am exploring in this paper. First of all, they agree in pointing to population thinking as the most central aspect of Darwinism. They are also suspicious about the replicator approach (especially in the cultural domain) and argue that replicators are not necessary for evolution by natural selection to take place. They all embrace also a multilevel approach to natural selection. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -8-
8
(and point to some relationships between them) and look at how cultural groups fare in these dimensions of the Darwinian hyperspace.
De-Darwinization in BPg
The parameter V measures the abundance of variation in a population. How much variation, and of what kind, is required to fuel an evolutionary dynamics at the level of
groups of an BPg type? Since we are dealing here with collective entities, we have to look also at how the population inside each group fares regarding the V parameter. In the human case, at least, the relevant lower level population is made up of agents exchanging cultural information in a social network. How abundant is the variation at this lower level, compared to the variation we find in the population of cultural groups?
If we take as a model the already mentioned case of multicellularity, there is a suppression of variation at the lower level population of cells that make up the organism: they are very similar in their intrinsic, genotipic properties.8
In the case of collectives, Godfrey-Smith describes an evolutionary transition as a combination of processes taking place simultaneously in nested populations, at several levels, that constitute the new individual. The evolutionary trajectory that represents the emergence of a new paradigmatic Darwinian population at the level of collectives in the hyperspace, and the simultaneous trajectory taken by the population of members of these collectives run in opposite directions.
Using Godfrey-Smith's expression, those members are "de-Darwinized" in different aspects, including V. In other words, in a major transition, the lower level population
8 The lower level populations in multicellular organisms, taken as a model for a collective entity, have also other features I will not be fully addressing in this paper: a) there is a division of labor between somatic and reproductive parts (cells, in this case); b) the latter are sequestered very early in the development of the organism and, therefore, are shielded from the evolutionary activity that happens in the population of somatic cells during the life of the organism; c) there is often a bottleneck in the modality of reproduction they instantiate; in the clear-cut cases, the development starts with a single cell, a condition that scores the highest value in the parameter (B = 1), and this is the reason why the population is quite uniform in their intrinsic properties (genotypic, in the multicellularity case). 12-14 June 2012; Mugla, Turkey -9-
9
usually changes its status from paradigmatic to marginal when the transition concludes.
Reasoning the same way in the case of a population of agents making up a cultural group, we should expect that this population is, to some extent, de-Darwinized in the transition towards a paradigmatic Darwinian population of cultural groups.
If we focus on the parameter V, when an evolutionary transition is achieved the population of group-member agents displays less variation (in the agent's intrinsic properties), compared to the population of cultural groups.9
Why should we expect de-Darwinization of the lower level population when it comes to group-level phenomena? There is always the risk of subversion, by free-riders, of the cooperation and division of labor that maintains the integrity of the group (Godfrey-Smith, 2009, 101; 123). Therefore, mechanisms for leveling the fitness of altruists, on the one hand, and the fitness of selfish agents, on the other hand, have to be put in place for cooperation to be preserved.
Furthermore, variation at the group-level should be enhanced and kept (despite migration
9 One might ask what would be intrinsic properties in BPg-like populations (at the low and high levels). This is relevant for the definition of the parameter S, as described by Godfrey-Smith (see above). This issue is not my focus in this paper and I will just offer some crude intuitions here. At the level of groups with different cultural phenotypes, we would expect, in an evolutionary transition, that these groups score higher values in the parameter S as well, that is, that their fitness becomes (more) related to their intrinsic properties (in other words, that their fitness Supervenes on the latter properties when the transition concludes). Maybe, it is better to say that group-level intrinsic properties emerge in an evolutionary transition (the same for fitness as a property at this level). It is plausible, therefore, to consider those cultural variants that distinguish a group phenotype from that of another group as being intrinsic properties of that group. If conformism and other biases are in place - as well as moral aggression and other mechanisms for suppressing cultural variation -, we have, as a consequence, a fairly uniform population at the level of the group-members' population. In a transition, we expect that the fitness of a group-member will be increasingly dependent on the fitness of the cultural group, what can possibly be interpreted as a suppression of S at the level of the group-member's population (since location in a particular group can be interpreted as an extrinsic property of a group-member). Much more has to be done to establish fruitful relations between S, V, H and the
reproduction-related parameters for each level in an BPg-like collective entity. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -10-
10
etc.) for group selection to have strength, at the same time that (behavioral) variation inside the groups has to be suppressed.
What is at stake is the intensity of selection at the cultural group-level, which arguably has been non-negligible in human evolution at least. In Richerson and Boyd's dual inheritance theory for human evolution, psychological biases like conformism play a central role in supressing variation inside each cultural group, at the same time that these biases increase variation between these groups and maintain this variation along the time. These processes, going on simultaneously at both levels, would achieve a transition towards a Darwinian 10 population of groups with different cultural phenotypes (BPg).
We are touching here upon the problem of the evolution of cooperation, also discussed by Godfrey-Smith (2009, p. 115;163-4). What would be the analogues, in the cultural domain, of the ways of avoiding subversion we find in the biological domain?
We know that just kin selection and reciprocal altruism are not enough to support cooperation in groups whose members are not genetically-related and/or in large groups.11 Richerson and Boyd point, therefore, to other mechanisms of "variation supression" (to use Godfrey-Smith's expression): moralistic aggression and symbolic markers. Through these mechanisms, cultural groups score higher values in the parameter V and selection at the group-level becomes stronger. By the same token, cultural groups achieve a tight integration, that is, they score higher values in Godfrey-Smith's parameter I.12
How could those mechanisms for promoting cooperation in human groups have evolved?
10 Besides the conformist bias, Boyd and Richerson argue for the relevance of other transmission biases in the transmission of cultural variants: the model bias and the content bias. We discuss at length the role these biases play in their theory in Abrantes & Almeida, 2011.
11 Richerson & Boyd, 2005; Abrantes & Almeida, 2011. Cf. Godfrey-Smith, 115.
12 The former discussion suggests that through moral aggression each group "takes control over the lives and activities of [cultural agents, in this case], especially with respect to their reproduction" (2009, 124). This is one of the ways, pointed out by Godfrey-Smith, in which lower level populations in collectives are de-Darwinized (in their reproductive output also). I am not sure whether he would accept this interpretation of
the quoted passage in the context of BPg-like populations. 12-14 June 2012; Mugla, Turkey -11-
11
Even though this question will not be thoroughly addressed in this paper, I will say a few other things on the role of transmission biases in the next section.13
(IV) RULES FOR UPDATING BEHAVIOR AND DARWINIAN POPULATIONS
In the chapter on "Cultural evolution" of his 2009 book Godfrey-Smith engages himself in modeling the dynamics of a population of behaviors when a particular rule, among several possibilities, is followed by the agents for updating their behavior (2009,159-60). He investigates, especially, the evolutionary implications of the following rules that might be used in this context: 'imitate your best neighbor' (IBN), 'copy the common' and 'best response'.
An agent that follows the IBN rule looks around his or her neighbors (in a local interaction) and compare their behaviors for their payoffs; the agent then chooses to imitate the behavior that gets effectively the highest payoff. A best response rule is "smarter" than IBN since the agent not only looks around for her neighbors actual behaviors but is able to find out what would have been the most appropriate behavior given their circumstances. The agent embraces the behavior that, in Godfrey-Smith words, "would have been the most appropriate overall response to the behaviors produced by the individual's neighbors on the previous time-step" (2009,157).
The 'copy the common' rule is a kind of conformist rule: the agent imitates the behavior that is more common among those to which it is exposed.
Godfrey-Smith argues that IBN can support a Darwinian dynamics in the population of behaviors, but not the 'copy the common' rule. His argument is based on two assumptions:
13 The emphasis Godfrey-Smith puts on integration (the parameter I) in his account of the requirements for a paradigmatic darwinian population, can contribute to develop further dual inheritance theories. In my view, Hodgson and Knudsen (2010, p. 163-4) rigthly point out that a concern with social structure is lacking in Richerson and Boyd`s theory, for instance, and that we need more than psychological biases to deal adequately with the problem of the evolution of cooperation in human social groups. For an in depth discussion of the issue of cooperation, in the context of dual inheritance theory, see Abrantes & Almeida, 2011. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -12-
12
1) IBN is success-driven but not conformism. After all, in the first case the agent imitates the behavior that gets the highest payoff among those to which it is exposed. An agent that conforms is not, for whatever reason, in a position to evaluate the payoffs of the behaviors to which it is exposed, since the most common behavior is not necessarily the fitter one given the circumstances.14 If we accept this assumption, IBN would be a "smarter" rule than the copy the common rule.
2) If the agents in the population follow the 'copy the common' rule, then we can't expect a Darwinian dynamics in the population of behaviors, since the behaviors that are imitated by the agents do not have single 'parent' behaviors. This rule does not give rise, therefore, to a lineage of behaviors: "... any given behavior will not have a single 'parent' behavior on the previous time-step" (Godfrey-Smith, ibid.,157).
He argues that if the IBN rule is followed instead by the agents,
"A particular instance of a behavior might, through successive events of imitation, be the ancestor of a branching tree of descendant behaviors, spreading through the population. Each behavioral instance is transitory, but if successful it may be causally responsible for other behaviors of the same kind. Behaviors themselves in this system are replicators" (Ibid., 157).
Godfrey-Smith concludes, assuming (1) and (2), that a conformist rule cannot give rise to a Darwinian change in the pool of behaviors themselves (Ibid., 160).
In what follows, I will object to the first assumption of the argument reconstructed above. I will not address the second assumption, since I have not much to say about reproduction and inheritance in this paper, despite their indisputable relevance in demarcating different kinds of populational dynamics.
Another perspective on behavior updating rules
In his discussion of various rules for updating behavior, Godfrey-Smith is clearly focusing
14 I will put aside, for now, the issue of the psychological requirements for being able to do this kind of appraisal. 12-14 June 2012; Mugla, Turkey -13-
13
just on what I have called the CPi case (see Table 1), that is, on the effects of following a particular rule in the dynamics of a population of behaviors (or, else, on the population of cultural variants that cause these behaviors). There is, however, another perspective that can be taken into account when addressing the evolutionary effects of following these rules,
by changing the focus to the BPg case instead. What is now at stake is the evolutionary dynamics of a population of groups with different cultural phenotypes, whenever a particular rule is followed by the members of those groups.
So that groups with different cultural phenotypes make up a (less marginal) Darwinian population, the agents that are members of these groups should follow a conformist rule, contrary to Godfrey-Smith's own expectations. I antecipated the argument supporting this thesis in the last section: a conformist rule leads to higher values of V for the population of cultural groups.
Furthermore, I suspect that the effects of the IBN rule on the dynamics of behaviors internal to a particular cultural group might endanger its cohesion, which is not the case if the copy the common rule is embraced by the agents.15 In other words, an IBN rule might lead to higher values of V in the population of group-member behaviors, whereas the copy the common rule obviously favors a lower V for this population.
At the same time, I argued before that a conformist rule for updating behavior conveys higher values of the parameter V for the population of groups (that is, this population becomes more diversified as far as culture is concerned). As a consequence, they become more isolated from each other, since cultural variation builds up barriers for migration (language is very effective in this regard). In addition, this situation enhances the strength
15 Another point that can be made is that "smarter" rules such as IBN and the best response rule presuppose that the agent is able to appraise which of her neighbors' behaviors has the best payoff under the prevailing environmental conditions. Very often, however, an agent is not able to do this - to appraise whether a particular behavior, to which it is exposed, is adaptive or not - and the best bet is to imitate the most common behavior in the group. An alternative would be for the agent to rely on individual learning, which can be a very risky strategy if, for whatever reason, the environment is informationally translucent for the agent. For the distinction between informationally opaque, transparent and translucent environments, see Sterelny, 2003. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -14-
14
of selection at the group-level, as I had the opportunity to emphasize before.
In other words, following a copy the common rule de-Darwinizes the group-member's population, as far as the abundance of behavioral variation is concerned.16 A conformist bias - and maybe other biases too, besides enforcement mechanisms such as moralistic aggression -, might also reduce reproductive competition among the members of a particular group: this population scores a lower value in the parameter .17 Therefore, we have the conditions for a more cooperative interaction between the members of a particular cultural group. Competition switches from the level of group-members to the group-level population, where V is higher. By the same token, we should also expect a stronger selection at the cultural group-level whenever a conformist bias shapes social learning at the lower level of group-members.
Godfrey-Smith (2009, p. 157-8) makes it clear that models which address behavior updating rules, such as those built by Skirms, are attempts to simulate the conditions under
which cooperation could have evolved. The group-level BPg point of view I am suggesting in this section, points to a scenario in which a conformist bias is one of the chief elements that favored the evolution of cooperation in human cultural groups. Richerson and Boyd, among others, offered reconstructions along these lines, as I mentioned before.
Concerning the issue of the evolution of rules for updating behaviors, Godfrey-Smith says
16 Possibly we might also have a de-Darwinization not only regarding V, but also in reproduction-related parameters as well, for the group-member's population. At the same time, a transition towards a more paradigmatic population at the cultural group-level is taking place, as far as the latter parameters are concerned. To argue thoroughly for this thesis is beyond the scope of this paper.
17 One might ask about what is being reproduced here. The CPi and CPg cases correspond to populations of cultural variants, therefore the latter are the entities being reproduced. Given Godfrey-Smith's distinctions between different kinds of reproducers, it would seem straightforward to classify this kind of reproduction using the categories of formal and scaffolded reproducers, but he is not clear about it (2009, p. 79, 154-5; cf.
Dennett, 2011). It is even more complicated to conceive the modality of reproduction involved in the BPg case. Godfrey-Smith claims that there is no clear-cut (paradigmatic) reproduction in this case, which implies that we can't attribute to cultural groups the status of full individuals. I will argue against this claim at the end of the paper. 12-14 June 2012; Mugla, Turkey -15-
15
in passing:
"So evolution can build agents who use social experience to influence their choices in a number of ways. It is a striking fact that some of these ways, including IBN, can generate a new Darwinian population in the pool of behaviors themselves. But evolution may or may not build such agents. And it may build them initially and then build something beyond them - suppose biological evolution produced a sequence of successively "smarter" rules in a population: first copy-the-common, then IBN, then a best-response rule. The pool of behaviors is initially non-Darwinian, becomes Darwinian, and then becomes non-Darwinian again" (Ibid., 160).
Godfrey-Smith does not develop this scenario further in his book, but I want to point out
that it refers, again, to the CPi case (see Table1).
My focus on the BPg case points, instead, to a more constrained scenario, in which the evolution of a copy the common rule (arguably in the Hominin lineage) is much more probable than the evolution of other rules, given the environmental conditions that prevailed during the Pleistocene (Boyd & Richerson, 2005). I would guess also that an IBN rule has a higher cost for the agent in those environmental conditions.18
From the point of view I am taking here, a conformist rule might be success-driven, after all, and it can be shown that it is able to support, actually, a Darwinian dynamics at the
biological group-level population (BPg).
There is a large amount of litterature on the role conformism might have played in Hominin
18 Besides the point I made before concerning the effects on the parameter V of following the IBN rule, my intuition is that, compared to the conformist rule, the costs of following the IBN rule are higher: we have to consider the cost of the psychological machinery required for the evaluation of the payoffs and, in addition, to take into account the (cost of) risk of imitating a behavior that is not the most adaptive, given the environment in which the population has been living (refer also to the point I made in footnote 15 concerning informationally translucent environments). This is a situation in which intuition can mislead and mathematical modeling is indispensable to compare the various scenarios. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -16-
16
groups and on the conditions under which it might have evolved.19 According to several models built by Richerson and Boyd, among others, the evolution of imitation as a social learning modality is closely related to the evolution of a conformist rule for updating behaviors (the equivalent to what Godfrey-Smith calls a 'copy the common' rule). Social learning by imitation enhances the fitness of the agent when certain environmental conditions prevail: those conditions in which the environment is neither too unstable – which would favor, instead, individual learning – nor very stable - which would favor an innate behavior. These models give plausibility to a scenario in which a conformist transmission bias and high-fidelity imitation evolved in the very same environmental conditions. Therefore, a conformist bias has been probably selected for at the group-level, and one of its effects was a de-Darwinization of the lower level population, as I argued above.20
CONCLUSION
The arguments presented in the previous sections - inspired by some of the theses defended by dual inheritance theorists -, suggest that a population of groups with different cultural phenotypes might be more paradigmatically Darwinian than Godfrey-Smith is willing to acknowledge in his 2009 book. It is true, however, that the points I make in this paper are
restricted to just a few dimensions of the Darwinian hyperspace. The BPg-like population might (still) be a marginal one, as far as other dimensions of this hyperspace are taken into account, especially those quantified by the reproduction-related parameters. Godfrey-Smith is explicit about what is at stake here:
"Darwinian language is often applied to social groups and communities in such a way that the focus is on persistence of a group as contrasted with extinction, or growth as opposed to shrinkage (...) In this book I treat Darwinian processes involving growth and persistence
19 Henrich & Boyd, 1998; Boyd & Richerson, 2005; Abrantes & Almeida, 2011. Hodgson & Knudsen argue for a replicator approach on tackling this issue (2010, esp. 140, 159-165). I emphasized at the beginning of the paper the reasons why Godfrey-Smith rejects this approach (see also 2009, p. 110-11).
20 Another possible scenario would be one in which a conformist bias coevolved with a capacity for high-fidelity imitation. We discuss some of those models in Abrantes & Almeida, 2011; Abrantes, 2011. 12-14 June 2012; Mugla, Turkey -17-
17
without reproduction as marginal cases (...) So "cultural group selection" of a significant kind requires differential reproduction, not just differential persistence, even though the border between these is vague" (Ibid., 151-2; cf. 118-9).
Taking this stance, Godfrey-Smith is skeptical about the possibility of talking about reproduction in the case of cultural groups. My intuition, instead, is that it might be fruitful to come up with modalities of reproduction suitable to cultural groups, such as persistence. This strategy is compatible with the "permissive attitude" (2009, 91) he embraces along the book in other hard cases and concerning other parameters of the Darwinian hyperspace.21
Further work has to be done to argue more forcefully in favor of the thesis that the emergence of cultural groups in the Hominin lineage might have been a transition in individuality. This is an speculative scenario, albeit plausible, suggested by Godfrey-Smith's novel approach to the issue of transitions. It is an empirical matter how far we have been going along any of those possible evolutionary paths.22
BIBLIOGRAPHY
Abrantes, P. Methodological issues in the dual inheritance account of human evolution. In: Darwin´s Evolving Legacy. Martínez Contreras J. & Ponce de León A. (eds.). México: Siglo XXI - Universidad Veracruzana, 2011, p. 127-143.
Abrantes, P. ; Almeida, F. Evolução Humana: a teoria da dupla herança. In: Abrantes, P. (org.), Filosofia da Biologia. Rio Grande do Sul: ARTMED, 2011, p. 261-295.
Boyd, R.; Richerson, P. The origin and evolution of cultures. Oxford: Oxford University Press, 2005.
21 Refer also to the above footnotes 12 and 17. For an argument along a similar line, see Dennett, 2011.
22 I am grateful to Peter Godfrey-Smith for several conversations we had in Harvard University in 2009, which helped me to clarify and further work out some of the topics I address in this paper. Any mistakes in it are my own responsibility, of course. Versions of this paper have been presented at the 2011 ISHPSSB Meeting (Salt Lake City) and at the VII International Principia Symposium (Florianópolis, 2011). I am grateful to the Brazilian Research Agency (CNPq) for the scholarship that made possible my stay in Cambridge and my attending those Conferences. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -18-
18
Buss, L. W. The evolution of individuality. Princeton (NJ): Princeton University Press, 1987.
Dennett, D. Homunculi rule: reflections on `Darwinian populations and natural selection` by Peter Godfrey Smith. Biology & Philosophy, v. 26, p. 475–488, 2011 .
Godfrey-Smith, P. Darwinian populations and natural selection. Oxford: Oxford University Press, 2009.
Henrich, J.; Boyd, R. The Evolution of Conformist Transmission and the Emergence of Between-Group Differences . Evolution and Human Behavior, v. 19, p. 215–241, 1998.
Hodgson, G. M. ; Knudsen, T. Darwin's conjecture: the search for general principles of social & economic evolution. Chicago: The University of Chicago Press, 2010.
Maynard-Smith, J. ; Szathmáry, E. The major transitions in evolution. Oxford: Oxford University Press, 1997.
Michod, R. E. Darwinian dynamics: evolutionary transitions in fitness and individuality. Princeton (NJ): Princeton University Press, 1999.
Richerson, P. ; Boyd, R. Not by genes alone: how culture transformed human evolution. Chicago: The University of Chicago Press, 2005.
Sterelny, K. Thought in a hostile world. Malden (MA): Blackwell, 2003. 12-14 June 2012; Mugla, Turkey -19-
The Moral Lives of Animals Mugla, Turkey June 2012 1 Introduction
Do animals lead moral lives? What exactly might be meant by claiming that they do and how might we be able to establish that fact? This is the focus of the following, programmatic, paper. My aim is to establish a framework for answering these questions and suggest a direction for further investigation.
Much of the literature on animals and morals focuses on the moral status of animals. Do they need to be considered in our moral calculations and if so how? To what extent does the moral status of animals suggest or dictate human attitudes towards them and human practices with respect to them? To borrow a phrase from Peter Singer the question is should the circle that encompasses the moral community of human beings be expanded to include some if not all animals? If so, what criteria are relevant for determining who is or is not to be included in this expanded circle? Utilitarians opt for the capacity for feel pain, Kantians and neo-Kantians opt for evidence of some degree of rationality or reflective capacity and virtue theorists, I suppose, would opt for some evidence of the manifestation of virtue. The point is that each of these approaches reflects what might be called an ‘anthropocentric perspective’ insofar as the key underlying question seems to be what the implications of including or excluding animals in the ‘moral circle’ are for us? They are anthropocentric in another sense as well in that who counts as morally relevant is determined by criteria that are set by some understanding of human conceptions of morality.
My approach is somewhat different. The question I am interested in exploring is this: To what extent can we get a handle on the moral lives of animals from the perspective of the animals themselves? Does it make any sense and, if so, what sense, to talk of animals as leading moral lives independently of questions about how and whether to factor them into our moral deliberations? In terms of Singer’s ‘expanding circle’ metaphor we may put the question in the following way: Is there one moral circle that encompasses all those who warrant moral consideration or are there perhaps a number of (possibly overlapping) circles centered around different focal points? Does it, for example, make sense to talk of a moral community of wolves or elephants where the norms of these communities and the criteria for membership are determined by and reflections of the social dynamics of the respective groups? In contrast to the traditional anthropocentric perspective this approach might be labeled ‘speciocentric.’
The plan of the paper is as follows. Section 2 is a brief summary of the main empirical and theoretical considerations that suggest that at least some non-human animals lead moral lives that can be appropriately characterized from a speciocentric point of view. Section 3 raises two questions that need to be addressed if the project of attributing moral lives to animals is to get off the ground. Section 4 explores the sense in which animals might be construed as moral agents. This discussion draws on some recent work by Geoffrey Sayre-McCord on the nature of normativity.i Section 5 is a discussion of a moral version of what is known as the ‘logical problem’ in the theory of mind literature. This material draws on some recent work by Robert Lurz.ii
The moral lives of animals – 5-31-12 draft Page 1 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -20-
2. The empirical and theoretical background
Here I briefly summarizie material that is dealt with more fully in ‘The moral life of animals.’iii The general empirical and theoretical support for attributing moral sensibilities to animals derives from three sources: evolutionary theory, neuroscience and cognitive ethology.
The argument from evolution has its roots in the work of Charles Darwin and George Romanes.iv The basic idea is that human beings, other mammals and even more distant lineages have a shared evolutionary history. This history records the development of the shared underlying biological mechanisms that give rise to psychological and affective states. Different lineages may manifest those characteristics in different ways but the implication is that the differences between lineages are differences in degree and not differences in kind. The attribution of mental and affective states to animals was blacked by the rise and dominance of behaviorism in the first half of the 20th century. However, developments in neuroscience in the past 40 years have led some to challenge the behaviorist paradigm that rejects all attributions of mental or affective states to animals as anthropomorphism gone wild.
The evidence from neuroscience is extensive although the implications for attributing mental and affective states to non-human animals are still somewhat controversial. Two of the major figures advancing the view that the neuroscience strongly supports the view that animals do have minds and experience affects are Paul Maclean and Jaak Panksepp. The basic idea of Maclean’s ‘triune brain hypothesis’ is that the evolved mammalian brain can be conveniently represented as the product of 3 developmental stages: A primitive reptilian brain located in the basal ganglia, an old mammalian brain located in the limbic system, and a new mammalian brain located in the neocortex.v The triune brain thesis argues for deep homologies between the brains of animals and the brains of human beings. Neurological evidence points to deep structural similarities between the ancient brain systems that we share with other animals. In particular, the ancient structures are the neural source of basic qualitative feels or affects. Jaak Panksepp has identified seven primary limbic emotional action systems which, he argues, are the basis of animal responsiveness and lie at the foundation of both emotional and cognitive states. In addition to this shared affective neurostructure, he has recently argued that mammals share brain structures that constitute what he calls “proto-selves” and “core selves.” Further study, he suggests, may reveal the basis for attributing a sense of self to a wide range of animals. It stands to reason, he argues, that animals with brain structures similar to those in humans not only react in ways that make them appear to have qualitative experiences similar to those of humans when the homologous brain structures are stimulated, but also that they do in fact have those experiences.vi The unregenerate behaviorists among you may object that the attribution of affects to non-human animals is unjustified anthropomorphism. Frans de Waal, among others, however, argues that it is not. de Waal argues that there is a double standard at work.vii On the one hand, researchers take cognitive differences to justify the non-attribution of emotional and mental capacities to animals while, on the other hand, they ignore evolutionary evidence that suggests that animals and human beings have shared inherited brain structures associated with emotional and mental capacities. de Waal labels this blind spot “Anthropodenial,” which he characterizes as the a priori rejection of the importance of the fact that although non-human animals are not human, humans are animals.
The moral lives of animals – 5-31-12 draft Page 2 12-14 June 2012; Mugla, Turkey -21-
The third line of relevant scientific findings comes from investigations by cognitive ethologists. In their book Wild Justice, Marc Bekoff and Jessica Pierce argue from the perspective of cognitive ethology that animals exhibit behaviors that are best interpreted as manifestations of empathy, cooperation, and a sense of fairness. In essence, “animals have morality.”viii Bekoff and Pierce understand morality to be “a suite of interrelated other-regarding behaviors that cultivate and regulate complex interactions within social groups.”ix However, these behaviors do not constitute morality in themselves; a certain level of cognitive and emotional sophistication is necessary. Bekoff and Pierce’s approach is data-driven, and they emphasize the need and importance of expanding research beyond non-human primates to other social mammals: Hunting predators such as wolves, coyotes, and lions, as well as elephants, mice, rats, meerkats, and whales, among others. In addition, they emphasize the importance of studying animals in their natural habitats and not merely in the confines of laboratories where they are often asked to perform in accordance with the interests of animal behaviorists, which may or may not reflect the interests of the animals themselves. Where is the line to be drawn between animals that evince morality in this limited sense and those that do not? Bekoff and Pierce suggest that the line is shifting as more empirical evidence becomes available and as our philosophical understanding of what it means to be moral is modulated by reflection on the scientific data. Although their focus is on social mammals, there is a widening body of evidence that suggests that some birds have the wherewithal to constitute a moral community, in the sense of relevant emotions, co-operation, and the like. Although they argue that the data strongly support the attribution of morality to animals, Bekoff and Pierce also argue that what constitutes morality has to be understood as species specific. Thus, what counts as morality for human beings may not apply well to wolves, for instance. Nevertheless, they argue, the fact that human standards of morality are not appropriate for wolves does not mean that wolves do not possess some sense of moral relationships that is exhibited in their own manifestations of empathy, cooperation and a sense of fairness. The net effect is that there is not one sense of moral community and that we humans, as prototypical moral agents, may expand our understanding of morality to include some organisms and exclude others. The proper way to understand animal morality, they suggest, is to see that there are a number of distinct species-specific moral communities. Within these diverse communities, what counts as moral needs to be attuned to the characteristic features of the species themselves. Indeed, even within species, different communities may develop different social practices, so that what is acceptable in one wolf pack, for example, may not be acceptable in another.
These considerations, taken together, are compelling support for the claim that at least some animals, especially the social animals, have moral lives. The evidence is compelling but not conclusive. Putting aside behaviorist qualms there are still significant hurdles to be overcome before we can be confident in concluding that animals are moral creatures in their own right. To these qualms we now turn.
3 Two questions
There is an extensive literature on the dual questions of whether animals have minds and whether, if they do, they have a ‘theory of mind.’ There are two fundamental issues: (1) Do animals have minds?, and (2) Given that they do, are they capable of attributing mental states to others and acting on those attributions? Parallel questions can be raised with respect to the moral
The moral lives of animals – 5-31-12 draft Page 3 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -22-
lives of animals. (1m) Do animals have moral lives, that is, are they motivated by ‘moral’ considerations, properly understood? (2m) Given that they are, can they attribute moral motivations to others and act accordingly?
Robert Lurz, in a recent book, has identified two fundamental issues that need to be addressed in order to be in a position to answer questions about the mindreading capabilities of animals. One is theoretical and one is experimental. Parallel issues have to be addressed in order to be in a position to answer questions about the moral lives of animals.
For our problem, the theoretical issue is this: What does it mean to attribute moral lives to animals? In particular, what does it mean to attribute moral motivations to animals? The empirical issue is this: How best can we test for the existence moral sensibilities and moral motivations in animals?
I do not have a good answer to either of these two questions but I think we can make some headway in identifying the key questions that need to be answered and in identifying what is the proper perspective for answering them.
4. Levels of agency
To the extent that we attribute psychological and moral states to animals they are, in some sense, persons and not merely biological organisms. What, then, does it mean to attribute personhood to animals? We can adopt either an anthropocentric or a speciocentric perspective on this question. From an anthropocentric point of view, something is a person if it has a sufficient number of properties that make it an entity like us. I am not sure what constellation of properties this includes but the fact that in some legal sense corporations can be persons shows that the applicability of the concept is not limited to living beings. The central ideas that legitimate the extension of the idea of personhood to corporations, for instance, are notions of agency and responsibility. Corporations can act as (legal) agents and can be held (legally) responsible for their actions. However, what degree of agency and responsibility they possess is conferred upon them by human beings and their social practices. What about the moral agency of animals? Is that to be construed as merely derivative as well? A speciocentric perspective would reject this way of understanding what it means for an animal to be a moral agent. To the extent that animals lead moral lives (as opposed to being merely factors in our moral calculations) we must be able to construe them as moral agents in their own right. Can this be done?
In a recent paper responding to claims in the literature that attribute moral agency to animals, Geoffrey Sayre-McCord asks ‘Just what is it to be a moral agent?’.x In effect, what is the nature of normativity? In his analysis, Sayre-McCord identifies several levels of ‘agency’ where an agent is understood to be something capable of representing its environment and acting on the basis of those representations (p. 5). These are, in order of increasing sophistication, (1) ‘stimulus-response agents who ‘represent the world as being a certain way and then respond directly (p. 5);’ (2) ‘planning agents,’ which are basically stimulus-response agents with the extra capacity to identify alternative courses of action and act in accordance with some plan of action. Sayre-McCord characterizes these agents as ‘decision-theoretical’ agents whose behaviors can be adequately modeled by decision theory (p. 5); (3) ‘strategic agents’ are agents who attribute
The moral lives of animals – 5-31-12 draft Page 4 12-14 June 2012; Mugla, Turkey -23-
designs and plans to others and act accordingly. Their behavior can be modeled by game theory; (4) ‘norm-governed agents’ are ‘strategic agents . . . [who] introduce rules for behavior with which they are disposed to conform and disposed to enforce in various ways (p. 7);’ and finally, (5) ‘rational agents,’ that is, strategic norm-governed agents who are ‘able to represent the different options as better or worse, as right or wrong, or as justified or not and . . . [are] able to act on the basis of such normative representations (p. 7).’
On Sayre-McCord’s account, truly moral agents need to be able to have a capacity for second-order reflection on first order states. That is, truly moral agents need not only follow norms but be capable of recognizing that they are following norms and be capable of using this reflective insight to guide their actions. This is a high bar for non-human animals to pass. Whether they are capable of passing it depends on how sophisticated their mental and psychological capacities are. Many who are willing to allow that some animals have sophisticated psychological states are reluctant to attribute reflective second-order capacities to them. Sayre-McCord, for one, allows that some animals are capable of rising to the level of norm-governed agents but he resists the attributing any rational, and hence, truly moral, agency to them.
The distinction between norm-governed agents and rational agents roughly parallels Kant’s distinction between acting in accordance with duty and acting from duty. Indeed, this is the central theme of Sayre-McCord’s analysis (p. 2) . His main project is to provide a Kantian account of rational agency freed from the metaphysical baggage of Kant’s own account (p. 2) For our present purposes, the question is: ‘Is norm-governed agency good enough for non-human moral agency? It is clear that many cognitive ethologists see the structured behavior of social animals as manifesting norm governed behavior. Some, perhaps sympathetic to the idea that such behavior doesn’t rise to the Kantian level of moral agency, are content to qualify such animals as ‘proto-moral’ beings. I don’t want to haggle over labels here but merely want to suggest that the resistance to qualifying animals as ‘truly’ moral may reflect a subtle anthropocentric bias. If we view human morality as one manifestation of a shared evolved set of social enabling mechanisms, then the peculiar feature of rational reflectivity, as Sayre-McCord understands it, looks more like a refinement of a capacity that is shared among many lineages rather than as a defining characteristic. If so then we can tentatively accept norm-governed agency as moral agency enough and move on to the empirical question of how to establish whether any animals do live moral lives, so understood.
5. The logical problem
Turning to the question of how to empirically test whether or not animals lead moral lives, we confront what has been labeled in the mind reading literature as the ‘logical problem.’ The mind reading problem is this: Is there any way to empirically distinguish between (1) animals that are mind readers, that is, animals that act in light of their attribution of intentional states to others, and (2) animals that are acting on behavioral cues but who do not attribute intentional states to others? The problem arises, in part, because, in the absence of language, the attribution of mind reading to animals is determined solely by their behavioral responses to environmental situations. In a recent book, Robert Lurz argues that all previous experimental results that suggest that some animals are mind readers are compromised by a failure to rule out
The moral lives of animals – 5-31-12 draft Page 5 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -24-
the hypothesis that the observed behaviors can be explained equally well by a ‘behavior-reading’ hypothesis to the effect that the animals are responding to behavioral cues and are not attributing mental states to either other conspecifics or to the experimenters.xi
Some take this failure to be able to discriminate between cases of mind-reading and cases of behavior-reading to be an insurmountable barrier to the unequivocal attribution of mind reading to organisms that cannot communicate their thoughts and intentions through the use of language that is intelligible to us. Lurz, however, argues that it is possible to design experiments that will be able to discriminate between the two hypotheses and he proposes several, as yet untested designs, that he claims will yield different predictions depending upon whether the tested animals are mind-readers or not. I do not want to pursue this here but rather to formulate the analogous problem for determining whether or not non-human animals lives live moral lives.
The logical problem for the moral lives question boils down to this: Is it possible to experimentally distinguish between animals that are acting in accordance with moral norms and animals that are behaving as if they were but for whom no moral considerations, per se, are relevant? If we allow, for the sake of argument, that morally motivated animals are norm- governed in Sayre-McCord’s sense then what we want to know is whether the animals behavior is directed by (first-order) moral motivations or whether the characterization of their behavior as norm-governed is imposed upon their behavior by the ethological investigators.
This problem dogs much, if not all, of the cognitive ethology data that suggests that many social animals exhibit behaviors that can be interpreted as a result of the animals acknowledging and enforcing social and moral norms within their respective communities. Unlike Lurz, I do not have any good sense that these alternative accounts are empirically distinguishable. If they are not, then the claim that animals lead moral lives will remain in limbo despite the suggestive evidence from evolutionary considerations and the neuroscientific data. However, I am persuaded by the work of the cognitive ethologists that any decisive conclusions one way or the other must be the result of investigations in situ where experiments and observations are set up to reflect the conditions and expectations of the animals under investigation and not the expectations of alien investigators (that is, us).
The moral lives of animals – 5-31-12 draft Page 6 12-14 June 2012; Mugla, Turkey -25-
i Geoffrey Sayre McCord, “Rational Agency and the Nature of Normative Concepts,” http://philosophy.unc.edu/people/faculty/geoffrey-sayre-mccord/on-line-papers/Rational%20Agency%20and%20the %20Nature%20of%20Normative%20Concepts.pdf
ii Robert W, Lurz, Mindreading Animals: The Debate over What Animals Know about Other Minds (Cambridge MA: MIT Press, 2011)
iii Michael Bradie, “The Moral Life of Animals,” in Tom Beauchamp and R. G. Frey (eds.) The Oxford Handbook of Animal Ethics (Oxford: Oxford University Press, 2011)
iv Charles Darwin, On The Origin of Species: A Facsimile of The First Edition, edited by Ernst Mayr (Cambridge, MA: Harvard University Press, 2000); Darwin, The Descent of Man and selection in relation to sex (Princeton: Princeton University Press, 1981); Darwin, The Expression of The Emotions in Man and Animals, 3rd ed., (Oxford: Oxford University Press, 1998); George Romanes, Animal Intelligence (London: Kegan Paul, Trench & Co., 1882); Romanes, Mental Evolution in Animals (London: Kegan Paul, Trench & Co., 1885)
v Paul MacLean, The Triune Brain in Evolution (New York: Plenum Press, 1990).
vi Jaak Panksepp, “Affective consciousness: Core Emotional Feelings in Animals and Humans,” Consciousness and Cognition 14 (2005): 30-80.
vii Frans de Waal, Primates and Philosophers: How Morality Evolved (Princeton: Princeton University Press, 2006), pp. 61f.
viii Marc Bekoff and Jessica Pierce, Wild Justice: The Moral Lives of Animals (Chicago: The University of Chicago Press, 2009), p. 1.
ix Bekoff and Pierce, Wild Justice, p. 7.
x Sayre-McCord, p. 2
xi Lurz, Mindreading Animals, ch. 2. Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -26-
Defusing Bertrand’s Paradox
Zal´anGyenis Department of Mathematics and its Applications Central European University N´adoru. 9. H-1051 Budapest, Hungary [email protected]
Mikl´osR´edei Department of Philosophy, Logic and Scientific Method London School of Economics and Political Science Houghton Street, London WC2A 2AE, UK [email protected] July 2, 2012
Abstract The classical interpretation of probability together with the Principle of Indiffer- ence are formulated in terms of probability measure spaces in which the probability is given by the Haar measure. A notion called Labeling Irrelevance is defined in the category of Haar probability spaces, it is shown that Labeling Irrelevance is violated and Bertrand’s Paradox is interpreted as the very proof of violation of La- beling Invariance. It is shown that Bangu’s attempt [2] to block the emergence of Bertrand’s Paradox by requiring the re-labeling of random events to preserve ran- domness cannot succeed non-trivially. A non-trivial strategy to preserve Labeling Irrelevance is identified and it is argued that, under the interpretation of Bertrand’s Paradox suggested in the paper, the paradox does not undermine either the Princi- ple of Indifference or the classical interpretation and is in complete harmony with how mathematical probability theory is used in the sciences to model phenomena. It also is argued however that the content of the Principle of Indifference cannot be specified in such a way that it can establish the classical interpretation of probability as descriptively accurate, predictively successful or rational.
1 The main claims
Bertrand’s Paradox, published first in [3], is regarded a classical problem in con- nection with the classical interpretation of probability based on the Principle of Indifference, and it continues to attract interest [17], [22], [2], [20] in spite of al- leged resolutions that have been suggested in the large and still growing literature discussing the issue ([12] and [16] are perhaps the most well-known suggestions for
1 12-14 June 2012; Mugla, Turkey -27-
resolutions; the Appendix in [16] contains a brief summary of a number of typical views of the Paradox). It is not the aim of this paper to offer yet another “resolution” or criticize the ones available; rather, we suggest a new interpretation of Bertrand’s Paradox and analyze its relation to the classical interpretation of probability. The interpretation proposed here should make clear that Bertrand’s Paradox cannot be “resolved” – not because it is an unresolvable, genuine paradox but because there is nothing to be resolved: the “paradox” simply states a provable, non-trivial mathematical fact, a fact which is perfectly in line both with the correct intuition about how probability theory should be used to model phenomena and with how probability theory is in fact applied in the sciences. The key idea of the interpretation to be developed here is that the category of probability measure spaces with an infinite set of random events for which a clas- sical interpretation of probability based on the Principle of Indifference can be meaningfully formulated is the one in which the set X of elementary events is a compact topological group, the Boolean algebra S representing the set of random events is the set of Borel subsets of X and the probability measure pH is the (nor- malized) Haar measure on S. After stating the General Classical Interpretation in terms of the probability measure space (X, S, pH ) together with the Principle of Indifference, we will define a notion called Labeling Irrelevance in this cat- egory of measure spaces: Labeling Irrelevance expresses the intuition that the specific way the random events are named is irrelevant from the perspective of the value of their probability understood according to the classical interpretation. It will be shown that Labeling Irrelevance does not hold in this category of proba- bility measure spaces and we interpret Bertrand’s Paradox as stating this provable mathematical fact. This interpretation makes it possible to formulate precisely the extra condition on re-labelings that ensures that re-labelings do preserve the probabilities of events; the condition is an expression of the demand that re-labelings do not affect our epistemic status about the elementary events. We also will show that the recent attempt by Bangu [2] to block the emergence of Bertrand’s Paradox by requiring re-labelings to preserve randomness cannot succeed non-trivially. The interpretation will also make it clear that Bertrand’s Paradox does not affect the Principle of Indifference and does not, in and by itself, undermine the classical interpretation of probability – the classical interpretation, the Principle of Indifference and Labeling Irrelevance are independent ideas. This is not to say that the classical interpretation is maintainable however; the main problem with it is that it gives the impression that it is possible to infer empirically correct probabilities from an abstract principle stating some sort of epistemic neutrality. It would be a mystery if this were possible, but we will argue in the final section that this is not possible and does not in fact happen in applications of probability theory.
2 The elementary classical interpretation of probability
Betrand’s Paradox appeared at a time when probability theory had already pro- gressed from the purely combinatorial phase involving only a finite number of ran- dom events to the period when it got intertwined with calculus. This development
2 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -28-
began in the early 18th century with the appearance of limit theorems (theorem of large numbers, Bernoulli 1713, and central limit theorem, de Moivre 1733, [8]); yet, by the late 19th century the theory had not yet reached the maturity that would have made the mathematical foundations of the theory clear and transpar- ent. This was clearly recognized by Hilbert, who, in his famous lecture in Paris in 1900, mentioned the need of establishing probability theory axiomatically as one of the important open problems (Hilbert’s 6th problem [27], [26][p. 32-36]). Hilbert’s call was answered only in 1933, when Kolmogorov firmly anchored probability the- ory within measure theory [13]. (See [7] for the history of some of the major steps leading to the Kolmogorovian axioms.) In the measure theoretic approach probability theory is a triplet (X, S, p), where X is the set of elementary random events, S (the set of general random events) is a Boolean σ algebra of certain subsets of X and p (the probability) is a countably additive measure from S into the unit interval [0, 1]. Typically, one also needs random variables to describe certain features of the phenomenon to be described probabilistically: A (real valued) random variable f is a measurable function f from X into the set of real numbers R; measurability being the requirement that the inverse image f −1(d) of any Borel set d in R belongs to S. The measurability requirement entails that the distribution of a random variable d 7→ p(f −1(d)) is well-defined, the distribution of f is in fact the probability measure p ◦ f −1 on B(R) defined as (p ◦ f −1)(d) = p(f −1(d)) for all Borel sets d ∈ B(R). The number p(f −1(d)) is the probability that f takes its value in d. Note that the events also can be regarded as random variables: an element A in S can be identified with the characteristic (also called: indicator) function χA of the set A (see e.g. [19] for the mathematical notions of measure theoretic probability). The significance of probability theory being part of measure theory is that foundational-conceptual problems of probability theory, such as Bertrand’s Para- dox, can best be analyzed in terms of measure theoretic concepts. With few ex- ceptions, the papers on Bertrand’s Paradox typically do not aim at providing an analysis on this level of abstraction however, and, as a result, the precise nature of the paradox remains less clear than it should be. One such exception is Shackel’s paper [22], which raises the issue of “Getting the level of abstraction right” [22][p. 156] explicitly. But the level of abstraction suggested by Shackel is a bit too high. To see why, we recall first the classical interpretation of probability together with the Principle of Indifference in measure theoretic terms. The elementary version of the classical interpretation of probability concerns the probability space (Xn, P(Xn), pu), where Xn is a finite set containing n number of random events and the full power set P(Xn) of Xn represents the set of all events. The probability measure pu is determined by the requirement that the probability pu(A) be equal to the ratio of the “number of favorable cases to the number of all cases”: number of elements in the set {xi : xi ∈ A} pu(A) = (1) n This is equivalent to saying that pu is the probability measure that is uniform on the set of elementary events. While it is not always stated and emphasized explicitly, it also is part of the classical interpretation what we call here the Interpretive Link: that the numbers pu(A) are related to something non-mathematical. Without such an interpretive link, the classical interpretation is not an interpretation of proba- bility at all: the numbers pu(A) defined by (1) are just pure, simple mathematical relations. There are two standard Interpretive Links: The Frequency Link
3 12-14 June 2012; Mugla, Turkey -29-
and the Degree of Belief Link. We formulate here the first only, the latter will be discussed briefly in section 7. Thus we have the following specification of the classical interpretation: Elementary Classical Interpretation: In case of a finite number of elementary events the probabilities of events are given by the measure pu that is uniform on the set of elementary events and (Frequency Link:) the numbers pu(A) will be (approximately) equal to the relative frequency of A occurring in a series of trials producing elementary random events from Xn. Notice the future tense in the above formulation: it is this reference for future random trials that distinguishes the classical interpretation (with the Frequency Link) from the frequency interpretation, in which the ensemble of elementary ran- dom events determining A’s relative frequency must be specified before one can talk about probabilities (cf. [25][p. 24]). The classical interpretation so formulated is not maintainable however: simple examples (such as throwing a loaded die) show that it is only under special circum- stances that pu(A) is indicative of the frequencies with which A will occur in trials. This is what the Principle of Indifference is supposed to express. To state this principle we reformulate first the condition (1). Let Πn be the group of permuta- tions of the n element set {1, 2, . . . n} and π ∈ Πn be a permutation. Then the probability measure pu on P(Xn) which is uniform on Xn is determined uniquely by the condition
for every π ∈ Πn one has: pu({xi}) = pu({xπ(i)}) for all i ∈ {1, 2, . . . n} (2)
Elementary Principle of Indifference: If the permutation group Πn expresses epistemic indifference about the elementary random events in Xn, then the (Ele- mentary) Classical Interpretation is correct. Thus the (Elementary) Principle of Indifference states that the (elementary version of the) classical interpretation of probability is maintainable only if one is epistemically neutral in some sense about the elementary events. For now, we leave it open how to specify the content of the “epistemic neutrality”, we will return to the issue of epistemic neutrality in section 7.
3 The general classical interpretation of prob- ability in terms of Haar measures
Bertrand’s Paradox is typically regarded as an argument against the universal ap- plicability of the Principle of Indifference: Bertrand’s Paradox type arguments are intended to show that applying the Principle of Indifference can lead to assigning different probabilities to the same event. Both the original version of the argument and the numerous simplified versions of it involve an (uncountably) infi- nite number of elementary random events however. But then it is not obvious at all how one can apply the Principle of Indifference because the formulation of it in the previous section looses its meaning if the set of elementary events is not finite: there is no permutation group in the infinite case with respect to which one could require invariance of the measure yielding the “right” probabilities; equivalently: there is no probability measure on an infinite S that would be uniform on the infi- nite set X of elementary events. What is then the Principle of Indifference in
4 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -30-
connection with such infinite probability spaces? Without answering this question in suitable generality, Bertrand’s Paradox cannot be properly discussed in measure theoretic concepts. Shackel’s paper [22], which aims at an analysis of Bertrand’s Paradox in abstract measure theoretic terms, realizes the importance of this question but does not offer a convincing specification of the Principle of Indifference: Shackel just assumes a measure µ on S and stipulates that the probabilities p(A) be given by µ as p(A) = µ(A)/µ(X) (“Principle of indifference for continuum sized sets” [22][p. 159]). But there are infinitely many measures µ on S that could in principle be taken as ones that define a probability p. Which one should be singled out that yields a p that could in principle be interpreted as expressing epistemic indifference about elements in X? This crucial question remains unanswered in [22]. It is clear that without some further structure on an infinite X it is not possible to single out any probability measure on S and hence it is impossible to formulate an indifference principle on such a measurable space. The formulation of the Ele- mentary Principle of Indifference in terms of the permutation group Πn gives a hint about what kind of structure is needed in the more general case however: It is a natural idea to try to replace the permutation group Πn by another group G to be interpreted as expressing epistemic neutrality and hope that the elements g of G determine a function αg : X → X (an action on X) in such a way that if one requires the analogue of (2) by postulating ∗ ∗ for all g ∈ G : p (A) = p (αg[A]) for all A ∈ S (3)
then the above condition (3) determines a unique probability measure p∗ on S, just like in the case of a finite number of events. Problem is that for a general measurable space (X, S) with a continuum sized X there is no guarantee in general that a G exist leading to a p∗ – much less that it leads to a unique p∗. There is however such a guarantee under some additional assumptions: If X itself is a topological group satisfying certain conditions. If X is a locally compact abelian topological group, or a not necessarily abelian but compact topological group, then there exists a unique (up to multiplication by a constant) measure (called: the Haar measure) pH on (the Borel sets of) X which is invariant with respect to the group action. Furthermore, if X is compact then the measure pH is normalized and pH is then a probability measure. (The Appendix collects some elementary facts about the Haar measure; equation (29) in the Appendix formulates the invariance of the Haar measure precisely). The canonical example of an unbounded Haar measure is the Lebesgue measure on the real line: the Lebesgue measure is the unique measure on the real line that is invariant with respect to the real numbers as an additive group – the group action is the shift on the real line. The same holds for the Lebesgue measure on Rn. The normalized restrictions of the Lebesgue measure on Rn to bounded, compact subsets of Rn are thus distinguished by the feature that they originate from a shift-invariant measure; moreover, the Lebesgue measure on any interval [a, b] also can be regarded as Haar measure in its own right and the same holds for n Rn sets ×i [ai, bi] in (cf. Appendix). Both the original Bertrand’s Paradox and the simplified versions of it take the normalized restriction of the Lebesgue measure to some bounded, compact sets in Rn (n = 1, 2) as the measure that expresses the Principle of Indifference. This amounts to interpreting (more or less tacitly) the group that generates the Lebesgue measure as a symmetry expressing epistemic neutrality about the elementary events.
5 12-14 June 2012; Mugla, Turkey -31-
Thus in general, the group action on X determined by X itself as a group can play the role of the action of the permutation group on Xn, and the Haar measure pH on a compact X is the analogue of the uniform distribution on Xn if a non-zero uniform distribution on the elements X does not exist, which is the case if X is an infinite set. Note that taking the Haar measure as the analogue of the uniform distribution is also justifiable using maximum entropy techniques (see [11]). In what follows, (X, S, pH ) stands for a probability measure space in which X is a compact topological group with continuous group action, S is the Borel σ algebra on X and pH is the Haar measure on S. In the terminology of these group and measure theoretic notions the general classical interpretation of probability and the related principle of indifference can be consistently formulated generally as follows: General Classical Interpretation: If X is a compact topological group, then the probabilities of the events are given by the Haar measure pH on (the Borel sets of) X and (Frequency Link:) the numbers pu(A) will be (approximately) equal to the relative frequency of A occurring in a series of trials producing elementary random events from X. General Principle of Indifference: If X is a compact topological group and if the group action expresses epistemological indifference about the elementary random events in X, then the General Classical Interpretation is correct.
4 Labeling Irrelevance
Part of the intuition ingrained in the classical interpretation of probability is what can be called Labeling Irrelevance. Intuitively, the Labeling Irrelevance states that from the perspective of the values of the probabilities it does not matter how the events are named: re-naming them should not change their probability. To formulate this idea precisely, we need the notion of re-labeling (re-naming) first: ′ ′ ′ If (X, S, pH ) and (X , S , pH ) are two probability spaces describing the same phe- nomenon then the map h: X → X′ is called a re-labeling if it is a bijection between X and X′ and both h and its inverse h−1 are measurable, i.e. it holds that
′ h[A] ∈ S for all A ∈ S (4) − ′ h 1[B] ∈ S for all B ∈ S (5)
(Here h[A] = {h(x): x ∈ A} and h−1[A′] = {h−1(x′): x′ ∈ A′}.) Note that without the measurability condition required of h it can happen that a general event A ∈ S has probability but its re-named version h[A] does not – in this case h cannot be called re-naming of random events (and similarly for h′). Labeling Irrelevance is the claim that from the perspective of probabilities (un- derstood in the spirit of the classical interpretation), naming is irrelevant; that is to ′ ′ ′ say, if (X, S, pH ) and (X , S , pH ) are two probability spaces and h is a re-labeling between X and X′ then it holds that
′ pH (h[A]) = pH (A) for all A ∈ S (6) −1 ′ ′ ′ ′ ′ pH (h [A ]) = pH (A ) for all A ∈ S (7)
Recall (see e.g. [1][p. 3]) that two probability measure spaces (X, S, p) and (X′, S′, p′) are called isomorphic if there are sets Y ∈ S and Y ′ ∈ S′ such that p(Y ) = 0 = p′(Y ′) and there exists a bijection f :(X \ Y ) → (X′ \ Y ′) such that both f and its inverse
6 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -32-
f −1 are measurable and such that both f and f −1 preserve the measure p and p′, respectively; i.e. (8)-(9) below hold:
′ p (f[A]) = p(A) for all A ∈ S (8) − ′ ′ ′ ′ ′ p(f 1[A ]) = p (A ) for all A ∈ S (9)
The function f is called then an isomorphism between the probability measure spaces. Labeling Irrelevance can therefore be expressed compactly by saying
Labeling Irrelevance: Any re-labeling between probability spaces (X, S, pH ) and ′ ′ ′ (X , S , pH ) is an isomorphism between these probability spaces.
5 General Bertrand’s Paradox
Labeling Irrelevance is obviously a very strong claim and Bertrand’s paradox can be interpreted as the proof that it cannot be maintained in general (see below). But why would one think that Labeling Irrelevance holds in the first place? The answer is: because Labeling Irrelevance does hold for an infinite number of probability spaces: for probability spaces with any finite number elementary events probabilities of which are given by the uniform probability measure. A bijection h ′ between two finite sets Xn and X = Xm of elementary events exists if and only if the sets Xn and Xm have the same number of elements, n = m, and this entails that the two uniform distributions on those equivalent sets will assign the same probability to A and h[A] (and to A′ and h−1[A′]) – no Bertrand’s Paradox can arise in this case. Since the intuition about probability theory was shaped historically by situations involving only a finite number of random events, it is not surprising that Labeling Irrelevance became part of the intuition about probability. It turns out however that this intuition is a poor guide if the set of elementary events is not finite: This is precisely what Bertrand’s Paradox shows, general form of which is the following statement: ′ ′ ′ General Bertrand Paradox: Let (X, S, pH ) and (X , S , pH ) be probability spaces with compact topological groups X and X′ having an infinite number of elements ′ and pH , pH being the respective Haar measures on the Borel σ algebras S and ′ ′ S of X and X . Then Labeling Irrelevance does not hold for (X, S, pH ) and ′ ′ ′ (X , S , pH ) in the sense that • either there is no re-labeing between X and X′; • or, if there is a re-labeling between X and X′, then there also exists a re- labeling that violates Labeling Irrelevance. The General Bertrand’s Paradox is a trivial consequence of the following non- trivial theorem in measure theory: Proposition 1 ([24], [21]). If X is an infinite, compact topological group with the Haar measure pH on the Borel σ algebra S of X, then there exists an autohomeo- morphism θ of X and an open set E in S such that pH (θ[E]) 6= pH (E). By definition an autohomeomorphism θ of X is a bijection from X into X such that both θ and its inverse θ−1 are continuous. Since continuous functions are Borel measurable, an autohomeomorphism is a re-labeling: a re-labeling of X in terms of ′ ′ ′ its own elements. Assume now that (X, S, pH ) and (X , S , pH ) are two probability ′ spaces with infinite, compact topological groups X and X and Haar measures pH
7 12-14 June 2012; Mugla, Turkey -33-
′ ′ ′ and pH . If h: X → X is a re-labeling between X and X then either h is an isomorphism between the probability spaces (i.e. preserves the probability in the sense of (6)-(7)) or it is not. If it is not, then Labeling Invariance is violated by h. If h does preserve the probability (and is thus an isomorphism between (X, S, pH ) ′ ′ ′ and (X , S , pH )) then by Proposition 1 there exists an autohomeomorphism θ on X and there exists an open set E ∈ S such that pH (θ[E]) 6= pH (E). This means that for the re-labeling given by the composition h ◦ θ we have ′ ′ pH ((h ◦ θ)[E]) = pH (h[θ[E]]) = pH (θ[E]) 6= pH (E) (10) so the re-labeling h ◦ θ violates (6) and thus h ◦ θ violates Labeling Invariance. In either case Labeling Invariance is violated. Furthermore, the autohomeomor- phism ensured by Proposition 1 provides a re-labeling of the elementary set of events of any infinite compact group in terms of its own elementary events in such a way that the Haar measure yielding the probabilities of the events in the spirit of the classical interpretation are not preserved under the re-labeling. The General Bertrand’s Paradox is thus a general feature of infinite probability measure spaces with the Haar measure yielding the probabilities, and note that it says more than the original Bertrand’s Paradox, which only claimed that there exist Haar measures and re-labelings that violate Labeling Irrelevance: The General Bertrand’s Paradox says that no two Haar probability spaces can satisfy Label- ing Irrelevance; i.e. if there is at all a re-labeling between two probability spaces ′ ′ ′ ′ (X, S, pH ) and (X , S , pH ) with infinite X and X then there is also re-labeling be- tween these spaces that violates Labeling Invariance, and for any space (X, S, pH ) with an infinite X there exists a space (namely itself) and a self-re-labeling of (X, S, pH ) that violates Labeling Invariance. Thus Bertrand’s 1888 Paradox can be viewed as the specific “Lebesgue measure case” of a mathematical theorem that was proved in full generality in 1993 only. We close this section by giving an explicit, elementary example of violation of Labeling Invariance; this example will be referred to in the next section. In a well-defined sense (explained in Remark 1) the example is general. Example Let [a, b] and [c, d] be two closed intervals of the real numbers and
([a, b], S[a,b], p[a,b]) and ([c, d], S[c,d], p[c,d])
be the two probability spaces with p[a,b] and p[c,d] being the normalized Lebesgue measures on the intervals [a, b] and [c, d], with S[a,b] and S[c,d] being the Borel measur- able sets of the respective intervals. Elementary algebraic calculation and reasoning show that one can choose the parameters α, β and γ in the definition of the simple quadratic map h defined on the real line by h(x) = αx2 + βx + γ (11) in such a way that h maps [a, b] to [c, d] bijectively and both h and its inverse are continuous hence (Borel) measurable. Thus (the restriction to [a, b] of) h is a re-labeling between ([a, b], S[a,b], p[a,b]) and ([c, d], S[c,d], p[c,d]). Specifically, the pa- rameters below have this feature d − c α = (12) (b − a)2 d − c β = −2a (13) (b − a)2 d − c γ = a2 + c (14) (b − a)2
8 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -34-
Furthermore, if ǫ is a real number such that [a, a + ǫ] ⊆ [a, b] then ǫ p ([a, a + ǫ]) = [a,b] b − a
d−c 2 and since h takes [a, a + ǫ] into [c, c + (b−a)2 ǫ ] one has
1 d − c 2 p[c,d](h [a, a + ǫ] ) = c + 2 ǫ £ ¤ d − c³ (b − a) ´ It is clear then that for many ǫ
ǫ 1 d − c 2 p[a,b]([a, a + ǫ]) = 6= c + 2 ǫ = p[c,d](h [a, a + ǫ] ) (15) b − a d − c³ (b − a) ´ £ ¤ which is a violation of Labeling Irrelevance.
Remark 1. Note that the above example is typical in the following sense: A proba- bility measure space is called a standard probability space if X is a complete, separa- ble metric space and S is the Borel σ algebra of X. Standard, non-atomic probability spaces are isomorphic to ([a, b], L[a,b], p[a,b]) with some interval [a, b] where L[a,b] is the algebra of Lebesgue measurable sets in [a, b] (see [1][Chapter 1, p. 3]). Hence the above example gives a large number of re-labelings that violate Labeling Ir- relevance in the category of spaces (X, S, pH ) with X being a complete, separable metric space. This covers all the spaces that occur in connection with Bertrand’s Paradox.
6 Attempts to save Labeling Irrelevance
One may attempt to defend Labeling Irrelevance by trying to block the emergence of Bertrand’s Paradox. The previous section makes it clear what the possible strate- gies are to achieve this: One can impose some extra condition on re-labelings that entails either that re-labelings satisfying the extra conditions do not exist (Strategy A) or that the re-labelings satisfying the additional conditions force the re-labelings to be isomorphisms of the probability spaces (Strategy B). Although not formu- lated in this terminology, Bangu’s recent attempt [2] is an example of Strategy A. We show below that Bangu’s suggestion for Strategy A is ambiguous however and that resolving the ambiguity makes it either a trivial case of Strategy B or is unsuc- cessful. A successful implementation of Strategy B is to say that it is unreasonable to expect a re-labeling to preserve probabilities unless the re-labeling also preserves our epistemic status with respect to the elementary events: after all, the Principle of Indifference states that pH is the correct probability only if the group structure of X expresses epistemic neutrality. So the following stipulation is in the spirit of the Principle of Indifference: ′ ′ ′ Definition: The re-labeling h between probability spaces (X, S, pH ) and (X , S , pH ) preserves the epistemic status if it is a group isomorphism between X and X′. ′ Since the probability measures pH and pH are completely determined by the respective group actions, re-labelings that preserve the epistemic status are isomor- phisms between the measure spaces, hence no Bertrand’s Paradox can arise with respect to such re-labelings; furthermore, not every re-labeling is a group isomor- phism – thus this strategy works in a non-trivial way.
9 12-14 June 2012; Mugla, Turkey -35-
Bangu’s suggestion is that one should only expect Labeling Irrelevance to hold for bijections that “preserve randomness” – this is his Assumption R – Bertrand’s paradox is only a paradox in his view if Labeling Irrelevance is violated by re- labelings satisfying the randomness condition, which, he claims, has not been shown and the burden of proof is on those who claim such re-labelings exist. It is clear from the wording of his paper that he conjectures that no such proof can be given, i.e. that no randomness preserving re-labelings exist that violate Labeling Irrelevance (i.e. that he is following Strategy A). As Bangu also points out, the notion of randomness is notoriously both vague and rich: the adjective “random” can be applied to different entities (events, processes, dynamics, ensembles etc.), it can come in the form of a pre-theoretical informal intuition, in form of precise mathematical definitions, and it also can come in de- grees. Thus one has to be very careful and specific when it comes to the problem of whether “randomness is preserved” under a re-labeling of the elementary events. Bangu leaves it deliberately open in what sense precisely “randomness” might not be invariant under re-labeling of the random events; hence his suggestion remains somewhat vague. No matter what kind of notion of randomness one has in mind, if it is to be relevant for probabilistic modeling of a phenomenon, then it must be expressible in terms of probabilities, since the basic principle guiding the modeling of phenomena by probability theory is the maxim: Distribution Relevance: “A property is probability theoretical if, and only if, it is describable in terms of a distribution” [15][p. 171]. In the spirit of Distribution Relevance one can take the position that ran- domness of a phenomenon expressed by “randomness” of the random variables that describe the phenomenon are encoded in the distribution of the random variables. Consequently, under this interpretation of randomness, if one is given two proba- bility models (X, S, p) and (X′, S′, p′) of a given phenomenon and h: X → X′ is a re-labeling between (X, S, p) and (X′, S′, p′), then h preserves the randomness of the two probabilistic descriptions if and only if it holds that if f : X → R is any ′ −1 random variable in (X, S, p) with distribution p ◦ f −1 then the distribution p′ ◦ f in (X′, S′, p′) of the re-named random variable f ′ = f ◦ h−1 coincides with p ◦ f −1:
′ − − − p ◦ (f ◦ h 1) 1 (d) = p ◦ f 1 (d) for all d ∈ B(R) (16) ¡ ¢ ¡ ¢ and conversely: for every random variable g′ : X′ → R which is the re-named version − of a random variable g = g′ ◦ h in (X, S, p) it holds that the distribution p′ ◦ g′ 1 in (X′, S′, p′) of g′ and the distribution p ◦ g−1 of g = g′ ◦ h in (X, S, p) coincide:
′ − ′ ′− (p ◦ (g ◦ h) 1)(d) = (p ◦ g 1)(d) for all d ∈ B(R) (17)
Since the random events themselves are random variables, the two equations (16)- (17) must hold for every characteristic function χA (A ∈ S) in place of f and every ′ ′ ′ characteristic function χA′ (A ∈ S ) in place of g as well, so this requirement of preserving randomness amounts to the demand that the following two equations hold:
′ p (h[A]) = p(A) for all A ∈ S (18) − ′ ′ ′ ′ ′ p(h 1[A ]) = p (A ) for all A ∈ S (19)
which is precisely Labeling Irrelevance (eqs. (6)-(7)). So, if “preserving random- ness by re-labeling” in Assumption R is understood in the spirit of Distribution
10 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -36-
Relevance as conditions (16)-(17) then the only randomness-preserving re-labelings are the isomorphisms and no Bertrand paradox can arise indeed – requiring preserv- ing randomness in this sense is equivalent to the requirement that the re-labelings are isomorphism, Strategy A, so interpreted, is trivial. One can try to argue that this is an extremely strong interpretation of “pre- serving randomness” and that randomness also can be interpreted differently as ex- pressed by some other property Φ(p) of the probability measure p. For instance, one has the intuition that a probability measure sharply concentrated on a single point in X is far less “random”, it represents much more certainty by having zero vari- ance than a probability distribution that has a large variance. The usual (Shanon) entropy of a probability measure also can be taken as a measure of “randomness” of the phenomenon that the probability model describes [4][p. 61-62]. Thus one can interpret the requirement of “preserving randomness under re-labeling” in As- sumption R in different ways depending on what property Φ one chooses: ′ ′ ′ Assumption R[Φ]: If (X, S, pH ) and (X , S , pH ) are two probability spaces and h is a re-labeling between X and X′ then we say that Assumption R[Φ] is satisfied ′ if both Φ(pH ) and Φ(pH ) hold. It is clear then that if there is a property Φ of randomness of a probability mea- ′ ′ ′ sure and there exists probability spaces (X, S, pH ) and (X , S , pH ) with a re-labeling h: X → X′ such that Assumption R[Φ] is satisfied but Labeling Irrelevance is violated by h then Bertrand’s paradox re-emerges. The variance and the entropy are such properties: Consider the probability spaces ([a, b], S[a,b], p[a,b]) and ([c, d], S[c,d], p[c,d]) described in the Example in section 5. The variance σ(p[a,b]) of the normalized Lebesgue measure p[a,b] on any interval [a, b] is by definition equal to
b b 2 2 1 2 1 (b − a) σ(p[a,b]) = x dx − xdx = (20) Za b − a h Za b − a i 12
and the entropy E(p[a,b]) of p[a,b] is by definition
b E(p[a,b]) = − x log(x)dx = log(b − a) (21) Za It follows then that if b − a = d − c = t then
t2 σ(p ) = σ(p ) = (22) [a,b] [c,d] 12 E(p[a,b]) = E(p[a,b]) = log(t) (23)
On the other hand, the map h defined in the Example remains a re-labeling even if b − a = d − c and Labeling Irrelevance is violated by this map because for b − a = d − c = t eq. (15) entails that for many ǫ we have
ǫ 1 1 2 p[a,b]([a, a + ǫ]) = 6= c + ǫ = p[c,d](h [a, a + ǫ] ) (24) t t ³ t ´ £ ¤
Thus Bertrand’s paradox re-emerges: The probability space ([c, d], S[c,d], p[c,d]) can be regarded as a re-named version of the probability space ([a, b], S[a,b], p[a,b]) via the re-labeling h defined by (11) and (12)-(14), furthermore, if b − a = d − c then this re-labeling satisfies Assumption R[Φ] with Φ being the variance or entropy, and because of (15) h violates Labeling Irrelevance (6)-(7).
11 12-14 June 2012; Mugla, Turkey -37-
One also can try to question Distribution Relevance. But if one gives up Distribution Relevance and interprets “randomness” in a way that makes ran- domness not expressible exclusively in terms of the distributions involved, then the appropriately modified Assumption R constrains even less the emergence of Bertrand’s Paradox. Rowbottom and Schackle [20] take Assumption R to be (a technically undefined) “unpredictability” and argue (informally) that there are re- labelings that preserve “unpredictability” and which are not isomorphisms, contrary to what Bangu [2] seems to conjecture. As a technically more explicit example, as- R sume that a dynamic {αt : t ∈ } is given on ([a, b], S[a,b], p[a,b]) and a dynamic ′ R ′ {αt : t ∈ } is given on ([c, d], S[c,d], p[c,d]), where αt and αt are one parameter groups of measure preserving maps on [a, b] and [c, d] respectively. As randomness ′ of the dynamical systems ([a, b], S[a,b], p[a,b]), {αt}) and ([c, d], S[c,d], p[c,d]), {αt}) one can take the randomness of the respective dynamics such as ergodicity, or mixing, which are not expressible in terms of p[a,b] and p[c,d] only. Given the re-labeling h between ([a, b], S[a,b], p[a,b]) and ([c, d], S[c,d], p[c,d]) described in the Example in section 5 that violates Labeling Irrelevance one can then specify the dynamics ′ {αt} and {αt} in such a way that they are both ergodic, [4][p. 34], generating a ′ Bertrand’s Paradox, or in such a way that {αt} is ergodic whereas {αt} is not, which would be a violation of preserving randomness (Assumption R) hence not a case of Bertrand’s Paradox (according to Bangu’s requirement) – anything is possible under such a dynamical interpretation of randomness. Thus the emergence of Bertrand’s paradox cannot be blocked in a non-trivial way by requiring the paradoxical examples to satisfy the randomness test and showing that they cannot pass this test: unless one requires in effect that the re-labeling be an isomorphism, Bertrand’s Paradox emerges: If Distribution Relevance is accepted and randomness is interpreted as measured by the variance or entropy of the probability measures then elementary examples can be given that show violation of Labeling Irrelevance. If Distribution Relevance is abandoned then the randomness requirement can be satisfied even more easily.
7 Comments on the classical interpretation
While Bertrand’s Paradox shows that Labeling Irrelevance cannot be maintained in general, this does not undermine, in and by itself, either the classical interpreta- tion of probability or the Principle of Indifference: It is clear from the discussion in the previous sections that the Principle of Indifference and Labeling Irrel- evance are independent ideas: One can in principle maintain the classical interpre- tation based on the Principle of indifference and reject Labeling Invariance completely or restricting it to the domain in which it holds: in the category of prob- ability measure spaces with a finite number of random events, or to re-labelings that preserve the epistemic status. Thus Bertrand’s Paradox is defused; however this is not to be taken as defence of the classical interpretation. The classical interpretation is deeply problematic for simple, non-technical reasons that are related to the general issue of how one should view the status of probability theory. One has to distinguish applications of probability theory from interpretations of probability as this latter term is used in philosophy of science. Probability theory is part of pure mathematics in the first place. In an application of probability theory one relates the mathematical elements in a triplet (X, S, p) to non-mathematical
12 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -38-
entities. This involves two tasks: Event Interpretation To specify what the elements in X and S stand for. Truth Interpretation To clarify when the proposition “p(A)=r” is true/false. In an application, probability theory thus becomes a mathematical model of a cer- tain phenomenon that is external to mathematics. A probability measure space is a good model of the phenomenon if it has two features: descriptive accuracy and predictive success. Descriptive accuracy means that under the fixed specification of the Event and Truth Interpretations propositions such as p(A) = r are true about events that have been observed in the past. Predictive success means that the prob- abilistic propositions p(A) = r will be true in future observations. It is clear that both descriptive correctness and predictive success are robustly empirical features; hence, whether a probability space is a good model is a question that can be an- swered only on the basis of empirical considerations. This is of course not new, there is nothing peculiar or mysterious about probabilistic modeling, probabilistic scientific theories are just like any scientific theory from this perspective. 1 The mathematical notion of isomorphism between probability measure spaces is in complete harmony with the application of probability theory – and so is the General Bertrand Paradox: The Event Interpretation and Truth Interpretation are conceptually different issues, the former does not determine the latter, and, ac- cordingly, two probability spaces are defined to be isomorphic if two conditions are satisfied: the random events in the two spaces are connected by a re-labeling and the re-labeling preserves the probabilities. From the perspective of the notion of isomorphisms of probability spaces finite probability spaces with the uniform prob- ability measure just happen to have the “contingent” feature that in this category re-labelings are isomorphisms; in this case the re-labelings contain enough informa- tion to make them isomorphisms. Interpretations of probability are typical classes of applications of probability theory, classes consisting of applications that possess some common features, which the interpretation isolates and analyzes. The main problem with the Classical In- terpretation (understood with the amendment of the Principle of Indifference) is that it disregards the empirical character of the applications of probability the- ory and gives the impression that descriptive accuracy and predictive success in applications are based on (and can be ensured by referring to) an priori-flavored principle that expresses some sort of epistemic indifference about random events. But this is not possible, which is shown by the difficulty (often pointed out in con- nection with the Principle of Indifference [9]) that it is unclear how to specify the precise content of “epistemic neutrality” in such a way that the Principle of Indifference does not become circular and holds nevertheless: The Principle of Indifference holds only if epistemic neutrality does entail that the probabilities of the events given by the uniform probability measure will be equal to the frequencies of events in actual trials producing elementary random events, and such a conclu- sion cannot be validly based on a priori considerations – if it could, the Principle of Indifference would have solved the problem of induction.
1Although Marinoff [16] does not emphasize the empirical aspect of probabilistic modeling, his resolu- tion of Bertrand’s Paradox is essentially in the spirit of probabilistic modeling described here: Marinoff distinguishes different types of random generators representing different types of randomness and notes that, depending on which random generator produces the random events featuring in a Bertrand Paradox type situation, one obtains different probability distributions – there is nothing paradoxical about this.
13 12-14 June 2012; Mugla, Turkey -39-
One might say that the classical interpretation and the Principle of Indiffer- ence should be taken not with the Frequency Link but with the Degree of Belief Link, according to which pH should be viewed as representing degrees of belief [5], [17]. To assess the viability of such an interpretation of the classical interpretation one has to distinguish two further specifications of the notion of degree of belief: descriptive and normative. In the descriptive interpretation the claim is that pH does represent the degree of belief of a particular person (or a specific group of people) about random events happening if the persons are epistemologically neutral about the events. Whatever the precise content of this epistemological neutrality, this descriptive interpretation of the degrees of belief is again an empirical claim about the thinking and behavior of certain people, which may or may not be true; testing it (including testing if the people in question have degrees of belief indeed) is a matter for empirical psychology – but this interpretation has little to do with how probability theory is applied in the sciences. In the normative interpretation pH is declared to stand for the rational degrees of belief of an abstract person (agent) if the agent is epistemologically neutral about the elementary events. In this case one has to ask in what sense and why pH represents rational degrees of belief? One answer can be that pH is rational if (X, S, pH ) is a good model of a certain phenomenon in the sense described earlier in this section and a rational agent’s belief better be in harmony with the probabilities provided by a good model. This interpretation of rationality of pH is essentially the content of the Principle Principle [14] and, while it is very natural, one should realize that pH features in it in two roles: (i) standing for the degree of belief and (ii) representing some extra-mental, non-degree-of-belief-type quantities (for instance frequencies or some other dimensionless physical quantities [23]) with which the degrees of belief are required to be equal. Thus this interpretation reduces the Degree of Belief Link to another Interpretive Link and thereby the rationality (or otherwise) of an agent’s degree of belief is made again dependent on empirical matters. But then it does not matter from the perspective of rationality of the degrees of belief whether the agent is epistemically neutral about the elementary events or not, because the correctness of the probabilistic model is an empirical matter that cannot be ensured on the basis of an a priori neutrality, and probability measures different from pH can very well be rational if they satisfy the Principle Principle and the probabilistic model is good. Another possible specification of rationality of the agent’s degrees of belief can be that they are consistent, i.e. that pH satisfies the axioms of probability. Obviously, this does not single out pH as the only rational probability. In sum: Bertrand’s Paradox interpreted as violation of Labeling Irrelevance does not undermine the classical interpretation of probability understood with the Principle of Indifference, and violation of Labeling Irrelevance is in complete harmony with how mathematical probability theory is used in the sciences to model phenomena; yet, irrespective of Bertrand’s Paradox, the content of the Principle of Indifference cannot be specified in such a way that it can establish the classi- cal interpretation of probability as descriptively accurate, predictively successful or rational.
Appendix
This Appendix recalls some elementary facts about the Haar measure. Standard references for the Haar measure are [18] and [10][Chapter XI.], for a more recent presentation see [6].
14 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -40-
X is called a topological group with multiplication (x, y) 7→ x · y and inverse x 7→ x−1 if the map (x, y) 7→ x−1 · y is continuous (x, y ∈ X). A measure p on the Borel algebra S of the group X is called left invariant (respectively right invariant) with respect to the group action if eq. (25) (respectively eq. (26)) below hold
p(A) = p(xA) for all x ∈ XA ∈ S (25) p(A) = p(Ax) for all x ∈ XA ∈ S (26)
where for an x ∈ X, the sets xA and Ax are defined by
xA = {x · y : y ∈ A} (27) Ax = {y · x : y ∈ A} (28)
The measure p is called invariant if it is both left and right invariant, i.e. if
p(A) = p(xA) = p(Ax) for all x ∈ XA ∈ S (29)
L On any locally compact topological group there exists both a left pH and a right R pH invariant Haar measure and they are unique up to multiplication by a constant. The left and right invariant Haar measures are in general different. Since both the left and Haar measure is unique up to constant multiplication, and since for . L any x ∈ X the measure px(A) = pH (Ax) is again a left invariant measure, there L exists a real number ∆(x) such that px(A) = ∆(x)pH (A). The map x 7→ ∆(x) is called the modular function of the group. If ∆(x) = 1 for all x, then the groups are called unimodular; for unimodular groups the left and right invariant Haar measures coincide and yield an invariant measure. Compact and locally compact abelian groups are unimodular. The Haar measure is bounded if and only if X is compact – the Haar measure is then a probability measure. The canonical examples of unbounded Haar measures are the Lebesgue measure on the real line and the Lebesgue measure on Rn. It is shown below that the nor- Rn n malized restrictions of the Lebesgue measure on to subsets of the form ×i [ai, bi) in Rn also can be regarded as Haar measures in their own right with respect to a n compact group G. This entails that the Lebesgue measure on the closed set ×i [ai, bi] also can be viewed as a Haar measure with respect to G because the Lebesgue mea- n n sure space over ×i [ai, bi) and over ×i [ai, bi] are isomorphic. (Note that G is not the shift; it cannot be since shifted subsets of [0, 1) are not necessarily subsets of [0, 1) and the group of “shifts modulo 1” do not form a topological group due to discontinuity of the “shift modulo 1” operation.) Since [0, 1) can be mapped onto [a, b) by a continuous linear bijection connecting the (normalized) Lebesgue mea- sures on the intervals [0, 1) and [a, b), to see how the Lebesgue measure on [a, b) is a Haar measure in its own right, it is enough to see how the (normalized) Lebesgue measure p[0,1) on the interval [0, 1) emerges as a Haar measure. Let
S1 = z ∈ C : |z| = 1 © ª be the unit circle on the complex plane. As S1 is a compact topological subgroup of C with the multiplication of complex numbers as the group operation, there exists 1 a normalized Haar measure pH on S . The exponential function f defined by
f : [0, 1) → S1, f(t) = e2πit
is a continuous and continuously invertible bijection between the unit interval [0, 1) and the unit circle S1; hence both f and its inverse are measurable. We claim that
15 12-14 June 2012; Mugla, Turkey -41-
f is a measure theoretic isomorphism between the interval [0, 1) with the Lebesgue 1 measure on it and S with the measure pH on it; i.e. that
−1 pH = p[0,1) ◦ f (30)
To verify (30), by the uniqueness of Haar measures, it is enough to show that −1 −1 p[0,1) ◦ f is a Haar measure, i.e. that p[0,1) ◦ f is invariant with respect to the group operation in S1, which is the multiplication of complex numbers. Since the exponential function f turns addition of real numbers into multiplication of complex numbers, for B ⊂ S1 and z ∈ C we have
− f 1(B · z) = f[B] + t mod 1 (31)
where the translation Y 7→ Y + t mod 1 (32) is the standard shift of set Y ⊂ [0, 1) by t followed by “pulling back” into [0, 1) the part of Y that is shifted out of the bounds of [0, 1); formally:
Y + t mod 1 = Y ∩ [0, 1 − t) + t ∪ Y ∩ [1 − t, 1) − (1 − t) ¡ ¢ ¡ ¢ p[0,1) is translation invariant on [0, 1) in the sense that for any measurable set A ⊆ [0, 1) and 0 ≤ t < 1 we have
p[0,1)(A) = p[0,1)(A + t mod 1),
so we have
−1 −1 −1 pH (B · z) = p[0,1)(f (B · z)) = p[0,1)(f (B) + t mod 1) = p[0,1)(f (B)) = pH (B)
n n The Lebesgue measure p[0,1) on the n-dimensional cube [0, 1) also can be regarded n as a Haar measure: one can consider the Haar measure pH on the n-dimensional torus T n = S1 × S1 × · · · × S1 (n times) which is a compact topological subgroup of Cn with the coordinate-wise multipli- cation of complex numbers as group operation. Put
n n 2πit0 2πitn f : [0, 1) → T , f(t0, . . . , tn) = e , . . . , e ¡ ¢ Then f is a continuous and continuously invertible bijection and, applying the previous argument in each coordinates, one concludes
n n −1 pH = p[0,1) ◦ f
References
[1] J. Aaronson. An Introduction to Infinite Ergodic Theory, volume 50 of Math- ematical Surveys and Monographs. American Mathematical Society, Rhode Island, 1997. [2] S. Bangu. On Bertrand’s Paradox. Analysis, 70:30–35, 2010. [3] J.L.F. Bertrand. Calcul de Probabilit´es. Gauthier-Vilars, Paris, 1888.
16 Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -42-
[4] P. Billingsley. Ergodic Theory and Information. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, New York, London Sydney, 1965. [5] P. Castell. A consistent restriction of the Principle of Indifference. The British Journal for the Philosophy of Science, 49:387–395, 1998. [6] A. Deitmar and S. Echterhoff. Principles of Harmonic Analysis. Universitext. Springer, New York, 2009. [7] J. Doob. The development of rigor in mathematical probability theory (1900- 1950). American Mathematical Monthly, pages 586–595, 1996. [8] H. Fischer. A History of the Central Limit Theorem: From Classical to Modern Probability Theory. Sources and Studies in the History of Mathematics and Physical Sciences. Springer, New Yor, Dordrecht, Heidelberg, London, 2011. [9] A. H´ajek. Interpretations of probability. The Stanford Encyclope- dia of Philosophy (Summer 2012 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu/archives/sum2012/entries/probability-interpret/, 2012. accessed May 29, 2012. [10] P. Halmos. Measure Theory. D. Van Nostrand, New York, 1950. [11] P. Harremo¨es. Maximum entropy on compact groups. Entropy, 11:222–237, 2009. [12] E. Jaynes. The Well Posed Problem. Foundations of Physics, 4:477–492, 1973. [13] A.N. Kolmogorov. Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer, Berlin, 1933. English translation: Foundations of the Theory of Probability, (Chelsea, New York, 1956). [14] D. Lewis. A subjectivist’s guide to objective chance. In Philosophical Papers, vol. II, pages 83–132. Oxford University Press, Oxford, 1986. [15] M. Lo´eve. Probability Theory. D. Van Nostrand, Princeton, Toronto, London, Melbourne, 3rd edition, 1963. [16] L. Marinoff. A resolution of Bertrand’s Paradox. Philosophy of Science, 61:1– 24, 1994. [17] J.M. Mikkelson. A resolution of the wine/water paradox. The British Journal for the Philosophy of Science, 55:137–145, 2004. [18] L. Nachbin. The Haar Integral. D. Van Nostrand, Princeton, NJ, 1965. [19] J.S. Rosenthal. A First Look at Rigorous Probability Theory. World Scientific, Singapore, 2006. [20] D.W. Rowbottom and N. Schackel. Bangu’s random thoughts on Bertrand’s Paradox. Analysis, 70:689–692, 2010. [21] W. Rudin. Autohomeomorphisms of compact groups. Topology and its Appli- cations, 52:69–70, 1993. [22] N. Shackel. Bertrand’s Paradox and the Principle of Indifference. Philosophy of Science, 74:150–175, 2007. [23] L.E. Szab´o.Objective probablity-like things with and without objective inde- terminism. Studies in the History and Philosophy of Modern Physics, 38:626– 634, 2007.
17 12-14 June 2012; Mugla, Turkey -43-
[24] E. K. van Douwen. A compact space with a measure that knows which sets are homeomorphic. Advances in Mathematics, 52:1–33, 1984. [25] R. von Mises. Probability, Statistics and Truth. Dover Publications, New York, 2nd edition, 1981. Originally published as ‘Wahrscheinlichkeit, Statistik und Wahrheit’ (Springer, 1928). [26] J. von Plato. Creating Modern Probability. Cambridge Studies in Probability , Induction and Decision Theory. Cambridge University Press, Cambridge, 1994. [27] A. Wightman. Hilbert’s 6th problem. In F.E Browder, editor, Mathematical Developments Arising from Hilbert Problems: Proceedings, volume 28 of Pro- ceedings of Symposia in Pure Mathematics, pages 147–240. American Mathe- matical Society, 1983.
18
Seventh Quadrennial Fellows Conference of the Center for Philosophy of Science -44-