Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2017 Philosophical, Historical, and Empirical Investigations into the Concept of Biological Fitness Peter Takacs

Follow this and additional works at the DigiNole: FSU's Digital Repository. For more information, please contact [email protected] FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

PHILOSOPHICAL, HISTORICAL, AND EMPIRICAL INVESTIGATIONS INTO THE

CONCEPT OF BIOLOGICAL FITNESS

By

PETER TAKACS

A Dissertation submitted to the Department of Philosophy in partial fulfillment of the requirements for the degree of Doctor of Philosophy

2017 Peter Takacs defended this dissertation on November 15, 2017. The members of the supervisory committee were:

Michael Ruse Professor Directing Dissertation

Joseph Travis University Representative

Michael Bishop Committee Member

James Justus Committee Member

The Graduate School has verified and approved the above-named committee members, and certifies that the dissertation has been approved in accordance with university requirements.

ii

I dedicate this work of scholarly toil to my wife (Shannon) and children (Elek and Zsofia), without whom this project would either have been completed long ago or not at all. Their unconditional love and patience continue to exceed even my loftiest expectations. They have always been the most welcome “distraction” for this aspiring academic. I would also like to thank Michael Ruse and his wife Lizzie. Michael’s mentoring and friendship have not only forever changed my understanding the world, but also serve as a constant reminder of how the modern Academy could and should operate. Finally, I would like to extend my love and gratitude to my parents. As immigrants to this country who spoke not a word of English upon arrival more than half a century ago, they somehow managed to create a comfortable and nurturing environment for themselves and their children. Their genuine toils remain a constant source of perspective. The encouragement that they have provided throughout my studies is likewise humbling. My debt to them is simply beyond measure.

iii TABLE OF CONTENTS

Abstract ...... v

1. INTRODUCTION ...... 1

2. IS ORGANISMAL FITNESS A METAPHYSICAL EXCRESCENCE? ...... 39

3. HONEST PROPENSITIES: IS THERE A CRACK IN THE FOUNDATION? ...... 70

4. EVOLUTIONARY THEORY AND THE CHALLENGE OF EVO-DEVO ...... 109

5. CONCLUSION ...... 127

References ...... 135

Biographical Sketch ...... 146

iv ABSTRACT

While undeniably one of the central explanatory concepts in biology, fitness is deployed in an ambiguous or even inconsistent manner by evolutionary biologists as well as philosophers. This sort of foundational confusion is a plea for conceptual clarity and has, thereby, presented a wonderful opportunity for philosophers of science to ply their trade. After engaging with the topic, however, several influential philosophers of science (e.g., Mohan Matthen, Dennis Walsh, and Andre Ariew) and biologists (Richard Lewontin and Massimo Pigliucci) have reached the conclusion that biological fitness is not in fact the cause of natural selection but instead a mere statistical artifact or redescription of systematic transgenerational change. It is, as they see matters, a label best reserved for abstract trait types rather than the organisms that bear such traits. This poses a serious challenge to the working intuitions of most biologists and many philosophers of biology. Moreover, it is but one of many challenges to the explanatory and ontological primacy of natural selection in recent memory. For at least three decades, some practitioners in the burgeoning subdiscipline of evolutionary developmental biology have been outspoken in insisting that the tools of population biology are insufficient for describing or explaining observations of adaptive evolutionary change both past and present. In this dissertation, I examine these recent challenges to orthodox conceptions of fitness and natural selection, as well as the rejoinders given in defense. Ultimately, I defend a conception of fitness as a probabilistic dispositional property (i.e., a propensity) of token organisms that causes natural selection.

v CHAPTER 1

INTRODUCTION

Along with other concepts such as genotype, phenotype, population, species, natural selection, and adaptation, the notion of biological fitness (hereafter simply “fitness”) maintains a central role within the explanatory framework of evolutionary population biology. In population , for instance, is defined as systematic change in allele frequency due to observed differences in the relative fitness of extant genotypes (or allelotypes). While in quantitative genetics, it is nonrandom change in the mean phenotypic value that is of primary concern, where attributions of “nonrandomness” are made on the basis of whether a trait variant tends to enhance (or diminish) its bearer’s overall ability to deal with environmental selection pressures (i.e., fitness).

This apparent uniformity of opinion surrounding fitness in various sub-disciplines of evolutionary biology belies a deep conceptual as well as methodological problem. For, despite its undisputable explanatory importance, the notion of fitness frequently reveals itself to be little more than an ill-understood explanatory posit. Take, by way of example, the following quotation from a prominent primer of ecological genetics: “Fitness is the key concept in understanding natural selection, because selection is caused by differences in fitness among individuals with different phenotypes. Unfortunately, fitness is very difficult to define.”1 As if this claim was not revealing enough, the ambiguity of fitness is further evinced in the following:

The y-axis [on a three-dimensional adaptive landscape] depicts fitness, which is a fundamental but difficult concept. In general, fitness refers to the ability of an organism to survive, reproduce, and thus have descendants in future generations. Unfortunately, fitness is difficult to define more specifically so that it can be measured and understood more clearly. Therefore, a number of different definitions of fitness have been proposed, each with strengths and weaknesses.2

1 Op.67. Conner, J.K. and D.L. Hartl. A Primer of Ecological Genetics. Sinauer Associates: 2004. 2 Ibid., p.191. Emphasis added. 1 Renowned evolutionary theorist Stephen C. Stearns has even made light of this situation with a tongue-in-cheek definition of the concept: “Fitness: something everyone understands but no one can define precisely.”3

An inquisitive reader might rightly wonder how a concept central to explanation in evolutionary biology remains so poorly understood. How has it come to be that the notion of fitness is routinely deployed in an ambiguous or even inconsistent manner by evolutionary biologists and philosophers alike? Foundational confusion of this sort is a plea for conceptual clarity and, thereby, presents a wonderful opportunity for philosophers of science to ply the tricks of their trade. Addressing this general worry, then, one which is conceptual or philosophical in nature, is focus of this project. And, as always, the best place to begin is by examining the historical underpinnings of current controversy.

The Nineteenth Century

There is no mystery surrounding the origin of the concept. Prior to Charles Darwin’s On the Origin of Species by Means of Natural Selection; or the Preservation of Favoured Races in the Struggle for Existence (1859), judgments pertaining to the fitness of individuals were made almost exclusively by way of reference to a given organism’s species membership. In large part, this was due to the then prevailing idea that species were “fixed” at the moment of divine creation. All species would accordingly remain unchanged throughout the history of the Earth within this worldview. Barring the occasional (divinely ordained) geological catastrophe, the

Earth was pictured as a more or less static mosaic of environmental roles or niches, each one filled by God with its proper inhabitants. Species were immutable because they were divinely designed with the ability to optimally exploit their respective environments. This presupposition

3 Stearns, S.C. Life-history tactics: a review of the ideas. Quarterly Review of Biology, 51: 3-47. 1975. 2 supposedly accounted for the appearance of adaptation, the overriding phenomenon for which explanation was sought. It is very much in this vein that the noted comparative anatomist and paleontologist Georges Cuvier (1769-1832) confidently claimed that “All organs of an animal form a single system, the parts of which hang together, and act and re-act upon one another; and no modifications can appear in one part without bringing about corresponding modifications in all the rest.”4

The underlying sentiment was unmistakable: any dramatic modifications would sully

God’s creation and make the organism somehow unfit for its unique role, purpose, or function.5

The apparent “fittingness” of an individual organism within its local environmental circumstances depended critically on how closely it resembled the archetypal form of its species.

Fitness thus pertained to species fitness or general “adaptedness,” God’s engineered design for each species, rather than any particular individual’s unique ability to navigate the vicissitudes of its surroundings.

This view is especially clear in the work of English clergyman William Paley, whose

Natural Theology or Evidences of the Existence and Attributes of Deity (1802) had a tremendous formative influence on Darwin’s thinking. Darwin agreed with Paley that the outstanding challenge was one of how to explain adaptation or the appearance of design.

How have all those exquisite adaptations of one part of the organisation to another part, and to the conditions of life, and of one distinct organic being to another being, been perfected? We see these beautiful co-adaptations most plainly in the woodpecker and missletoe; and only a little less plainly in the humblest parasite which clings to the hairs of a quadruped or feathers of a bird; in the structure of the beetle which dives through the

4 Cuvier, G. Histoire des Progres des Sciences naturelles depuis. Vol.1, 310. 1789. 5 Further evidence for this worldview can be traced to the then prevailing use of the label “monstrosity.” It was typically applied to individuals who exhibited variations, abnormalities, or deformities of different sorts, especially malformed fetuses and the like. For more on the fascinating history of biological abnormality or deformity see Lorraine Daston and Katharine Park’s Wonders and the Order of Nature, 1150-1750. MIT Press, Cambridge, MA: 1998. 3 water; in the plumed seed which is wafted by the gentlest breeze; in short, we see beautiful adaptations everywhere and in every part of the organic world.6

How has it come to be, for instance, that the mammalian eye has such intricate and purposive structure? Paley’s answer was that it was divinely designed by an omnibeneficent creator to perform a specific function (veridical visual perception) that contributes to human flourishing.

There was, for him, “divine selection” for the design of the mammalian eye at the very moment of creation. It is with this imposition of extramundane design, however, that Darwin would eventually part company with the natural theologians.

Darwin’s exodus from English natural theology was in no small part a direct response to two problems that accompany the belief that species were the fixed products of divine selection.

First, the rapidly increasing number of fossil remains revealed that the morphological structures of extant species more closely resembled those of spatially- and temporally-proximate extinct species than the remains of extinct species in geographically distant lands. This presents a problem if one believes, as those opposed to the transmutation of species did, that species in seemingly similar selective environments should exhibit a near resemblance in their (divine) design. Since such species face similar challenges to survival and reproduction, they should also evince the phenotypic traits that best enable them to respond to such selection pressures. A beneficent, all-knowing, and all-powerful God would presumably bestow optimal design upon his creations. But having more than one “optimal” design in any particular type of selective environment, no matter the spatial separation among the areas identified as being of a type, could be seen as indicating redundancy or indecision uncharacteristic of such a gifted architect. And

6 Op.60. Darwin, C. R. (1859). On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. London: John Murray. [1st edition] 4 yet this is precisely what is observed across geographically separate habitats that appear to exert very similar selective pressures.

Second, there was the obvious fact that the traits exhibited by some species seemed to be ill-suited to the performance of critical functions. Think of a sea turtle’s nesting practices, for example. Would-be sea turtle mothers must laboriously crawl across sandy terrain to lay their eggs. Once they locate a suitable nesting site, these turtles must dig holes in which to deposit their eggs. If we did not know any better, it might be safe to assume that a supposedly omniscient, omnipotent, and omni-beneficent deity optimized sea turtle morphology for digging such nests. As most readers no doubt already know, this is anything but the case. These turtles must rather awkwardly dig with the very same flippers that they use to gracefully traverse their marine habitat.

The modern conception of fitness, which comes to us by way of Darwin’s On the Origin of Species (1859), changed this species-archetype-relative conception of fitness. Evolutionism, the belief that species change gradually over time, clearly preceded Darwin’s grand insight. It was he, however, who discovered the first plausible mechanism for such evolutionary change.

According to Darwin, adaptions were forged via a “struggle for existence” between the conspecifics who comprise a population as well as among members of different species with similar requirements:

I should premise that I use the term Struggle for Existence in a large and metaphorical sense, including dependence of one being on another, and including (which is more important) not only the life of the individual, but success in leaving progeny. Two canine animals in a time of dearth, may be truly said to struggle with each other which shall get food and live. But a plant on the edge of a desert is said to struggle for life against the drought, though more properly it should be said to be dependent on the moisture. A plant which annually produces a thousand seeds, of which on an average only one comes to maturity, may be more truly said to struggle with the plants of the same and other kinds which already clothe the ground. The missletoe is dependent on the apple and a few other trees, but can only in a far-fetched sense be said to struggle with these trees, for if too

5 many of these parasites grow on the same tree, it will languish and die. But several seedling missletoes, growing close together on the same branch, may more truly be said to struggle with each other. As the missletoe is disseminated by birds, its existence depends on birds; and it may metaphorically be said to struggle with other fruit-bearing plants, in order to tempt birds to devour and thus disseminate its seeds rather than those of other plants. In these several senses, which pass into each other, I use for convenience sake the general term of struggle for existence.7

Differential survival (“life of the individual”) and reproduction (“success in leaving progeny”) are the outcomes of this unrelenting struggle. But mere differences in survival and reproduction can lead to any number of varying and even potentially maladaptive results; they are, without further qualifications, effectively directionless. There is no guarantee that differential reproduction alone will lead to the sorts of cumulative organismal adaptation (appearances of design) that cry out for explanation.

Darwin’s theory wanted an analogue for the directed selection provided by the God of natural theology. To fill this role, he famously posited the mechanism of natural selection:

How will the struggle for existence, discussed too briefly in the last chapter, act in regard to variation? Can the principle of selection, which we have seen is so potent in the hands of man, apply in nature? I think we shall see that it can act most effectually. Let it be borne in mind in what an endless number of strange peculiarities our domestic productions, and, in a lesser degree, those under nature, vary; and how strong the hereditary tendency is. Under domestication, it may be truly said that the whole organisation becomes in some degree plastic. Let it be borne in mind how infinitely complex and close-fitting are the mutual relations of all organic beings to each other and to their physical conditions of life. Can it, then, be thought improbable, seeing that variations useful to man have undoubtedly occurred, that other variations useful in some way to each being in the great and complex battle of life, should sometimes occur in the course of thousands of generations? If such do occur, can we doubt (remembering that many more individuals are born than can possibly survive) that individuals having any advantage, however slight, over others, would have the best chance of surviving and of procreating their kind? On the other hand, we may feel sure that any variation in the least degree injurious would be rigidly destroyed. This preservation of favourable variations and the rejection of injurious variations, I call Natural Selection.8

7 Ibid., p. 62. 8 Ibid., p. 80-81. 6 Darwin uses the example of artificial selection to show how dramatically and effectively humans have altered domesticated breeds by resort to existing variation. He thereby maintains the notion of design, albeit one that is already a step removed from any type of divine origin. Domesticated animals are more likely to bear traits that humans deem valuable. This helps ensure the survival of a species at the expense of its members who lack the desired traits. Darwin, then, makes plain that artificial selection is nothing but a special case of natural selection. The traits that breeders select, even if not those beneficial to organisms in a wild population, are nevertheless extant variations upon which to select. Breeder’s selection becomes just another form of environmental selection pressure.

With this maneuver, Darwin thought that the final semblance of derived intentionality or anthropomorphically-imposed purpose had been removed from evolutionary theorizing. Yet the much sought-after explanation of adaptation was still safely in hand. Mere differential survival and reproduction become, in essence, the unconscious selection of those “individuals having any advantage, however slight, over others.” The presence of a favorable variation typically enhances an organism’s chances of surviving and reproducing. Having favorable variations makes some organisms “fitter than” others. Assuming these advantages are inheritable and that the environmental challenges remain similar, organisms that exhibit favorable variations tend to increase their representation in the future. There is consequently selection in the direction of accumulated favorable variations. Darwin, to his credit, recognized that this process can lead to highly complex adaptive traits (e.g., the mammalian eye) and even speciation when transpiring over long periods of time.

Not everyone was enamored with Darwin’s analogy between natural and artificial selection, nor for that matter with the notion of selection generally. Alfred Russel Wallace (1823-

7 1913), who along with Darwin is credited with cofounding the theory of evolution via natural selection, was somewhat surprisingly among Darwin’s most vocal critics on this front. Wallace, unlike Darwin, was not beholden to the tradition of natural theology. He agreed that explaining the appearance of design was crucial, but he also realized how invoking the notion of selection, whether artificial or natural, runs the risk of reintroducing (or perhaps failing to eliminate) the possibility of conscious selection or divine design. This was potentially a serious impediment to the broad acceptance of differential survival and reproduction due to variations in fitness as the ultimate cause of adaptive evolutionary change.

One of the strongest arguments which have been adduced to prove the original and permanent distinctness of species is, that varieties produced in a state of domesticity are more or less unstable, and often have a tendency, if left to themselves, to return to the normal form of the parent species; and this instability is considered to be a distinctive peculiarity of all varieties, even of those occurring among wild animals in a state of nature, and to constitute a provision for preserving unchanged the originally created distinct species.9

The worry, for Wallace, is that artificial selection of the sort then practiced by animal breeders often does not culminate in the creation of new species. While there are undoubtedly many morphological differences among dog breeds that are due to human intervention, these breeds are still considered types of dogs or members of the species Canis familiaris. As soon as breeders relent in their selection, selected traits are prone to revert or regress to common form. Moreover, speciation was just as often an outcome to be avoided in the breeding context. It would be a nightmare scenario for a livestock breeder who, say, sought to increase the milk production of his dairy cattle and instead selected for a generation of sterile hybrids.

9 Op.53. Darwin, C. R. and A. R. Wallace. (1858). On the tendency of species to form varieties; and on the perpetuation of varieties and species by natural means of selection. Journal of the Proceedings of the Linnean Society of London. Zoology 3 (20 August): 46-50.

8 Wallace subsequently draws a sharp distinction between variation under domestication and variation in the wild:

It will be observed that this argument rests entirely on the assumption, that varieties occurring in a state of nature are in all respects analogous to or even identical with those of domestic animals, and are governed by the same laws as regards their permanence or further variation. But it is the object of the present paper to show that this assumption is altogether false, that there is a general principle in nature which will cause many varieties to survive the parent species, and to give rise to successive variations departing further and further from the original type, and which also produces, in domesticated animals, the tendency of varieties to return to the parent form.

In spite of Darwin’s using artificial selection to great rhetorical effect, Wallace notes how cases of artificial selection are usually trotted out as counterexamples to macroevolution and speciation of the form both he and Darwin espoused. Selection was accordingly cast as the last relic of an outmoded conception of design.

Along this line, he encouraged Darwin to drop the term ‘natural selection’ in favor of

Herbert Spencer’s (1864) expression “survival of the fittest.”10 This phrasing had, according to

Wallace, a much more direct, empirical ring to it once one appreciates that the survival of those with favorable variations (i.e., the fittest) comes by way of the environment’s selecting against or exterminating those with the most unfavorable variations. Darwin was receptive to Wallace’s suggestion. By 1868 in The Variation of Animals and Plants under Domestication, Darwin makes a rather candid concession: "This preservation, during the battle for life, of varieties which possess any advantage in structure, constitution, or instinct, I have called Natural

Selection; and Mr. Herbert Spencer has well expressed the same idea by the Survival of the

Fittest. The term "natural selection" is in some respects a bad one, as it seems to imply conscious

10 Darwin Correspondence Project, “Letter no. 5140,” accessed on 13 June 2017, http://www.darwinproject.ac.uk/DCP-LETT-5140. 9 choice; but this will be disregarded after a little familiarity.”11 Darwin adopted Spencer’s phrase in the fifth edition of On the Origin of Species (1969), where it was implied to mean “better designed for an immediate, local environment.”12 Until very recently, this was the edition that nearly everyone read, including the architects of what would become the modern or neo-

Darwinian synthesis of the early twentieth century. Darwin was always reluctant to do away with the metaphor of selection, whereas Wallace and Spencer thought the metaphor at best a distraction. This is indicative of the fact that Darwin had a substantive account of selection as a force, the vera causa (true cause) of adaptive evolution. By what means does this force or mechanism operate? Differences in fitness or, as Darwin would phrase it, the “chance of surviving and of procreating their kind.”13

Darwin recognized that typological-thinking of the kind underlying the idea that species will forever remain fixed by divine design was an explanatory obstacle. The only way to reconcile observations of extant with extinct biological form or function was to recognize evolution via natural selection. Environmental circumstances change even if gradually, and the best way for species to survive continuous change is for them to adapt. Adaptation, in turn,

11 Op. 7. Darwin, C. R. (1868). The variation of animals and plants under domestication. London: John Murray. First edition, first issue. Volume 1. 12 Gould, S.J. Darwin’s Untimely Burial: Despite reports to the contrary, the theory of natural selection remains alive and well. Natural History, 85: 24-30. 1976 (October). 13 Darwin makes numerous references to variation in a population as a vera causa. This is evident throughout On the Origin of Species (1859). By way of example, note the following remark on page 482 in the conclusion: “Several eminent naturalists have of late published their belief that a multitude of reputed species in each genus are not real species; but that other species are real, that is, have been independently created. This seems to me a strange conclusion to arrive at. They admit that a multitude of forms, which till lately they themselves thought were special creations, and which are still thus looked at by the majority of naturalists, and which consequently have every external characteristic feature of true species, --they admit that these have been produced by variation, but they refuse to extend the same view to other and very slightly different forms. Nevertheless they do not pretend that they can define, or even conjecture, which are the created forms of life, and which are those produced by secondary laws. They admit variation as a vera causa in one case, they arbitrarily reject it in another, without assigning any distinction in the two cases. The day will come when this will be given as a curious illustration of the blindness of preconceived opinion.” 10 requires change or variation. But variation is, by definition, departure from similarity or normalcy. It is perhaps on this point that Darwin’s genius becomes most apparent. He understood that the bearers of such variation can be none other than the individual organisms comprising a population. Perhaps even more importantly, he also recognized that all members of a population vary in some respect. Ernst Mayr (1904-2005), renowned evolutionary biologist and architect of the modern synthesis, captures what is arguably Darwin’s key insight:

The assumptions of population thinking are diametrically opposed to those of the typologist. The populationist stresses the uniqueness of everything in the organic world. What is true for the human species—that no two individuals are alike—is equally true for all other species of animals and plants […] All organisms and organic phenomena are composed of unique features and can be described collectively only in statistical terms. Individuals, or any kind of organic entities, form populations of which we can determine the arithmetic mean and the statistics of variation. Averages are merely statistical abstractions, only the individuals of which the populations are composed have reality.14

Each population, being a constituent subset of the species taken as a whole, faces different selective pressures. One combination of traits may be selected for in environment X but another combination of traits could prove better even in an ever so slightly modified environment Y.

Attributions of fitness consequently do not depend on a comparison against some archetypal member of a species, but are instead relative to the extant variations (or combinations thereof) within a population in a variable environment.

In one fell swoop, the paradoxical coexistence of suboptimal trait design and morphological resemblance across space and time was demystified. Variation became the norm.

Following Darwin’s work, variation was no longer conceived of as unfavorable accidental deviation, nor was it necessarily indicative of corruption, deformity, error, or monstrosity. It was at the root of the bewildering diversity of adaptive form and function in the organismic world and, hence, acknowledged as the fodder for speciation.

14 Op. xix-xx. Facsimile of the first edition of Charles Darwin’s Origin of Species. 11 The notion of fitness was concurrently reconstrued as an individual organism’s ability to survive and reproduce. It became an individual organismic property, one upon which natural selection effects adaptive evolutionary change. Differences in this ability were accordingly attributed to the different combinations of trait variants exhibited by individuals in a population.

Combinations of trait variants have differing rates of success when it comes to navigating challenges posed by the environment, including competition for resources with conspecifics.

Systematic ecological success typically increases representation in subsequent generations, usually at the expense of extant competitors, and is the kernel of long-term evolutionary change or speciation.

The Twentieth Century

While this Darwinian conception of fitness remains with us even now, two related conceptual difficulties became evident in the work of early population geneticists. On one hand, there was growing confusion over exactly which entities, trait types (e.g., genotypes and phenotypes) or token organisms, should be considered the proper referents of fitness-value ascriptions. On the other hand, there was the issue of whether actual reproductive contribution

(i.e., lifetime number of offspring) alone exhausts the notion of organismal fitness.

In addressing the shifting conceptions of selection and fitness one must begin with the work of statistician and biologist Ronald A. Fisher (1890-1962). It is by now well known how his foundational work in population genetics resolved the impasse between Mendelian geneticists and biometricians and thus established a unified conceptual framework for evolutionary biology.15 Mendelian geneticists such as William Bateson (1861-1926) and Hugo de Vries (1848-1935) were committed to the discrete variation patterns and the laws of

15 See especially Ruse (1996, 2003), Bowler (2003), Larson (2004), and Provine (1971). 12 inheritance exhibited by Mendelian genes. Biometricians such as Karl Pearson (1857-1936) and

Walter Frank “Raphael” Weldon (1860-1906) were focused on the measurement and statistical analysis of continuous variation exhibited by populations. Mendelians insisted that the continuous variations measured by biometricians were too insignificant to account for the evolution of new species. Biometricians countered that discrete units of heredity, like genes, could not explain the appearance of continuous variation in real populations. Fisher ingeniously demonstrated how continuous variation among phenotypic traits of the type measured by biometricians could be produced by the combined action of many discrete genes and thus be the result of Mendelian inheritance. This was, for Fisher and his followers, an unambiguous vindication of Darwin’s theory of evolution via the mechanism of natural selection.

In Fisher’s hands, then, the Darwinian conceptions of selection and fitness became even more entrenched. The general argumentative strategy in his masterwork The Genetical Theory of

Natural Selection (1930) was, after all, “to introduce only the most basic assumptions about inheritance that were not available to Darwin and that can, now that they have been secured by recent genetical research, vindicate Darwin’s theory.”16 Beyond this basic directive and his ideological adherence to Darwin, one must also take note of how results in statistical mechanics and the legacy of James Clerk Maxwell (via his teacher James Jeans’ work on the theory of gases) influenced Fisher’s thinking. In the kinetic theory of gases, “several molecules are conceived to move freely in all directions with greatly varying velocities” but with a “statistical result that is a perfectly definite measurable pressure.”17 Whatever idiosyncratic differences there may be among particular molecules in a sample of gas can be averaged over to accurately

16 Op. 245. Hodge, M.J.S. “Biology and Philosophy (Including Ideology): A Study of Fisher and Wright” in The Founders of Evolutionary Genetics: A Centenary Reappraisal. Ed. Sahotra Sarkar. Kluwer Academic Publishers. Dordrecht, The Netherlands. 1992. 17 Op. 60. Fisher, R.A. and C.S. Stock, Cuenot on Preadaptation. A Criticism. Eugenics Review: 7, 46-61. 1915. 13 calculate pressure. In similar fashion, when focusing on human populations, the “agencies acting at large amidst a multitude of random causes,” any one being the predominant influence on some particular individual, “nevertheless determine the progress or decadence as a whole.”18 The common feature shared by selection theory and gas theory is consequently the reliability and predictability of the outcome when the individuals are numerous and the causes acting upon them independent. It was precisely this commitment to statistical regularity that enabled Fisher to unveil the statistical effects in a mixed population of a large number of (supposedly independent)

Mendelian factors and, thereby, resolve the dispute between Mendelians and biometricians.

The analogy between statistical regularity in the thermodynamics and that found in population biology comes apart at a critical juncture, however. According to the second law of thermodynamics, any realized physical system will tend towards a state of maximum entropy in which disordered states are maximally probable and the energy available for work is lost.

Organisms, conceived of as the outcome of continuous and cumulative selection, exemplify highly improbable, ordered states that endure. In essence, organismal unity through time and the fidelity of reproduction appear to run counter to the second law of thermodynamics.19 Fisher argued that the only way to reconcile this paradox was to embrace natural selection as the primary “counterentropic” factor combating the second law’s bleak trajectory. Organisms bearing combinations of traits that better adapt them to deal with environmental contingency will tend to survive and reproduce at higher rates and eventually displace less favorable trait combinations. Selection would drive populations toward ever higher fitness when environmental conditions remain approximately uniform. This was encapsulated by Fisher’s “fundamental theorem of natural selection,” which states that “in any species at any time, the rate of change of

18 Ibid., pp. 60-61 19 Obviously, organisms endure by way of increasing entropy in the surround system. 14 fitness ascribable to natural selection is equal to the additive genetic variance in fitness at that time.” A direct consequence of this is that “natural selection […] at all times acts to increase the fitness of the species to live under the conditions that existed an instant earlier.”20

The problem was that the process of becoming better adapted or “fitter” necessarily consumes existing additive genetic variation as it selects against suboptimal trait combinations.

Fisher did not want to introduce higher mutation rates or concede that mutations might have more extreme phenotypic effects (e.g., saltations) than previously imagined. These were, for him, ad hoc theoretical fallbacks with the potential to detract from selection’s primacy as the operative evolutionary force. Instead, he suggested that more than enough latent genetic variation existed in natural populations. The latency, of course, could come by way of being phenotypically

“hidden away” in the form of recessive alleles that constitute heterozygote genotypes. The phenotypic effects of such alleles could be revealed and exploited for use in future generations because these can escape the culling effects of selection.

For present purposes, it is worth noting the lengths to which Fisher is willing to go just to maintain the primacy of selection. He obviously recognizes the need for mutations, but admits only the bare minimum. He routinely downplays the importance of gene flow and random genetic drift by emphasizing large population sizes and long-term outcomes across populations constituting a species. Natural selection is depicted as all-pervasive, nearly all-powerful, and of undeniable explanatory import in the face of the second law of thermodynamics. Notice, too, that it is not just the variation (or the statistical notion of variance) exhibited by one selectively non-

20 Op. 131-132. Price, G.H. (1972) ‘Fisher’s “Fundamental Theorem” Made Clear’. Annals of Human Genetics, 36: 129-140. Fisher’s original formulation (1930) is stated as follows: “The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.” 15 neutral trait or genetic locus that is under consideration; additive genetic variance in fitness (i.e., across all selectively relevant loci) is paramount.21

All of the foregoing demonstrates how Fisher was the quintessential Darwinian selectionist. The phenomenon of adaptation had to be explained and the best way to do so was to posit the mechanism of selection as acting on heritable fitness differences. There nevertheless remains considerable latitude for (mis)interpretation regarding the entities that are the objects of fitness attributions. We need look no further than his original statement of the fundamental theorem for evidence of this: “The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.” It pertains directly to the “fitness of any organism,” which suggests that token individuals are the proper targets of fitness ascriptions.22 Yet, in the very same work (1930), Fisher replaces the term ‘organism’ in the expression “fitness of any organism” with the term ‘species’.23 Needless to say, this substitution has profound ontological implications regarding the scope and efficacy of natural selection.

Fisher’s ideas about natural selection and fitness would find would find their most potent critic in the American geneticist Sewall Wright (1889-1988). As for Fisher, Darwin, and

Wallace before him, evolution had to be conceived of as cumulative change rather than mere change. In other words, explaining adaptation was at the forefront of Wright’s approach. The basic division with Fisher about how best to do so can be glimpsed in Wright’s claim that

“fixation in some respects is as important as variation in others.”24 For Fisher, variation-inducing factors such as mutation, migration, and drift were entropic and thus ran counter to selection.

21 On this point, see A.W.F. Edwards (1994) and W.J. Ewens (1989). 22 Edwards, op. cit., pages 450-451. He therein argues for a rewording in which ‘organism’ would be replaced by ‘population’ or possibly ‘species’. 23 See the Summary at the end of chapter 2 of Fisher’s The Genetical Theory of Natural Selection (1930). 24 Op.142-143. Wright, S. (1931) Evolution in Mendelian Populations. Genetics 16, 97-159. 16 Only by downplaying the significance of such factors could one ensure the cumulative and adaptive nature of evolutionary change. According to Wright, however, Fisher’s “common treatment of organisms as mosaics of unit characters (Fisher’s norm)” when considering their evolution was insufficient. Selection cannot act on each of these “unit characters” or Mendelian genes independently. While selection can produce great character divergences, as in William

Castle’s (1867-1962) experiments with strains of black and white hooded rats, it can also bring unwanted traits and even infertility. Genes taken in combination, as they always are when attached to alleles at other gene loci on the same chromosome, often have unpredictable effects.

These make the strong selection of individual traits, though occasionally efficacious, also notoriously unreliable.25

Wright argued that genotypes relate to phenotypes “by a very complex network of biochemical and developmental reactions, with each character usually affected by many gene substitutions, each substitution having many pleiotropic effects and the intervening processes involving nonadditive interactions”26 Furthermore, to view these simply as “entropic” or

“distorting” factors along the lines suggested by Fisher is a mistake. For it is nothing other than the existence of these selectively arranged entanglements or “linkages” which enables the accumulation of naturally selected variations and the fidelity of transmission from one generation to another. Without gene linkage, as often exemplified by the nearness (in terms of number of loci) between genes on a chromosome, the recombinational effects of sexual reproduction in a panmictic population would dissolve genetic combinations. Evolution becomes a far more intelligible process, according to Wright, if based on natural selection among interaction systems rather than among alleles at each locus separately.

25 Crow, J.F. (1990) Sewall Wright’s Place in Twentieth-Century Biology. Journal of the History of Biology 23, 57-89. 26 Hodge, op. cit., p. 267. 17 Powerful as it has been in crafting organismal diversity, natural selection must assume a role alongside other evolutionary factors like drift, mutation, and migration for Wright. Selection acts only within the genetic and developmental constraints that unify individual organisms and identify them as members of the same species. Its capacity to bring about change at any time is severely curtailed by the products (genetic linkages) of its prior activity. Wright thus recognized that natural selection is “a complicated reciprocal process.”27 In a large, subdivided population, there is “a continually shifting differentiation” among the local races “intensified by local differences in selection but occurring under uniform and static conditions” and inevitably producing “indefinitely continuing, irreversible, adaptive and much more rapid evolution of species.”28 Highly localized adaptation occurs in small subpopulations, leading to what are sometimes called “local fitness optima” or “locally stable equilibria.” Each of these optima, equilibria, or “adaptive peeks” exhibits greatly diminished genetic variation due to increased rates of inbreeding and often intense selection.

As with Fisher, then, Wright faced the problem of how to account for the appearance of design or cumulative, continuous adaptation in what are prima facie selectively unfavorable circumstances. But Wright had the theoretical resources to overcome this obstacle. In his

“shifting balance” theory, (sub)populations of a species which occupied “higher adaptive peeks”—ones exhibiting trait combinations which had higher average rates of survival and especially reproduction—were more likely to increase in size and subsequently send forth emigres to infiltrate adjacent populations. Because these invading immigrants would on average be fitter than the members of the resident population for which they were destined, they would likely change (via successful mating) the genetic makeup of the global population in such a way

27 Op. 929. Wright, S. (1948) ‘Evolution, Organic’ in Encyclopedia Britannica., 14th ed. Revised, vol.8, pp.915-929. 28 Op. 272 in Hodge (1992). 18 as to make it better adapted to local circumstances. What small subpopulations taken singly thus lacked in genetic variation was compensated for collectively by extant genetic variation in the populations comprising the species at large. In metaphorical terms, migration could drag

(sub)populations occupying lower adaptive peeks across maladaptive valleys onto the slopes of

(“zones of attraction for”) higher adaptive peeks. Wright had no need of Fisher’s speculations about there being ample “latent” variation; genetic variation was manifest if one could but widen one’s gaze.

Even though Wright reduces the global efficacy of selection by emphasizing the need for other factors, the importance of fitness to any account of adaptive evolutionary change is unambiguous. It is, after all, individuals from populations with higher mean fitness (i.e., “higher adaptive peeks”) that tend to export migrants and thus enable the adaptation of an entire species.

Even Wright’s late (1980) work emphasizes such organismic selection:

A person considering evolution in terms of Mendelian heredity for the first time is likely to think that it is merely a question of how the best alleles in all loci, relative to a given ecological niche, come to be assembled. After a moment's consideration, it is fairly obvious, even on the simplistic assumption of a one-to-one relation between gene and unit character, that the selective value of any one of the unit characters will depend on the others with which it is combined.29

His views on the proper targets of fitness ascriptions are nonetheless ambiguous. He had no level of selection higher than the individual organism because he could not envisage any sufficiently persistent and integrated entities above the level of individuals in the hierarchy of organisms.30

This position leaves open the possibility of ascribing fitness to any form of organization below the level of individual organisms. In particular, Wright (1980) entertained the possibility of

“genic” selection among genes on the genomes contained within somatic cells.31 Setting aside

29 Op.827. Wright, S. (1980) Genic and Organismic Selection. Evolution 34:5, 825-843. 30 See Hodge (1992) on this point. 31 Wright, op cit. 19 the technical mathematical possibility of this approach, it is a peculiar view in that it licenses attributions of fitness to abstract trait types (genotypes) rather than individuals or token organisms.

Other pivotal figures of the modern evolutionary synthesis were likewise less than transparent when it came to the status of fitness and fitness-value ascriptions. By any account,

Theodosius Dobzhansky (1900-1975) and George G. Simpson (1902-1984) were among the primary architects of the neo-Darwinian theory of evolution, which, for better or worse, is arguably still regnant orthodoxy despite its articulation more than half a century ago.32 Within their works, careful reading reveals passages where the concept of fitness figures prominently. In his highly influential early work, Genetics and the Origin of Species (1951, 3rd ed.), Dobzhansky defines Darwinian fitness as “the relative capacity of the carriers of a given genotype to transmit their genes to the gene pool of the following generations.”33 Even as late as 1970 (just five years before to his death), Dobzhansky makes the following remark in a discussion about the multifarious differences among members of a population that influence the actual contribution that the carriers of a given genotype make to the gene pool: “This contribution, relative to the contribution of other genotypes in the same population, is a measure of the Darwinian fitness of a given genotype.”34 It seems as though Dobzhansky clearly favors trait types (genotypes) as the proper referents of fitness ascriptions, which rules out actual offspring contribution (at least of a single organism) as constituting fitness. But when taking note of the context in which this occurs and the language that he uses, especially the expression “relative capacity of the carriers of a given genotype,” ambiguity remains. If the capacity is a property of the carriers (i.e.,

32 If in doubt about this claim, see “Does Evolutionary Theory Need a Rethink? Researchers are Divided Over What Processes Should be Considered Fundamental” in Nature, Vol.514, 9 October 2014. 33 Op.78, my emphasis. 34 Op. 101. Dobzhansky, T. (1970) Genetics of the Evolutionary Process. Columbia University Press. New York, NY. 20 individuals) and those carriers are composed of combinations of differing alleles for the many genes that are subject to selection, then one can be forgiven for reading his statement as pertaining to the fitness of token organisms.

Simpson, too, was occasionally prone to less than precise expression on the issue of fitness. For instance, he (1949) articulates the distinction between the synthetic theory and

Darwin’s notion of selection as follows:

It must, however, be noted that the modern concept of natural selection […] is not quite the same as Darwin’s. He recognized the fact that natural selection operated by differential reproduction, but he did not equate the two. In the modern theory natural selection is differential reproduction plus the complex interplay in such reproduction of heredity, genetic variation, and all the other factors that affect selection and determine its results.35

While clearly recognizing that a complex interplay of heredity, genetic variation, and all other factors affecting selection determines actual response to natural selection, this passage could be interpreted as suggesting that the interplay in question is so extraordinarily complicated that realized fitness (or mere differential reproduction) is about as good an approximation as we will ever have for fitness simpliciter. Such a reading would, after all, mesh with one of the oft-cited boons of population genetics, namely that its mathematical approach enabled practitioners to

“black box” the daunting complexity of the underlying biochemical instantiations of heredity that

Wright and others emphasized. It is nevertheless clear that Simpson’s conception of Darwinian fitness was probably much more akin to what we now think of as expected fitness than it was to mere differential reproduction. For instance, in the very same work (1949) he claims that “the correlation between those having more offspring, and therefore really favored by natural selection and those best adapted, or best adapting to change is neither perfect nor invariable, only

35 Op.268. Simpson, G.G. (1949) The Meaning of Evolution: A Study of the History of Life and Its Significance for Man. Yale University Press. New Haven, CT. 21 approximate and usual.”36 Here, too, the question about the proper referents of fitness value ascriptions remains unanswered since the expression “those best adapted” could be read as equivalent to the expression “the fittest individuals.”

At minimum, the ambiguity of the architects of modern evolutionary theory is a plausible if not likely source of what would come to be known as the “naïve view” of fitness as actual reproductive contribution. This becomes all the more convincing when one notes the significant chronological overlap with philosophers like Karl Popper (1902-1994) who was undoubtedly familiar with their work.37

A Philosophical Turn

What often happens when biologists are confronted with these conceptual discrepancies or inconsistencies is that they instinctively defer to measures of fitness that are deemed most unobjectionable or orthodox. Actual lifetime reproductive contribution or “realized fitness” is usually assumed the most suitable measure of fitness in this vein. Suggesting otherwise seems nothing short of ludicrous to some biologists since relative fitness values must be attributed on the basis of just such evidence.

This knee-jerk reaction, often fostered by the commendable intention of keeping philosophical analyses empirically-grounded, has nevertheless made for considerable confusion.

Adhering to this naïve view of fitness is what, for example, drove Karl Popper (1974) to articulate a particularly resilient but misguided worry that has come to be designated “the tautology problem.” This worry can be articulated in the following manner: If there is nothing more to the notion of fitness than realized fitness, then the claim that ‘fitter organisms reproduce

36 Ibid., p. 221. 37 Popper, K. (1974) Darwinism as a metaphysical research programme. In: Schilpp PA (ed.) The Philosophy of Karl Popper, pp. 133–143. New York and Chicago. Open Court Press: 1974. 22 in greater number than their less-fit conspecifics’ is equivalent to ‘the organisms that reproduce more outreproduce those which reproduce less.’ As stated, this is an obvious tautology that is devoid of empirical content and thus explanatory import. It clearly fails to account for why any of the organisms reproduce in greater (or lesser) numbers than their conspecific competitors. For

Popper, committed as he was to falsification as the criterion for demarcating scientific theories,

(mis)reading ‘fitness’ as synonymous with ‘realized reproduction’ rendered generalizations about selection unfalsifiable. The generalization ‘fitter organisms reproduce in greater number than less-fit conspecifics’ apparently left no room for counterinstances. If asked about why any particular organism X had more offspring than competing conspecifics, the answer was always ready at hand: X is fitter than (read “had more offspring than”) the ones that were less fit (read

“had fewer offspring than”). More (less) fit organisms could never have a fewer (more) offspring than their less (more) fit counterparts. As there was allegedly no way to refute this generalization, one that figures so prominently in evolutionary theorizing, Popper wrongly concluded that Darwinism was more of a “metaphysical research programme” than a scientific enterprise.38

The problems with defining fitness in terms of actual survival and reproductive output are less than self-evident. One can begin to understand the shortcomings, however, by noting that this naïve view makes a driving explanatory project in evolutionary biology appear untenable or possibly unmotivated; namely, the identification of adaptation(s). Central to any examination of the process of forging adaptations is a distinction between the effects of selective processes (i.e., traits bearing the appearance of design) and the effects of random processes like drift. The crucial distinction between these processes and their respective effects evaporates if mere

38 Ibid., p.133. 23 differential reproduction suffices to determine fitness. Natural selection is systematic differential reproduction due to the relatively better design of favored organisms. Such “favored” organisms do not always reproduce in greater numbers than conspecifics of lesser design. Actual reproductive success in finite populations is inevitably influenced by factors that are explanatorily irrelevant from the standpoint of evolutionary theorizing. When selectively “less favorable” organisms occasionally reproduce in greater number, however, they must, according to the view under scrutiny, be deemed “more fit.” Even worse, the trait variants they exemplify must, then, be considered somehow “better adapted.” All of this flies in the face of evolutionary biologists and their considerable expertise regarding the study of such systems since they are overwhelmingly inclined to attribute such events to sampling error and the like.39

Another way to see how the naïve view is mistaken is to point out that it conflates evidence for the presence or value of a probabilistic property with the property itself.40 This view of fitness offers an implicit definition of fitness. As it defines fitness in terms of actual reproduction or realized fitness must commit to something like the following: In a particular population of species S, an organism identified by the set of selectively relevant variations has a fitness value of W if and only if it survives and reproduces W offspring. Note that such a strong

(definitional) identity claim makes it impossible for an organism to have a fitness value of W if it does not actually survive and contribute W offspring to subsequent generations. But this seems to set much too high a standard for bearing a probabilistic property like fitness.

39 This contention originates in Richard Burian (1983). It is reiterated on pages 54-80 in The Epistemology of Development, Evolution, and Genetics: Selected Essays. Cambridge University Press. Cambridge, UK: 2005. MJS Hodge (1987) makes much the same point but in much greater detail. Hodge, MJS (1987) Natural selection as a causal, empirical, and probabilistic theory. In: Krȕger L., Gigerenzer G., and Morgan MS (eds). The Probabilistic Revolution, pp. 233–270. Cambridge, MA: The MIT Press. 40 See Beatty and Finsen (1989) for more on what they call the “operationalist fallacy.” 24 To see why this is so, it is informative to examine a different sort of case. Take an example involving three fair coins and a mechanized coin-tossing device.41 To claim that the coins are “fair” is just to claim (i) that there is no reason for believing that any of the coins have intrinsic properties that make it considerably more likely that they land heads rather than tails or vice versa and (ii) that each of the trials involves the same proximate mechanism (e.g., the coin- tossing device) in similar environmental conditions. The first coin is tossed 100,000 times and subsequently destroyed. The observed frequency of heads was 0.5. The second coin is tossed 35 times and, then, also destroyed. The frequency of heads in this second trial was 0.2. Upon its initial toss, the third coin is melted in midair by a stray laser beam, so the frequency of heads was obviously 0.0. Taking the actual or observed frequencies in each of these trials as determining the presence (or absence) of a probabilistic property such as the propensity to come up heads on exactly half of the total tosses, we are forced to conclude that only one of the otherwise identical coins is actually fair. The other two must consequently be “biased” or possibly subject to a coin- tossing trial which violates the initial assumptions licensing the application of the term ‘fair’ to a coin. Either way, such a conclusion should strike most readers as patently absurd. One of the purportedly biased coins was, after all, tossed an odd number of times. As such, there was no way that it could have realized its “fairness.” The other purportedly “unfair” coin was destroyed during the first toss, which made the actual frequency of heads in that trial equal to 0. Does anyone genuinely believe that such observed frequencies exhaust the propensity of those particular coins to come up heads? Perhaps, but the onus of proof clearly resides with those who would claim as much.

41 This case is a variant of one given by Elliot Sober (1984) and Matthen and Ariew (2002). 25 Why are most of us squeamish about observed or actual frequencies determining the truth-value of ascriptions involving probabilistic properties? The answer to this question lies in the presumption that the coin-tossing trials supposedly involved coins of the same type. By stipulation, the coins were deemed fair. We want to know what would have happened had we the opportunity to toss that type of coin a very large (or perhaps infinite) number of times, or what would have happened if we had a very large (or perhaps infinite) number of that type of coin to toss a finite number of times. The thought remains that were it not for random or idiosyncratic mitigating circumstances (e.g., a stray laser beam or an odd number of tosses) the purportedly

“biased” coins might have realized their “fairness.” We maintain that such circumstances confound our efforts to gauge the actual biases of these coins rather than illuminate them and are, thereby, somehow explanatorily irrelevant. They are the result of sampling error or epistemic limitations.

Reconsidering the case of biological fitness in light of the foregoing intuitions, we can now begin to unravel the shortcomings associated with defining fitness in terms of actual survival and reproduction.42 Before doing so, however, a few words about the expectations surrounding scientific explanation are in order.

Speaking in very general terms, scientific explanations seek to answer questions pertaining to why and how certain empirical phenomena (i.e., entities, processes, activities, states of affairs, events, etc.) persist. In attempting to explain such phenomena, scientific theories and the models that are thought to comprise them often appeal to law-like generalizations or patterns of association relating properties of different types or classes. Henry B.D. Kettlewell’s now

42 For a more recent attempt to resuscitate the idea that fitness is equivalent to actual reproductive output, see: Otsuka, J., Turner, T., Allen, C. and Lloyd, E. A. [2011]: ‘Why the Causal View of Fitness Survives’, Philosophy of Science, 78, pp. 209–24. I will address the problems that this view faces later in this treatise. 26 famous study of industrial melanism offers us a case in point.43 He wished to understand why the melanic form (carbonaria) of the peppered moth (Biston bitularia) tends to predominate and eventually replace the standard light form (typica) in industrial areas containing soot-covered trees. As it turned out, the darker color of the melanic form was less likely to be preyed upon by birds than its light counterpart. Dark coloration proved to be an effective form of camouflage in industrially polluted areas. Accordingly, an explanatory generalization was (at least tacitly) corroborated: Dark-colored peppered moths in soot-polluted woods tend to survive and reproduce in greater numbers than their light-colored counterparts and, hence, increase in relative frequency over time (ceteris paribus, of course). Similarly, light-coloration explains the decline of the standard light-colored moths in the bird-infested, polluted woods. Such strategic appeal to law-like generalizations or recognizable patterns in our attempts to understand natural phenomena is desirable even if not strictly necessary. It is, for instance, an important characteristic of hypothesis formation that one’s tentative and testable explanations maintain a level of referential generality so as to not constrain prospective explanatory import for other populations or study systems. Were it not for this aspect of generality it is not clear how we could meet the requirement of independent testing or corroboration of prior findings.

Unfortunately, this generally desirable feature of scientific explanation leads to problems when the explanatory adequacy of fitness is thought to be reducible to actual survival and reproduction. Fitness is supposed to be more than a moniker appended post hoc to those organisms that happen survive and subsequently reproduce. It is usually asserted as the cause of an organism’s survival and reproduction rather than a mere consequence of or response to selection.

43 Majerus, Michael E. N. (2005), The rise and fall of the carbonaria form of the peppered moth., in Fellowes M. D. E., Holloway G. J., Rolff J., "In Insect evolutionary ecology", Quarterly Review of Biology, 78: 399–418 27 Perhaps the classic counterexample to fitness as mere differential reproduction is presented by Michael Scriven (1959).44 He presents readers with a thought-experiment involving identical twins. The fictional twins are out for a walk when one of them is struck down by lightning. If actual survival and reproduction are all that there are to the notion of fitness, then the surviving twin has a positive fitness value, say equal to 1, while his electrocuted sibling has a fitness value equal to 0 (or a positive value considerably less than 1). As in the case of coin- tossing described above, these value ascriptions strike many as unpalatable since the identical twins are, by stipulation, physically identical with respect to all of their intrinsic properties. Not only are Scriven’s twins physical duplicates, they are supposedly subject to the same set of systematic selection pressures in their environment. Even if one allows for some difference, as we would if extending consideration to actual “identical” twins, the differences at issue would simply be construed as being of the sort that are “invisible to” or likely inconsequential for natural selection, such as unexpressed genetic differences (e.g., induced mutations to introns) or unreliably inherited features of the local external environment.

The point of Scriven’s example for present purposes is just that there are no reliably inherited intrinsic or extrinsic properties which make it more or less likely for a given member of this species (H. sapiens) in this particular population (of two members) to be struck by lightning.

Lightning appears to be randomly distributed if anything is; it’s a quintessentially chancy event not unlike winning the lottery.45 The twins must accordingly be ascribed to the same explanatory

44 Scriven, M. (1959). “Explanation and Prediction in Evolutionary Theory.” Science, 130: 477-482. 45 If you are one of those who can’t shake the thought that lightning selects against individuals with greater height or those who like to golf during thunderstorms, then substitute a falling tree for lightning as the cause of death in Scriven’s tale. 28 reference class. But, then, the notion of fitness appears devoid of any genuine explanatory and predictive import since there is no selectively relevant, heritable variation.46

Now, if the foregoing is indeed an accurate rendering of the difficulties associated with what might be called the “naïve” notion of fitness (as mere differential survival and reproduction or realized fitness), then a dilemma arises. One can hold fast to an unsophisticated and counterintuitive notion of fitness as a mere descriptor of actual survival and reproduction or they can abandon it in favor of a novel approach with genuine explanatory and predictive power. The former option, at least as presented thus far, is clearly untenable since it strips the notion of its causal, explanatory, and predictive efficacy.47 It is probably safe to assume that no biologist or philosopher would accept this nihilistic outcome as satisfactory. To do so would be tantamount to claiming that fitness has no (scientific) role in modern evolutionary theorizing, which it would seem is anything but the case. The second option is, therefore, thrust upon us. The primary difficulty with the latter alternative involves what might be called its “empirical sensitivity.”

How can fitness ascriptions remain “sensitive to” or account for observed instances of survival and reproduction but resist being defined merely in terms of such actual frequency data?

A philosophically-insightful resolution was first articulated in the late 1970s if not somewhat earlier. Adapting Karl Popper’s conception of propensity,48 Robert Brandon (1978) and John Beatty and Susan Finsen (1979) independently argued that fitness is best conceived of

46 Lewontin, R. (1970). The Units of Selection. Annual Review of Ecology and Systematics, 1: 1-18. 47 This caricature neglects a somewhat recent twist in the debate. A cadre of philosophers (Denis Walsh, Mohan Matthen, Andre Ariew, and Tim Lewens) and philosophically-sophisticated biologists (Richard Lewontin, Massimo Pigliucci) has recently argued that fitness can be stripped of its causal power but nevertheless retain its explanatory and predictive utility. I will address this issue, as per Walsh’s (2010) work, in the next chapter. 48 See the following: (1) Popper, K. R. The Propensity Interpretation of the Calculus of Probability, and the Quantum Theory. In: S. Körner (ed.): Observation and Interpretation. Academic Press Inc., Butterworths Scientific Publications, pp. 65–70. 1957. (2) Popper, K. R. The Propensity Interpretation of Probability. British Journal for the Philosophy of Science: 10, No. 37, 25–42. 1959. 29 as a probabilistic dispositional property. In a later collaboration (1984), Brandon and Beatty summarize the upshot of this maneuver:

On the propensity interpretation, fitness is a probabilistic disposition or ability explicated in terms of expected rather than actual reproductive success (in the mathematical sense of 'expected value'). Inasmuch as the connections between an entity's dispositions or abilities and its actual behaviors are causal connections rather than analytic connections, the propensity interpretation of 'fitness' allows for genuinely explanatory accounts of differential reproduction in terms of differential fitness […] Informally, 'fitness' is defined in terms of abilities to reproduce, not in terms of actual reproductive success. (33- 34)49

This proposal sidesteps the explanatory worries that accompany the naïve view’s commitment to defining fitness simply in terms of realized fitness. On the propensity interpretation, fitness-value ascriptions to individuals are always issued on the basis of trait-type (or trait-variant) membership. Thinking back to Kettlewell’s study, for example, a fitness-value ascription for any member of the population would make reference to that particular individual’s coloration, as being either melanic or light. Only after averaging over the reproductive contributions of all the known individuals bearing a trait-variant, say melanic, in a given population could any token melanic member of that type be baptized with what Kettlewell considered a reasonably precise and thus interesting fitness value.50 So the strength of this approach is that individuals bearing a specific trait-variant can exemplify an expected fitness-value without necessarily realizing the calculated fitness-value for its reference (trait-variant) class. It is, after all, only over repeated trials of the same (or relevantly similar) type that we should expect actual fitness values to converge upon the calculated mathematical expectation.

49 Brandon, R. and J. Beatty (1984). The Propensity Interpretation of ‘Fitness’: No Interpretation is No Substitute. Philosophy of Science, 51(2): 342-357. 50 This example is obviously oversimplified. To accurately estimate the fitness of any (non-lethal) variant not only requires normalizing relative to alternative extant variants, but also taking into account the selection for and heritability of every other selectively non-neutral trait. I will expand this important point in subsequent chapters. 30 The propensity interpretation of fitness and its consequences will prove so important in subsequent discussion that it is well worth briefly reviewing how population biologists go about measuring fitness within this framework. Population biologists begin with census data on the individuals in a population. They gather data on the number of viable offspring each individual in a population actually leaves over the course of its lifetime. It is on this basis of actual offspring contribution that individual fitness is eventually calculated. As already mentioned, however, individual fitness is the expected number of offspring that a given member of the population under study would on balance leave over the course of its life. “Individuals” in this sense are identified via descriptions that assign them to reference classes according to the particular trait-types (or variants) they happen to bear. To calculate the requisite expectations, one must first establish a range of outcomes in terms of the number of possible offspring that an individual in the population could leave. From census data, probability weightings are associated with each possible reproductive outcome. These probability weightings reflect how often each of the possible outcomes is realized by a given type of organism. Each outcome value must be multiplied by its associated probability weighting. All of the products are, then, summed to yield the “weighted sum” or expected number of offspring for a particular type of organism. This is, of course, repeated for each extant trait-type or variant so that comparisons can be drawn. At the culmination of this process, the population biologist is confronted with differences in expected fitness for the trait-variants under investigation. Determining what accounts for the observed differences is one of the more stimulating questions in evolutionary biology. And by far the most promising approach to getting at the answer(s) begins by assuming that variations in trait-type cause differential fitness.

31 Having summarized the major problems with the naïve definition of fitness and introduced what has become the orthodox philosophical account of fitness as a propensity, a curious reader might well wonder why a concept so central to explanation in evolutionary biology still lacks a unified account. The current predicament is in part due to difficulties that emerged for the formal (mathematical) rendition of the propensity interpretation. Subsequent chapters will examine these mathematical and ontological worries at length. Another reason for scientific ambivalence regarding the concept is almost certainly that significant explanatory and predictive success in evolutionary population biology that has taken place despite the theoretical confusion surrounding fitness. In the remainder of this introduction I will address this “devil- may-care” indifference.

If the history of science has revealed anything at all it is surely that the various sciences can achieve a good deal of empirical success without veridical knowledge of the entities or processes they investigate.51 Examples are, of course, manifold. Despite his brilliant explanation for the diversity of form and function in the biological realm by means of evolution via natural selection, Charles Darwin knew almost nothing about the underlying hereditary mechanisms that are required for selection to effect such colossal change. We need look no further here than The

Variation of Animals and Plants under Domestication (1868) in which Darwin posited the existence of imaginary particles of inheritance or “gemmules” to account for the reliability with which characteristics are transmitted from ancestral (parental) generations to descendants

(offspring)52:

51 Laudan, L. (1981) A Confutation of Convergent Realism. Philosophy of Science. Vol. 48, No. 1, pp.19-49, 1981. 52 It should be noted that Darwin’s pangenesis, with its commitment to gemmules, was designed to deal with the problem of the (limited) inheritance of acquired characteristics. If understood in its proper historical context, as a way to incorporate the more plausible elements of “Lamarckian” inheritance based on use and disuse, then his theory regains some plausibility. Interested readers should consult Michael Ruse’s highly informative Charles Darwin (2008, Blackwell Great Minds Series) and Janet Browne’s magisterial, two-volume (1995, 2002) treatment. 32 It is universally admitted that the cells or units of the body increase by self-division, or proliferation, retaining the same nature, and that they ultimately become converted into the various tissues and substances of the body. But besides this means of increase I assume that the units throw off minute granules which are dispersed throughout the whole system; that these, when supplied with proper nutriment, multiply by self-division, and are ultimately developed into units like those from which they were originally derived. These granules may be called gemmules. They are collected from all parts of the system to constitute the sexual elements, and their development in the next generation forms a new being; but they are likewise capable of transmission in a dormant state to future generations and may then be developed. (1868, 369-70)

Darwin’s theory of pangenesis was subsequently shown to be incorrect on the grounds that there is nothing which actually corresponds to Darwinian gemmules in modern cell theory. Particles containing only the “plans” for individual phenotypic traits, ones which could congregate to form sex cells and yet retain the capacity for expression in future generations, do not exist. Thanks to relatively recent advances in we now know that every cell, with the exception of gametes in sexual organisms, carries a full complement of DNA or what might be called a

“total plan” for the development of any tissue type and thus a new being. Nevertheless, Darwin’s explanation of the diversity of biological form and function is still considered a unifying schema within biology.

Examples like this supposedly demonstrate that evolutionary biologists need not necessarily concern themselves with whatever biological reality underpins their seemingly successful explanations and the concepts figuring centrally therein. Richard Levins, a prominent mathematical ecologist at Harvard School of Public Health, was among the earliest to articulate this issue of “sacrifices” or “tradeoffs” between the metatheoretical virtues of modeling. In his

“The Strategy of Model Building in Population Biology” (1966), he claims that there are three general qualities which we would like our formal models to satisfy. Ideally, a model should (i) generate accurate and precise predictions, (ii) reflect reality by taking into account known parameters and trends, and (iii) be applicable across a number of different organisms or systems.

33 The problem is that no single model can simultaneously meet all three criteria. Models can, at best, meet two of the three criteria and are thus classified according to the sacrifices they make.53

The conclusion that many have drawn is that there are many roads that lead to empirical success, at least one of which altogether bypasses the need to accurately reflect reality (option (ii) above).54

Levins’ work has undoubtedly influenced the thinking of many biologists, some of whom, in turn, rest content with an intuitive grasp of fitness. Fitness is just whatever measure makes for accurate enough predictions in an evolving system. But predictive efficacy alone, while occasionally sufficing for general qualitative inferences about which trait types will increase (or correspondingly decrease) their relative frequency or about whether the mean trait value (or statistical variance surrounding it) will shift, often fails to provide important information about the magnitude of such directional change.55 Modern evolutionary biology is a statistically-saturated enterprise. Most biologists are savvy to conflations of statistical correlation and causation. Reliable correlation is best thought of as a prerequisite for controlled experimental testing or possibly the accumulation of more observational data, not a substitute for it. Only the latter are taken as definitive evidence for the presence of causal relationships within this branch of science. Were biologists not somewhat interested in the biological reality that underpins their theoretical posits (e.g., fitness), they would not feel so compelled to test the predictions of statistical models.

53 Levins, Richard (1966) "The Strategy of Model Building in Population Biology", American Scientist, 54:421-431. 54 Though drawing on Levins as a source of such thinking, he in no way intends for such a modeling strategy to be used in isolation. He provides a prescient argument that such modeling strategies should be used in tandem. 55 There are important instances when no such shifts or changes to numerical measures occur but where biologists would claim that important fitness differences nevertheless exist. Heritability can affect the response to selection. 34 When scientists emphasize predictive efficacy at the expense of accurate description or comprehensive explanation, ontological apathy is not necessarily the motive. It is probably fair to assume that most biologists care deeply about biological reality or the ontological structure of their study systems. Whether they know it or not, most of them subscribe to scientific realism in the philosophical sense, which holds that the objects of scientific knowledge exist independently of the minds or acts of scientists and that scientific theories are true of that objective (mind- independent) world (Papineau 1996). For them, then, the strategy of “black-boxing” is more commonly deployed as a promissory note for future clarification. This happens in the area of conservation biology, for instance. Conservation biologists are often confronted by wild populations with dwindling numbers. Their primary goal is to preserve and hopefully restore such populations to stable densities in a manner that does not sacrifice the well-being of other native organisms in the ecosystem. The sheer number of uncontrolled (and typically uncontrollable) variables in such studies can be staggering. Further complications arise from the fact that the complexity of the interactions between such factors can easily outstrip our capacity to measure and calculate the import of these potential explanatory factors. Consequently, conservation biologists must temporarily set aside any pretentions to accurate or complete description. While this move is no doubt less than ideal, it is certainly prudent in the sense that saving endangered species often requires acting as quickly as possible to alleviate further decline.

The use of principal components analysis (PCA) by many conservation biologists clearly reveals this commitment. Originally developed by Karl Pearson (1901),56 PCA is statistical tool that is used for both exploratory data analysis and the generation of predictive models. It uses

56 Pearson, K. (1901) On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine. Vol. 2, No. 11: 559-572. 35 orthogonal transformation to convert a set of observations of possibly correlated variables into a set of uncorrelated variables, the values of which are termed “principal components.” Each of the resulting statistically independent components receives a score based on the amount of variability in the data for which it happens to account. Accordingly, the first principal component provides the best (linear) account of the variance in the observational data, while each succeeding principal component would in turn provide the best account of the variance in the data under the constraint that it is uncorrelated with or independent of (i.e., “orthogonal to”) the preceding components. From the conservation biologist’s perspective, the upshot of this procedure comes from the requirement that the number of principal components must be less than or equal to the number of original variables. More often than not, the number of relevant variables in such an analysis is dramatically reduced, which brings in its wake a welcome reduction of possible causal or explanatory factors that might be manipulated.57, 58

Neither ontological apathy nor promissory notes will suffice as a way to avoid further inquiry into the concept of fitness. The crux of their shortcoming becomes evident if one focuses on the role that the notion is thought to play in explanations of adaptive transgenerational change. Fitness, especially the difference in fitness (or “relative fitness”) among conspecific organisms in a particular population, is thought to be the cause of natural selection and, hence, adaptive evolutionary change. Most biologists would claim that it is a particular organism’s “fit to the environment” or “ability to endure under a common suite of selection pressures” that

57 Shaw, P.J.A. (2003) Multivariate statistics for the Environmental Sciences. Hodder Arnold. 58 With dramatic increases in computational capabilities over the last half decade or so, most conservation biologists have shifted from linear transformation to “nonparametric multidimensional scaling” (NMSA), a non- linear form of statistical analysis that does not rely on a priori assumptions about the statistical independence of potential causal factors. This shift in no way affects the point being made here since the basic motivation for adopting NMSA still involves reducing (often dramatically) the number of possible explanatory factors. I would like to thank Brian Inouye for bringing this point to my attention. 36 accounts for whatever success it has when it comes to survival and proliferation. When biologists point out that some organisms of a particular kind outcompete others of that kind because they were fitter, they clearly acknowledge that the difference in competitive ability is due to (or the result of) differences in fitness. And they routinely cite fitness as being responsible for non-random changes in populations.59 What is perhaps even more telling than the sheer prevalence of these causal locutions is the fact that biologists model and measure fitness. Their doing so implies that fitness is a measurable property and that some sorts of biological entities bear this property. Within biological circles, the orthodox assumption is one that takes fitness to be a property of individual organisms.

The prevalence of causal locutions and continuing attempts to measure and model fitness demonstrate the poverty of any view which holds that the ontological basis of fitness is either indeterminate or irrelevant from a practical standpoint. Ontological apathy of this ilk contravenes actual biological practice and should consequently be disregarded. Black-boxing, which issues a promissory note for further clarification and explanation, fares better because it does not rule out the need for a determinate ontological basis. But explanatory promises typically go hand-in-hand with the expectation of fulfillment. The sooner, the better in this case since the edifice of population biology seems ill-equipped to provide the much sought-after clarification that biologists and philosophers of biology desire. This weak form of black-boxing, in essence, simply echoes the present call for a philosophically astute examination of fitness.

Fitness is a Janus-faced concept. It is routinely employed in an inconsistent or ambiguous fashion. My overarching claim in this work, however, is not that population biologists have gone astray in their work. Nor is it that their findings are necessarily compromised because they lack a

59 Ramsey, Grant; and Pence, Charles H (June 2013) Fitness: Philosophical Problems. In: eLS. John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0003443.pub2 37 clear and cogent analysis of fitness. Quite to the contrary, it is the astounding success of evolutionary biology despite the absence of such a rigorous definition that leaves us wondering and deserves further investigation. This alone should pique the interest of those who might like to better understand why or how science in general, even if not biology in particular, succeeds.

When concepts which figure centrally to various successful subdisciplines of science are revealed to be unclear or difficult to define and consequently measure, the time is perhaps ripe for philosophical enquiry.

38 CHAPTER 2

IS ORGANISMAL FITNESS A METAPHYSICAL EXCRESCENCE?

The ontological status of organismal or trait fitness has been a topic of heated debate in the philosophy of biology. On one side of the issue there are those who claim that fitness is a causally efficacious, probabilistic dispositional property (i.e., a propensity) of the individual organisms comprising a population. For ease of future reference, let us refer to this position as the “propensity interpretation of fitness” (or PIF). Opponents of the propensity interpretation contend that fitness is a merely statistical property of trait types, one which is explanatorily but not causally efficacious. Dennis Walsh, one of the architects of the statistical interpretation, has recently (2010) argued that the causal commitments of the orthodox view entail a probabilistically non-benign version of Simpson's paradox and ultimately the violation of a principle in decision theory known as the “Sure Thing Principle.” Were this the case, it would constitute a fatal result for the orthodox view since causal claims must in general conform to the directive of this principle. But Jun Otsuka, Trin Turner, Colin Allen, and Elisabeth Lloyd (2011) counter that Walsh’s argument relies on mistaken usage and misunderstanding of the mathematical models that John Gillespie (1972, 1974) uses to calculate fitness. In this paper, I argue that Walsh has indeed overstated the case against the orthodox view. However, it is not clear that his mistakes lie precisely where Otsuka et al. take them to be. I begin by briefly sketching out the relevant differences between the two competing positions with respect to the concept of fitness and its explanatory role in theoretical population biology. This is followed by a brief review of the pivotal distinction between probabilistically “pathological” and “benign” instances of Simpson's paradox, and a careful examination of the problem case that supposedly

39 stymies the orthodox view. I conclude with a rejoinder based on practical as well as theoretical grounds.

Background

To thwart early worries about the explanatory vacuity of evolutionary theory, Robert

Brandon (1978) and John Beatty and Susan Finsen (1979) independently argued that biological fitness is best conceived of as a probabilistic dispositional property or propensity. Fitness, they argued, should be explicated in terms of mathematical expectation rather than actual reproductive success. This construal had several notable advantages. First, it was in accord with the formal models then deployed by evolutionary population biologists. Most (if not all) of the models used to determine fitness require averaging over the demographic survival probabilities and offspring contributions of individuals identified qua their exhibiting the same trait type or suite of trait types. Weighted averages are central to such predictive models. While formal models do not impose ontological commitments upon those that employ them, they nevertheless provide insight as to the nature of explicit or implicit metaphysical commitments. This is especially so when it comes to biologists’ intuitions about differences in fitness or “relative fitness.” Fitness differences among conspecific organisms in a population are typically thought to be the cause of natural selection and, hence, adaptive evolutionary change. Most biologists would unhesitatingly claim that it is a particular organism’s “fit to the environment” or “ability to endure under a common suite of selection pressures” that accounts for whatever success it has when it comes to survival and proliferation. When biologists point out that some organisms of a particular kind outcompete others of that kind because they were fitter, they acknowledge that

40 the difference in competitive ability is due to or the result of differences in fitness. And they routinely cite fitness as being responsible for non-random changes in populations.60

The second advantage of the propensity interpretation of fitness is best seen through a philosophical lens. The connection between an organism’s ability to survive and reproduce (i.e., its “fitness” or “adaptedness”) and its actually doing so is causal (synthetic) rather than logical

(analytic) on the propensity view. Ascribing a scalar fitness value to a particular organism in virtue of its membership to a subclass of a population (i.e., qua its exemplification of a trait variant subject to selection pressure) does not guarantee that the organism in question realizes its fitness any more than the ascription of fragility (a deterministic dispositional property) to a glass vase necessitates its shattering. As the vase has the dispositional property irrespective of its actually shattering, so too can a token organism of type X have the fitness value of n (≠ 0) even if it perishes before it has the opportunity to reproduce. Fitness value ascriptions, after all, are averages taken over all the individuals that belong to a particular partition of the population.

These are more than mere redescriptions or labels applied only to the organisms that happen to survive and reproduce. The propensity interpretation of fitness paved the way for genuinely explanatory and informative accounts of differential reproduction in terms of differential fitness.61

The propensity interpretation of fitness was for the most part well and good until a cadre of philosophers argued that taking fitness to be causally efficacious property involves a fundamental misunderstanding.62 The pivotal distinction is that between the prevailing

60 Ramsey, Grant; and Pence, Charles H (June 2013) Fitness: Philosophical Problems. In: eLS. John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0003443.pub2 61 Brandon, R. and John Beatty. The Propensity Interpretation of ‘Fitness’: No Interpretation is No Substitute. Philosophy of Science. Vol. 51, No. 2. 1984. 62 Matthen and Ariew (2002); Walsh, Lewens, and Ariew (2002); Ariew and Lewontin (2004); Ariew and Ernst (2006); 41 conception of fitness, one which they refer to as “vernacular fitness” or “Darwinian fitness

(occasionally “ecological fitness”), and a purportedly more parsimonious conception that they suggest which goes by the label “predictive fitness” or “(merely) statistical fitness.”63 Matthen and Ariew (2002) describe the commonsense or orthodox intuition about organismal fitness as follows:

An organism’s ability to “survive and reproduce” arises from its traits. To the extent that relatively “advantageous” traits can be inherited by an organism’s descendants, they will be reproduced and retained in the population at a higher rate than less optimal ones (…) This much is central to common-sense analysis, and for many this notion of an organism’s overall competitive advantage traceable to heritable traits is at the heart of the theory of natural selection. Recognizing this, we shall call this measure of an organism’s selective advantage its vernacular fitness.64

In contrast, predictive fitness is merely “a statistical measure of evolutionary change, the expected rate of increase (normalized relative to others) of a gene, a trait, or an organism’s representation in future generations […] suitably quantified and normalized.”65 Vernacular fitness is typically conceived of as the property of organisms which figures in general dynamical explanations for transgenerational change in the representation of trait types. The fitness of a trait type (relative to alternative extant character state-types) is accordingly a function of or supervenes on the properties of individual organismal fitnesses of the members in a population within their local environments.66 The vernacular conception of fitness gives credence to the propensity interpretation which holds that fitness is a probabilistic dispositional property of

63 That ‘vernacular fitness’ and ‘Darwinian fitness’ are synonymous is evident in André Ariew and Richard C. Lewontin’s “The Confusions of Fitness” in British Journal for the Philosophy of Science, Vol. 55 (2004), 347-363. The expression ‘ecological fitness’ is also a common substitute for ‘vernacular fitness’. See especially Ariew and Ernst (2006). Brandon and Ramsey (2007) employ the modifier “merely” on the grounds that the application of statistical methods is admitted by all parties to the philosophical debate and does nothing to settle going concerns. 64 Op. 56. Mohan Matthen and André Ariew. “Two Ways of Thinking about Fitness and Natural Selection.” The Journal of Philosophy. Vol. 99, No. 2 (Feb. 2002). Pages 55-83. 65 Ibid. 66 This is adequacy condition “C” for Ariew and Ernst (2006). 42 individual organisms, while the predictive conception envisions fitness as statistical measure or property of a trait type that is itself an abstraction.

To forestall any confusion, it should be noted that the claim made by advocates of the merely statistical view is not that the notion of fitness has proven irrelevant, should be dispensed with, or serves no purpose whatsoever in modern evolutionary theorizing. That would be far too strong; for it would fly directly in the face of how evolutionary population biologists even now utilize the concept. Fitness can be used as a form of shorthand explanation when it comes to forecasting population trajectories. Controversy arises only when metaphysical or ontological assumptions about the causal efficacy of fitness are foisted upon the biologists who deploy the notion. Walsh (2010), for instance, is unmistakably clear on this point: “These excess [causal] commitments are mere metaphysical excrescences and should be removed from our interpretation of evolutionary theory.”67

Walsh, one of the primary architects of the statistical interpretation of fitness, has recently put forth a reductio ad absurdum argument that threatens to undermine the propensity interpretation of fitness (hereafter PIF). In its most general form, Walsh’s master argument can be sketched as a disjunctive syllogism:

(P1) Either organismal fitness is the cause of systematic (adaptive) transgenerational change in trait frequencies or it is a mere statistical (i.e., acausal) redescription of such change.

(P2) The vernacular conception of fitness and its philosophical heir, the PIF, argue that organismal fitness is the causal property which accounts for natural selection.

(P3) Fitness cannot be construed as the cause of systematic transgenerational change in trait frequencies; doing so violates a fundamental principle of causal decision theory known as the “Sure Thing Principle.”

67 Walsh p.168 43 (C) The merely statistical interpretation of fitness, therefore, proves to be a more appropriate conception of fitness.

The devil, of course, is always in the details. I shall examine the finer points of Walsh’s argument below. Understanding his contention, however, requires a brief digression into decision theory.

Simpson’s Paradox: the basics

Simpson’s paradox is well-known phenomenon in statistics (Yule 1903, Simpson 1951) which occurs when the association between two variables of interest, say X and Y, is reversed in each of the subpopulations that result from partitioning a population at large. Partitioning, of course, proceeds by way of accommodating additional information upon which the global population is subdivided. By way of generic example,68 suppose that the variables X and Yare positively correlated in a population. This is to just to say that Pr(Y|X) > Pr(Y|~X) when the population is taken as a whole. Assume that this population is, then, further partitioned by the variable W. Now there are two subpopulations in which to examine the probabilistic relationship between X and Y; there is a subpopulation created by assuming the factor W in addition to its contrast class. Since the relationship of interest is stipulated as that between the variables X and

Y, the appropriate comparisons are those between Pr(Y|X & W) and Pr(Y|~X & W), on the one hand, and between Pr(Y|X & ~W) and Pr(Y|~X & ~W), on the other. In such a case it is possible for the positive global correlation between X and Y to be inverted in each resulting subpopulation such that Pr(Y|X & W) < Pr(Y|~X & W) and Pr(Y|X & ~W) < Pr(Y|~X & ~W).

Insofar as the foregoing reversal is seen as a purely statistical phenomenon, one which takes the association in question (between variables X and Y) as one of mere correlation, no problems or paradoxes arise. Difficulties ensue only if such associations are also claimed to be

68 Astute readers might recognize this example as a generalized version of the Paradox of the Perplexing Painkiller. 44 causal in nature. Not only is this etiological commitment central to the PIF, it is also precisely the inference that many, if not most, statistical analyses in population biology seek.69 But if you are inclined to draw causal inferences on the basis of data drawn from the aforementioned global population, then it seems as though you find yourself in the rather undesirable state of having to conclude that X is a causal factor that positively contributes to the realization of Y even though

X diminishes the probability of Y in a more information rich partitioning of the population by W and ~W. This is a glaring inconsistency from the causal point of view,70 one that can only be tolerated on pain of irrationality.

Some instances of Simpson’s paradox are clearly “benign” and can be quickly rectified.

A troublesome reversal of probabilistic inequalities can occasionally be prevented just by taking steps to (i) normalize the data or (ii) assure that partitioning variables are independent. It is well known that Simpson’s reversals can occur as the result of partitions that generate subpopulations with unequal numbers of data points. The probabilities of association drawn from subpopulations with larger sample size more heavily influence global probabilities than those drawn from subpopulations with smaller sample size. Such situations can skew the global probabilities or percentages used to determine the association between variables such as X and Y. Remedy is easy enough to come by in these situations. Sample size can be normalized by providing common denominators for the ratios that are used to generate the percentages or probabilities and compute weights in the representation of probabilities as weighted averages. Once the denominators in more information rich partitions (e.g., W and ~W) are normalized, recalculated global probabilities of association might “invert” so as to agree with those found in the subpopulations.

69 Often this assumption is what licenses further experimentation to validate the outcome of statistical analyses. 70 Strictly speaking, it is a metaphysical impossibility and the set of causal commitments (beliefs) is incoherent. 45 As for determining the independence of partitioning variables so as to counter an apparent Simpson’s reversal, one has to check for the presence of confounding factors.

Partitioning by subpopulation membership, as by W or ~W in our foregoing example, should not make for a difference in the value of the supposed causal variable (X) between subpopulations.71

Put another way, partitioning by any legitimate variable will not make it such that a data point initially assigned membership in one class (e.g., X) need consequently be reassigned as a member of the contrast class (e.g., ~X). If partitioning violates this measure, it is a clear case of illegitimate averaging that generates the inversion of probabilities characteristic of Simpson’s paradox. While avoiding this type of “benign” Simpson’s reversal often requires auxiliary information about the scenario and factors under investigation, such knowledge can often be attained in practice.

The point of this short foray through decision theory and “benign” cases of Simpson’s paradox is just to set the stage for Walsh’s alleged coup de gras. The probabilistic associations and reversals heretofore discussed are of the type that the proponent of PIF, committed as s/he is to drawing causal inferences on the basis of statistical correlations, can readily accommodate. It is for this reason that Walsh labels these cases as “benign.” His argument that the vernacular conception of fitness and PIF are inadequate hinges on there being at least one “pathological” case of Simpson’s paradox for which his adversaries have no cogent reply. This is the topic of the next section.

It’s a Sure Thing that PIF is Pathological

To generate a pathological instance of Simpson’s paradox for PIF, Walsh begins by adopting geneticist John Gillespie’s (1974; 1975; 1977) equation for fitness (w):

71 The quintessential example of this sort of benign reversal is the Ungentle Unguent. 46 휎2 푤 = µ − 푖 푖 푖 푛

Here, µ is the mean number of offspring and σ2 is the variance in number of offspring, while n is the population size. The subscript i signifies the ith genotype. He, then, asks us to consider the following hypothetical situation involving two competing genotypes with the following parameter values:

2 Genotype G1: µ1 = 0.99, 휎1 = 0.2

2 Genotype G2: µ2 = 1.01, 휎2 = 0.4

Walsh further stipulates that this hypothetical population is constituted by fourteen six-member subpopulations and exhibits a global population total of 84 members. Granting this, Gillespie’s model delivers the verdict that G1 is fitter than G2 in each of the fourteen subpopulations j that collectively exhaust the global population:

0.2 0.4 wj,1 = 0.99 - = 0.9567 > wj,2 = 1.01 - = 0.9433 6 6

Regarding the population taken as a whole, however, this inequality is reversed; the fitness of G2 exceeds that of G1:

0.2 0.4 w1 = 0.99 - = 0.9876 < w2 = 1.01 - = 1.005 84 84

Walsh’s example of a population experiencing within-generation variance in reproductive output is a clear case of Simpson’s paradox since the genotype (G1) that is fitter in every subpopulation is not the most fit overall in the global population. The decisive question, of course, is whether this instance of Simpson’s paradox is “pathological” rather than “benign” for a causal interpretation.

Can the standard remedies for benign instances of Simpson’s paradox be applied to prevent the reversal of inequalities Walsh presents? Normalizing the data is clearly of no help.

47 The fourteen hypothetical subpopulations that exhaust the population as a whole have exactly the same size (nj=6). Simpson’s reversal cannot in this case be due to the uncorrected influence of unequal sample (subpopulation) size. What of the possibility that partitioning the global population into fourteen six-member subpopulations somehow introduces confounding factors?

Barring miraculous intervention or spontaneous back mutation, events the likes of which

Walsh’s scenario explicitly precludes, the criterion for partitioning does not allow the reassignment of a token member of genotype G1 (G2) as a member of its contrast class G2 (G1).

Assignment to one of the fourteen subpopulations is accordingly independent in the sense that such partitioning has no influence on a member’s genotypic identity. Moreover, the relative fitness relationship was stipulated at the outset as being the same in every subpopulation.

Random subgroup assignment and composition (i.e., the number of G1s and G2s in each subpopulation) do not differentially affect the value of genotype relative fitness within each subpopulation. In fact, there is no way to compose the hypothetical population out of subpopulations of n ≤ 10 that would fail to produce the same reversal (Walsh 2010, 164-65).

The standard diagnostic checks for benign cases of Simpson’s paradox apparently support Walsh’s contention that there is at least one pathologically paradoxical case for proponents of PIF. But, as the qualifier “apparently” suggests, we should not be too hasty. The force of Walsh’s alleged counterexample to PIF resides in the satisfaction of two suppositions.

The first is that probabilistic fitness relations between competing genotypes be interpreted as causal relations. There is, of course, another option. These probabilistic relations could be taken as merely statistical associations, in which case the probabilistic incompatibilities between the subpopulation-level and the global-level would be a familiar and unproblematic mathematical artifact. For advocates of PIF, committed as they are to a causal interpretation of fitness and

48 selection, this option is clearly unavailable. The second supposition is that the biological realm does not admit of such causal reversals. Even a single case involving a genuine causal reversal of the foregoing variety would suffice for the perchance coherence of a set of causal commitments like those exposed by Walsh’s counterexample.

In spite of its prima facie plausibility, the second supposition founders on counterexamples. Evolutionary biology provides many instances of fitness reversals when multiple levels of selection are operative. Most prominent among these are cases involving altruism (Sober and Wilson 1998). Altruistic individuals typically have lower fitness than their selfish counterparts within the small groups that together comprise a large but subdivided population. As there is within-group selection against altruists, one might well think that their relative frequency should decrease in the subsequent generations. But as groups with more altruistic individuals grow faster than those composed of selfish ones, among-group selection favors altruists. Their frequent small within-group losses can be offset by large infrequent gains of the group to which they belong. The probabilistic inequality reversal which generally characterizes Simpson’s paradox also happens to identify well-known examples of selection for biological altruism. Insofar as the mathematical formalisms (e.g., Price’s equation (1970, 1972)) within which fitness reversals are detected are metaphysically neutral, these do not foist an ontological framework upon us. A proponent of PIF can thus maintain that it is no less plausible to assume the operation of multiple levels of selection than to deny the efficacy of causal processes driving selection (Otsuka et al. 2011). The causal interpretation of fitness and selection consequently remains a live possibility.

As some Simpson’s reversals are not only unproblematic but also seemingly necessary for the detection of (what many accept as) indisputably evolutionary processes, Walsh needs

49 another means by which to force the hand of PIF theorists. For the sake of argument, then, he grants that there may be causal processes within a population which exhibit reversals of the probabilistic inequalities that describe their etiological structure. If so, however, PIF theorists require principled means by which prioritize among the causal factors that are supposedly at work. Barrowing from Judea Pearl’s (2000) work, Walsh argues that any set of causal claims is subject to a decision-theoretic constraint known as the Sure Thing Principle (hereafter STP):

(STP) An action C that increases the probability of event E in each subpopulation increases the probability of E in the population as a whole, provided that the action does not change the distribution of the subpopulations. (Pearl 2000, 181)

Another way to comprehend this point is to restate it as a constraint on genuine causal processes, namely that such processes must be description-independent (Walsh 2007, 292-93). More formally, a genuine causal relationship between two factors in a population, say X and Y, must be such that, if Pr(Y|X) > Pr(Y|~X), then there can be no other legitimate factor W by which to partition the population as whole which would reverse or negate the probabilistic inequality (i.e.,

Pr(Y|X) ≤ Pr(Y|~X)) without also making it the case that a data point initially assigned membership in one class (X) need consequently be reassigned as a member of the contrast class

(~X).

This is a subtle move on Walsh’s part, one that should not go unappreciated. From a metaphysical standpoint, he rightly concedes that there can be (and usually is) more than one causal process at work in any case of systematic transgenerational change to relative genotype frequency. A moment’s reflection reveals that no one with a proper understanding of evolution via natural selection and ecological processes should dispute this point. Yet it does not follow from this concession that every causal factor that somehow influences selective change is thereby of explanatory interest to population biologists. A particular squirrel’s having consumed just one more molecule of H20 than any of its extant conspecific competitors is exceedingly 50 unlikely to make a significant difference to its prospects for survival. Such idiosyncratic facts are safely ignored by population biologists. They are interested in “common differences” or combinations of trait types that exhibit heritable variation of the sort that affects survival and reproduction in what is assumed to be a more or less uniform (or at least randomly experienced) set of environmental conditions. It is thus with respect to the epistemic project of explanation in population biology that Walsh presses the need for priority among competing causal commitments. Propensity theorists must grant explanatory privilege to the fitness distribution for the population as a whole in order to correctly predict subsequent change to genotype frequencies in Walsh’s case. But, then, they cannot predict or explain why G2 regularly loses to

G1 in any unstructured (undivided) population. This is a fact that demands explanation in

Walsh’s hypothetical case as well as in instances of group selection for altruism. The challenge for advocates of PIF becomes one of how to articulate a coherent causal interpretation of fitness that explains why, within each subpopulation, fitness distribution causes G1 to increase over G2, while, in the population overall, fitness distribution causes G2 to increase over G1 (Walsh 2010,

166-67).

Confronted with this challenge, the strategy adopted by many advocates of multilevel selection is to argue that there are two distinct types of selective processes at work: one that operates within each subpopulation and another that operates among subpopulations. For ease of reference, I will refer to these as “within-group” and “among-group” selection, respectively. In

Walsh’s hypothetical population, within-group selection tends to increase the frequency of genotype G1 relative to that of G2, while among-group selection tends to increase the relative frequency of G2 overall. Now if these selective processes were truly independent, there would be no violations of the STP since there would be distinct types of fitness. Independence between

51 two causal processes minimally requires the in-principle possibility of intervening in or manipulating one causal process while leaving the other causal process unchanged (Woodward

2003). This condition goes unmet in the present case. It is impossible to manipulate the overall fitness distribution and, thereby, among-group selection without intervening on some within- group fitness distribution. The resulting change to some within-group fitness distribution or other would subsequently affect within-group selection. Among-group selection consequently supervenes on within-group selection (Sober and Shapiro 2007). It cannot be independent since it is nothing more than the aggregate of within-group selection. If both of these selective processes are causal, then the relationship between them is subject to STP. But, as Walsh’s case reveals, they clearly violate STP when jointly assumed. As a consequence, they cannot both be explanatory causes (Walsh 2010, 167-68).

A multilevel strategy that assumes independent causal processes cannot provide us with an understanding of the population dynamics as exhibited in Walsh’s example. How can this be so? Such a strategy, after all, suffices to explain the selective dynamics within each hypothetical subpopulation. Moreover, this approach correctly predicts the selective dynamics for the hypothetical population taken as a whole if it is reconceived of as undivided. But in either case the population under examination is obviously depicted as other than it (hypothetically) is.

Restricting explanatory focus solely to subpopulation dynamics demands principled reasons for taking the set of subpopulations as an appropriate target of selective explanation. What, we might ask, justifies or motivates labeling these fourteen six-member groups ‘subpopulations’ rather than populations simpliciter? And when such reasons (e.g., being subject to a similar suite of selection pressures) are provided it quickly becomes evident that projections on the basis of subpopulation-fitness distribution go unmet or are “inverted” for the entire population. The

52 multilevel approach, at least with respect to Walsh’s case, offers an explanans in search of an explanandum. It cannot correctly explain selective dynamics on both levels (subpopulation and globally) concurrently without recourse to a fluctuating conception of fitness.

A Rejoinder

Let us pause to summarize the argument as it stands. In Walsh’s hypothetical case, partitioning by subgroup membership does not change the stipulated relative fitness distribution between extant genotypes: G1 is fitter than G2 in every subpopulation, j1…j14. But the increased probability of an effect (increased relative frequency of G1) in each subpopulation ji does not carry over to the population as a whole, where G2 is projected to increase due to its greater fitness. This pathological Simpson’s reversal presents a blatant violation of the Sure Thing

Principle. So, the set of causal commitments must be incoherent as it stands. In order to reestablish coherence, PIF theorists are subsequently left with what a dilemma. On one horn, they are faced with the prospect of establishing explanatory priority among causal claims in a principled manner. On the other horn, they must realign probabilistic inequalities by making the direction of the global population inequality correspond to that observed for the subpopulation- level or vice versa.

Otsuka et al. (2011) have taken the latter course. In defense of the causal interpretation of selection and the PIF, they call into question the relevancy of Walsh’s case. Two closely related stipulations are set aside as targets for scrutiny: (i) the choice of Gillespie’s (1974, 1975, 1977) equation for the calculation of fitness, and (ii) the manner in which subpopulation structure is imposed upon the population as a whole. Let us examine each of these in turn.

Gillespie’s equations for the calculation of fitness have been the subject of intense discussion for some time in the philosophy of biology (Sober 2001). It is largely as the result of

53 a greater appreciation of Gillespie’s stochastic models that philosophical work on the conceptual status of fitness (and related processes like selection and drift) has been revitalized. Be that as it may, Gillespie’s models have long since been surmounted by more sophisticated mathematical models. For instance, as Otsuka et al. point out, “Steven Frank and Montgomery Slatkin (1990) have developed a general model that has Gillespie’s model as a special case” (2011, 210-11).

The choice of Gillespie’s model as representative of fitness equations generally is thus at best idiosyncratic.

That Gillespie’s model applies only to a restricted range of cases does not, however, provide adequate reason to discount Walsh’s use of it. After all, Walsh needs but one pathological case of Simpson’s paradox to stymie the causal interpretation. What makes his use of Gillespie’s model suspect is that this equation explicitly parameterizes population size (n), where it occurs in the denominator of the second term (σ2/n) on the right-hand side of the equation. This feature is essential to Walsh’s argument. Unrestricted manipulation of this parameter is what enables him to generate the Simpson’s reversal between probabilistic inequalities for the subpopulations (in which n = 6) and the global population (in which n = 84).

Yet the more general models (e.g., Frank and Slatkin 1990) that subsume Gillespie’s model as a special case do not include a parameter for population size. The key issue accordingly becomes one of whether Walsh’s unrestricted manipulation of the population size parameter is warranted.

Contra Walsh, Otsuka et al. argue that subpopulation partitioning by size, as indicated by n, is not an arbitrary matter of descriptive fiat (2011, 212). There are constraints upon the value that this parameter can take even in Gillespie’s original model. In his model, the parameter n is presumably held constant by a density-regulating process that determines how many juveniles survive and reproduce. The strength of this process depends on environmental factors such as

54 habitat condition, abundance and quality of food sources, number of predators, etcetera. It is thus an objective property of surrounding environmental conditions, rather than our abstract conception of population, that determines relevant population size. Population size matters to fitness precisely because it serves as an indicator of such concrete selection pressures.

Describing, defining, or partitioning an actual population is more than a matter of biological or philosophical whimsy. If the fitness of an individual that lives in a population of six is less than that of an individual living in a population of eighty-four it is because they are subject to different density-regulating processes that arise from different sets of environmental factors

(Otsuka et al. 2011, 212).

Notice that the foregoing criticism grants Walsh the use of Gillespie’s model. And for good reason since greater explanatory scope is a metatheoretical virtue or value (McMullin

1982) that can at times be less than decisive in contests between theories.72 The fact that central equations of Newtonian mechanics can be derived from Einsteinian relativity does not dissuade us from using Newton’s laws of motion to predict the macroscopic behavior of many phenomena. Predictions made on the basis of Newton’s laws coincide with those made by the more elaborate equations of Einsteinian relativity in a limited (but practically relevant) range of cases. Moreover, Newton’s laws are simply easier to apply in practice. In much the same fashion, Gillespie’s (1974) equation for fitness is easier to comprehend than Frank and Slatkin’s

(1990) model and applies equally well in a limited range of cases. Although Otsuka et al. make it seem as though it is by sheer good will that they grant Walsh his choice of model, this is

72 This assumes a semantic account of theory whereupon theories are variegated entities comprised of mathematical or graphical models, methodological procedures, pictorial representations, as well as logical arguments. 55 anything but the case. They must concede the use of Gillespie’s model on the grounds that its idiosyncrasy does not alone make for discursive impropriety.

What about Otsuka et al.’s case for constraints on the parameter for population size? Here their argument fares much better. Walsh claims that “[i]t is legitimate for biologists to investigate the dynamics of whole populations and their subpopulations; howsoever the latter are demarcated” (2010, 165). For him, then, any partitioning of the population is permissible. But sensible partitioning subdivides a population only on the basis of further theoretically relevant information. Theoretically relevant partitioning variables in population genetics must meet at least three conditions. First, partitioning must yield actual subpopulations or contrast classes as proper subsets. Without such variation, as in variants of a genotype, there can be no natural selection. Second, the variation exhibited by these subpopulations must somehow influence the probabilistic relationship between variables of explanatory interest. This is just to restate the well-worn demand for “distinctions that make difference” which is at the heart of the statistical relevance model for explanation (Salmon 1984). Third, the background conditions against which partitioning proceeds must be considered more or less uniform or randomly experienced by the organisms that constitute a population. ‘Background conditions’ has two senses here. It refers to intrinsically uniform aspects of the organisms that constitute a population. Such characteristics exhibit no variation but nevertheless serve to identify each of these individuals as belonging to a type (species). Whereas the first sense of ‘background conditions’ refers to aspects that are intrinsic to population members, the second sense emphasizes extrinsic or relational features between the external environment and the organisms in a population.

Otsuka et al. take issue with Walsh’s presumption that the second sense of this third condition is satisfied by his counterexample. They find it totally unsurprising that an organism

56 with a particular genotype which lives in population of eighty-four members should have a fitness value that differs from what it would have in a six-member population. There are, as they see matters, different sets of environmental factors that instantiate the density-regulating processes in these two situations. The extrinsic background conditions in the two situations would accordingly be different, perhaps even radically so. If this is correct, Walsh has unknowingly introduced a new partitioning variable into the probabilistic relationship between fitness and the relative increase of one genotype over its extant competitor by dividing what was assumed a uniform set the background conditions.

A more formal approach can bring clarity to Otsuka et al.’s contention. Barrowing from

Walsh’s example, let us establish H1 as the hypothesis that genotype G1 increases in relative frequency over genotype G2. The null hypothesis H0 would, then, be that the relative frequency of G1 shows no such increase. Let Sj be a partitioning variable that indicates (sub)population structure, where the subscript j designates some subpopulation or other. W indicates a uniform relative fitness distribution. The variable which stands for background conditions of the extrinsic sort that ground density-regulating processes must also be made explicit. Let us label this variable E. Utilizing the parameter values for the mean number of offspring and variance in offspring number for each genotype that Walsh provides, he wants it to be the case that the following reversal of probabilistic inequalities holds:

Global Population: Pr(H1|W, E) < Pr(H0|W, E)

Subpopulation: Pr(H1|W, Sj, E) > Pr(H0|W, Sj, E)

How the partitioning variable Sj is conceived of here is of the utmost importance. It is proposed as a structural property of a population that is either realized (Sj) or unrealized (~Sj) and, hence, suppressed within the expression of the global inequality. But ‘~Sj’ is ambiguous; it admits of several mutually exclusive interpretations. It could mean the total absence of any 57 (sub)population structure as might be the case for a mere aggregate of individuals. An alternative interpretation would have it such that the negation can be applied only to instantiations of j.

When j=1, for example, ‘~Sj’ would then be shorthand for the subpopulation structures exemplified by the extant subpopulations other than that designated as S1. Yet another interpretation might characterize ‘~Sj’ as a general catch-all term, the extension of which is a non-empty contrast class. On such a reading, it would indicate not that there is no population structure, but rather that what structure there is differs from that exhibited by any of the subgroups S1…S14 in Walsh’s example.

Otsuka et al.’s argument suggests that Walsh must bend to a reading of the negated structure variable as a general catch-all term. Interpreting the negated variable as the absence of any structure whatsoever is not conducive to Walsh’s ends. On such a reading, the global population is equivalent to an “oversized subpopulation” in the sense that the selective dynamics would favor an increase in the relative frequency of genotype G1 over G2. This scenario would not generate the Simpson’s reversal of probabilistic inequalities that Walsh requires. What of the second interpretation, whereupon the negated structural partitioning variable reflects a contrast class consisting only of extant subpopulations? No matter which of these subpopulations instantiates the subscript variable j, the relative fitness relationship between G1 and G2 would remain consistent across all thirteen remaining subpopulations in the contrast class. Here, again, there is no resulting Simpson’s reversal. Consequently, the only interpretation that survives to generate the desired Simpson’s reversal is one upon which what structure there is differs from that exhibited by any of the subgroups S1…S14.

Unfortunately, the reversal licensed by the foregoing interpretation comes with a cost that

Walsh cannot afford to pay. We know that, whatever structure this unique population has, it is

58 unlike that exhibited by any division of it into subpopulations of n = 6. This population has a size of n = 84. The concrete external conditions that make explicit expression of the parameter for population size relevant and important to calculations of fitness are accordingly different.

Members of this population experience environmental conditions that enable the survival of 84 cohabiting (i.e., undivided) conspecifics. As the structure of the population as a whole is unique

(~Sj = ~ (S1 v S2 v…S14)), let us rename it ‘S15’. The reversal of probabilistic inequalities that is actually engendered, contra Walsh’s claim, is consequently not in violation of the Sure Thing

Principle:

Population 15: Pr(H1|W2, ~Sj, ~E) < Pr(H0|W2, ~Sj, ~E)

Populations 1-14: Pr(H1|W1, Sj, E) > Pr(H0|W1, Sj, E)

The structural variable changes the distribution of the subpopulations in this case and thus fails to satisfy the second provision of the STP. Partitioning by Sj makes it the case that an individual genotype cannot experience conditions ~E and so should not be held to expectations based on the relative fitness distribution W2 (whereupon G2 is fitter than G1). But if individuals are not subject to the expectations of relative fitness distribution W2, then there is no reason to believe

Pr(H1) < Pr(H0) within subpopulations experiencing E. It thus makes no sense to calculate two different fitness values—one within-group and another for the population as a whole—for one and the same individual.

It is a given that changes to relevant background conditions (i.e., differing sets of selection pressures) can result in relative fitness differences. Changes to structure (population size) entail a change to background conditions, and these concrete external conditions determine the fitness distribution. There is no way to tease apart the partitioning variable for structure from its corresponding fitness distribution in this case; ~Sj is inextricably bound to W2, while Sj is likewise bound to W1. This should not surprise. In any population of finite size drift will play 59 some role in selective dynamics. And the magnitude of drift is inversely proportional to population size. If there is no more to structure than sheer population size, then the partitioning variable Sj (or ~Sj) is merely a constitutive component of fitness made explicit. It is simply inappropriate to think of these probabilistic inequalities as pertaining to the one and the same population under differing descriptions since a uniform set of background conditions against which to partition is absent.

Unwarranted Constraints on Population Size

Otsuka et al.’s reply appears damaging indeed to Walsh’s case against the PIF and a causal interpretation of selection. But is the rejoinder fatal? That depends on whether the case for constraints on the population size parameter in Gillespie’s model is cogent. In what follows I will argue that Otsuka et al.’s reasoning is less than decisive. Let me stress at the outset, however, that arguing as such does not necessarily entail a defense of Walsh’s acausalist interpretation of fitness.

It cannot be denied that there is much empirical and theoretical support for Otsuka et al.’s plea for constraints on the population size parameter. One of the basic questions in population genetics involves estimating the amount of allelic differentiation (heterozygosity) among the subpopulations constituting a metapopulation (Wright 1965). This is a prerequisite for determining whether natural selection and other evolutionary factors are operative. Doing so demands ascertaining the actual magnitude of genetic drift, which of course depends on population size. But a simple count of the number of individuals in a subpopulation accurately estimates the magnitude of drift only when (i) there is an equal sex ratio, (ii) sexual and natural selection are inoperative, and (iii) subpopulation size remains constant in each generation. These are unrealistic assumptions. To determine the magnitude of genetic drift in actual cases that

60 violate these assumptions, the concept of effective population size (Ne) is needed. Effective population size is the size of an idealized population (i.e., one satisfying the aforementioned unrealistic assumptions) that would experience the same magnitude of genetic drift as the actual population of interest (Conner and Hartl 2004, 62; Nunney 1991). Ne reflects features of the environment that organisms experience, ones which significantly affect subsequent evolutionary dynamics. It requires measurement and careful estimation. Inaccurate estimates of it render theoretical models completely inapplicable to real populations. It would seem as though the value of the population size parameter, contra Walsh, should not be arbitrarily established

(Otsuka et al. 2011).

The need for effective population size is especially apparent in cases that involve sexual selection. The occurrence of sexual selection violates idealizing assumption (ii) above, which stipulates the absence of selection, sexual or otherwise. To suggest that selection is inoperative is just to say that each individual has an equal probability of successfully contributing gametes to the next generation. When, by way of example, only a few dominant males account for most of the mating during a generation, the majority of males do not have an equal chance of contributing to the subsequent generation. In such a case, effective population size is decreased because the males that do not mate are effectively eliminated from the population. Variance in reproductive success among individuals that exceeds random expectations reduces effective population size because some individuals are overrepresented and others are underrepresented

(Conner and Hartl 2004).

Variance in reproductive success is commonplace in nature. For instance, Clutton-Brock,

Albon, and Guinness (1988) found over four times greater variance in lifetime reproductive success for males (Vm=41.9, Nm=33) than for females (Vf =9.1, Nf =35) in red deer (Cervus

61 elaphus). In cases where sexual selection is operative, Ne can be estimated with the following equation:

8푁푎 푁푒 = 푉푚 + 푉푓 + 4

Using Clutton-Brock et al.’s numbers in place of the variables yields:

8 × (33 + 35) 푁 = = 9.89 푒 41.9 + 9.1 + 4

Due to the variance in reproductive success, effective population size for this particular population of red deer is thus about 1/7 the actual population size (Na=68). As the strength of sexual selection increases, so, too, does variance in mating success. This, in turn, reduces the effective population size. Examples such as this show just how important it is to temper Walsh’s contention that “It is legitimate for biologists to investigate the dynamics of whole populations and their subpopulations; howsoever the latter are demarcated” (Walsh 2010, 165). Neither biologists nor philosophers should have the freedom to use just any population size to instantiate the parameter n in Gillespie’s model when there is strong sexual selection.

Must biologists restrict themselves to using effective population size in the manner that

Otsuka et al. suggest? The case for this claim is anything but given in spite of its prima facie plausibility. There are at least two issues confronting their contention, one motivated by practical or methodological obstacles and the other based on theoretical concerns.

Even if it is accepted that effective population size is the appropriate instantiation of population size parameter in Gillespie’s model, there are well known methodological obstacles to obtaining accurate estimates. These obstacles become apparent when examining populations that fluctuate in size. Extreme decreases in population size are typically called “bottlenecks.” A founder event is a particularly important type of bottleneck, one in which a new population is

62 created by a small number of colonists. By chance alone, allele frequencies in a small group of colonists often diverge significantly from those in the population from which they originated.

Moreover, the initially small size of the colonizing group usually makes for additional genetic drift.

To calculate the effective population size for populations that experience extreme fluctuations, the harmonic mean population size is employed:

1 1 1 1 1 = ( + + ⋯ + ) 푁푒 푡 푁1 푁2 푁푡

Nt refers to the actual population size in generation t. Small numbers, such as those that occur during a founder event, have a very strong influence on harmonic mean values. By way of generic example, suppose that a population has sizes 1000, 10, and 1000 over three successive generations. The arithmetic mean population size is accordingly 670. Effective population size, as per the aforementioned equation, is but a meager 29. This large discrepancy unveils a practical problem for trying to determine the effects of fluctuations on Ne: we need to know the history of population sizes in order to discount the inflated effect(s) of bottlenecks in our calculations of effective population size and thus the magnitude of genetic drift. Unfortunately, this type of data is very difficult to obtain and examples of studies that have managed to collect such data are hard to come by.73 Even when there are data on the actual population sizes historically realized by a population, calculations almost certainly overestimate the value of Ne since these calculations include actual population sizes. More realistic estimates of Ne could be given if estimates of Ne for each year were known and, then, substituted into the harmonic mean equation in place of

7373 For good examples see Peter and Rosemary Grant’s studies of Galapagos finches, E.B. Ford’s studies of the moth Panaxia dominula, and Clutton-Brock et al.’s work with red deer. 63 actual population sizes. But obtaining such detailed data over a number of years is even more difficult than acquiring data on actual population size, which is taxing enough.

The worry here presented is obviously more of a problem in practice than it is in principle. Nevertheless, it calls for a change in methodology that favors Walsh’s call for more liberal manipulation of the population size parameter. When biologists do not have data on historically realized population sizes, let alone effective population sizes, they must make educated guesses as to value of this parameter. There are several different ways to approach this task. One could census a particular population over future generations and infer that the population sizes realized prior to census were not on average terribly divergent. Yet another “a posteriori” approach would have one use census data drawn from different populations of the same species or even closely related species (e.g., congeners) with similar life history requirements for the purpose of estimation. Alternatively, when detailed information about the survival requirements for a particular population is in hand, including information regarding the prevalence of common selection pressures, it is possible to construct mathematical models and simulations. Such “a priori” modeling does not require preexisting intergenerational census data.

Within this explanatory framework, models and simulations are typically judged on how well they predict or fit subsequently acquired data, usually discounted by the number of parameters a model happens to include (e.g., Akaike Information Criterion scores). This general approach is common to both maximum likelihood and Bayesian analyses, which inquire as to the likelihood of observational data given alternative competing models. It should also be noted that these three different approaches to the problem of estimating effective population size can be used in conjunction.

64 What is a biologist to do when confronted with a newly discovered population for which there is no previous census data? Suppose that this population consists of just six individuals, each of whom exemplifies one of two extant competing genotypes, G1 and G2, and happens to be of the same species as the population cited in Walsh’s case (n=84). Should population biologists resist entertaining or appropriating the dynamics of Walsh’s subpopulations (n=6) because of the arbitrary way they were conceived? Of course not. Science does not grind to a halt. Scientists should be granted the use of any (reasonable) means at their disposal (Feyerabend

1975). The model that our biologists develop on the basis of Walsh’s perhaps ill-conceived subpopulations might prove informative even though inapplicable to the subpopulations that informed its construction.

In partial defense of Walsh, then, it should be acknowledged that a moratorium on investigating the dynamics of population subparts is not justified by the fact that evolutionary biologists are generally interested in explaining (global) population-level evolutionary phenomena. Investigating trait (genotype) dynamics within the subpopulations needs to incorporate the size of these subdivisions, which in turn makes possible a reversal of the fitness values assigned to the traits (Ramsey 2013), albeit perhaps not in one and the same population.

A related worry emanates from the theoretical arena. Effective population size is derived from data on actual population size. It is often presumed that the actual population sizes recorded reflect equilibrium population size. That this is a standard assumption is evident in what typically follows, namely an investigation of equilibrium dynamics. Population biologists aim to determine the character of these equilibrium values. They want to know whether such equilibria are stable, partially stable, or unstable in nature. The methods for determining as much are well known and typically call for sophisticated applications of calculus and linear algebra.

65 Applications of calculus involve the study of what happens in the wake of small perturbations to equilibrium values. While preexisting data can inform the construction of formal models, models

(in population biology) must also face the “tribunal of experience” in the form of independent evidence. Such independent evidence usually comes by way of data generated via controlled laboratory studies or field experiments. Especially in the laboratory setting, a biologist has the ability to manipulate parameter values well beyond values observed in “wild” populations. And it behooves any practicing ecologist or evolutionary biologist to do so. For the number of offspring born in subsequent generations rarely equals the projected expectations for equilibrium population size.

The commonplace occurrence of more (or sometimes fewer) offspring being born than can possibly survive in a particular environmental locale makes the study of non-equilibrium dynamics paramount. A good example of the need and appreciation for non-equilibrium dynamics can be gleaned from population ecology. In this area, experimental treatments are often concentrated near the low end of the population density gradient to capture the transition from exponential to density-dependent recruitment (Miller and Inouye 2011). The transition is crucial for it signals the population densities at which limiting factors and natural selection take effect.74 The location (i.e., range of population densities), functional form, and magnitude of the transition can provide significant insight into the nature of the selection pressures that drive evolutionary population dynamics. For present purposes, though, it should be noted that this transition need not occur at values anywhere near the actual (equilibrium) population size or

74 That this example is taken from population ecology and its focus on recruitment is no reason to discount its relevance to evolutionary considerations. One can, for example, transform recruitment into average female lifetime reproductive success (i.e., a proxy for female fitness) just by dividing total recruitment (Nt+1) by the number of females in the preceding parental generation (Wf=Nt+1/Ft). 66 even the effective population size. Studying selective dynamics at non-equilibrium population sizes is not only incidentally informative, it is often necessary.

Theoretical concerns like the foregoing provide reason for reconsidering restrictions on the permissible values for the population size parameter in Gillespie’s model. Otsuka et al. no doubt realize that effective population size is an idealization. It indicates the size of a counterfactual population with stable size and equal sex ratio, which does not experience any form of selection. Nor would they deny that use of idealizations is justifiably ubiquitous across the sciences. It is nevertheless worth delving into what this commitment this implies.

Idealization assumes that the applicability of a formal model or truth of an explanatory generalization depends on the realization of antecedent conditions that are not actually realized or perhaps only counterfactually realizable (Cartwright 1983, 1999). Granting this point does not entail that idealized models and explanations are thereby inapplicable to actual situations. By contraposition, it follows instead that the failure of a law-like generalization or predictive model often provides us with good reason to believe that antecedent conditions have somehow gone unmet, often for theoretically interesting reasons. In population biology, antecedent conditions for the applicability of law-like generalizations (e.g., Hardy-Weinberg equilibrium) consist of the absence of evolutionary mechanisms or underpinning ecological circumstances. Violations of these antecedent conditions, then, imply the operation or presence of at least one evolutionary mechanism in an actual population (Elgin and Sober 2002, 443-444). But beyond the inference that one or more evolutionary mechanism is at work, questions about the number, direction, magnitude, and duration of evolutionary factors linger. Deviations from equilibrium inform answers to these subsequent questions only if equilibrium values have been accurately estimated.

67 One of the ways that Gillespie’s equation for fitness might produce the wrong result occurs when the parameter for population size is incorrectly instantiated. As Otsuka et al. effectively show, this can happen even when effective population size, an ideal value based on what is presumed to be equilibrium population size, is not used to instantiate it. What they fail to appreciate is that models incorporating effective population size can lead biologists astray even when effective population size is correctly estimated. This can happen when non-equilibrium dynamics go unexamined, which is tantamount to ignoring the evolutionary dynamics around population densities that are neither equilibrium values nor derived ideal values. Mathematically, this is exemplified by the occurrence of what are called “basins of attraction.” The ontological reality that grounds this mathematical phenomenon is typically due to the existence of pleiotropy and epistasis.

Conclusion

Walsh (2010) has argued that the propensity interpretation of fitness (PIF) and the causal interpretation of the selection that accompanies it can only be tolerated on pain of irrationality.

By way of counterexample, he proposes a hypothetical population in which any uniform causal interpretation of fitness, or rather a fitness distribution describing relative fitness between two extant genotypes, leads to incompatible probabilistic expectations concerning subsequent relative frequency of the genotypes in question. The Sure Thing Principle (STP) stipulates that a set of causal commitments cannot admit such inconsistency unless partitioning variables change the composition of the subpopulations initially formed on the basis of an independent explanatory variable. Walsh claims that partitioning by subpopulation size alone does not violate this provision and, thereby, his counterexample allegedly constitutes a genuinely pathological instance of Simpson’s paradox. Otsuka et al. (2012) counter that not just any change to the

68 population size parameter is permissible; for some changes make the resulting calculations of fitness inapplicable to actual populations. They suggest constraints on the population size parameter such as restricting its instantiation to the value of the effective population size (Ne).

However, this restriction is surely too strong. I have provided both practical and theoretical reasons for a more liberal approach to the population size parameter in Gillespie’s equation for fitness and elsewhere.

My arguments for the free manipulation of this parameter do not entail the legitimacy of

Walsh’s counterexample or a defense of his acausal interpretation of fitness. The liberal approach to the population size parameter is grounded in what are ultimately pragmatic methodological concerns. Subparts (subpopulations) of a population can, when examined, show fitness distributions which are inconsistent with expectations for relative genotype frequency in the population that they collectively and exhaustively constitute. However, the fitness values calculated for these subpopulations have theoretical (explanatory) relevance only insofar as they inform our understanding of the evolutionary dynamics for populations of similar size that are not subpopulations (proper subsets) of some global population. When restricted to a single population, only the fitness values calculated for the population as a whole (globally) and the corresponding selective process should be read in a causal fashion. This is a difference that must not be overlooked. And it is the reason why we should ultimately side with Otsuka et al. against

Walsh in this dispute.

69 CHAPTER 3

HONEST PROPENSITIES: IS THERE A CRACK IN THE FOUNDATION?

While answering early concerns about explanatory circularity, the propensity interpretation of fitness as originally articulated by Robert Brandon (1978) and Susan K. Mills and John Beatty (1979) foundered upon issues pertaining to the measurement of fitness as a scalar value in the face of demographic and environmental stochasticity (Gillespie 1977, Beatty and Finsen 1989, Brandon 1990, Sober 2001). Staunch critics of the propensity interpretation

(Matthen and Ariew (2002); Walsh, Lewens, and Ariew (2002); Ariew and Lewontin (2004);

Ariew and Ernst (2006)) and self-conscious proponents of propensities (Abrams 2007, 2009;

Millstein 2006) alike have taken the inability of the propensity interpretation to overcome such measurement problems as a decisive reason for concluding that fitness cannot remain an explanatorily relevant concept when construed as a probabilistic dispositional property of individual organisms. By reconfiguring the mathematical foundations of the propensity interpretation, Pence and Ramsey (2013) have effectively countered the aforementioned measurement worries. But, as I will argue, the new formal foundation they have laid for the propensity interpretation risks of losing sight of the basic reference class—a biological population—for which explanation of adaptation is typically sought. Furthermore, there is at least one alternative mathematical model which does all the work that Pence and Ramsey’s model is designed to do but with a much sparser ontology. Taken in tandem, these critical points demonstrate that the new formal model is neither sufficient nor necessary. They do not, however, undermine the propensity account of fitness since an alternative formal model is on offer.

70 Background

A philosophically-insightful resolution to the problem of explanatory vacuity75 in evolutionary population biology was articulated in the late 1970s if not somewhat earlier.

Adapting Karl Popper’s conception of propensity,76 Robert Brandon (1978) and John Beatty

(1979, along with Susan K. Mills) independently argued that biological fitness is best conceived of as a probabilistic dispositional property or propensity. Brandon and Beatty (1984) nicely summarize the upshot of this maneuver:

On the propensity interpretation, fitness is a probabilistic disposition or ability explicated in terms of expected rather than actual reproductive success (in the mathematical sense of 'expected value'). Inasmuch as the connections between an entity's dispositions or abilities and its actual behaviors are causal connections rather than analytic connections, the propensity interpretation of 'fitness' allows for genuinely explanatory accounts of differential reproduction in terms of differential fitness […] Informally, 'fitness' is defined in terms of abilities to reproduce, not in terms of actual reproductive success. (33- 34)77

This proposal sidesteps the explanatory worries that accompany a naïve definition of fitness in terms of realized fitness. On such an oversimplified view, there is supposedly nothing more to the notion of fitness than actual reproductive success. But the claim that ‘fitter organisms reproduce in greater number than less-fit conspecifics’ is then equivalent to ‘the organisms that reproduce more outreproduce those which reproduce less.’ This tautology is obviously devoid of empirical content and thus explanatory import; for it clearly fails to account for why any of the organisms reproduce in greater (or lesser) number than their conspecific competitors. Actual

75 Here, “explanatory vacuity” is a deliberately vague way of acknowledging the mostly misguided worries (Popper 1974) that motivated philosophical work on the concept of fitness. These typically fall under the moniker “the tautology problem.” 76 See the following: (1) Popper, K. R. The Propensity Interpretation of the Calculus of Probability, and the Quantum Theory. In: S. Körner (ed.): Observation and Interpretation. Academic Press Inc., Butterworths Scientific Publications, pp. 65–70. 1957. (2) Popper, K. R. The Propensity Interpretation of Probability. British Journal for the Philosophy of Science: 10, No. 37, 25–42. 1959. 77 Brandon, R. and J. Beatty (1984). The Propensity Interpretation of ‘Fitness’: No Interpretation is No Substitute. Philosophy of Science, 51(2): 342-357. 71 lifetime reproductive success cannot constitute fitness if the notion of fitness is to maintain its central explanatory (causal) role in evolutionary population biology.

The propensity interpretation holds that fitness-value ascriptions are always issued on the basis of membership in a class defined by reference to a trait-type or trait-variant. Henry B.D.

Kettlewell’s famous study of industrial melanism offers a simple illustration.78 He wished to understand why the melanic form (carbonaria) of the peppered moth Biston bitularia tends to predominate and eventually replace the standard light form (typica) in industrial areas containing soot-covered trees. In Kettlewell’s study a fitness-value ascription for any member of the population defers to a particular individual’s coloration, as being either melanic or light. Only after averaging over the reproductive contributions of all the known individuals bearing a trait- variant, say melanic, in a given population would any token member of that melanic type be ascribed a reasonably precise and thus interesting fitness value.79 The advantage of this approach is that individuals bearing a specific trait-variant can exemplify an expected fitness-value without necessarily manifesting the calculated fitness-value for its reference (trait-variant) class. Only over many repeated trials of the same or relevantly similar type would we expect fitness values to converge upon the calculated mathematical expectation.

The mathematical model that captures this insight was introduced and later refined by

Brandon (1978,1990):

OE OE A(O,E) = Σ P(Qi ) Qi (1)

78 Majerus, Michael E. N. (2005), The rise and fall of the carbonaria form of the peppered moth., in Fellowes M. D. E., Holloway G. J., Rolff J., "In Insect evolutionary ecology", Quarterly Review of Biology, 78: 399–418 79 This example is obviously oversimplified. To accurately estimate the fitness of any (non-lethal) variant not only requires normalizing relative to alternative extant variants, but also taking into account the selection for and heritability of every other selectively non-neutral trait. 72 It has become the standard formal model in the philosophical literature. ‘A(O,E)’ should be read

OE as the “adaptedness” (i.e., fitness) of organism O in environment E’. Each ‘Qi ’ is a possible number of offspring, the whole number value of which can range (theoretically) from 0 to ∞.

OE ‘P(Qi )’ is the probability associated with a possible number of offspring being realized, a weighting which is typically generated by way of accumulated statistical data.

At the time of its introduction, this formal model provided a more rigorous way to understand what is perhaps the most basic explanatory relationship within evolutionary biology, namely the “being better adapted than” or “fitter than” relation within a population (Bouchard and Rosenberg 2004). All systems subject to the evolutionary process are arguably governed by the common expectation that if organism X is better adapted than organism Y in environment E, then (probably) X will have more (sufficiently similar) offspring than Y in E (Brandon 1978).

Brandon would later (1990) call this basic assumption “the principle of natural selection” on the grounds that it must be presupposed when explaining the dynamics of any evolutionary system.

Nothing that I have said about the formal model thus far implicates what is arguably its most important contribution. The model’s rigor is due in no small part to its measurement theoretic scale. Not only does its application allow us to determine whether one type of organism is fitter than another, it also permits inferences about just how much fitter a conspecific competitor is. Since the model is formulated on a ratio scale80 rather than an ordinal scale we can ascertain the precise magnitudes of the relative fitness differences under investigation. This, of course, allows for generating and testing predictions.

80 There is a non-arbitrary, unique, and biologically realistic zero value; namely, having a fitness value of ‘0’. Moreover, the geometric and harmonic means are permissible measures of central tendency. Brandon’s later (1990) modifications to the original formal model make higher mathematical moments explicit. The necessity for this shift is discussed below. 73 Problems for the PIF

It is this latter aspect of the formal model for the PIF that has also proven to be its greatest vulnerability. As the model returns exact scalar values upon being supplied with statistical data, demonstrating its insufficiency requires only that one present biologically plausible cases in which the measures of fitness turn out to be imprecise (i.e., falling outside conventional thresholds for statistical error) or highly unrealistic. Three types of cases exploit the opportunity to do so and have thereby plagued the propensity interpretation. I will follow Pence and Ramsey (2013) and refer to these cases as (i) the (mathematical) moments problem, (ii) the delayed selection problem, and (iii) the timing of offspring problem. Let us examine each of these.81

The so-called “moments problem” arises because model (1) determines fitness by computing the weighted arithmetic mean of possible offspring for an organismal (trait) type.

Imagine a hypothetical fitness distribution for a population of asexual organisms in which there are two mutually exclusive and exhaustive variants of a selectively non-neutral trait-type: blue or red. Let us stipulate that there are ten members exemplifying each color variant and, hence, an initial population of just twenty individuals. Census data indicates that blue individuals always contribute four offspring apiece to the subsequent generation, whereas red individuals within a generation never fail to contribute either two or six offspring apiece with equal probability. With these values in hand, expected fitness can be calculated as follows:

Fitness of Blue Form = A(OB,E) = (1.0)4 = 4

Fitness of Red Form = A(OR,E) = (0.5)2 + (0.5)6 = 4

81 I draw heavily on Pence and Ramsey’s (2013) succinct description of these three problems for the propensity interpretation. See especially pages 856-858. 74 Notice that both forms have the same (arithmetic) average number of possible offspring. There is thus no prima facie reason to believe that one form would reproduce in greater number than its competitor. According to the basic formal model for the PIF that is on offer, we should conclude that the trait in question exhibits selective neutrality and may be at the mercy of chancy, random, or drift-like circumstances.

This conclusion is more than a bit dubious if for no other reason than that the trait in question, color variation, was stipulated at the outset as selectively non-neutral. In other words, we (in our guise as biologists) already know or at least strongly suspect that variation in this trait somehow affects the chances of survival and reproductive success in a particular population and environment. Perhaps a local predator with acute color vision preferentially preys on one color form rather than another. What we (still in our guise as biologists) would like to discover is the direction and magnitude of the difference that color variation effects in this population. The proponent of PIF as thus far formalized holds that simple mathematical expectations supposedly exhaust the notion of fitness. When such expectations turn out to be of equal value, as they do in the going example, there is no variation in fitness as fodder for natural selection and thus no systematic evolutionary change. But it is simply wrongheaded for a proponent of the PIF to maintain that the trait in question makes no difference. The formal model for the PIF, at least as initially formulated, must give way to the first-hand knowledge that biologists have of a system in a case like this.

Matters are not quite as bleak as they first appear for the PIF. It has been known for some time that calculations of individual fitness are sensitive to higher mathematical moments of the

75 offspring distribution such as variance, skew, and kurtosis.82 This much was already explicit in

John Gillespie’s (1974) general equation for the fitness of a trait:

휎2 휔 = µ − 푖 푖 푖 푛

Here, ωi is the fitness of a trait, µi is the arithmetic mean of the distribution in reproductive

2 output, 휎푖 is the distribution’s variance, and n is the population size. Other things being equal, when two variants have the same expected number of offspring but differ with respect to variance in possible offspring, the organism (variant type) with lower variance will be more fit.

Equation (1) can accordingly be retooled so as to evade the aforementioned criticism and fall in line with the intuitions of practicing biologists. This becomes evident if we reexamine the foregoing example in the light of stochastic variation between generations. Let us make the additional simplifying assumptions that generations do not overlap (i.e., progenitors die very shortly after reproduction) and that coloration is perfectly heritable. To determine the expected fitness of a trait type is just to speculate about representation of the trait type in the offspring population. The expected frequency of each trait type in the subsequent generation of offspring can be calculated in the following manner:

OBLUE = 0.5(40/60 + 40/100) = 0.533

ORED = 0.5(20/60 + 60/100) = 0.467

As these calculations can make for some confusion, it is worth saying just a few words in the way of clarification. Remember that each of the ten blue members of the initial population will contribute 4 offspring, so there will be 40 blue individuals in the offspring population

82 Gillespie, J.H. 1973. Natural selection with varying selection coefficients-a haploid model. Genetical Research 21:115-120. Gillespie, J.H. 1974. Natural selection for within-generation variance in offspring. Genetics. 76: 601-606. Gillespie, J.H. 1977. Natural selection for variances in offspring numbers—a new evolutionary principle. American Naturalist. 111: 1010-1014. 76 irrespective of the reproductive outcomes for the red form. The probability weighting of 0.5 is attached to each of the possible population outcomes (60 or 100 individuals) because we do not know beforehand whether the red members will contribute 2 or 6 offspring; each outcome is stipulated as equally likely or effectively random. Put more directly, the probability weighting indicates that the total offspring contribution of the blue form could with equal probability constitute either 0.67 (40/60) or 0.40 (40/100) of the subsequent offspring population, depending on the actualized offspring contribution for members of the red form. Although both forms began with the same frequency and had the same expected fitness values, their expected frequencies diverged in the subsequent generation. In this case, and in accord with Gillespie’s original model, the blue form with lower variance is expected to increase in frequency.83

Drawing on Gillespie’s results, Brandon (1990) suggested a correction factor to the original formal model (1) by way of the following modification:

OE OE 2 A(O,E) = Σ P(Qi ) Qi – f(E,σ ) (2)

The new element ‘f(E,σ2)’ is supposed to “denote some function of the variance in offspring number for a given type, σ2 , and of the pattern of variation.”84 The correction factor was expressly designed to compensate for the influence of higher mathematical moments and, thereby, restore what are otherwise “invisible” fitness differences.

Despite the temporary reprieve granted by corrections for higher mathematical moments, there are two remaining considerations that ultimately undermine the foregoing formal models of

PIF. The first worry stems from the propensity theorist’s commitment to fitness as a probabilistic dispositional property of token organisms or individuals. To maintain that there can in fact be

83 With relatively minor changes, the foregoing example is drawn from Sober (2001). 84 Op. 20 of Brandon (1990). 77 fitness differences between organisms bearing trait variants with equivalent mathematical expectations but unequal variance, the propensity theorist had to shift away from predicting how many offspring an individual exemplifying a trait type would on average contribute to calculating the frequency of the trait type in a subsequent generation. While apparently harmless, this maneuver turns out to be anything but. Calculating relative frequency requires reference to the composition of the population as a whole. Even if we are certain about how many offspring an organism of type X would have and the number of type X organisms there are in the current generation, we cannot deduce the frequency of the type X in a subsequent generation unless we also have such information about all of the other competing types in the population. Now no one would deny that we must take account of the common environmental selection pressures upon a population undergoing the process of natural selection. Nor would anyone deny that calculating the fitness of an individual bearing a particular trait variant requires statistical methods for averaging over the survival and fecundity of conspecifics with the same trait type. But propensity theorists should be wary of demanding that the fitness-determination for a particular trait variant

(or token organism that bears such a variant) also requires making reference to the reproductive fates of competing variants. This is tantamount to claiming that the propensity of a fair coin to land heads when tossed is not an intrinsic feature of the coin nor even an extrinsic (but highly localized) feature of a causal network consisting of the coin, the tossing device, and the surrounding environmental conditions, but rather a property of the population of coins to which it belongs. This cuts against the well-worn assumption that a (fair) coin’s chance of landing heads on any particular toss is independent of other trials. It appears as though advocates of the propensity theory must make a similarly worrisome commitment when calculating individual

78 fitness in cases that involve stochastic variation between generations or even frequency- dependent selection.

Countering that competing trait variants and their relative frequencies can be reconceived as “part of the selective environment” would be of no help to the propensity theorist in the aforementioned case. Such a move effectively splits the population under investigation into two distinct subpopulations: one consisting of blue organisms with red organisms considered part of the selective milieu rather than conspecific competitors in a common arena and vice versa. One problem with this maneuver is that each of the consequent subpopulations is now (by definition) faced with a unique set of selection pressures instead of common background conditions.

Statistical differences in reproduction and survival are to be expected when conspecific organisms, even those bearing exactly the same trait type, are posed with significantly different environmental challenges. In such circumstances it makes little sense to ask the sorts of questions—Is blue coloration adaptive? How strongly is red coloration selected against? —that typically motivate evolutionary population biologists. Such questions not only presuppose the existence of ecological problems common to the population as a whole; they suppose that alternative trait variants or “strategies” within a population exist for solving them. It is the set of ecological problems commonly confronted by members of a population that distinguishes a particular collection of organisms as a unit worthy of investigation. To think otherwise is to fall back into an outdated mode of essentialist thinking the likes of which had to be overcome by

Charles Darwin and his contemporaries on route to modern evolutionary theorizing.85

85 On this point, see especially Mayr (1959), Sober (1980), and Ruse (1987). While most philosophers of biology would concede as much, it is by no means an uncontroversial claim. Michael Devitt (2008) has been the most vocal defender of essentialism, especially as it pertains species identification. 79 A closely related difficulty is that there would be no relevant variation for the trait in question—coloration—within the resulting subpopulations. Cumulative, adaptive evolutionary change via natural selection is exceedingly improbable without heritable variation. Any systematic differences in survival and fecundity within each of these subpopulations would, therefore, be due to the effects of selection on traits other than coloration.

The “delayed selection” problem highlights still another difficulty for any conception of fitness based solely on offspring production. It is by now well known that there are mutations which make all (or at least most) of an organism’s offspring sterile. James Crow and Motoo

Kimura (1956) were the first to report on the grandchildless mutation in some species of

Drosophila. This mutation does not affect the actual number of offspring produced by an individual in the parental generation and, hence, is not reflected in calculations of fitness that span less than three generations (i.e., parent, offspring, grand-offspring). Moreover, such mutations often permit those that bear them to live what might otherwise be considered a normal life. Offspring fecundity is clearly compromised, while offspring viability is not.

One might well think that version (2) of the propensity model could be revised yet again so as to include a correction factor for expected number of grand-offspring rather than offspring.

Doing so would solve the problem as generated by the grandchildless mutation. Unfortunately, there are other such mutations, as for example in the nematode C. elegans, which can end in sterility dozens of generations later (Ahmed and Hodgkin 2000). The worry accordingly transforms into one of when to consider descendant contribution irrelevant to calculations of fitness. Even if the formal model is based on the current maximal number of generations for all extant species, future evolution could presumably change the required number of generations.86

86 Op. 857 of Pence and Ramsey (2013). 80 To accommodate this worry, the once-corrected version of the propensity model (Brandon 1990) would need yet another correction factor.

The “timing of offspring” problem, not unlike the delayed selection problem, is one that plagues the propensity model even when corrected for the effects of higher mathematical moments. If organisms of a type are disposed to reproduce earlier than their conspecific competitors, they will be more fit, ceteris paribus.87 Natural selection can systematically shift the timing of reproduction without necessarily affecting the number of offspring produced. The form of the problem should by now be familiar since this type of shift allows for a situation in which alternative trait variants in a population have equal initial representation and equivalent expected fitness values but nevertheless systematically diverge in their subsequent frequency. Perhaps the easiest way to comprehend the advantage given to those with shorter generation times is just to note that, given any specified duration of time, early-reproducers will occasionally cycle through more but never fewer generations than competing conspecific late-reproducers. For a somewhat more concrete interpretation of the advantage that might accrue to early reproduction we need look no further than proverbial English advice: “The early bird catcheth the worm.”88 Other things being equal, offspring that emerge early typically face less opposition when it comes to the exploitation of resources since intraspecific density-dependence has yet to peek.89

When taken together, the slew of problems heretofore discussed presents a daunting obstacle for those who favor the propensity interpretation. An advocate could nevertheless maintain that properly corrected versions of the formal model for idiosyncratic populations,

87 Ibid., p.858. 88 This first written record of this occurs in John Ray’s A Collection of English Proverbs (1670, 1678). 89 This is one plausible scenario wherein early emergence enhances the prospects for survival and reproduction. The author acknowledges that counterexamples are easy enough to generate. The second mouse, for instance, may very well get the cheese. 81 although difficult to come by, will eventually succeed where the others have failed.90 As Pence and Ramsey (2013) have argued, however, the problem with this stance is that it changes the original formal model into “an equation schema describing a ‘family’ of models: the exact nature of the propensity in a given case can only be specified once the details of the population are determined.”91 What was at first a single equation applicable to any biological system becomes a potentially infinite disjunction of equations, one for every possible population subject to the vicissitudes of evolution via selection.

It is less than clear whether such an unwieldy disjunction could assume a role in the general causal structure of natural selection. To see why this is so, it bears repeating that the original motivation for analyzing the notion of fitness was that it or something closely akin to it

(e.g., Brandon’s “adaptedness”) seems to figure centrally in explanations of adaptation and ultimately speciation for any biological system whatsoever. If the equations become as system- specific as suggested here, then evolutionary theorists are faced with the unwelcome possibility that a satisfactory explanation for the evolutionary dynamics of one population of species X might very well resist extrapolation to other populations of the very same species. In light of these worries it appears as though the additional refinements (i.e., correction factors) required unrepentantly nudge the plausibility of a satisfactory formal model off the page, at least insofar as it purports to be an explanation of the general causal structure of evolutionary change.

A New Formal Foundation for the Propensity Interpretation

The foregoing discussion demonstrates just how bleak the prospects of a satisfactory formalization for the propensity interpretation of fitness were. But there must be some

90 A commitment to the semantic conception of theories is consistent with this sort of response. Peter Godfrey- Smith’s recent classic Darwinian Populations and Natural Selection (2009) is a good example of the trend that takes a dim view of there being any interesting explanatory generalizations which pertain to all evolving systems. 91 Op. 857. 82 interpretation for this central explanatory notion. There are but three options. One is to abandon the propensity interpretation of fitness wholesale and rely instead on a completely different framework for understanding this all-important concept. The obvious contender is “the (merely) statistical interpretation of fitness” whereupon fitness is explanatory but acausal (Matthen and

Ariew (2002); Walsh, Lewens, and Ariew (2002); Ariew and Lewontin (2004); Ariew and Ernst

(2006)). Another option is to acknowledge the inadequacies of the mathematical formalisms currently on offer and argue that these shortcomings do not undermine the philosophical interpretation of fitness as a propensity. This option is tantamount to admitting that no single formal model is necessary for all biological systems subject to evolution via natural selection.

Such a move takes Brandon’s (1990) proposal for correction factors to the extreme.92 Yet a third possibility is to acknowledge the necessity of a formal model for fitness that meets the objections noted above and, then, strive to generate such a model.

Pence and Ramsey (2013) opt for the third alternative on the grounds that (i) the (merely) statistical interpretation denies the causal efficacy and explanatory adequacy of fitness as well as related evolutionary processes like selection and drift and (ii) a philosophical understanding of fitness without a corresponding formal model lacks the requisite clarity and rigor needed to bridge philosophical theory with biological practice. They have subsequently discarded previous formulations and begun anew with the aid of sophisticated mathematical models.

Drawing on results in a research program known as “adaptive dynamics” (Tuljapurkar and Orzack 1980; Caswell 1989; Tuljapurkar 1989, 1990), Pence and Ramsey (2013) offer the following general model:

92 On my reading, it is doubtful that Robert Brandon would promote a potentially infinite disjunction of models as a satisfactory explanation of the general causal structure of evolutionary change. 83 1 퐹 = 푒푥푝 (푙푖푚 ∫ 푃푟(휔) · 푙푛(훷(휔, 푡)) 푑휔) (3) 푡→∞ 푡 휔휖훺

There is a good deal of heavy mathematical machinery here, so let us begin by unpacking this equation. It is of the utmost importance to note that this model is meant as a definition of individual fitness (here ‘F’), as in the fitness of a token organism within a population. The symbol ‘ω’ figures as the fundamental notion. When subscripted, it signifies a particular

“daughter population” (or lineage) to which an organism with genome, G, in environment, E, might give rise.93 The guiding idea is actually more straightforward than this wording suggests.

Take any token organism in a population. There are many different ways in which that organism’s life could unfold. It might, for example, give rise to many progeny or it could leave relatively few. It could just as easily leave none at all due to its succumbing to severe malnutrition or predation prior to reproduction. There are also many ways that a particular organism could produce one and the same number of offspring. For instance, it might reproduce at an earlier rather than later time in its life. The same goes for any of the offspring or even grand-offspring of our focal organism. Radical contingency of this sort holds for any possible descendents whatsoever. So the total number of “daughter populations” or “possible lives” is, strictly speaking, uncountable. And the set containing all such daughter populations (here ‘Ω’) is the state (sample) space in which ω is but one member.

The function ϕ(t,ω) takes a particular point ω in the sample space (Ω) and a time t to the number of o’s progeny living at time t on that outcome. Taking the natural log (ln) of this function transforms the expected output of this function, one given in terms of o’s progeny at time t within ω, into a measure of the time needed for o’s lineage (or daughter population) to attain this level of growth. Along with each possible daughter population, ωi, in the total state

93 Op. 860-861 of Pence and Ramsey (2013). 84 space, Ω, there also corresponds a measure of the probability, Pr(ω), that it will be the one actualized for a focal organism. This measure is designed to mirror the probability weightings that were assigned to possible offspring contributions one generation in the future from the original formulation of the PIF (see Equation 1).94 It is the aspect of Equation 3 that is sensitive to empirical data and allows for a nonsymmetrical probability distribution over areas within the state space established by integration. These probability weightings are assigned to daughter populations or lineages rather than mere possible offspring contributions one generation hence.

The values computed via Equation 3 thus reflect total descendant contribution over, what is at least theoretically, an infinite amount of time and thus countless generations. This much is clear from the mathematical notion (푙푖푚) indicating that the value of interest is that taken in long run 푡→∞ at its infinite limit. Exponentiation (exp) in the model simply transforms the temporal values

(i.e., natural logarithms of the function φ) into expected growth rates (logarithm base e).

This general model, or simplifications of it for specific cases, can readily deal with the three objections that have plagued the PIF as formalized in terms of Equation 1. The

“mathematical moments” problem capitalized on the transgenerational changes in mean fitness that can accrue because of differences in the variance, skew, or kurtosis of distributions of possible offspring among competing trait variants. Such fitness differences were “invisible” to the original formalization of the PIF, as it focused solely on arithmetic mean values. Equation 3, however, is explicitly designed to incorporate these features of a statistical sample and, then, apply them to calculate individual fitness over a very long time scale.95

94 The continuous analog of the weighted average, Equation 1, would look as follows: F = ∫ωԑΩ Pr(ω)·φ(ω,T)dω, wherein T is established as a time point one generation in the future. See page 861 of Pence and Ramsey (2013) for further details. 95 Op. 874-875 in Pence and Ramsey (2013). 85 How becomes clear if one focuses on a simple example like that mentioned in our discussion of the moments problem, where conspecifics bearing two competing trait variants have the same arithmetic mean fitness but diverge in their variances. Suppose, as before, that this species reproduces asexually and has discrete generations. For organisms of type O1 and their descendants, there is a probability of 1.0 that they will produce 1 offspring each generation. For organisms of type O2 and their descendants, there is a probability of 0.5 that they will produce 2 offspring each generation and a probability of 0.5 that they will produce 0 offspring each generation. In other words, O2 organisms are equally likely to have either 0 or 2 offspring each generation. These conspecifics have the same expected number of offspring ([1.0(1)] = [0.5(0) +

0.5(2)] = 1). As time approaches ∞, the probability of a O1 organism having 1 offspring will be

1.0. In contrast, organisms of type O2 and their descendants will run a 50% risk of extinction with each passing generation. In the long run, then, extinction is nearly certain for O2. This demonstrates the superior fitness of O1 via accommodating the difference in variance. Similar reasoning shows that Equation 3 is sensitive to differences in even higher mathematical moments

(e.g., skew and kurtosis) for calculations of individual fitness.96

How does Equation 3 manage to circumvent the “delayed selection” problem?

Remember, certain mutations, such as the “grandchildless” mutation in Drosophila, demonstrate how a single-generation timeframe is insufficient for determinations of fitness. For even an organism that survives to produce many offspring can have low fitness (i.e., leave relatively few descendants) if its immediate descendants turn out to be inviable or infertile. Since Equation 4 calculates individual fitness in the long run, however, it accounts for the fitness differences among offspring, grand-offspring, and ever more distant degrees of genetic separation. The

96 See Beatty and Finsen (1989) and Abrams (2009) 86 fitness of individuals is determined “via their descendant pool ‘at infinity’.”97 Doing so enables

Equation 3 and the revamped PIF to accommodate an unlimited amount of variability in survival

(viability) and fecundity (fertility), the primary components of fitness.

Having dispatched the purported problems associated with higher mathematical moments and delayed selection, Pence and Ramsey’s model also provides a way to resolve the “timing of reproduction” problem. When conspecifics differ in their generation times, those with shorter generation times will be fitter, ceteris paribus. The original formalization of the PIF (Equation 1) was a formulation of lifetime reproductive success for competing heritable trait variants that relate to the temporal timing of reproduction. It assumes an average number of reproductive events or generations experienced by typical members of a species regardless of the competing trait variants in question. Equation 1 thus stipulates that a member of a species will experience n mating events during the course of its life. By doing so, however, the duration of time required to realize n mating events is automatically removed from consideration.

It is not the multigenerational character of Equation 3 that helps dispel this problem. The revamped model does away with number of generations in favor of time simpliciter. A toy example shows why this is a necessary. Imagine that we have two types of competing organism, say O1 and O2, the only difference between them being that O1 experiences two reproductive events in the time that it takes for O2 to experience just one. In other words, an O1 organism will have twice as many offspring as an O2 organism when O2 finishes reproducing. Notice that both

O1 and O2 will have the same number of offspring, grandoffspring, and so on. If discrete generation times are used instead of continuous temporal duration (time) when calculating fitness, these two competing variants would be considered equally fit. Consequently, the benefits

97Op. 874 of Pence and Ramsey (2013). 87 that usually accrue to early reproducers or heritable variations for shorter generation time are overlooked by models like Equation 1 that utilize generational timeframes. Using temporal duration, in contrast, clearly demonstrates a higher fitness value for O1. At any moment in time, the early-reproducer O1 will occasionally cycle through more but never fewer generations than competing late-reproducer O2. Yet again, Equation 4 prevails where Equation 1 fell short.

Cracks in the New Foundation

Via the introduction of Equation 3, Pence and Ramsey “have explicated the PIF itself via a very extensive picture of the success of an organism o, by considering all the possible daughter populations to which it might give rise” (pp.863, 2013). The point of this “very extensive picture” is, of course, to “capture all causal influences which might impact the future fate of an organism within a population” (pp.862, 2013). In defense of this approach they claim that any solution to the generality problem in the philosophy of biology—how fitness figures causally in explanatory generalizations pertaining to the evolution of any biological system subject to selection—requires a formal model that takes a long-term view.98 Only “in the long run” or “in the infinite limit,” then, can the influence of all causal factors be captured.

Taken at face value, there is nothing objectionable to the “broad brush” approach that

Pence and Ramsey adopt. It effectively counters objections leveled against earlier formalizations of the PIF and does so in a way that connects recent mathematical advances in theoretical population biology with philosophical musings. In passing, however, they make a point of noting that “Equation [3] is the density-independent, non-chaotic limit of this more sophisticated work” in adaptive dynamics (p.863, 2013). In other words, their Equation 3 is but a special case of an even more general model in a research program that accounts for difficulties that accompany

98See their reply to Objection #3 on page 871. 88 density-dependent population growth, chaotic population dynamics, and non-static environments. The maximally general model alluded to would invoke the broadest possible explanatory scope so as to cover all possible causal influences that could affect a focal organism’s success. No population composition or structure would go uncounted. What is truly on offer, then, is nothing short of a “God’s-eye-view” of fitness.

It is important to understand just how Equation 3 achieves this unparalleled generality.

There are basically two different ways in which a formal model might attain the level of generality that Pence and Ramsey desire. One way involves knowing a priori that the idealizing assumptions which inform a formal model rarely if ever hold true. This is the case with what are sometimes called “null models” in scientific parlance. The Hardy-Weinberg equilibrium in population genetics is a prime example. It is a principle which states that the gene frequencies in a population will remain constant from one generation to the next in the absence of disturbing factors (migration, mutation, selection, drift, and nonrandom mating). But disturbing factors routinely occur in natural populations, so this equilibrium is rarely if ever realized. All natural populations are finite in number, which introduces an element of drift. Mating is often anything but random. Fitness differences abound and, hence, selection is ubiquitous. The idealization nevertheless remains useful because it allows for the identification and measurement of changes in genetic variation as departures from this equilibrium state.

Another way to increase generality or broaden the explanatory scope of a model is by collecting all the known and unknown factors that influence a dynamic system under an “error term.” Errors due to the inaccuracy or imprecision of measurement, accidental oversight, and even the deliberate bracketing or limitation of explanatory factors can all be accommodated via such catch-all terms. The necessity and utility of error terms in the statistical decomposition of

89 evolutionary change is undeniable. The most general statement evolutionary change, the Price

퐶표푣(푤,푧) (푤∆푧) equation (1970, 1972), clearly evinces this: ∆푧 = + 퐸 . It partitions total 푤 푤 evolutionary change, change in the mean phenotype value (∆푧) into two components. The first

퐶표푣(푤,푧) component ( ) provides an abstract expression of natural selection, while the second 푤

(푤∆푧) component (퐸 ) subsumes all other evolutionary processes.99 This second component, 푤 sometimes called the environmental or unexplained variance, “encompasses everything not included in the expression for selection […] every possible force that might arise and that is not accounted for by the particular expression for selection” (Frank 2014, 1010).

Now, it is fairly obvious that Pence and Ramsey do not want to cast Equation 3 in the role of a “null model.” They make this clear when they say, “In fact, in adaptive dynamics, a variation of [Equation 3] is argued to be the optimal predictor of the fates of populations” (p.864,

2013). The “variation” mentioned here refers to the model from which Equation 3 is derived, one that accounts for chaotic population dynamics, non-static environments, and density-dependence, among other complexities. Consequently, the broad explanatory scope of Equation 3 is achieved in a manner not unlike that in the Price equation, namely, by virtue of lumping together all possible causal influences which might be responsible for deviation from expected values.

There is, however, a vital difference. Pence and Ramsey’s model determines the fitness of a focal organism in a population by averaging over all possible scenarios and, thereby, all causal influences that an organism and its descendants could experience. This deliberately removes the need for an explicit term for unexplained variance since all possible sources of residual error are reflected somewhere in the state space (Ω) over which the equation for

99 Frank, S.A. Review: Natural Selection. IV. The Price Equation. Journal of Evolutionary Biology 25, 1002-1019. 2012 90 individual fitness ranges. There is, in effect, no genuine “error” at all because there will be no deviation from expectations in the infinite long run.

Two aspects (subsets) of Ω’s composition must not go unrecognized. The first aspect worth noting is that the set containing all potential but unactualized causal influences must be included in this expansive list of possibilities. These possibilities figure in Equation 3 since finite beings know not what the future will hold. The second aspect of Ω that demands attention features within the contrast class consisting of actualized causal factors. Among these factors there are influences typically deemed “unsystematic” in the sense that exhibiting traits with heritable variations affecting fitness has no bearing on the likelihood of an organism experiencing such influences.

These features of the sample space Ω provide reasons for scrutiny of Pence and Ramsey’s preferred formulation. That Equation 3 ranges over what will forever be unrealized causal influences raises worries about whether such an expansive, all-encompassing ontological framework (i.e., interpretation of Ω) is required for current calculations of fitness or conceptualizations thereof. This concern speaks directly to the necessity of Equation 3 and, thereby, encourages the search for formal models with more nominal ontological commitments.

A second worry is one which disputes the sufficiency of Equation 3 as a definition of fitness.

While Equation 3 undoubtedly captures all actualized causal influences, it seems to lose a feature of the concept that many philosophers and most biologists deem indispensable: the distinction between relevant and irrelevant causes. I will develop each of these worries below, beginning with the latter.

As already mentioned, the state space (Ω) over which Equation 3 ranges includes all possible causal influences. Among the manifest causal influences in the infinite limit would be

91 those that practicing evolutionary population biologists consider “haphazard” or “random” and which result in “indiscriminate” or “unsystematic” evolutionary change.100 Such labels are not mere artifacts of scientific convention; they have a principled basis. Some causes (e.g., an asteroid impact) can affect survivorship or fecundity without regard to the characteristics of an organism. Exhibiting traits that significantly affect rates of survival and reproduction has absolutely no bearing on the likelihood of an organism experiencing such influences.101 To these we can add causally efficacious traits which exhibit no heritable variation.102 The phenomenon of phenotypic plasticity presents a case in point. This occurs when one genotype has the capacity to produce more than one phenotype in response to different environmental conditions. A phenotypically plastic trait can exhibit variation that is subject to strong selection pressure.103

But the frequency of the genotype responsible for such a plastic phenotype cannot change

(ceteris paribus) in the absence of heritable variants. Also included in this growing menagerie of wayward causal factors are traits which show heritable variation of the kind that is not subject to any selection pressure, as often evinced by erratic and seemingly inexplicable fluctuations in frequency.104

The point of noting these causal but selectively neutral phenomena is just to demonstrate that population biologists routinely distinguish between “relevant” and “irrelevant” causal

100 For more on “indiscriminate sampling,” see the following: (i) Beatty, J. 1984. Chance and Natural Selection. Philosophy of Science 51:183-211. (ii) Hodge, MJS. 1987. Natural Selection as a Causal, Empirical, and Probabilistic Theory. In The Probabilistic Revolution, edited by L. Krüger. MIT Press. (iii) Millstein, RL. 2002. Are Random Drift and Natural Selection Conceptually Distinct? Biology and Philosophy 17 (1):33-53. 101 These causes certainly feature among the “thousand natural shocks” to which Sterelny and Kitcher (1988), barrowing from Shakepeare’s Hamlet, refer. 102 Incidentally, such traits could also exert indirect causal influence via developmental constraint. 103 Work on freshwater snails (Physa virgata) is especially persuasive. For details, see Langerhans, R.B. and T.J. DeWit. 2002. Plasticity constrained: Over-generalized induction cues cause maladaptive phenotypes. Evolutionary Ecology Research. 4 (6): 857–70. 104 Color polymorphisms occurring in environments that lack light would be a good example. A somewhat more controversial example involves synonymous nucleotide substitutions. 92 influences when evaluating the success (i.e., fitness or “adaptedness”) of an organism. They assume that selectively-neutral causal factors randomly influence or are experienced in equal measure by competing conspecifics. Despite their causal contribution, such factors are explanatorily irrelevant from the standpoint of theorizing about the dynamics of evolving populations. These form the background conditions against which partitioning of the population for further analysis proceeds. Explanatorily relevant trait variants within these background conditions are the “common differences” that are, for instance, included within the first

퐶표푣(푤,푧) component of the Price equation, , as an abstract expression of natural selection.105 푤

The crucial explanatory distinction between relevant and irrelevant causes has no principled basis for Pence and Ramsey. Their absolute or “God’s-eye-view” of fitness in the infinite limit (i.e., Equation 3) includes all actualized causal influences on the fitness of an organism without any consideration for whether those factors are explanatorily relevant. Any causal factor, no matter how miniscule its influence, is by default explanatorily relevant because it figures in the scalar-valued return from Equation 3. Contrast this with the conceptual resources available to a “fully completed” version of the Price equation for a hypothetical biological population, one in which the unexplained variance in the system has been reduced to zero (i.e.,

(푤∆푧) 퐸 → 0). With such resources at our disposal, we can still sensibly pose the following 푤 question about our hypothetical population: Is some causal factor X explanatorily relevant to our theoretical (evolutionary) considerations? Answering the question requires only that we turn to

퐶표푣(푤,푧) the (fully expanded) abstract expression of natural selection in our equation, , and, then, 푤

105 For a good example and discussion of how this approach can be generalized for any number of traits (zi), see the following: Lande, R. and S.J. Arnold. 1983 The measurement of selection on correlated characters. Evolution 37: 1210-1226. 93 examine whether the factor X figures as one of the component traits (zi) associated with fitness in our completed definition. Note that the distinction between relevant and irrelevant causal influences on fitness remains a sensible one even in the long-run (i.e., when unexplained variance becomes nonexistent). This feature is important because it allows for the preclusion of highly idiosyncratic causal influences and factors with miniscule effects that do not perturb the central tendencies of evolving populations. The most important insights of science often come from understanding what does not matter, so that we can say what does matter.106

There is no more crucial ontological juncture in population biology than that which occurs when background conditions are determined. For it is only against a carefully circumscribed list of such conditions or criteria that we can identify an aggregate of superficially similar individuals as a genuine population, the fundamental unit of study in evolutionary biology. In attempting to explain adaptive evolution and predict its course population biologists look for systematic transgenerational change in gene frequency or mean phenotype values in a population. Insight into relevant selection pressures accompanies determinations about the traits that are necessary for explaining adaptive change. If, for example, we discover that color variation (camouflage) systematically affects the number of offspring produced, then we search for predators whose preferences are based on acute color vision. It is the reciprocal nature of these two elements in evolutionary theorizing—the set of environmental selection pressures and the suite of selectively nonneutral trait types—that enables biologists to make headway when sorting through the dizzying array of heritable variations that organisms exhibit. Biologists focus on some traits while ignoring others at least partly by way of identifying a set of selection

106 Weyl, H. 1983. Symmetry. Princeton University Press, Princeton, NJ. 94 pressures or “design problems”107 that the individuals purportedly comprising a population share.

Branding such causal factors “background conditions” is what permits the use fitness value ascriptions acquired in one population as predictors of selective dynamics in other populations of the same species. But the absolute fitness that Pence and Ramsey’s model renders apparently removes the need for populations altogether. There remain no principled means by which to identify the relevant background conditions that unify a population and, thereby, determine the relative fitness of competing conspecifics.

Let us now turn our attention to the necessity of Equation 3 as a definition of fitness. This equation, it should be remembered, returns an exact scalar value for individual fitness in the infinite limit. As the previous criticism aimed at the sufficiency of Pence and Ramsey’s approach reveals, there is considerable tension involved with this insofar as it does not require the identification (composition) of actual populations. It provides us with fitness simpliciter or what

I have called an absolute “God’s-eye-view” of fitness. Relative fitness accordingly becomes an epiphenomenon of aggregate absolute individual fitness (i.e., the F-values from Equation 3).108

Since every single individual has an associated fitness-value, any population structure and fitness distribution can purportedly be regained just by selecting the right set of individuals.

A mathematical model taken on its own does not, however, yield a unique interpretation or entail any ontological conclusions. Nor, for that matter, is there only one formal model

107 The label “design problems” is drawn from Daniel Dennett (1995). The author of this essay is fully aware of the problems that accompany this notion as it is deployed in analyses of fitness. 108 Bouchard and Rosenberg (2004) take a somewhat similar approach when using pairwise comparisons between conspecifics in a population to calculate fitness. Notice, however, that such an approach derives fitness values by comparing an individual against all extant members of the population taken one at a time. It is thus a measure of relative fitness rather than absolute fitness. 95 available for describing a phenomenon.109 Models are non-linguistic entities that can be presented in terms of variables and rules or laws describing the functions operating on those variables.110 The state of a formal model refers to any specific set of values for the variables in a model. The collection of all the states of a model composes the model’s state space. The functional rules of a theory describe the possible relationships of interaction, coexistence, and temporal succession of the variables in the model. By doing so they describe trajectories through the state space for the model (Lloyd 1988, pp.19). All models are to some extent ideal structures

(Cartwright 1983). As such, they must be interpreted before finding an application. Interpretation involves proposing that some features of a model (i.e., some of the variables and functions) correspond to certain features of the world. The claims for empirical correspondence then become subject to verification and refutation.

What features must figure in Pence and Ramsey’s interpretation of Equation 3? Theories about evolutionary change are concerned with changes in allele, strategy, or trait distributions

(Fisher 1930; Maynard-Smith 1982; Lande 1982). A common approach of all such theories is to assess the performance of alleles, strategies, and traits by estimating their fitness. “This approach defines fitness as the expected representation of a replicating entity within a population at some distant point in the future” (Coulson et al. 2006, p.547).111 It is noteworthy that this assumption is also fundamental to the research program in mathematical biology known as adaptive dynamics (Metz et al. 1992), the sophisticated formal work on which Pence and Ramsay draw.

109 Millstein, R.L., R. Skipper, and M.R. Dietrich (2009). (Mis)interpreting Mathematical Models: Drift as a Physical Process. Philosophy and Theory in Biology. DOI: http://dx.doi.org/10.3998/ptb.6959004.0001.002 110 What follows is taken from the account of models as proposed by the semantic approach to theories. 111 Coulson, T., T.G. Benton, P. Lundberg, S.R.X. Dall, B.E. Kendall, and J.-M. Gaillard. 2006. Estimating Individual Contributions to Population Growth: Evolutionary Fitness in Ecological Time. Proceedings of the Royal Society B 273: 547-555. I draw very heavily on Coulson et al. in the remainder of this paper. 96 Three nontrivial theoretical and ontological commitments are evinced by this interpretation. The first commitment is that fitness, as construed by proponents of adaptive dynamics as well as in Equation 3, is considered a long-term measure. The second commitment makes clear that fitness must measure relative performance. A third commitment holds to the existence of replicating entities whose evolutionary success is at issue.

Many empirical tests of evolutionary theory use generation-based proxies for fitness to characterize long-term individual performance. If for no other reason, the difficulties associated with collecting data on transgenerational performance often necessitate doing so. Pence and

Ramsey, for reasons already noted, opt instead for infinite temporal duration or expectations in the limit to derive scalar fitness values. This allows them to avoid the pitfalls that accompany the

“timing of offspring” problem, for instance. Their proposal differs significantly, however, from the way that most population biologists invoke “long-term” estimation in measures of individual fitness. Most such estimates are temporally bounded by the actual lifespan of an organism even though fitness is theoretically considered a measure of relative performance in the long run. This occurs, for example, when biologists deploy lifetime reproductive success as a measure of fitness. There may be many opportunities for reproduction during a lifespan (e.g., iteroparous species) or just one (e.g., semelparous species). Many cohorts of offspring can temporally overlap with one another and progenitors alike, with obvious consequences for long-term

(lifetime) performance. Biologists tend to think of such measures as being “long-term” in the limited sense that a lifetime (generation) can include more than a single reproductive event or unchanging set of conditions from which to gauge the success of a focal organism. Moreover, generation-based proxies for individual fitness do not assign fitness values directly to token organisms; they measure the contribution of heritable variations that affect survival and

97 reproduction. Proxies taken in combination (e.g., selection gradients) then allow for a measure of what individual viability in the very long-term (i.e., beyond the actual lifespan) might be for a hypothetical organism exhibiting a particular combination of trait variants. It is somewhat difficult to square this with Pence and Ramsey’s claim that lifetime reproductive success is one among “[s]everal short-term measures of fitness [that] are particularly common in the biological literature, as they are easy to estimate and can be derived from readily available empirical data”

(2013, p.866). A measure of fitness as lifetime reproductive success can be considered “short- term” and “easy to estimate” only when contrasted with the likes of Equation 3, for which the reproductive success of subsequent generations of offspring must also be assessed.

That most biologists understand “long-term” estimation in a more limited or narrow sense does not gives us reason to immediately discount Pence and Ramsey’s account. Perhaps biologists are simply mistaken. More worrisome, though, is that Pence and Ramsey’s definition hankers after the notion of relative fitness. In adaptive dynamics as well as Equation 3, fitness is defined as the expected representation of a replicating entity within a population at some distant point in the future. But if any claim has the status of an axiom in evolutionary biology, it is surely that individuals (i.e., token organisms) do not evolve or re-present themselves in future generations. What, then, are we to make of the “replicating entities” whose representation in the limit is at stake? Pence and Ramsey would surely concede that the recurring entities in question are allelotypes and corresponding genotypes, types of strategies, or phenotypes. In any case, the replicating entities are to be conceived of as types rather than tokens.

They must limit the class of trait types to those that differentially affect survival and reproduction “within a population.” It is, after all, the relative performance of replicating entities that interests population biologists. Focus must be restricted to the trait types that make for

98 differences in the survival and reproduction of conspecific competitors in a more or less uniform environment (i.e., sharing a common suite of selection pressures). There is a plethora of traits that have some, however miniscule, causal influence on the future representation of types. Not all of these are relevant for explaining the general adaptive trajectories of evolving populations.

The well-worn distinction between discriminate and indiscriminate sampling processes to which biologists routinely refer takes this for granted. Some traits accordingly constitute the set of background conditions against which the trajectory and magnitude of variations that are

“difference-makers” is to be assessed.112

Notice that nothing in the foregoing yet requires consideration of token organisms taken in their own right. Only the existence of selectively relevant trait types has been supposed. Long- term performance is depicted as a matter of transgenerational representation. As individuals cannot endure beyond their own lifetimes, the replicating entities at stake must be characterized as types rather than tokens. Even the long-term performance of a trait type, though, is of little interest when assessed in isolation. Evolutionary biologists seek explanations of and projections for the trajectories of evolving populations. This requires appeal to the frequencies of competing variants. Transgenerational frequency change indicates the relative fitness of extant trait types.

Put succinctly, trait fitness has seemingly displaced the need for a measure of individual fitness.

It is just the average reproductive contribution of all the individual organisms that exhibit a trait variant (Sober 2001). Individual reproductive output is merely evidence used in a statistical procedure for determining trait fitness.113 This arrangement is clearly at odds with the central

112 If not already clear from the choice of terminology, the account of causation presupposed here can be found in James Woodward’s Making Things Happen: A Theory of Causal Explanation (2003). 113 To think otherwise is to fall prey to what Mills and Beatty (1979) call “the operationalist fallacy.” 99 ontological assumption of the PIF and its interpretation of Equation 3, which holds that individual organisms bear the probabilistic dispositional property of fitness.

One possible way for propensity theorists to avoid this unwelcome predicament begins by pointing out that the metaphysical grounding for this dispositional property (fitness) supposedly lies in a combination of properties: the collection of selectively nonneutral traits that affect the overall tendency of an organism to survive and reproduce. This rejoinder concedes that trait fitnesses are proxy measures of individual performance when taken one at a time. They are typically used in lieu of a direct way to measure individual fitness. When taken in combination, however, as an intra-organismic suite of interacting variations, they can provide us with a more informative descriptive apparatus by means of which to conceptually recast individual organisms by reference to the relevant variations they exemplify.

The gist of this rejoinder can be made clear with a toy example. Assume that we have a population of organisms whose adaptive evolution can be explained and predicted by reference to just three phenotypes114: coloration (C), shape (S), and size (Z). Suppose that each of these three trait types admits of two mutually exclusive (i.e., qualitatively discrete) variants: C1, C2, S1,

S2, Z1, Z2. Any token organism in this population qua its membership must accordingly fall

3 under no more than one of eight (2 = 8) complete, complex descriptions: C1S1Z1, C1S1Z2,

C1S2Z1, C1S2Z2, C2S1Z1, C2S1Z2, C2S2Z1, C2S2Z2. Such descriptively-mediated identification exhausts what it is to be type-cast as “an evolutionary individual,” or an entity capable of being

114 Imagine that an analysis using selection gradients shows the three traits in question as explaining all the observed variance in fitness. 100 re-presented in future generations. Only by accident do these descriptions uniquely identify any particular individual; complex property types can be instantiated by one or more organisms.115

Yet Pence and Ramsey thumb their noses at the any proposal that seeks to base individual fitness estimates on measures of trait fitnesses. Here is what they say on the matter:

Trait fitness, however, is commonly understood in two different ways. First, we have trait fitness as the average fitness of all individual organisms that bear a given trait […] Second, we have trait fitness as a prediction of future trait prevalence: the quantity that lets us predict the frequency of a trait in the next generation given its current frequency […] If the first of these two definitions is adopted, then trait fitness is straightforwardly parasitic on individual fitness, and a model of individual fitness must be provided to make sense of the fitness of traits. If the second definition is adopted, however, then we are dealing with quite a different quantity than the one modelled here. Trait fitness in this second sense relies on individual fitness as well, but also includes factors such as heritability. Thus, under either of the standard ways of understanding trait fitness, individual fitness is in some sense foundational. Trait fitness values are either directly derived from individual fitness values, or individual fitness values are a component of trait fitness. Because of this, we are justified in simply providing a model of individual fitness as the foundational concept in the PIF. (2013, p.872)

Their claim about the first sense of ‘trait fitness’ applies directly the proposed strategy of reconstructing individual fitness via a complex of traits. Reconstruing any token organism as an evolutionary individual (e.g., being identified by the statistical partition C1S1Z1 in the aforementioned population) still makes this organism’s actual reproductive success relevant only qua its exhibiting a complex trait type. Simple trait fitness has consequently been displaced by averages taken over the reproductive contribution of all the individuals that exemplify a complex trait type. Trait fitness thereby retains ontological as well as explanatory priority, albeit in a somewhat more sophisticated form. Pence and Ramsey’s resistance, then, is not misguided on this score.

115 The case in which there is only a single organism exemplifying one of the eight descriptions is a special case that can be glossed over since another conspecific could just as well have been the referent of that description. It is a matter of contingent identity. 101 What is truly at issue is ‘trait fitness’ in Pence and Ramsey’s second sense, as a predictor of future prevalence based on current representation. In the toy example that was used to elucidate the strategy of recasting token organisms as evolutionary individuals, there was deliberately no mention of exact trait fitness values for phenotypic variants of the traits C, S, and

Z. These traits were simply stipulated as thoroughly explaining and effectively predicting the evolution a population. Obtaining such information about the targets of selection usually requires estimating selection gradients, which measure direct selection on each trait after multiple regression. “In multiple regression, a single dependent variable is regressed on multiple independent variables simultaneously, and the effects of correlations among the independent variables are controlled for statistically” (Conner and Hartl 2004, 201-202). Selection gradients thus measure the (slope of) relationship between fitness and a particular trait while removing the effect of indirect selection. The effects of indirect selection can be removed by controlling for phenotypic correlations with other measured traits. Phenotypic correlations are correlations that occur among traits within a generation. These provide insight into adaptive evolutionary

(transgenerational) change only insofar as they reflect the presence of an underlying genetic architecture that exhibits corresponding genetic correlations. It is thus to phenomena such as epistasis and pleiotropy that Pence and Ramsey allude when they mention the heritability of a trait.

By their own admission, then, they seek to model a quantity that precedes (“is a component of”) trait fitness in the second predictive sense noted above. It follows directly that measures of individual fitness must bracket considerations of heritability. This is a telling move.

How are fitness values from Equation 3 supposed to optimally predict the fate of any population when the model gives no explicit consideration to the power of selection in effecting a

102 coordinated response? The key to answering this question and understanding Pence and

Ramsey’s conception of fitness via Equation 3 is to recognize that fitness is “maximally sensitive to” but no longer dependent on actual reproductive output at any one point in time. It has been reconstrued as a purely intragenerational measure of fitness, one designed to capture survivability or viability across any possible environmental contingencies. At any moment in time short of the infinite long run, it is an optimal predictor insofar as it takes account of all known data on reproductive performance and assumes that either (i) that there are no genetic correlations to impede the efficacy of selection or (ii) that any genetic correlations that could affect the response to selection would be shared among the members of a population.

Any deviations from expectations derived on the basis of actual reproductive output are accordingly attributed to changes in what is assumed to be a static selective environment (e.g., as in the topography of the adaptive landscape in Sewall Wright’s sense). The “static” structure of this environment is based (mathematically) on the average relative reproductive contributions of the extant individuals that bear specific versions (variants) of a selectively nonneutral trait types.

As such, exactly when an organism reproduces or how many conspecifics are in its vicinity are usually glossed over. But viability and fecundity can change considerably with the addition or subtraction of individuals who exemplify the selectively non-neutral trait types. Recognizing this fact is to acknowledge the explanatory import of ecological factors such as absolute number of individuals in a population, frequency-dependent selection, and density-dependent selection

(Waxman and Gavrilets 2005). It is precisely to link such biologically realistic, short-term ecological factors with long-term evolutionary considerations that is at the heart of the research program known as adaptive dynamics.

103 How, then, can propensity theorists recover the explanatory importance of the individual organism? The explanatory importance of the individual cannot be reduced merely to its realized fecundity. Actual reproductive contribution can, at best, be considered evidence upon which to infer the direction and magnitude of fitness. Recasting token organisms as “evolutionary individuals”—entities whose causal and explanatory contributions depend exclusively on the complex trait-type descriptions satisfied—fares no better for propensity theorists. For the fitness of complex trait types is still explanatorily and ontologically (causally) prior to the individuals that bear them. Individuals are no more than “bundles of relevant trait types” on such a view.

For the PIF and Equation 3 to be successful, individuals must remain ontologically and theoretically fundamental.

Appreciating factors typically associated with ecological change holds the key. The ubiquity and importance of frequency- and density-dependent selection suggest that when, how, and into what sort of population an individual is introduced are pivotal for predicting the dynamics of an evolving population. Conspecifics exhibiting the same selectively relevant trait types can influence the dynamics of a population in very different ways even when they make identical reproductive contributions. As a generic example, imagine two asexual individuals, X1 and X2, each of which leave ten offspring over the course of their lives. Suppose that the only difference between these two is that X1 contributes all of its offspring when the population contains few individuals at relatively low density, while X2 contributes all of its offspring only after the population has reached its local environment’s carrying capacity (i.e., many individuals at high density). In such a case, X1’s contribution to the average relative fitness of its trait type will be positive since the population is presumably experiencing something like exponential growth while initially exploiting its environment. X2, in contrast, makes its contribution at a time

104 when it would be best for all conspecifics in this population, even those exemplifying the most fit variants, if reproduction to were to cease. X2’s contribution thus diminishes the average fitness that both it and X1 exemplify.

A biologically realistic example of the foregoing variety can be seen in iteroparous species. Members of an iteroparous species have many reproductive cycles over the course of their lifetimes. There can be substantial variation in generation times between cohorts and even among individuals within a cohort in such species (Kruuk et al. 1999). This type of demographic stochasticity, when taken in conjunction with environmental variation during the lifespan that may influence the performance of an individual in a specific genotypic or phenotypic state in a year, can eventuate in dramatic differences in lifetime reproduction and thus the fitness of a trait.

The heretofore discussed problems present an obstacle to long-term theoretical measures that seek to reconcile predicted with observed evolutionary change in the wild. Pence and

Ramsey’s formalization, to their credit, was developed with such difficulties in mind. It is, however, worth asking whether such an extensive picture of the causal network affecting a focal organism is necessary. In other words, are there any more parsimonious alternatives to their rather “bloated” ontology?

Coulson et al. (2006) have put forth an alternative formal model. They label their methodological approach “de-lifing” on the grounds that it estimates individual fitness via calculating how a population would have performed with a focal individual removed over some arbitrary time step. The procedure involves “retrospectively removing the individual and any offspring that it produced between time t and t+1 that were still alive at time t+1 from the data and recalculating population growth.” They label this quantity “individual performance”

(denoted ξ). For each individual within a population at each time, ξ is removed and population

105 growth recalculated. With actual population size denoted by N, it is easy enough to calculate population growth with individual i’s contribution removed (ωt(-i)):

푁푡+1−휉푡(푖) 휔푡(−푖) = (4) 푁푡−1

An individual’s contribution to population growth, denoted pti, can, then, be calculated by subtracting ωt(-i) from actual population growth ωt(i). “This approach takes population growth directly and asks how each individual contributed to it directly” (Coulson 206, 548-549).

There several noteworthy features to this approach. First, Coulson et al.’s model assumes that the time interval chosen is shorter than the amount of time required for offspring to reproduce. In other words, their chosen measure, not unlike Pence and Ramsey’s equation 3, is a non-generational measure of individual performance. This choice is justified on the grounds that selection is a continuous process which operates on the distribution of phenotypic traits within a population at any and every point in time. Consideration of time in shorter intervals than a generation enables their proposed model to likewise avoid some of the difficulties associated with demographic stochasticity (e.g., Pence and Ramsey’s “timing of offspring” problem).

Second, the full equation for pti includes a correction factor which takes account of the number of competitors in a population.116 This step recognizes how an individual’s reproductive contribution to population growth is greater when the population is small rather than large. As population size increases, the maximum potential contribution an individual can make in any onetime step goes down. Third and perhaps most pertinent, the statistic pti enables the calculation of a weighted sum across individuals within the same state, no matter how the “state” in question is conceived. Being in the same state can, thereby, involve organisms sharing the same trait value

116 For technical details, see page 549 of Coulson et al. (2006). 106 or variant (e.g., size), belonging to the same cohort (i.e., same age class), or any other arrangement. The relative reproductive contribution that organisms within any specific state make to population growth can accordingly be determined.

While Pence and Ramsey’s model shares some if not all of the forgoing features, there is one crucial difference. “The method [of Coulson et al.] does not estimate what the consequences would be on the performance of other individuals” (2006, 548). This is where the similarity between the two models ends. Differences would surely arise if a particularly dominant individual or territory holder were removed from a population. Dominance hierarchies and territory tenure could be significantly altered. But these consequences are beside the point since estimates of relative individual performance (i.e., fitness as per the measure pti) are based directly on realized population growth, not on what would have happened in a counterfactual environment unlike that which was actually experienced by all members of the population.

Conclusion

On Coulson et al.’s approach, long-term measures of organismal performance (fitness) are established on the basis of the actual influence (via survival and fecundity) that an individual exerts on realized population growth. This influence is readily corrected for the vicissitudes of frequency- and density-dependence as well as non-generational timeframes. Better still, for propensity theorists, the realized performance measure calculated via pti for each token organism is what ultimately grounds the trait fitness values that collectively function as long-term predictors of evolutionary change. The individual thus retains ontological as well explanatory priority. All of this is accomplished without recourse to an interpretation that requires considering what might have been or may come to be for an individual conceived of as somehow divorced from competing, cohabiting conspecifics. Such speculations can be avoided because the

107 actual population is recognized as inseparable from the individual when it comes to calculations of relative fitness. As this approach apparently shares all the strengths of Pence and Ramsey’s model (Equation 4) without any of its drawbacks, notably its rather dubious reliance on the state space Ω containing all possible causal influences, the “de-lifing” model of Coulson et al. constitutes a much-improved formal model for the PIF.

108 CHAPTER 4

EVOLUTIONARY THEORY AND THE CHALLENGE OF EVO-DEVO

In On the Origin of Species (1859) Charles Darwin held fast to a conception of natural selection as the operative force (vera causa) behind adaptive evolutionary change. The explanatory primacy of selection as a causal mechanism has not gone unquestioned since. Sewall

Wright’s criticism of R.A. Fisher, for instance, stressed the inclusion of factors (e.g., genetic linkages, migration, and drift) in addition to selection. Others, notably Stephen J. Gould (1983) and his highly influential colleague Richard Lewontin (1979), were until recently particularly vociferous in their lamentations about how the neo-Darwinian or modern synthesis of the mid- twentieth century prevented evolutionary biologists from considering other plausible evolutionary mechanisms.117 Even these able critics, however, never went so far as to deny selection some significant role in crafting adaptations and phylogenetic change.

How times have changed. These are the opening paragraphs of a recent paper in the leading journal Nature.

Charles Darwin conceived of evolution by natural selection without knowing that genes exist. Now mainstream evolutionary theory has come to focus almost exclusively on genetic inheritance and processes that change gene frequencies.

Yet new data pouring out of adjacent fields are starting to undermine this narrow stance. An alternative vision of evolution is beginning to crystallize, in which the processes by which organisms grow and develop are recognized as causes of evolution (Laland et al. 2014).118

117 Gould, S.J. 1986. “The Hardening of the Modern Synthesis” in Dimensions of Darwinism: Themes and Counterthemes in Twentieth-Century Evolutionary Theory. Ed. Marjorie Grene. Cambridge University Press. Gould, S. J., and R. C. Lewontin. 1979. The spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist programme. Proc. R. Soc. Lond. B. 205:581–598. 118 Op. 161. “Does evolutionary theory need a rethink? Researchers are divided over what processes should be considered fundamental.” Nature, Vol.514. 9 0ctober 2014. As this piece is written in a point-counterpoint format, it should be noted that only the following figures subscribe to the sentiment in the above quotation: Kevin Laland, Tobias Uller, Marc Feldman, Kim Sterelny, Gerd B. Műller, Armin Moczek, Eva Jablonka, and John Odling-Smee. 109 The words are penned by a group of evolutionists who think that essentially the traditional synthesis of Darwinian natural selection and Mendelian (more recently molecular) genetics, often known as neo-Darwinism, is an exhausted paradigm, to use familiar language. Whatever the merits of the theory in the past, its time is over. We need a new approach to evolutionary questions. We must stop focusing on small random variations, where the creative factor in evolutionary change is natural selection, and start focusing on the variations produced by the multiple new factors of which we are now becoming aware – factors that bring in organismal development, something that neo-Darwinism ignores or treats as hidden in a black box. We must see that the creativity of evolution is to be found in development and that natural selection has at most a minor role, that of clearing up the detritus when the true factors of change have done their work.

The story that SET [“Standard Evolutionary Theory” or neo-Darwinism] tells is simple: new variation arises through random genetic mutation; inheritance occurs through DNA; and natural selection is the sole cause of adaptation, the process by which organisms become well-suited to their environments. In this view, the complexity of biological development — the changes that occur as an organism grows and ages — are of secondary, even minor, importance.

In our view, this ‘gene-centric’ focus fails to capture the full gamut of processes that direct evolution. Missing pieces include how physical development influences the generation of variation (developmental bias); how the environment directly shapes organisms’ traits (plasticity); how organisms modify environments (niche construction); and how organisms transmit more than genes across generations (extra-genetic inheritance). For SET, these phenomena are just outcomes of evolution. For the EES [“Extended Evolutionary Synthesis”], they are also causes.119

These authors focus in specifically on so-called “evolutionary development” or “evo-devo” as the key to understanding both the failure of neo-Darwinism and the need for and nature of the extended synthesis. Here, then, is the rationale for this paper. Although I ultimately find the more

119 Op. cit., p.162. 110 radical claims for evo-devo wanting, the journey there is certainly more interesting than the destination.

Historical Context

The import of this burgeoning area of biology and philosophy of biology is perhaps best understood from an historical vantage point (Laubichler and Maienschein 2006, 2009). With the rediscovery of Mendelian particulate heredity around the turn of the nineteenth century, evolutionary biology assumed an overwhelmingly genetical orientation, what some historians have referred to as the ‘‘eclipse of Darwinism’’ (Bowler 1983). The possibility of “hard”

(particulate) inheritance initially helped Darwinians counter a longstanding objection due to

Fleeming Jenkin (1867), who argued that even beneficial mutations would inevitably be

“swamped by” or “blended with” a majority of extant wild type traits. Were it so, natural selection would eventually be left without fodder (variation) and thus impotent as an ultimate explanation of adaptive biological diversity. While undoubtedly an effective response to one sharp critic, appreciation of Mendelian particulate heredity raised a more potent problem: how to explain the observation that quantitative characters, such as human height or weight, typically exhibit continuous variation (i.e., a “normal” distribution with a bell-shaped curve). When taken as applying exclusively to the effects of alleles at a single genetic locus, particulate inheritance seemed to imply that the realization of a phenotype was likewise an all or nothing affair (i.e., discontinuous). Offspring, it was thought, would exhibit either the phenotypic form associated with bearing the dominant allele at a genetic locus or the alternative phenotype associated with having the recessive allele, but that there should be a range of intermediate phenotypes for (or variants of) a particular trait was accordingly supposed to be nigh impossible. Yet continuous variation was ubiquitous. It was scattered throughout the biological realm, undeniable even in

111 humans, and of special interest to evolutionary theorists who saw selection as working incrementally on small mutations to shift mean trait values in populations over time. Not until the mathematical work of pivotal figures such as Ronald Fisher (1918, 1930), Sewall Wright

(1932), and J.B.S. Haldane (1932), but especially Fisher’s 1918 paper “The Correlation between

Relatives on the Supposition of Mendelian Inheritance,” would continuous phenotypic variation be reconciled with Mendelian genetics. Smoothing of the continuous distribution was “ensured by the combination of the additive effects of several loci and the blurring effect of environmental variation” (Pigliucci and Műller 2010, 6).

Nonrandom transgenerational changes in gene frequency within and even across populations subsequently became the hallmark of evolutionary change via natural selection. But the role of selection, Darwin’s primary mechanism of change, was restricted to the task of filtering what were thought to be evolutionarily insignificant micromutations. This much- diminished role for selection follows most directly from evolutionary biologists’ interest in the fates of quantitative phenotypic traits in populations. For such polygenic traits, multiple genes converge to result in a single phenotype. This involves a critical shift in focus from single loci and the linear development of particular phenotypes to what might be called “gene networks.”

The implications of this shift could not have been more profound. For if one considers that (i) a phenotype is typically determined by many genes, each having potentially more than one allele, and (ii) that one and the same gene can play a role in the development of several different phenotypic traits, the potential impact upon organismal fitness of mutations to any single gene within a network increases exponentially. But mutations are inevitable. How, then, does the individual organism, understood as a well-integrated functional unit exhibiting many-to-many relations between genotype and phenotype, manage genetic change and novelty? If organisms

112 are to survive such change and populations adapt on the basis of them, the impact of most genetic changes (mutations) must be slight indeed, or so it was inferred.

Macromutations (saltations) of the sort speciation seemed to require (e.g., Richard

Goldschimdt’s (1940) ‘‘hopeful monsters’’) were correspondingly seen as an exceedingly improbable explanation for the variety of flora and fauna evinced by extant organisms and in the fossil record. Of particular importance in this vein was the observation that random and sometimes highly unrepresentative samples of large base populations could be established and maintained as the result of geographical isolation. When exposed to novel suites of selection pressures, such small isolated populations (colonies) could resist the effects of ‘‘swamping’’

(regression to the mean values for traits in the larger population), and adapt to local environmental conditions. Theodosius Dobzhansky (1951) and others had already discovered ample unexpressed genetic variation in populations (e.g., Drosophila melanogaster) for just such an unexpected confluence of events. Given enough time in isolation, a colony could thus adapt to such an extent that its individual members would no longer have the capacity to interbreed with members of the source population from which they initially diverged, a process which came to be known as allopatric speciation (Mayr 1942) The architects of the Modern Synthesis were confident that the cumulative selection of seemingly insignificant micromutations or unexpressed genetic variants sufficed for speciation and, thereby, evolution.

Philosophical Context

Through the lens of the pocket history provided in the previous section, it is of the utmost importance to note that ‘‘population thinking’’ predominates. The primary architects of the

Modern Synthesis (e.g., Fisher, Haldane, Wright, Dobzhansky, Mayr, Simpson, Stebbins, and

Rensch) recognized that evolutionary biology in all of its seemingly disparate disciplinary guises

113 had a common core: it is concerned with the fates of populations, changes in the makeup of aggregates, collectives, or ensembles. As within statistical thermodynamics, where explanatory interest in the respective histories of token particles gives way to the robust macrophenomenal properties of particle aggregations, so, too, within evolutionary biology; the idiosyncratic biographies of individual organisms in a population fall by the explanatory wayside. It is

“common differences” or variations in type that are of interest to those attempting to explain the trajectory of particular traits in a population over time (Gayon 1998; Sterelny and Kitcher 1988).

The role of token organisms was stricken from evolutionary biology in the interests of unifying the theoretical foundations of biology and professionalizing the life sciences (Smocovitis 1996;

Ruse 1996).

It is precisely with the stipulation that population-level thinking comprises the core of evolutionary biology that evolutionary developmental biology (hereafter ‘‘evo-devo’’) and its philosophical admirers have taken issue. Note at the outset that members of this contingent need not deny the necessity or even the importance of population thinking to evolutionary biology.

Rather, evo-devo inspired worries typically arise with respect to the sufficiency of population thinking as a proper unifying framework or explanatory paradigm for evolutionary biology

(Griffiths and Knight 1998).

These worries can be specified by focusing on the justifications offered for the ‘‘black boxing’’ of individual organisms that accompanied the so-called ‘‘hardening’’ of the modern synthesis (Gould 1983). Natural selection was presumed to act only on adult phenotypes and thus indirectly on the genetic underpinnings of such phenotypes. Assuming as much imposed constraints on the scope and efficacy of selection. The genetic and organismal levels were deemed explanatorily sufficient to account for all evolutionary change; macroevolutionary

114 changes were simply microevolutionary changes ‘‘writ large’’ or ‘‘extrapolated.’’ Individual development or ontogeny was correspondingly neglected on the grounds that the factors and processes associated with development are either highly conserved (i.e., show little if any intraspecific variation) or already expressed in some fashion by the by adult phenotype. When focusing on developmental factors and processes within a particular population, development processes and their products can appear quite robust in the sense that they often exhibit remarkable constancy even in the face of events like nucleotide sequence substitutions. Upon closer inspection, however, such stability or regularity often reveals itself as a mere artifact of pragmatic or heuristic maneuvering. Developmental factors and processes are generally relegated to the realm of background conditions against which causal claims about genes for particular phenotypic traits are evaluated. A trait, say X, is caused by a gene, say Y, only against a constant background of supporting factors or conditions, without which X would not be present even if Y were present. By stipulation, these supporting conditions are not expressed on the phenotypic level. Nor need they occur on the genetic level. An intermediate level of causal interactions and processes is implicitly assumed (Robert 2004).

The following line of reasoning similarly motivates the introduction of an intermediate level of developmental interaction, but this time its scope is confined to differentiation within a token organism rather than variation in a population. For the vast majority of multicellular organisms, all cells come with an entire complement of chromosomes. All the cells of such higher organisms consequently carry the same genes. But if this is correct, then the differences between types of cells such as muscle and nerve cells cannot solely be due to differences in genes. Some other factor(s) must be involved in determining the fate of particular cells.

Something else must control the development of nerves and muscles, bone and sinew, heart and

115 liver, leaves and petals. Moreover, whatever that something turns out to be, it must be reliably inherited. Development is, after all, regular. Were it not so, like would not beget like. In short, the Mendelian chromosome (gene) theory fails to account for the system of inheritance that controls organismal development; it leaves us wanting a theory of ontogenesis (Burian 2005,

183–184).

Whatever developmental factors distinguish cell types might also serve as a basis for selectively significant distinctions among the members of a particular population, among populations of conspecifics, and even among species. In other words, developmental factors could be the targets of selection and, thereby, precursors to speciation. The eventual key to discovering this something-the-architects-knew-not-what was to look between populations and among species. Recognizing that the invariance of developmental processes within a population was a mere heuristic or pragmatic ideal soon made between-population variation in such processes a genuine possibility. This was but the first step. The truly illuminating discoveries were ones that could only be achieved via comparison across species. Such comparisons revealed a surprising number of conserved transcription factors, many in the form of homeobox or “Hox” genes (Carroll 2005). Hox genes and their protein products play a critical regulatory role in the establishment of the body plan of invertebrates and vertebrates alike by way of activating or suppressing gene expression. A particularly striking example of the functional conservation of homeobox proteins can be found in the work of Lutz et al. (1996), who showed how an insect (fruit fly) can develop and function normally when its vertebrate (chicken) ortholog gHoxb-1 is put in place of the fly Hox gene Drosophila lab for labial development. For some, ‘‘[t]hese results suggest that phenotypic differences and phenotypic evolution are more the result of changes in the expression patterns of genes than they are of novel genes [or mutations]’’

116 (Laubichler 2010, 205). This discovery would not have been possible were it not for a crucial shift in perspective from “intrapopulation-intraspecies thinking” to “interpopulation-interspecies thinking.”

Some philosophers and evolutionary biologists have taken the shift in perspective that evo-devo encourages to be a powerful challenge or antidote to the hegemony of the Modern

Synthesis. Those who adopt this perspective (e.g., Wagner, Chiu, and Laubichler 2000) occasionally prefer the sobriquet ‘‘devo-evo’’ (developmental evolution). They ‘‘see the current theory of evolution as incomplete and seek to modify it, or even replace it, with a theory grounded in development’’ (Hall 2000, 177–178). In its more extreme incarnations, evo-devo challenges the very notion that the gene is the fundamental unit of information and evolutionary change. Developmental Systems Theory (hereafter ‘‘DST’’) provides an illuminating example in this regard. Pioneered by likes of Susan Oyama (1985, Oyama et al. 2001), Paul E. Griffiths and

Russell D. Gray (1994, 1997, 2004, 2005), and Eva M. Neumann-Held (1999), the primary contention is that the factors required for regular organismal development are not contained within the genome sequence alone. Cellular and even extracellular factors are supposed to have just as much of a claim on the containment of pivotal information: ‘‘The very structure of genes is deeply context dependent, caused by such processes as mRNA processing and mRNA editing

[…] The phenomenon of mRNA processing shows that, in the process of gene expression, DNA is not a unique carrier of developmental information’’ (Robert 2004, 76).

The crux of DST and perhaps devo-evo boils down to a claim about information; namely, that ‘‘developmental information does not pre-exist individual ontogenies but rather emerges from the interactions of dispersed developmental resources of various kinds’’ (Robert 2004,

113). Information, in this basic causal sense, refers to reliable co-variation between a sender and

117 a receiver along a specified channel of communication. While genes undeniably carry information about a phenotype insofar as they reliably co-vary with a phenotype, there are also many cases in which genes fail to reliably co-vary with phenotypes because of the many-to-many relationship between genes and complex phenotypes. Moreover, if reliable co-variation suffices to distinguish important or privileged evolutionary information from redundant or irrelevant factors, then there are numerous non-genetic (i.e., cellular and extra-cellular) factors that must be identified as equally significant. Consequently, the neo-Darwinian emphasis on the transmission of genetic information from one generation to another is dropped in favor of a picture whereupon developmental information is constructed anew every generation. Such processes generate the relatively reliable reproduction of type and also introduce variation of potential evolutionary significance.

Critical Evaluation

As I have shown, claims made by those who are in any way sympathetic to evo-devo can range from the seemingly mundane to the biologically heterodox. It is well worth pausing for a moment to take stock. The range of rhetoric can be better appreciated by gauging the strength of the claims being made against the paradigm or framework established during the Modern

Synthesis (hereafter “MS”). In this regard my historical approach suggests a somewhat oversimplified spectrum along which the various manifestations of evo-devo could be assigned a place. On one end of this spectrum, there are those who see the emergence of evo-devo over the last thirty-five years or so as “elaboration upon” concerns that were central to the MS in ways that are consistent with the methods and designs of the architects. This would clearly be the most theoretically conservative position since it casts recent work in developmental biology and its sister-disciplines as involving the analysis and refinement of earlier ideas or problems. Moving

118 subtly to the left, one might come upon a position that is quite similar to its more conservative alternative. Unlike the aforementioned position, however, proponents of this slightly more

“liberal” (as in liberated from the hegemony of the MS) position, would argue that the emergence of evo-devo is clearly an “extension of” the MS into new areas (e.g., genomics) with novel concepts or problems (e.g., evolvability). Note that there are no clarion calls here for upheaval on the basis of there being inexplicable anomalies or conceptual inconsistencies with the central tenets of the MS. Still farther in this direction would be a position just to the left of center that stakes out a claim for those who currently fall under the sobriquet “devo-evo.” They, after all, want to square the theory of adaptive evolution with the “spottiness of” or “fits and starts in” paleontological data, the robust conservation of genetic regulatory factors (e.g., Hox genes), constraints on the efficacy of selective explanation, and the maintenance of heritable variation in the face of seemingly ubiquitous filtering selection. The position known as DST would land further to the left since even staunch proponents of devo-evo might shrink at the thought of claiming that developmental information is constructed anew every generation.

Capping the liberal end of this spectrum would be a position which appears at first glance to be so radical that one is tempted to exclude it altogether. This so-called “pluralist” position (Craig

2015) holds that it makes not a whit of difference to ask whether evo-devo presents phenomena, problems, or questions that are inconsistent with the MS. Why? Well, the argument goes, there was no substantive theoretical unification in the first place. What unification or hegemony occurred was in large part due to the architects’ efforts to legitimate biology as a professional science on par with physics and chemistry. If one reads this contention charitably, as one should, it is not quite as radical as it first seems. It has a notable pedigree in the philosophy of science

119 that typically goes by the moniker “the semantic conception of theories,” whereupon theories are depicted as collections of models or methods (Suppes 1960; Van Fraassen 1980).

Utilizing this hypothetical spectrum, I can now clarify the targets of my critique. Very few would dispute that evo-devo must at minimum be taken as elaborating on or extending the central tenets of the Modern Synthesis. Something like a hybrid of these two more

“conservative” positions is arguably the most appropriate way to understand the emergence and impact of evo-devo (Ruse 2006a, 2006b, 2007). As the pluralist approach noted above delves into myriad issues in the general philosophy of science that extend well beyond the scope of this essay, I will not engage with it directly. I shall instead ignore it for the reason that, while undeniably espousing a viable philosophical approach, most historical accounts of the relevant period (1918-1950) speak against the plausibility of construing the Modern Synthesis as merely a sociological artifact of physics envy. Moreover, the pluralist position simply sidesteps the question that motivates this essay and the extensive literature that addresses it; one cannot establish a new paradigm in the absence of a predecessor. Focus will thus be restricted to the more radical strains of evo-devo: DST and devo-evo. It is, after all, practitioners of these two positions that actively call for a paradigm shift in our understanding.

Three strands of thinking are woven together to make the conceptual fabric for radical evo-devo: (i) emphasis on a causal or mechanistic account of ontogeny; (ii) a more inclusive notion of hereditary information; (iii) a commitment to the idea that extra-genetic developmental factors explain phylogenetic relationships among higher taxa. No party to this debate seriously questions the importance of (i), especially in the wake of discoveries such as conserved regulatory factors (Hox genes). Discoveries due to (i), argue proponents of radical evo-devo, strongly suggest (ii) on the grounds that there are cellular, extra-cellular, and even external (to

120 the organism) environmental factors that reliably co-vary with the development of phenotypes.

Furthermore, these extragenetic factors can vary in how they affect the expression or suppression of existing genetic information. Environmentally-induced variation in such regulatory factors can effect large-scale phenotypic changes of the sort that supposedly account for speciation and possibly higher taxonomic differentiation or so-called “major evolutionary transitions”

(Szathmáry and Maynard Smith 1995; Calcott and Sterelny 2011). This clearly evinces (iii) above. For advocates of radical evo-devo, it is “ontogeny that creates phylogeny” (Garstang

1929; Hall 2007). The gradual accumulation and filtering of micromutations via natural selection is no longer the primary or even a sufficient creative mechanism behind the bewildering array of biological form and function. Adaptation and speciation ultimately depend on the affect of extragenetic developmental factors.

In spite its more grandiose claims, it is still less than clear that radical evo-devo necessarily presents a decisive challenge to the central tenets of the modern evolutionary synthesis. There is an historical precedent for my resistance as well as an empirically-informed philosophical argument. Let us examine each of these in turn.

The historical counterargument begins by pointing out that evo-devo has often been touted as ‘‘revolutionary’’ on the grounds that its disciplinary predecessors and their respective tools of investigation were those descended from the Entwicklungsmechanik tradition of experimental embryology established by Wilhelm Roux, Hans Driesch, and others (Love and

Raff 2003, 327–330; Griesemer 2007) during the nineteenth century. Roux’s well-known experiment, for instance, involved killing one of the cells of a two-cell-stage frog blastula to test how cells differentiated as development progresses. He discovered that isolated blastomeres do not in fact produce complete embryos. This ran counter to Driesch’s findings in experiments

121 with sea urchin embryos where isolated blastomeres went on to produce perfectly viable embryos (Hall 2007). In both cases “[t]he ability to do embryonic physiology provided the impetus for evaluating embryos for their own sake in order to uncover embryological mechanisms, rather than studying embryos solely for their evolutionary content” (Hall 2007, p.472). This tradition emphasized proximate mechanisms of development and amassed impressive empirical successes, which continue to this day under the rubric of developmental genetics.

But this pedigree is usually emphasized at the expense of the investigative background against which it emerged; namely, that of comparative evolutionary embryology as practiced by

Karl Ernst von Baer, Ernst Haeckel, Alexander Kowalevsky, Carl Gegenbaur, and Francis

Maitland Balfour (Nyhart 1995). Contra experimental embryology, this tradition investigated ontogenesis precisely because it held out the promise of ultimate explanation(s). Its key issues were phylogenetic relationships, the origin of evolutionary innovations, and the evolutionary significance of developmental constraints. More importantly, it is just these issues at the intersection between evolution and development that continue to motivate many of today’s evolutionary developmental biologists. What better testimony of this could there be than the discovery and intensive investigation of homeobox genes since the 1980s (Carroll 2005). If a disciplinary pedigree including comparative embryology, heterochrony theory, and comparative evolutionary phylogenetic embryology is acknowledged as the historical precursor to contemporary evo-devo, then the purportedly revolutionary nature of evo-devo is nothing but an illusion (Love and Raff 2003, 327–330).

The empirico-philosophical reaction to the more extravagant claims of devo-evo and

122 DST takes issue with the contention that the contextualization of the gene (and the information it carries) has gone unappreciated. That DNA does not contain all the relevant developmental and evolutionary information is and has been widely agreed upon (Robert 2004, 118). But the fact that it does not contain all of the requisite information is no reason whatsoever to conclude that it contains none. It seems more plausible to assume that stretches of DNA contain at least some basic developmental information. Developmental factors, in contrast, do not contain any DNA information. This informational asymmetry suggests some form of rudimentary explanatory privilege or priority for DNA. Even if not the primary locus of activity, genes remain part and parcel of a developing organism. A stretch of DNA or a ‘‘molecular gene’’ could accordingly be functionally defined as having the capacity to generate an amino acid when placed in particular cellular and extracellular matrices of expression.

That selection can act on non-genetic aspects of organisms is also of little surprise. So long as maternal effects, cytological and transcription factors, and epigenetic regulatory networks are somewhat reliably reproduced, show variation, and contribute in some way to the fitness of individual organisms bearing them, selection can come into play. Here it should be kept in mind that the earlier a constraint evolved and the more developmentally entrenched it is, the less likely it is to be subject to extensive change due to selection. Any heritable variation in a developmental factor that is not fitness-diminishing would have to meet exceedingly specific informational requirements so as to avoid disrupting the developmental cascade that results in a fully-fledged organism. Developmental factors would be genetically constrained to a very high degree, although not immune to the effects of selection (Gilbert et al. 1996; Gilbert 2003a,

2003b, 2006, 2007; Ruse 2006a, 2006b, 2007).

123 That extragenetic matrices of expression are more or less reliably reproduced so as to regulate development and ensure that “like begets like” is a critical feature of the foregoing discussion. It is such regularity that enables talk not only of cataclysmic (macroevolutionary) phenotypic differentiation but also “constraints” upon evolution. Much heat but little light has issued from claims for a developmental revolution based on devo-evo’s purportedly unique promise to overcome this “limit” to neo-Darwinian selective explanation. The inference that the less-than-sympathetic are supposed to make at the mere mention of constraints, but especially

“developmental constraints” is clear enough: we had better acknowledge the need for mechanisms other than random genetic mutation, selection, drift, and migration or we will be unable to explain how or why populations manage to maintain enough accessible variation to, for example, stave off occasional evolutionary “bottlenecks” and the like.

This argument from constraints is unconvincing. Perhaps that best rejoinder is that it is possible to adequately describe the phenomenon of evolutionary constraints in terms that make no reference whatsoever to development factors. The tools of population genetics already provide us with the means to do so, particularly in the form of a mathematical abstraction known as the “adaptive landscape” (Wright 1932; Gavrilets 2004) or more recently “phenotype configuration space” (Stadler 1996; Rice 2004). These abstractions are expressly designed to capture the complex relationship between genotype space and phenotype (or fitness) space. The key to understanding this relationship is twofold: First, one must recognize that the relationship between any genetic spaces (possible genotypes) and phenotypes is a many-to-few relation, which means simply that each phenotype can be realized by a large number of genotypes.

Synonymous substitutions to the third position in codons make for a simple but adequate example as they do not induce changes to the resulting amino acid. Second, it must be

124 appreciated that not all phenotypes have exactly the same number of coding genotypes. If these two assumptions are granted, which it seems they must, then it follows (from probability theory alone) that phenotypes which can be realized by many more genotypes are more likely to evolve than phenotypes that can only evolve by way of few genotypes. Put colloquially, the former are

“more easily found” by random mutation, where accessibility is determined by the number of different genotype mutations that can realize a phenotype. In much the same manner, the accessibility relation holds between ancestral and derived phenotypes since not all phenotypes will be accessible from each other phenotype with equal probability (Wagner 2006). The point of import here, again, is just that it is possible to conceptualize constraints upon evolution without mentioning the “developmental constraints” or “developmental types” that adherents of devo-evo or DST see as indispensable. “Developmental constraints [merely] reflect limited mutational accessibility from an ancestral genotype/phenotype” (Wagner 2006).

The obvious reply is to point out that it is precisely this differential accessibility or

“topography of the landscape” (Wagner 2006) that requires explanation, an explanation which cannot, whether in practice or principle, come by way of the genetic level alone. It is less than obvious, however, that it is impossible (in principle) to provide a sufficient explanation solely in genetic and phenotypic terms, especially if it is granted that DNA encodes at least some developmental information. The truth of this contention can be assessed only after evolutionary biology is completed or significantly matured. For all that we have discovered, however, it must be admitted that knowledge claims pertaining to the actual course of evolution and the virtually innumerable factors responsible for its realization still merit significant scrutiny.

It is fascinating and potentially informative to ponder the implications of turning questions about heredity information “inside out” (so to speak). Since the Modern Synthesis

125 developmental factors have been cast in the role of stable “triggering conditions” for the expression of genotypes. Now, of course, there is the suggestion that macroevolutionary creativity falls within the province of these supposed “triggering conditions.” This is perhaps the case. But even if developmental processes have had such profound effects, the fact that they are so highly conserved across taxa could be taken as indicating nothing more than their early appearance and selective retention throughout what is undeniably a highly contingent diachronic process.

Conclusion

As soon as Charles Darwin published his Origin of Species in 1859, critics were proposing alternatives to natural selection. Indeed, his staunchest British supporter, Thomas

Henry Huxley, argued for jumps from one species to another (“saltations”) and his most resolute

American supporter, Asa Gray, argued for divine guidance behind the new variations in evolutionary change. This kind of alternative thinking continues down to the present. It is fueled by many factors, including religion, but clearly for many (very secular) scientists the idea that something so simple, so unthinking, could cause such complexity of adaptation is just too great for their imaginations. I very much doubt that my contribution is going to end this rival tradition. But my claim is that natural selection could do the job at the time of Darwin and it can do the job today as well as for the foreseeable future. Of course, there are fascinating new discoveries about the nature of variation and how it expresses itself. None of this would trouble

Darwin, nor need it trouble us.

126 CHAPTER 5

CONCLUSION

Explaining adaptation or “the appearance of design” in nature and, thereby, predicting the course of evolution unarguably remain a motivation par excellence for population biologists.

More so than ever there is a clarion call for investigating the interactions between short-term ecological phenomena concerning population dynamics and long-term evolutionary considerations which emphasize the study of adaptation as both product and process.

Investigation under monikers such as “ecological genetics” and “evolutionary ecology” is all the rage. Despite the greater sophistication of recent work, natural selection has maintained its importance as the explanatory mechanism of adaptive evolution. The concept of fitness, when understood as the variation in individual performance that generates both ecological and evolutionary change, likewise remains pivotal. The ontological status of fitness, however, remains uncertain.

The introductory chapter of this dissertation details the ongoing dialectic about selection and fitness that began shortly before Darwin introduced selection as the means by which to explain the diversity of biological form and function in the natural world. Darwin’s immediate precursors, steeped as many of them were in English natural theology, believed that adaptation was evidence for the existence of divine selection. Species were “specially created” as optimally fitted to pre-established niches. An individual organism’s fitness was accordingly judged against how well it measured up against what was considered a species archetype or exemplar. Darwin and Wallace famously severed this connection between organismal fitness and the “fittingness” of species to their respective niches. For them, selection was best conceived of as a mechanism or force, the vera causa of adaptation and ultimately speciation. But it was no simple amalgam of

127 all causal influences or the “thousand natural shocks” to which the members of any population are naturally subject. Rather, selection was from its inception a mechanism that effected its ends

(adaptation and speciation) via a proper subset of extant organismal traits in a population. Only heritable traits whose variations differentially affected the prospects of survival were subject to selection. Insofar as these variations were heritable, they could constitute the basis for cumulative selection of the sort that gives rise to new species. Species essentialism was torn asunder. Variation, along with its statistical analogue variance, became paramount for evolutionary theory following Darwin. Fitness was transformed into a property of individual organisms reconceived of as variations in the struggle for existence against common suite of environmental pressures.

Already with Darwin’s contemporaries, disagreements about how best to understand selection were emerging. Wallace, the cofounder of evolution via natural selection, was of course dubious about the use of the term ‘natural selection’ to describe differential survival and reproduction. He thought that it invokes premeditated selection by a rational agent and thus the sort of divine intelligence assumed by proponents of natural theology. Other worries would emerge only somewhat later among the architects of the modern or neo-Darwinian synthesis, notably Fisher and Wright. Their concerns were directed primarily at the targets and efficacy of natural selection. Fisher was convinced that allelotypes and genotypes were the proper targets of selection. Selection was ever-present and all-encompassing for him. Other factors like drift, mutation, and migration simply distorted otherwise adaptive trajectories. These were the

“entropic” factors that supposedly accounted for most of the unexplained (i.e., environmental) variance in statistical analyses of selection. Wright famously countered that such purported

“distortions” were in fact far more pervasive and necessary than Fisher believed. Linkages due to

128 pleiotropy and epistasis, for example, helped to ensure the accumulation of naturally selected variations and the fidelity of transmission from one generation to another. Migration and drift, therefore, gained significant explanatory importance. Adaptive evolution cannot be explained but by way of placing these factors right alongside selection. But taking these factors into account meant prioritizing the entire organism or its entire genome as the proper target of selection.

Even this much-abbreviated historical overview demonstrates that two tightly intertwined issues were emerging by the middle of the twentieth century. One involves the nature and efficacy of natural selection. Is selection a dynamical force or mechanism that somehow “drives”

(a causal notion) more fit conspecifics to increase their relative representation over time? Or is it better thought of as a post hoc redescription of differential reproduction? The second issue concerns the ontological status of fitness. Fitness is often spoken of as though it is a property that causes selection and, thereby, cumulative adaptive evolution. The fundamental assumption for

Darwin and his supporters was that this property is one borne by individual organisms. But is it individuals or the variations that they bear (i.e., the trait types) which are best conceived of as having fitness values? After all, biologists typically speak of a token individual as having a fitness value only in virtue of its exemplifying certain trait variants. Moreover, it is combinations of trait types rather than individuals that are the focus of heritability and, thereby, representation in subsequent generations.

Notice that an answer to the second issue presupposes a position on the first. If you believe that fitness is a label that applies to traits but not (or only derivatively to) individuals, then you are committed to a view of selection as an acausal or merely descriptive (statistical) notion. Intergenerational data on reproductive outcomes must be used in determining trait fitness values, which can in turn be used to ascribe individual fitness values. Fitness is, consequently,

129 something to be assessed “after the fact.” It is no longer just a matter of intragenerational viability as when depicted as a property of token individuals.

It is with the intention of dispelling the foregoing perplexities that biologically-savvy philosophers first proposed the propensity interpretation of fitness. As a propensity, fitness is considered a probabilistic dispositional property of individuals. The metaphysical grounding for this disposition allegedly lies in the collection of traits that exhibit heritable variation and differentially affect survival and reproduction. An individual can be assigned a mathematical expectation, the scalar value of which is calculated as the average reproductive contribution of all the organisms that exhibit each of the selectively nonneutral trait type(s). This mathematical representation is to be interpreted as indicating how likely an organism is to survive and reproduce given the suite of variations it happens to exhibit. The organism need not, however, manifest the associated value since fitness is a probabilistic disposition. This conceptual maneuver was supposed to accommodate worries that arise from the fact that the fittest do not always survive or reproduce.

The propensity account of fitness does not come without a host of problems, however.

Chapter 2 of this work examines the work of one prominent critic of the propensity interpretation of fitness, Dennis Walsh. He contends that fitness is a merely statistical property of trait types, one which is predictively but not causally efficacious. Walsh (2010) argues that the causal commitments of the propensity view—that organismal fitness causes systematic (adaptive) transgenerational change in trait frequency—entail a probabilistically non-benign version of

Simpson's paradox and ultimately the violation of a principle in decision theory known as the

“Sure Thing Principle.” Were this the case, it would constitute a fatal result for the orthodox view since causal claims must in general conform to the directive of this principle. In partial

130 defense of the propensity interpretation, Jun Otsuka et al. (2011) correctly counter that Walsh’s argument relies on a misunderstanding of the mathematical models that Gillespie (1972, 1974) uses to calculate fitness. I argue that, while Walsh has indeed overstated the case against the propensity view, it is not clear that his mistakes lie precisely where Otsuka et al. take them to be.

I briefly sketch out the relevant differences between the two competing positions with respect to the concept of fitness and its explanatory role in theoretical population biology. This is followed by a brief review of the pivotal distinction between probabilistically “pathological” and “benign” instances of Simpson's paradox, and a careful examination of the problem case that supposedly stymies the propensity view. I conclude with a rejoinder based on practical and theoretical grounds against taking effective population size as the sole measure for estimation of individual fitness.

As both proponents and opponents of the propensity interpretation of fitness have pointed out, many of the problems that plague the account are mathematical in nature. Pence and Ramsey

(2013) detail the problematic mathematical implications of three generic (but biologically realistic) scenarios: (i) the mathematical moments problem; (ii) the delayed selection problem;

(iii) the timing of offspring problem. They draw on highly sophisticated mathematical work in a research program known as adaptive dynamics to generate a new formal model for the propensity view. Their model offers a formal estimate of absolute individual fitness, which I dub the “God’s eye view of fitness” on the grounds that fitness is defined as the total descendent contribution of an individual in the infinite limit. The details of the model allow it to resolve the conceptual inconsistencies which resulted from estimating individual fitness as a probability weighted arithmetic average (i.e., a mathematical expectation) of reproductive output. It also synchronizes philosophical intuitions about propensities with current work in theoretical

131 population biology. Be that as it may, I argue in Chapter 3 that this revamped version of the propensity interpretation is not without serious flaws. On one hand, it provides an insufficient account of fitness since it lacks principled means by which to articulate the fundamental distinction between theoretically relevant (discriminate) and irrelevant (indiscriminate) sampling processes which is central to evolutionary population biology. On the other hand, the ontological interpretation of their mathematical model requires a hugely inflated ontology (sample space) with which to capture all possible causal influences on future representation. As I show, there is at least one available model for individual fitness whose interpretation does not require such long-term considerations and, hence, circumvents this bloated ontology. This alternative, which estimates fitness via a mathematical procedure known as “de-lifing” (Coulson et al. 2006), estimates future contingencies regarding representation based actual reproductive output and how population-level parameters would have been altered were it not for the reproductive contributions of an individual. If this is indeed a viable alternative, the necessity of Pence and

Ramsey’s model and the ontological interpretation for which they argue is potentially undermined.

Perhaps the most vociferous challenge to the status and efficacy of selection has come from practitioners within the somewhat recently emerged field of evolutionary developmental biology or “evo-devo”. They contend that the neo-Darwinian synthesis is an exhausted paradigm, and a better approach to evolutionary questions is already at hand. Evolutionary biologists must stop focusing on small random variations, where the creative factor in evolutionary change is natural selection, and start focusing on the variations produced by the multiple new factors of which we are now becoming aware—factors that bring in organismal development, something that neo-Darwinism ignores or treats as hidden in a black box. We must see that the creativity of

132 evolution is to be found in development and that natural selection has at most a minor role, that of clearing up the detritus when the true factors of change have done their work. A direct corollary of selection’s role being so diminished is that fitness is demoted or at minimum dramatically altered. Fitness is, after all, the fodder for selection.

Chapter 4 is an empirically-, historically-, and philosophically-informed critique of such claims. Special attention is given to the more radical or extravagant claims emanating from within evo-devo, namely, the claims of those who subscribe to some form of developmental evolution (“devo-evo”) or developmental systems theory (“DST”). The central commitment among those who adhere to these research programs is that developmental information does not pre-exist individual ontogenies but rather emerges from the interactions of dispersed developmental resources of various kinds. It must obviously be conceded that DNA does not carry all developmental information. But the fact that it does not contain all the requisite information provides no reason whatsoever to conclude that it contains none. It seems plausible that stretches of DNA contain at least some basic developmental information. Developmental factors, in contrast, do not contain any DNA information. This informational asymmetry suggests some form of rudimentary explanatory privilege or priority for DNA. Moreover, so long as maternal effects, cytological and transcription factors, and epigenetic regulatory networks are somewhat reliably reproduced, show variation, and contribute in some way to the fitness of individual organisms bearing them, selection can come into play. I conclude that the more radical claims of evo-devo provide no compelling reasons for conceptual upheaval. When examined closely and understood properly, their contentions can actually be construed as lending further support to prevailing views about the scope and efficacy of selection as a creative mechanism and the theoretical resilience of fitness.

133 The contents of this dissertation demonstrate that the concept of fitness as well as the mechanism of natural selection remain as pivotal to evolutionary theory today as they were in

Darwin’s time. It does not, of course, follow that these aspects of evolutionary theorizing are uncontroversial. Anything but. Vexing questions remain, especially when it comes to understanding the concept of fitness. This is perhaps most clearly evinced by the work of Walsh and his colleagues (Ariew and Matthen) who argue that fitness makes sense only as applied to traits. For them, fitness is merely a statistical redescription, an acausal epiphenomenon. It is just a matter of “bookkeeping.” Setting aside specifics of content and argumentative cogency, this challenge is noteworthy in that it finds capable apologists who, unlike some of the more radical exponents of evo-devo, are largely sympathetic to the central tenets of the neo-Darwinian paradigm. Nor are their worries due solely to flights of philosophical fancy. Four more than four decades now (see Stearns 1975), theoretical biologists have fought among themselves about how to properly conceptualize fitness. Pence and Ramsey (2013), using work in adaptive dynamics, show that a philosophically and mathematically unified account of fitness as a property of individual organisms is possible. But does the level of abstraction required by their model ultimately undermine its applicability? While I have urged that it does, the case is anything but closed since a purely philosophical account of fitness as a propensity remains a live option. At any rate, the agenda for theoretical biologists and philosophers of biology alike has been set, and it almost certainly involves ontological considerations insofar as the interpretations of competing mathematical models are in question.

134 REFERENCES

Abrams, M. (2007). Fitness and Propensity's Annulment? Biology and Philosophy 22(1): 115- 130.

Abrams, M. (2009). What determines biological fitness? The problem of the reference environment. Synthese, 166(1): 21–40.

Ahmed, S. and J. Hodgkin (2000). MRT-2 Checkpoint Protein is Required for Germline Immortality and Telomere Replication in C. elegans. Nature 403: 159–64.

Ariew, A. and Z. Ernst (2009). What fitness can’t be. Erkenntnis, 71(3), 289–301.

Ariew, A. and R.C. Lewontin (2004). The confusions of fitness. British Journal for the Philosophy of Science, 55, 347–363.

Beatty, J. (1984). Chance and Natural Selection. Philosophy of Science 51:183-211.

Beatty, J. and S. Finsen (1989). Rethinking the Propensity Interpretation: A Peek Inside Pandora’s Box. In: Ruse M. (eds) What the Philosophy of Biology Is. Nijhoff International Philosophy Series, vol 32. Springer, Dordrecht: 17-30.

Bouchard, F. and A. Rosenberg (2004). Fitness, Probability and the Principles of Natural Selection. The British Journal for the Philosophy of Science, 55: 693–712.

Bowler, P. (1983). The eclipse of Darwinism: Anti-Darwinian evolution theories in the decades around 1900. Baltimore: Johns Hopkins University Press.

Bowler, P. (2003). Evolution: The History of an Idea. University of California Press.

Brandon, R.N. (1978). Adaptation and Evolutionary Theory. Studies in the History and Philosophy of Science 9:181-206.

Brandon, R. N. (1990). Adaptation and Environment, Princeton: Princeton University Press.

Brandon, R.N. and J. Beatty (1984). The Propensity Interpretation of ‘Fitness’: No Interpretation is No Substitute. Philosophy of Science, 51(2): 342-357.

Browne, J.E. (1995). Charles Darwin: vol. 1 Voyaging, London: Jonathan Cape.

Browne, J.E. (2002). Charles Darwin: vol. 2 The Power of Place, London: Jonathan Cape.

Burian, R.M. (1983). Adaptation. In Dimensions of Darwinism, edited by M. Greene. Cambridge: Cambridge University Press.

135 Burian, R.M. (2005). The Epistemology of Development, Evolution, and Genetics: Selected Essays. Cambridge University Press. Cambridge, UK.

Calcott, B. and Sterelny, K. (2011). The Major Transitions in Evolution Revisited. Cambridge, MA: The MIT Press

Carroll, S. B. (2005). Endless forms most beautiful: The new science of Evo Devo. New York: Norton.

Cartwright, N. (1983). How the Laws of Physics Lie. Oxford University Press.

Cartwright, N. (1999). The Dappled World: A Study of the Boundaries of Science. Cambridge University Press.

Caswell, H. (1989). Matrix Population Models. Sunderland, MA: Sinauer Associates.

Clutton-Brock, T.H., S.D. Albon, and F.E. Guinness (1988). Reproductive success in male and female red deer. In T.H. Clutton-Brock, ed. Reproductive Success. The University of Chicago Press, Chicago, IL: 325-343.

Conner, J.K. and D.L. Hartl (2004). A Primer of Ecological Genetics. Sinauer Associates.

Coulson, T. et al. (2006). Estimating Individual Contributions to Population Growth: Evolutionary Fitness in Ecological Time. Proceedings of the Royal Society B, 273: 547- 555.

Craig, L. (2015). Neo-Darwinism and Evo-Devo: An Argument for Theoretical Pluralism in Evolutionary Biology. Perspectives on Science, 23(3), 243-279.

Crow, J.F. (1990). Sewall Wright’s Place in Twentieth-Century Biology. Journal of the History of Biology, 23: 57-89.

Crow, J. and M. Kimura (1956). Some Genetic Problems in Natural Populations. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability 4: 1–22.

Cuvier, G. (1789). Histoire des Progres des Sciences naturelles depuis. Vol.1, 310.

Darwin, C. R. (1859). On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. London: John Murray. [1st edition]

Darwin, C. R. (1868). The variation of animals and plants under domestication. London: John Murray. First edition, first issue. Volume 1.

Darwin, C. R. and A. R. Wallace. (1858). On the tendency of species to form varieties; and on the perpetuation of varieties and species by natural means of selection. Journal of the Proceedings of the Linnean Society of London. Zoology 3 (20 August): 46-50.

136 Darwin Correspondence Project, “Letter no. 5140,” accessed on 13 June 2017, http://www.darwinproject.ac.uk/DCP-LETT-5140.

Daston, L. and K. Park (1998). Wonders and the Order of Nature, 1150-1750. MIT Press, Cambridge, MA.

Dennett, D. (1995). Darwin’s Dangerous Idea. New York: Simon and Schuster.

Devitt, M. (2008). Resurrecting Biological Essentialism. Philosophy of Science 75(3): 344-382.

Dobzhansky, T. (1951). Genetics and the origin of species (3rd ed.). New York: Columbia University Press.

Dobzhansky, T. (1970). Genetics of the Evolutionary Process. Columbia University Press. New York, NY.

Edwards, A.W.F. (1994). The Fundamental Theorem of Natural Selection. Biological Reviews, 69(4): 443-474.

Elgin, M. and E. Sober (2002). Cartwright on Explanation and Idealization. Erkenntnis, 57: 441- 450.

Endler, John A. (1986). Natural Selection in the Wild. Princeton, NJ: Princeton University Press.

Ewens, W.J. (1989). An Interpretation and Proof of the Fundamental Theorem. Theoretical Population Biology, 36(2): 167-180.

Feyerabend, P. (1975). Against Method: Outline of an Anarchist Theory of Knowledge. Verso. London.

Fisher, R.A. (1918). The Correlation between Relatives on the Supposition of Mendelian Inheritance. Transactions of the Royal Society of Edinburgh, 52, 399-433.

Fisher, R.A. (1930). The Genetical Theory of Natural Selection. Oxford: Clarendon Press.

Fisher, R.A. and C.S. Stock (1915). Cuenot on Preadaptation. A Criticism. Eugenics Review: 7, 46-61.

Frank, S.A. (2012). Review: Natural Selection. IV. The Price Equation. Journal of Evolutionary Biology 25, 1002-1019.

Garstang, W. (1929). The Origin and Evolution of Larval Forms. Report of the 96th Meeting of the British Association for the Advancement of Science: 77-98.

Gavrilets S. (2004). Fitness Landscapes and the Origin of Species. Princeton University Press, Princeton and Oxford.

137 Gayon, J. (1998). Darwinism’s struggle for survival: Heredity and the hypothesis of natural selection. Cambridge, UK: Cambridge University Press.

Gilbert, S. F. (2003a). Evo-devo, devo-evo, and devgen-popgen. Biology and Philosophy, 18, 347–352.

Gilbert, S. F. (2003b). The morphogenesis of evolutionary developmental biology. International Journal of Developmental Biology, 47, 467–477.

Gilbert, S. F. (2006). The generation of novelty: The province of developmental biology. Biological Theory, 1, 209–212.

Gilbert, S. F. (2007). Michael Ruse: Bare-Knuckle fighting: Evo-Devo versus natural selection. Biological Theory, 2(1), 74–75.

Gilbert, S. F., J.M. Opitz, and R.A. Raff (1996). Resynthesizing evolutionary and developmental biology. Developmental Biology, 173, 357–372.

Gillespie, J.H. (1974). Natural Selection for within-Generation Variance in Offspring Number. Genetics 76(3): 601-606.

Gillespie, J.H. (1975). Natural Selection for within-Generation Variance in Offspring Number II: Discrete Haploid Models. Genetics 81(2): 403-413.

Gillespie, J.H. (1977). Natural Selection for Variances in Offspring Numbers: A New Evolutionary Principle. American Naturalist 111(981): 1010-1014.

Godfrey-Smith, P. (2009). Darwinian Populations and Natural Selection. Oxford University Press.

Goldschimdt, R. (1940). The Material Basis of Evolution. New Haven: Yale University Press.

Gould, S.J. (1976). Darwin’s Untimely Burial: Despite reports to the contrary, the theory of natural selection remains alive and well. Natural History, 85: 24-30.

Gould, S. J. (1983). The hardening of the modern synthesis. In M. Grene (Ed.), Dimensions of Darwininsm. Cambridge: Cambridge University Press.

Griesemer, J. (2007). Tracking organic processes: Representations and research styles in classical embryology and genetics. In M. D. Laubichler & J. Maienschein (Eds.), From embryology to evo-devo: A history of developmental evolution (pp. 375–433). Cambridge, MA: MIT Press.

Griffiths, P. E. (2006). Function, homology, and character individuation. Philosophy of Science, 73(1), 1–25.

138 Griffiths, P. E. and R.D. Gray (1994). Developmental systems and evolutionary explanation. Journal of Philosophy, xci (6), 277–305.

Griffiths, P. E. and R.D. Gray (1997). Replicator II: Judgement day. Biology and Philosophy, 12(4), 471–492.

Griffiths, P. E. and R.D. Gray (2004). The developmental systems perspective: Organism- environment systems as units of evolution. In M. Pigliucci & K. Preston (Eds.), Phenotypic Integration: Studying the ecology and evolution of complex phenotypes (pp. 409–431). Oxford, UK: Oxford University Press.

Griffiths, P. E. and R.D. Gray (2005). Three ways to misunderstand developmental systems theory. Biology and Philosophy, 20, 417–425.

Griffiths, P. E. and R.D. Knight (1998). What is the developmentalist challenge? Philosophy of Science, 65, 253–258.

Hall, B. K. (2000). Guest editorial: Evo-devo or devo-evo—does it matter? Evolution and Development, 2, 177–178.

Hall, B. K. (2007). Tapping Many Sources: The Advntitious Roots of Evo-Devo in the Nineteenth Century. In M.D. Laubichler and Maienschein (Eds.), From Embryology to Evo-Devo: A History of Developmental Evolution (pp. 467-498). Cambridge, MA: The MIT Press.

Hodge, M.J.S. (1987) Natural election as a Causal, Empirical, and Probabilistic Theory. In The Probabilistic Revolution (pp. 233–270) Krȕger L., G. Gigerenzer, and M.S. Morgan (eds).. Cambridge, MA: The MIT Press.

Hodge, M.J.S. (1992). “Biology and Philosophy (Including Ideology): A Study of Fisher and Wright” in The Founders of Evolutionary Genetics: A Centenary Reappraisal. Ed. Sahotra Sarkar. Kluwer Academic Publishers. Dordrecht, The Netherlands. 1992.

Jenkin, F. (1867). Review of ‘The origin of species’, The North British Review, 46, 277-318.

Kruuk, L.E.B., J. Slate, and A.J. Wilson (2008) New answers for old questions: the evolutionary quantitative genetics of wild animal populations. Annual Review of Ecology, Evolution, and Systematics, 39: 525-548.

Laland, K. et al. (2014). Does evolutionary theory need a rethink? Researchers are divided over what processes should be considered fundamental. Nature, 514, 161-164.

Lande, R. (1982). A quantitative theory of life history evolution. Ecology 63: 607-615.

Lande, R. and S.J. Arnold (1983). The measurement of selection on correlated characters. Evolution 37: 1210-1226.

139 Langerhans, R.B. and T.J. De Wit (2002). Plasticity constrained: Over-generalized induction cues cause maladaptive phenotypes. Evolutionary Ecology Research. 4(6): 857–70.

Larson, E. (2004). Evolution: The Remarkable History of a Scientific Theory. Random House Publishing Group.

Laubichler, M. D. (2010). Evolutionary developmental biology offers a significant challenge to the Neo-Darwinian paradigm. In F. J. Ayala & R. Arp (Eds.), Contemporary Debates in the Philosophy of Biology (pp. 199–212). Malden, MA: Wiley-Blackwell.

Laubichler, M. D., & Maienschein, J. (Eds.). (2006). From embryology to evo-devo: A history of developmental evolution. Cambridge, MA: The MIT Press.

Laubichler, M. D., & Maienschein, J. (Eds.). (2009). Form and function in developmental evolution. Cambridge, UK: Cambridge University Press.

Laudan, L. (1981). A Confutation of Convergent Realism. Philosophy of Science. Vol. 48, No. 1, pp.19-49.

Levins, R. (1966). "The Strategy of Model Building in Population Biology", American Scientist, 54:421-431.

Lewontin, R. (1970). The Units of Selection. Annual Review of Ecology and Systematics, 1:1-18.

Lloyd, E. (1994). The Structure and Confirmation of Evolutionary Theory. Princeton University Press. Princeton, NJ.

Love, A. C., & Raff, R. A. (2003). Knowing your ancestors: themes in the history of evo-devo. Evolution and Development, 5, 327–330.

Lutz, B. et al. (1996). Rescue of Drosophila labial null mutant by the chicken ortholog Hoxb-1 demonstrates that the function of Hox genes is phylogenetically conserved. Genes & Development, 10, 176-184.

Majerus, M.E.N. (2005). The rise and fall of the carbonaria form of the peppered moth., in Fellowes M. D. E., Holloway G. J., Rolff J., "In Insect evolutionary ecology", Quarterly Review of Biology, 78: 399–418

Matthen, M. and A. Ariew (2002). Two ways of thinking about fitness and natural selection. Journal of Philosophy, 49(2), 55–83.

Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge University Press.

Mayr, E. (1942). Systematics and the Origin of Species, from the Viewpoint of a Zoologist. Cambridge: Harvard University Press

140 McMullin, E. (1982). Values in Science. Proceedings of the Biennial Meeting of the Philosophy of Science Association, 2: 3-28.

Metz, J.A.J., R.M. Nisbet, and S.A.H. Geritz (1992). How Should We Define “Fitness” for General Ecological Scenarios? Trends in Ecology & Evolution, 7: 198–202.

Miller, T.E.X. and B. Inouye (2011). Confronting two-sex demographic models with data. Ecology, 92(11): 2141-2151.

Mills, S.B. and J. Beatty (1979). The Propensity Interpretation of Fitness. Philosophy of Science 46: 263-286.

Millstein, R.L. (2002). Are Random Drift and Natural Selection Conceptually Distinct? Biology and Philosophy 17(1): 33-53.

Millstein, R.L. (2006). Natural Selection as a Population-Level Causal Process. The British Journal for the Philosophy of Science 57(4): 627-653.

Millstein, R.L., R. Skipper, and M.R. Dietrich (2009). (Mis)interpreting Mathematical Models: Drift as a Physical Process. Philosophy and Theory in Biology. DOI: http://dx.doi.org/10.3998/ptb.6959004.0001.002

Mayr, E. (1959). Typological versus Population Thinking. In Evolution and Anthropology: A Centennial Reappraisal (pp.409-412). The Anthropological Society of Washington. Washington.

Neumann-Held, E. M. (1999). The gene is dead—long live the gene! Conceptualizing genes the constructionist way. In P. Koslowsky (Ed.), Sociobiology and bioeconomics: The theory of evolution in biological and economic theory (pp. 105–137). Berlin, Germany: Springer.

Nunney, L. (1991). The Influence of Age Structure and Fecundity on Effective Population Size. Proceedings of the Royal Society of London B: Biological Sciences, 246(1315): 71-76.

Nyhart, L.K. (1995). Biology Takes Form. Animal Mophology and the German Universities, 1800–1900. Chicago, IL: University of Chicago Press.

Otsuka, J. et al. (2011). ‘Why the Causal View of Fitness Survives’, Philosophy of Science, 78: 209–24.

Oyama, S. (1985). The Ontogeny of Information. Cambridge: Cambridge University Press.

Oyama, S., P.E. Griffiths, and R.D. Gray (Eds.). (2001). Cycles of contingency: Developmental Systems and Evolution. Cambridge, MA: MIT Press.

Papineau, D. (1996). The Philosophy of Science. Oxford Readings in Philosophy. Oxford, U.K.

141 Pearson, K. (1901). On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine. Vol. 2, No. 11: 559-572.

Price, G.H. (1972). ‘Fisher’s “Fundamental Theorem” Made Clear’. Annals of Human Genetics, 36: 129-140.

Paley, W. (1809). Natural Theology: or, Evidences of the Existence and Attributes of the Deity. 12th edition London: Printed for J. Faulder.

Pearl, J. (2000). Causality. Cambridge: Cambridge University Press.

Pigliucci, M. and G.B. Műller (2010). Evolution—the extended synthesis. Cambridge, MA: The MIT Press.

Popper, K. R. (1957). The Propensity Interpretation of the Calculus of Probability, and the Quantum Theory. In: S. Körner (ed.): Observation and Interpretation. Academic Press Inc., Butterworths Scientific Publications, pp. 65–70.

Popper, K. R. (1959). The Propensity Interpretation of Probability. British Journal for the Philosophy of Science: 10, No. 37: 25–42.

Popper, K. (1974). Darwinism as a metaphysical research programme. In: Schilpp PA (ed.) The Philosophy of Karl Popper, pp. 133–143. New York and Chicago. Open Court Press.

Price, G.R. (1970). Selection and covariance. Nature. 227 (5257): 520–521.

Price, G.R. (1972). Extension of Covariance Selection Mathematics. Annals of Human Genetics 35:485–90.

Provine, W. (1971). The Origins of Theoretical Population Genetics. Chicago, IL: University of Chicago Press.

Ramsey, G. (2013). Organisms, traits, and population subdivisions: two arguments against the causal conception of fitness? British Journal for the Philosophy of Science.64: 589-608.

Ramsey, G. and C.H. Pence (2013). Fitness: Philosophical Problems. In: eLS. John Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0003443.pub2

Ray, J. (1670). A Collection of English Proverbs. Nabu Press. Charleston, SC. 2010.

Robert, J. S. (2004). Embryology, Epigenesis, and Evolution: Taking Development Seriously. Cambridge, UK: Cambridge University Press.

Rice, S.H. (2004). Evolutionary Theory: Mathematical and Conceptual Foundations. Sunderland, MA: Sinauer Associates

142 Ruse, M. (1987). Biological species: natural kinds, individuals, or what? British Journal for the Philosophy of Science 38: 225-242.

Ruse, M. (1996). Monad to Man: The Concept of Progress in Evolutionary Biology. Cambridge, Massachusetts: Harvard University Press, 1996.

Ruse, M. (2003). Darwin and Design: Does Evolution have a Purpose? Cambridge, Mass.: Harvard University Press.

Ruse, M. (2006a). Forty years as a philosopher of biology: Why evo-devo makes me still excited about my subject. Biological Theory, 1, 35–37.

Ruse, M. (2006b). Scott F. Gilbert? The generation of novelty: The province of developmental biology, Bare-Knuckle fighting: Evo-Devo versus natural selection. Biological Theory, 1(4), 402–403.

Ruse, M. (2007). Scott F. Gilbert—second to the right, straight on till morning. Biological Theory, 2(2), 181–182.

Ruse, M. (2008). Charles Darwin. Blackwell Great Minds Series. Wiley-Blackwell.

Salmon, W. (1984). Scientific Explanation and the Causal Structure of the World. Princeton University Press. Princeton, NJ.

Scriven, M. (1959). “Explanation and Prediction in Evolutionary Theory.” Science, 130: 477- 482.

Shaw, P.J.A. (2003). Multivariate statistics for the Environmental Sciences. Hodder Arnold.

Simpson, E.H. (1951). The Interpretation of Interaction in Contingency Tables. Journal of the Royal Statistical Society, Series B. 13: 238–241.

Simpson, G.G. (1949). The Meaning of Evolution: A Study of the History of Life and Its Significance for Man. Yale University Press. New Haven, CT.

Smocovitis, V. B. (1996). Unifying Biology: The Evolutionary Synthesis and Evolutionary Biology. Princeton, NJ: Princeton University Press.

Sober, E. (1980). Evolution, Population Thinking, and Essentialism. Philosophy of Science 47(3): 350-383.

Sober, E. (1984). The Nature of Selection: Evolutionary Theory in Philosophical Focus. University of Chicago Press. Chicago, IL.

143 Sober, E. (2001). The Two Faces of Fitness. In R. Singh, D. Paul, C. Krimbas, and J. Beatty (eds.), Thinking about Evolution: Historical, Philosophical, and Political Perspectives, Cambridge: Cambridge University Press: 309-321.

Sober, E. and L. Shapiro (2007). Epiphenomenalism - the Dos and the Don'ts. In G. Wolters and P. Machamer (eds), Studies in Causality: Historical and Contemporary, University of Pittsburgh Press. Pittsburgh, PA: 235-264.

Sober, E. and D.S. Wilson (1998). Unto Others: The Evolution and Psychology of Unselfish Behavior. Harvard University Press. Cambrige, MA.

Stadler, P.F. (1996). Landscapes and their correlation functions. Journal of Mathematical Chemistry. 20: 1–45.

Stearns, S.C. (1975). Life-history tactics: a review of the ideas. Quarterly Review of Biology, 51: 3-47.

Sterelny, K. and P. Kitcher (1988). The Return of the Gene. The Journal of Philosophy, 85 (7), 339-361.

Suppes, P. (1960). A Comparison of the Meaning and Uses of Models in Mathematics and the Empirical Sciences. Synthese, 12, 287-301.

Szathmáry, E. and J. Maynard Smith (1995). The Major Transitions in Evolution. Oxford, UK: Oxford University Press.

Tuljapurkar, S. (1989). An Uncertain Life: Demography in Random Environments. Theoretical Population Biology 35: 227–94.

Tuljapurkar, S. (1990). Population Dynamics in Variable Environments. New York: Springer- Verlag.

Tuljapurkar, S. D. and S.H. Orzack (1980). Population Dynamics in Variable Environments I: Long-Run Growth Rates and Extinction. Theoretical Population Biology 18: 314–42.

Tuljapurkar, S. D., J-M Gaillard, and T. Coulson (2009). From Stochastic Environments to Life Histories and Back. Philosophical Transactions: Biological Sciences, 364(1523): 1499- 1509. van Fraassen, B.C. (1980). The Scientific Image. Oxford, UK: Oxford University Press.

Wagner, G.P. (2007). How wide and deep is the divide between population genetics and developmental evolution? Biology and Philosophy, 22(1), 145-153.

144 Wagner, G. P., Chiu, C.-H., & Laubichler, M. (2000). Developmental evolution as a mechanistic science: The inference from developmental mechanisms to evolutionary processes. American Zoologist, 40, 819–831.

Walsh, D.M. (2007). The Pomp of Superfluous Causes: The Interpretation of Evolutionary Theory. Philosophy of Science 74: 281–303.

Walsh, D. M. (2010). Not a sure thing: fitness, probability, and causation. Philosophy of Science, 77, 147–171.

Walsh, D. M., T. Lewens, and A. Ariew (2002). The trials of life: Natural selection and random drift. Philosophy of Science, 69(3), 452–473.

Waxman, D. and S. Gavrilets (2005). Journal of Evolutionary Biology 18: 1139–1154

Weyl, H. (1983). Symmetry. Princeton University Press, Princeton, NJ.

Woodward, J. (2003) Making Things Happen: A Theory of Causal Explanation. Oxford Studies in the Philosophy of Science. Oxford University Press.

Wray, G.A. (2010). Integrating Genomics into Evolutionary Theory. In M. Pigliucci and G.B. Műller (Eds.), Evolution: the extended synthesis (pp. 97-116). Cambridge, MA: The MIT Press.

Wright, S. (1931). Evolution in Mendelian Populations. Genetics 16, 97-159.

Wright, S. (1932). The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proceedings of the Sixth International Congress of Genetics (pp. 356–366).

Wright, S. (1948). Evolution, Organic. In Encyclopedia Britannica., 14th ed. (revised), vol.8: 915-929. Wright, S. (1965). The Interpretation of Population Structure by F-Statistics with Special Regard to Systems of Mating. Evolution. 19 (3): 395–420.

Wright, S. (1980). Genic and Organismic Selection. Evolution 34(5): 825-843.

Yule, G.U. (1903). Notes on the Theory of Association of Attributes in Statistics. Biometrika. 2 (2): 121–134

145 BIOGRAPHICAL SKETCH

Peter Takacs received a B.A. in philosophy with a minor in English Literature from the

University of West Florida in 2002. He subsequently pursued and acquired a M.A. through the

History and Philosophy of Science Program in 2010 and a M.S. through the Department of

Biological Science (Ecology and Evolution) in 2016, both at Florida State University. The work for this doctorate in Philosophy was completed while concurrently pursuing these other graduate degrees, maintaining a marriage, raising two children, and working nearly full-time.

146