<<

Cosmological and Akaike’s Criterion

A thesis presented to

the faculty of

the College of Arts and Sciences of Ohio University

In partial fulfillment

of the requirements for the degree of

Master of Arts

Christopher S. Arledge

August 2015

© 2015 Christopher S. Arledge. All Rights Reserved.

2

This thesis titled

Cosmological Model Selection and Akaike’s Criterion

by

CHRISTOPHER S. ARLEDGE

has been approved for

the Department of Philosophy

and the College of Arts and Sciences by

Philip Ehrlich

Professor of Philosophy

Robert Frank

Dean, College of Arts and Sciences

3

ABSTRACT

ARLEDGE, CHRISTOPHER S., M.A., August 2015, Philosophy

Cosmological Model Selection and Akaike’s Criterion

Director of Thesis: Philip Ehrlich

Contemporary cosmology is teeming with model underdetermination and cosmologists are looking for methods with which to relieve some of this underdetermination. One such method that has found its way into cosmology in recent years is the Akaike Information Criterion (AIC). The criterion is meant to select the model that loses the least amount of information in its approximation of the data, and furthermore AIC shows a preference for simplicity by containing a penalty term that penalizes models with excessive complexity. The principle aim of this paper is to investigate some of the strengths and weaknesses of AIC against two philosophical backdrops in order to determine its usefulness in cosmological model selection. The backdrops or positions against which AIC will be assessed are I) realist and II) antirealist.

It will be argued that on both of these positions there is at least one feature of AIC that proves problematic for the satisfaction of the aims of the position.

4

ACKNOWLEDGEMENTS

I would like to express my gratitude to Philip Ehrlich for his invaluable help during the composition of this thesis. I’d also like to thank Yoichi Ishida for his helpful comments. I would like to thank Jordan Shonberg and Ryan Ross for their help in making the thesis more readable. Finally I would like to extend a special thanks to John Norton for his willingness to be on the committee and for his insightful comments and criticisms.

5

TABLE OF CONTENTS

Page

Abstract……………………………………………………………………………….…..3

Acknowledgments……………………………………………….……………...………..4

1. Introduction…………………………….…………………………….……..………....6

2. Akaike Information Criterion……………………………..……...... …...... 13

3. Philosophical Positions……………….……………………………………….………15

4. Limiting Features of AIC….……………..…………..…………………………..….....20

5. Conclusion…………………………………………………………………………….31

References………………………………………………………………………………..33

6

1. INTRODUCTION

Contemporary physical cosmology is rife with underdetermination. What underdetermination amounts to is the claim that for some set of empirical data x there are multiple theories that can each provide a good account of x and yet each theory is equally well supported on the basis of x.1 Underdetermination is often discussed in the context of scientific theories. But cosmologists are faced with a slightly different sort of underdetermination, namely underdetermination of cosmological models. In model underdetermination, it is the various models that are built out of the foundational theories that are underdetermined and not the foundational theories themselves. So in cosmology the foundational theories of General Relativity (GR) and Quantum Mechanics (QM) are taken for granted and it is the models constructed out of these theories that face the challenge of underdetermination (Butterfield 2012, 2014). A prime example of model underdetermination in cosmology is that of dark energy modeling (which models the acceleration of the expansion factor of the universe). Presently there are no less than nine mutually incompatible dark energy models in competition with one another.2 The available evidence is insufficient to offer an empirical distinction between the models, though nothing inherent in the models inhibits future evidence from providing an empirical distinction. Another example of cosmological model underdetermination is one in which relativistic dark matter models and modified gravity models compete for

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 1 The term “account” expresses the ability of a theory T to save the phenomena with regards to a particular data set x. 2 Several of these models are already considered to be less viable than others. For instance the cosmological constant model is considered much more viable than the Dvali-Gabadadze-Porrati model, which is a model formulated on brane-world assumptions (cf. Li et al. 2010). 7 primacy in accounting for the rotation curves of spiral galaxies and other related phenomena.3

The extent to which cosmological models are underdetermined depends on the conception of underdetermination invoked. One conception of underdetermination is that of Pierre Duhem (1954), who advocates a kind of holist underdetermination. On this view a hypothesis H cannot be tested in isolation since there is always a body of auxiliary

hypotheses Hn that surround H. Therefore when an experiment fails to bear out the predictions of H it need not be the case that H is falsified since it could always be one of the auxiliary hypotheses that is the troublemaker. Consider an experiment in which a telescope is used to test some prediction made by an astronomical theory. If the prediction is not born out, it does not follow that the astronomical theory has been falsified, since it could be the optical theory on which the telescope is built or some other auxiliary hypothesis that is actually the problem. Hence for any given experiment, it always remains underdetermined as to which hypothesis has actually been falsified.

On another conception theories or models are underdetermined based on the evidence that is currently available, meaning that none of the present evidence can better support one of the competing theories over another. However, this conception of underdetermination does not preclude the possibility of future evidence providing better support to one of the competing theories. The theories are therefore underdetermined in practice. Proponents of this conception of underdetermination include Larry Lauden and

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 3 Of course, the juxtaposition of relativistic dark matter models and modified gravity models may ultimately result in theory underdetermination as the modified gravity models draw GR into question. Since the paper is concerned with model selection, however, the underdetermination of the foundational theories underlying these models will not be treated. 8

Jarrett Leplin (1991). Laudan and Leplin argue that because our experimental methods and our extra-empirical assumptions change with time, it is unwarranted to conclude that any two theories that are said to be empirically equivalent at time T will remain

equivalent at some future time T1 (Stanford, 2013). Hence two theories might appear to be empirical equivalents at present, but in the future may be shown to be empirically disparate.

The two conceptions of underdetermination presented above make universal claims that may be seen as overzealous. Accordingly on a third conception, the underdetermination of theories or models is treated on a case-by-case basis. Some theories or models will have empirical equivalents that cannot be distinguished by any possible amount of evidence. Bas van Fraassen (1980, 46-69) contrasts formulations of

Newton’s theory that differ only in regards to the velocity of the solar system with regard to absolute space. Since any given constant velocity of the solar system with respect to absolute space would be observationally indistinguishable, no possible body of evidence will be able to resolve this underdetermination. On the other hand, some theories or models will be underdetermined with respect to the currently available data. Future data collection may show one theory or model accounts for the data better than its competitor(s). A fairly recent example of this is the competition between the big bang and the steady state models of the universe in the early 20th century. Initially both models accounted for the observed data (e.g. Hubble’s law). However in the 1960s, the discovery of the Cosmic Microwave Background radiation (CMB), which was predicted by the big- bang model, showed that the steady-state model could no longer account for the data 9 when the CMB is included. Prior to the 1960s the two models were considered to be empirically equivalent, but posterior to 1960 the models were shown to be empirically inequivalent, with greater support provided to the big-bang model.

Whether cosmological model underdetermination is of the second or the third kind, the point is clear: cosmologists need a method (or methods) to resolve some of the underdetermination. Various proposals have been made ranging from parameter fitting to

Bayesian Inference (BI) (Mukherjee and Parkinson 2008; Wandelt et al. 2013; Watkinson et al. 2012; Weinberg 2013).4 In recent years, however, cosmologists have begun to make use of a model selection criterion known as the Akaike Information Criterion (AIC)

(Biesiada 2007; Godłowski and Szydłowski 2005; Li, et al. 2010; Szydłowski, et al.

2006, Tan and Biswas 2012). AIC differs from parameter estimation and BI in that it is an information-theoretic selection criterion. This means that AIC selects for models that lose the least amount of information in the approximation of the generating model (that is the data-generating process).5 Thus, AIC provides cosmologists with an alternative way in which to reduce some of the rampant model underdetermination.

AIC has also attracted philosophical attention from a number of philosophers of science including Sober and Forster (1994), Kieseppä (2001a, 2001b, 2003), McAllister

(2007), Mikkelson (2006), Mulaik (2001), and Myrvold and Harper (2002). Their

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 4 Parameter fitting is a method whereby an individual parameter (rather than sets of parameters or a model) is fitted (i.e. regression analyzed) to the data so as to determine the value of the parameter that most accurately predicts the evidence in question. BI refers to the method of inference that uses Bayes Theorem to calculate the posterior probability that a model is the best approximation given a prior probability distribution over the model and a set of empirical data. 5 Referring to the generating process as a generating “model” is somewhat misleading because the process is not in fact a model. This is because models are intended to approximate the process in question. Hence reference to the generating process, as a model, is a heuristic device meant to aid in the comparison of the constructed models with the process generating the data. 10 analyses seek to determine whether or not AIC is a viable model selection criterion for use in scientific practice. Some such as Malcolm Forster and Elliot Sober (1994) find AIC to reliably select models having the most predictive accuracy. Gregory Mikkelson (2006), on the other hand, claims that AIC selects models that have the most verisimilar (i.e. approximately true) parameter values.

The endorsement of AIC is not unanimous, however. Mulaik (2001) and

McAllister (2007) voice concerns about the reliability of the selections made by AIC.

McAllister for instance highlights cases from meteorology, cosmology and endocrinology in which multiple models account equally for the data. In cases such as these, McAllister suggests that what is relevant in the choosing of a particular model is the research goals of the scientist. AIC cannot account for these goals and as such is an inadequate model selection criterion.6

The principle aim of this paper is to investigate some of the strengths and weaknesses of AIC against two philosophical backdrops in order to determine its usefulness in cosmological model selection. The backdrops or positions against which

AIC will be assessed are I) realist and II) antirealist. It will be argued that on both of these positions there is at least one feature of AIC that proves problematic for the satisfaction of the aims of the position. There will be three features of particular interest. The first is a strictly formal feature, known as asymptotic inconsistency, which shows that as the data set increases towards infinity, the margin of error does not decrease to an ignorable value.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 6 Mulaik’s (2001) criticism is of a more formal nature, but Kieseppä (2003) makes note of a mathematical error in Mulaik’s argument and so a full discussion of his objection is not useful here. 11

The second feature regards the existence of parameter degeneracies in cosmology.7

Parameter degeneracy (or parameter underdetermination) is present when differing parameters or vastly different parameter values are equally consistent with the same set of observational data. Such degeneracy is often a consequence of parameters in the models that are unconstrained by the data. AIC has no way to discriminate between constrained and unconstrained parameters. This poses problems for the reliability of the selection process since taking the parameters unconstrained by the data into account can result in the selection of a model that is not the best fit to the data. The third and final feature of AIC is directly related to the second. It concerns AIC’s inability to predict the constraints placed upon the problematic unconstrained parameters by future data collection. This third feature exacerbates the problem of parameter degeneracies and futher hinders the satisfaction of cosmologists’ aims through the use of AIC.

Cosmologists should therefore take these three features into account in assessing the results of model selections and use caution in the application of AIC.

Section 2 will offer a more formal elaboration of AIC. Section 3 will provide an explication of the various philosophical backdrops against which AIC will be evaluated.

This section will also survey various cosmologists and their views concerning the use of

AIC in cosmological model selection. Section 4 will make more explicit the features of

AIC that pose problems for the satisfaction of cosmologists’ aims. This section will include examples from physical cosmology that highlight these features. Finally, in

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 7 Technically, features 2 and 3 are not features of the formalism of AIC. Rather, they are features of AIC as applied to specific situations in cosmology. Therefore, in some scenarios features 2 and 3 are absent. However, since I am primarily concerned with situations in which features 2 and 3 are present, I will refer to them as features of AIC. 12 section 5 I will review the conclusions reached and advocate for the use of caution in the application of AIC in cosmological model selection.

13

2. AKAIKE INFORMATION CRITERION

Akaike’s criterion can be understood in the following way: suppose there is some set of experimental data x generated by some hitherto unknown process g(x), which will be referred to as the “generating model.” The aim of the model selection process is to select the model from a set of candidate models that most accurately approximates (fits) the generating model. Because AIC is developed in terms of information theory, determining which model is the best fit (i.e. best approximation) of g(x) requires determining the amount of information lost by each of the candidate models. What exactly “information” amounts to is the subject of debate and largely depends on context.

For instance Pieter Adriaans (2013) makes note of at least six species of information. The best conception of information in the context of model selection is Fisher information which is a measure of the amount of information that g(x) contains about some specified parameter θ (Adriaans 2013). Therefore to say that a model loses information in its approximation of g(x) is to say that the model does not contain the full amount of information g(x) contains concerning the relevant θ. The measure of the information discrepancy between a particular model f(x) and g(x) is known as the Kullback-Leibler

(K-L) divergence. Consequently in order to determine which of the candidate models best fits g(x) an estimator of the expected K-L divergence of each model is needed.

AIC can function as the needed estimator for the expected K-L divergence of the

models in the candidate set. This estimation can be defined in the following way: AIC =def

-2 ln(Lmax) + 2K, where Lmax represents the maximum likelihood function and K represents the number of adjustable parameters in the model. An important portion of the formalism 14 is the 2K term, which functions as a penalization term for free parameters. This penalization typically results in models with fewer parameters (i.e. statistically simpler models) being selected over models with a greater number of parameters (i.e. statistically complex models). This penalization can pose problems for AIC in the presence of unconstrained parameters. Such problems will be explicated further in section 4.

There is a final point to be made concerning the formalism of AIC. The absolute value of the expected K-L divergence for any particular model is unimportant in the context of model selection. This is because cosmologists are not evaluating a single model in isolation but are choosing between multiple models in a candidate set. Instead the expected K-L divergence values of the models are significant in comparison with one another. The comparison allows for the determination of which model in the candidate set has the least K-L divergence and is the best approximation of g(x).

15

3. PHILOSOPHICAL POSITIONS

The evaluation of the strengths and shortcomings of AIC can take place against several philosophical backdrops, each with differing aims. The significance of this is that depending on the cosmologist’s aims AIC may fare better or worse in facilitating the satisfaction of those aims. Therefore before a proper analysis of AIC can be made, an explication of these positions is needed.

I will divide the positions against which the evaluation of AIC can be made into two categories: realist and antirealist. Those familiar with the realism/antirealism debate will note that these categorizations are markedly broad. But as Chakravatty observes “It is only perhaps a slight exaggeration to say that scientific realism is characterized differently by every author who discusses it…” (2011). As such a detailed explication of each of the various sorts of realism and antirealism is well beyond the scope of this paper.8 The two categories can be understood in a naïve way as follows:

Scientific Realism (SR): a philosophical position that holds that that the aim of

science is to produce true descriptions of the world, both observable and

unobservable.

Scientific Antirealism (SA): a philosophical position characterized by a denial

that the aim of science is to produce true descriptions of the world.9

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 8 Chakravatty (2011) offers a good overview of the nuances of particular positions and of the realism/antirealism debate in general. 9 Some particular species of SR might be entity realism (Hacking 1982; Cartwright 1983) or structural realism (Worrall 1989). Some species of SA might be instrumentalism (Carnap 1950) or pragmatism (James 1907). Constructive Empiricism (van Fraassen 1980) also falls under SA, since the position denies that science aims to produce true descriptions of unobservable phenomena. One position, Arthur Fine’s Natural Ontological Attitude (Fine 1996), is difficult to characterize according to the above categories. Fine’s view attempts to find the beliefs held in common between SR and SA and to reject any of 16

The primary distinction between the two views is the role that truth plays in the respective aims.10 In the context of model selection the two positions also differ in their interpretation of what “best fit” denotes. On the SR position, the model that is the best fit will be the model that is either true or the most verisimilar (i.e. closest to the truth). On the SA position, best fit will indicate something else such as predictive accuracy or some other sort of pragmatic consideration.

Since SR and SA are contrasting positions, it is natural to wonder which position is correct or which position scientists ought to take. There are a plethora of arguments for and against SR and SA.11 They needn’t be addressed here, but it is important to note that there is no unanimity among philosophers or scientists as to which position ought to be adopted. As such, AIC is used as a statistical tool to satisfy the aims of both SR and SA, depending upon the particular cosmologist’s position.

Since AIC is being used in some form or another to help relieve some of the model underdetermination in cosmology an examination of what cosmologists think they are accomplishing in their use of AIC is in order. Some cosmologists like Andrew Liddle perceive AIC as being used in order to select a model that offers a true description of underlying physics that the model is said to represent. Liddle writes:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! the other auxiliary beliefs about the metaphysical or epistemic import of the common beliefs. Fine also seems to tread the line between the two categories. 10 There are metaphysical and semantic elements to the realism/antirealism debate. But the features of AIC to be explicated below do not address the metaphysics or semantics of cosmology. The real issue is whether or not cosmologists should believe that the model selections have met their scientific aims, which of course is an epistemic issue. Hence I have confined the discussion of realism/antirealism to the epistemic dimension. 11 Two notable examples include the no-miracles argument (Putnam 1975) for SR and the pessimistic induction (Laudan 1981) against SR. A more recent argument against SR is Stanford’s argument from unconceived alternatives (2006). 17

Dimensional consistency does not seem to bother most statisticians, as they are typically seeking models, which can explain the data and have some predictive power, rather than expecting to represent some underlying truth. Indeed, they commonly quote statistician George Box: ‘All models are wrong, but some are useful.’ The problem of dimensional consistency is therefore mitigated, because they do not expect the set of models to remain static as the dataset evolves. Cosmologists, however, are probably not yet willing to concede that they might be looking for something other than absolute truth specified by a finite number of parameters. (2008, 3)

While this passage does not make explicit mention of AIC, Liddle is addressing a formal aspect of the criterion (i.e. asymptotic inconsistency, to be detailed below), which he finds problematic in the context of cosmological model selection because cosmologists are searching for “absolute truth.” Such a generalization is perhaps unwarranted since cosmologists will have their own individual research aims. However, this passage indicates that in using AIC cosmologists are generally seeking to satisfy the aims of SR.

Eric Linder and Ramon Miquel also view cosmologists’ use of AIC as attempting to satisfy SR aims. They write “physicists do not regard their models as just useful summaries of data, but as fundamental descriptions of the data based upon physical principles. The parameters in our models have (or should have) deep physical meanings and are not just concise ways of representing the data. Model efficiency takes a back seat to physical fidelity” (2008, 2316).12

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 12 Once again AIC is not explicitly mentioned. But this passage comes in the midst of a discussion concerning the merits and shortcomings of the model selection paradigm as a whole, a paradigm AIC is certainly an integral part of. Also the use of phrases such as “physical fidelity” and “fundamental descriptions” does not entail that the authors are talking about the truth of the models. But the downplaying of the efficiency of the model suggests that the predictive ability of the models is not as important as the model’s ability to provide a physically verisimilar account of the data. Accordingly I think it reasonable to conclude that Linder and Miquel are advocating some sort of realist account.

18

On the other side of the philosophical coin, a number of cosmologists make use of

AIC primarily in satisfying SA aims. For instance Szydłowski et al. claim “Moreover, in the notion of true models do not believe information theories because the model by definition is only an approximation to unknown physical reality: there is no true model of the Universe that perfectly reflect large structure of spacetime, but some of them are useful” (2006, 3). This statement has a strong resemblance to the famous quotation of the statistician George Box that “all models are wrong, but some are useful” (Box and Draper

1987). While the assertion of Szydłowski et al. is meant to apply to information theory more broadly, AIC is nevertheless based on information theory and therefore succumbs to their assertions (assuming it is correct). What exactly “useful” means here is unclear, but the denial of information criterion (i.e. AIC) as being truth-conducive indicates SA aims.

Davis et al. also view the usage of AIC as satisfying some sort of SA aim when they write “Thus information criteria alone can at most say that a more complex model is not necessary to explain the data” and “Information criteria provide a valuable way to get a relative ranking of the viability of scenarios, using a statistical analysis of that gives strong weight to the most simplistic model that fits the observations. This does not mean that the simplest model is always correct, rather that more complex and flexible models are not (yet) necessary” (2007, 717). Hence AIC is useful in determining the number of parameters necessary to explain the data, as well as ranking the various models according to plausibility. This does not imply that AIC is truth-conducive, however. Davis et al. are asserting the pragmatic or instrumental value of AIC and nothing more. Shi et al. are in agreement with Davis et al. when they write “…AIC remains useful because it gives an 19 upper limit to the number of parameters that should be included” (2012, 2453).

“Included” here refers to the inclusion of particular models in the model selection process. If AIC can set an upper limit on parameter amounts, then certain models will be excluded from the selection process. This exclusion is of considerable value as it reduces the number of candidate models, but it says nothing concerning the truth-conducivity of

AIC. In fact, that is precisely the point that Davis et al. and Shi et al. mean to emphasize: that the usefulness of AIC is limited to narrowing the number of necessary parameters and providing relative weighing of the candidate models; nothing more. This of course indicates SA aims. 20

4. LIMITING FEATURES OF AIC

Now that a broad characterization of the philosophical positions of cosmologists has been made, AIC can now be evaluated according to those positions. In doing so I would like to highlight three particular features of AIC that are particularly salient in the context of cosmology. They are as follows: I) asymptotic inconsistency, II) parameter degeneracies and III) the inability of AIC to appropriately predict constraints upon parameters by future data collection. I contend that each of these features provides reason to be cautious in the application of AIC to cosmological model selection since they pose challenges to the satisfaction of cosmologists’ aims. This assertion holds for both SR and

SA, though more strongly for SR.

Of the three limitations mentioned above, the asymptotic or dimensional inconsistency (AI) of AIC offers the most trouble for the satisfaction of SR aims (though this inconsistency does not appear to be particularly problematic for the satisfaction of

SA aims as will be shown below). One reason for its potency is the fact that AI is a strictly formal feature that is not contingent upon any factor external to AIC itself. Hence it is present in every instance of model selection not just those in which some set of external conditions obtain. In a 1984 paper, R. L. Kashyap provides proof that establishes

AIC as being asymptotically inconsistent. AI entails that as the number of points N in the data set approach infinity, the probability of error is not reduced to a negligable value.

This means that the probability of selecting an erroneous model remains significant.13 An erroneous selection amounts to selecting an overfitted model, which is a model that offers

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 13 “Significant” is not referring to statistical significance but rather the colloquial definition of being worthy of attention. 21 a description of the random fluctuations or errors (noise) in the data set rather than the data set itself. This is in opposition to a consistent criterion such as the Bayesian

Information Criterion (BIC) where as N tends to infinity, the criterion selects the the best fit model in the candidate set with probability 1 and this selection is regarded as the true model.14

So what does AI mean for cosmologists? For cosmologists with SR aims, AI severely lessens the confidence that can be placed in the selected model.15 The reason for this is that the AI ensures that the probability of selecting an overfitted model remains relatively high. Since overfitted models model noise rather than the actual data, this means that overfitted models are false. Therefore, the probability of selecting a false model remains rather high. Since this is the case, the selections made by AIC will often not satisfy the aims of SR because the general truth-conducivity of the selections is undercut by AI. This result is not so troubling to cosmologists with SA aims. Because the aims of SA are not concerned with truth, the fact that the model models noise rather than actual data is only problematic in so far as the other aims are not satisfied. For instance, if modeling the noise produced accurate predictions or provided some utility then the aims

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 14 Of course such consistent criteria make the assumption that the true model is a part of the candidate set, which may or may not be the case. If not, then the particular criterion would select a false model with probability 1. The skeptic might worry that it is possible that every candidate set does not contain the true model but only false models and hence every model selection, even with infinite N will result in a false model being selected with probability 1. This is certainly a possibility, but whether or not it is an actuality is difficult to determine because we do not have access to future models or future candidate sets. So this skeptical worry is inconclusive at best. Therefore it is possible that even if all past model selections have resulted in false selections that future selection will be truth-conducive. 15 This conception of confidence is not to be confused with the statistical notion of a confidence interval or a confidence level. A confidence interval is a statistical notion that estimates the frequency with which some given parameter value will be located within some interval and the confidence level specifies the frequency. Hence confidence intervals are not epistemic notions. Bayesians, however, make use of a conception of confidence that is meant to represent degrees of belief. It is something like the Bayesian conception that I employ. 22 of SA may be satisfied. Conversely, if modeling the noise resulted in poor predictions or little to no utility, then the aims of SA may not have been satisified.16

One possible response to this line of argument is to claim that though AIC is inconsistent at the limit, the data sets that cosmologists do have are much, much smaller and hence the inconsistency does not come into effect for these smaller data sets. In fact, there is a small sample correction for AIC that is intended for this purpose. The corrected

AIC (denoted AICc) makes use of the smaller value of N in order to bring the results of

AIC into agreement with a consistent criterion (namely BIC). The idea is that if AICc is in agreement with BIC, then the AI is no longer a problem because the the results of BIC

are given by a consistent criterion. The problem with this response is that AICc is only applicable when N is remarkably small and K is relatively large. In fact, Davis et al.

(2007) note that AICc is only applicable when N/K 40. This requirement, however, is very often not satisfied in cosmology. For instance in dark energy modeling, the models considered to be the most plausible do not have over 5 free parameters (cf. Szydłowski et al. 2006; Davis et al. 2007; Li et al. 2010). Futhermore the data sets used in the evaluations of these models have N anywhere from 157 to 397. Hence, almost every instance of model selection with dark energy models results in the correction term disappearing and so the selection is made by the standard AIC, which is inconsistent.17

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 16 It is unclear whether modeling noise can actually produce accurate predictions or utility. Often it does not. I do not mean to assert that it can, but rather to make the counterfactual point that if modeling noise can produce accurate predictions and utility, then the aims of SA would be satisfied. This counterfactual point does not hold for SR since the SR aims concern truth and not merely prediction or utility. 17 An exception to this might be when N = 157 and models with 3-5 parameters are evaluated. However, because BIC is much more strict on its penalization of excess parameters, it is unlikely that 23

Therefore, though a correction for small sample sizes exists, most cosmological modeling does not meet its requirements and hence the problem of inconsistency is not subverted.

Another way to get around the problem of AI might be for cosmologists to constantly alter their models as the data set grows. However, as Liddle (2008) notes, cosmologists trying to satisfy the aims of SR desire their models to remain reasonably static as the data set grows. This makes sense because if, as Liddle suggests

“cosmologists…are probably not yet willing to concede that they might be looking for something other than absolute truth specified by a finite number of parameters” (2008), then the constant fluctuation of the models would not allow cosmologists to determine whether or not their models actually offer the true description that is sought after. In order for cosmologists to assess the veracity of their models with respect to a data set the model must remain static as the data set grows. So this response is not a viable one for cosmologists who accept SR and AI cannot be subverted in this manner. As for the cosmologists who accept SA, this repsonse is superfluous since AI isn’t much of a problem for them anyway.

A second feature of AIC that proves problematic for its use in cosmological model selection is the criterion’s inadequacy in dealing with the parameter degeneracies.

Within cosmology there are a number of scenarios wherein models with differing parameters or parameter values equally account for the same observational data

(Godłowski and Szydłowski 2005; Howlett et al. 2012; Efstathiou and Bond 1999;

Minakata et al. 2008). In philosophical language it might be said that there is parameter

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! models with 3-5 parameters will be favored by BIC in the first place and so the results of the AICc selection would not align with the BIC results. 24 underdetermination, but in keeping with the current physics literature, this problem will be referred to as parameter degeneracy.

What is responsible for parameter degeneracies is that certain parameters (or values of parameters) are unconstrained by the data. This means that the data offers no information with which to discriminate between competing parameters or differing values of the same parameter. For instance consider the case of supernovae neutrino fluxes, which are the flow of neutrinos emitted from supernovae. Each neutrino flux has a particular energy spectrum (that is the distribution of possible energy values the neutrinos can take and how many neutrinos actually take on each value). Evidence shows that

regarding two neutrino energy spectra (Eean) each of the spectra can take on diverging values (designated the true and fake values) that are observationally indistinguishable

18 within an error margin of 0.01 (Minakata et al. 2008). We can therefore say that Eean is a degenerate parameter. The degeneracy is not an artifact of any particular formalism or model. Minakata et al. (2008) make use of two flux models in their analysis, known respectively as the Garching model (or parameterization) and the Modified Fermi-Dirac model (or parameterization). Though the models are different from one another (the details are unimportant), both models can have observationally indistinguishable

degeneracies regarding the Eean parameter. They therefore conclude that the continuous degeneracy of Eean is "...a robust feature of the reconstruction analysis of supernovae parameters" (Minakata et al. 2008, 14). The parameter degeneracy of supernovae !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 18 The degeneracy applies to the observation of the spectra as seen on Earth. The two spectra parameter values are considered to contain a “…large difference in the primary neutrino spectra at the SN core…” yet nevertheless “… the true and the fake spectra agree with each other within better than a 1% level over a wide energy range” (Minakata et al. 2008). The notation Eean is mine and is meant to denote the energy spectrum of electron antineutrinos, which are the subject of Minakata et al.’s discussion. 25 neutrino fluxes is therefore model independent and is an element of the current supernovae data set. The same sort of story could be told for situations such as the

degeneracy of the universal spatial curvature parameter (Ωk) with respect to the Cosmic

Microwave Background (CMB) anisotropies (Efstathiou and Bond 1999, 75). Different

values for the Ωk parameter result in different spatial curvatures of the universe (i.e. different geometrical models) and several of the values that Ωk can take on are observationally indistinguishable with regards to the current CMB anisotropy data. The

degeneracy of parameters such as Ωk and the Hubble parameter (H) with respect to the power spectrum of the CMB anisotropies is yet another example (Howlett et al. 2012). In

these examples, the Eean, Ωk, and H are instances of parameters that are unconstrained by the relevant data, and are therefore degenerate.19

Parameter degeneracy can pose problems for the use of AIC in cosmology. The unconstrained parameters of a model should not be taken into account in the model selection process. The reason for this is that model selection takes place in the context of a particular data set. If the relevant data set does not offer any information with which to locate a precise value of the parameter or to distinguish between competing parameters, then the parameter(s) in question is not useful in determining the model’s fit to the relevant data. Since AIC is selecting for the model f(x) with the least expected K-L divergence between f(x) and g(x), the unconstrained parameter(s) do not contribute to the

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 19 I would like to point out that degeneracy is relative to a particular data set. For instance the Ωk parameter may be degenerate with respect to the CMB anisotropies but could be constrained by some other data set. So when a parameter is designated as degenerate it does not mean that the parameter cannot be constrained in any way. It simply means that for a given data set, no information constrains the specified parameter. This notion of relative degeneracy is acceptable because AIC model selection is always relative to a particular data set. 26 calculation of the expected K-L divergence because no information about the unconstrained parameter(s) is apparently contained in g(x).20 For example, in deciding whether the Garching model or the Modified Fermi-Dirac model is the better model of

supernovae neutrino fluxes, the unconstrained parameter Eean should not be taken into account in the selection process. Whether or not the value of Eeanis true or fake is irrelevant to the fitting of the Garching model or the Modified Fermi-Dirac model to the

supernovae data because the effect of the value of Eean on the model is observationally indiscernible. Accordingly the supernovae data apparently contains no information

regarding the value of Eean.

The parameter penalty term of AIC (i.e. the 2K term) nevertheless treats these unconstrained parameters as though they were constrained. This entails that models are penalized for their unconstrained parameter(s), even though the parameter(s) is not relevant to the determination of the fit between the models and the relevant data. Thus, a model may be penalized for being more complex even if the parametric source of the complexity is uninformed by the data at hand! To be fair, this feature of AIC is not problematic for the formalism itself but rather for certain applications of the formalism.

AIC is a statistical tool and as such treats the parameters as strictly mathematical entities, not representations of the external physics. So we have no reason to suspect that AIC can differentiate between constrained and unconstrained parameters.

This penalization nevertheless presents a problem for cosmologists of both SR and SA flavor. If AIC is to satisfy the aims of SR then it must be able to differentiate !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 20 “Apparent” is used here since it may turn out upon future data collection that the unconstrained parameter(s) becomes constrained and thus g(x) is then known to contain information about the parameter(s). 27 between the constrained and unconstrained parameters. Otherwise the truth-conducivity of the criterion would be significantly weakened. Consider the following scenario: a cosmologist is presented with two cosmological models, model φ and model ψ.

Furthermore, suppose model φ has two constrained parameters, while model ψ has three parameters, two constrained and one unconstrained. Now suppose that model ψ is the best fit for the data in question. However, due to the penalization that results from model

ψ’s unconstrained parameter, model φ is chosen as the best fit for the data when the AIC calculations are performed. Here cosmologists accepting SR are presented with a case where a less verisimilar (or false) model is selected over a more verisimilar (or true) model only because AIC could not differentiate between constrained and unconstrained parameters. This significantly decreases the confidence in the truth-conducivity of the selection results and the aims of SR may be not satisfied.

Cosmologists accepting SA face a similar problem. As was stated, the inability of AIC to differentiate between constrained and unconstrained parameters can result in a model with a lesser fit being selected over the best fit model. But as we have seen one of the differences between SR and SA is the interpretation of best fit. Hence, even though a cosmologist in the SA position is uninterested in the versimilitude of the models, she is still interested in something like the predictive accuracy of the model, which is what goodness-of-fit is said to indicate. So a model with a lesser fit to the data has less predictive accuracy than the best fit model. Because AIC can select a model that is a lesser fit over the actual best fit model, this means that a less predictive model can be selected over the more predictive model. Since this is the case, AIC may fail to meet the 28 aims of SA. Therefore cosmologists in both the SR and SA positions must be very attentive to parameter degeneracies when using AIC and in the evaluation of the model selection results.

The issue of parameter degeneracy revolves around the inclusion of unconstrained parameters in the model selection. As was mentioned previously, since

AIC takes into consideration and penalizes parameters unconstrained by the data, the models selected by AIC may not be the best fit. A closely related problem is the criterion’s inability to predict the constraints placed upon previously unconstrained parameters by future data collection (cf. Linder and Miquel 2010). This feature of AIC only serves to exacerbate the problem presented by parameter degeneracies.

Consider again models φ and ψ mentioned previously, each with the same parameter designations. In selecting between the two models, the unconstrained parameter of ψ is penalized by AIC, resulting in φ being selected. However, suppose future data collection (of data qualitatively similar to the initial data set) results in the placement of a constraint upon the previously unconstrained parameter of ψ.

Furthermore, suppose that this new parameter constraint makes it clear that model ψ is the best fit, even with three parameters as opposed to φ’s two. The AIC calculations with the new data set will result in ψ being selected as the best fit, while the prior AIC calculations resulted in φ being selected as the best fit. Because AIC cannot predict the constraints made upon the unconstrained parameters by future data collection, it selected 29 a model that was in fact not the best fit to the data, even though the best fitting model was in the candidate set.21

There are examples from contemporary cosmology that make this argument more concrete. Linder and Miquel (2008) present two such examples. The first is in regard to the 1998 discovery by A.G. Riess et al. that the expansion factor of the universe is accelerating. Prior to 1998, models that included the unconstrained parameter specifying the cosmological constant (Λ) would have been penalized by criteria such as

AIC for the extra parameter. Hence, the Standard Cold Dark Matter (SCDM) model with

parameters Ωm and H would have performed better in model selection than a ΛCDM model with a non-zero cosmological constant. Posterior to 1998 and Riess et al.’s discovery, the Λ parameter was constrained by the new evidence. Because of this the

ΛCDM model now performs much better in the model selection than does the SCDM.

This suggests that these future constraints on cosmological parameters can have very significant effects on the model selection process.

A second example also given by Linder and Miquel concerns the galaxy correlation function, which measures the clustering of galaxies relative to scale. The two-

γ parameter power law, ξ(r) = (r/r0) was for a long time thought sufficient for modeling the distribution of galaxies.22 But, with the new data set collected from the Sloan Digital Sky

Survey (SDSS), it has been shown that the dark matter halo model containing a greater

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 21 It might be asked how such a result can come about. If AIC treats parameters as mathematical entities anyway, then further physical information about one of the parameters shouldn’t have an affect on how AIC treats it. When a constraint is placed on a parameter, however, it means that the data set now contains information with which to adjudicate between competing values of the parameter. So the parameter becomes relevant in determining goodness-of-fit. 22 The parameters in this power-law model are the correlation length r0 and the power-law slope γ. 30 number of parameters (most significantly the halo occupation distribution, HOD, which specifies how galaxies are clustered in their particular dark matter halos) is a better fit.

Prior to the collection of the SDSS data, the power law model would have performed better under AIC calculations due to having fewer parameters than the dark matter halo model. This would be the case even though certain parameters in the dark matter halo model were unconstrained by the data. However, the future data collection (i.e. the SDSS data) constrained the previously unconstrained HOD, which results in the dark matter halo model performing better under AIC calculations. Again, it is clear that AIC could not take into account the constraints of the future data collection.

A possible rejoinder to this whole line of argumentation is that cosmologists may be careful enough so as to include within the candidate set only models in which all the parameters are fully constrained by the data. This would seem to avoid the problem of parameter degeneracies and hence vidicate the use of AIC and indeed it does. If cosmologists can be confident that degenerate parameters will not be in the model selection, then the problems mentioned above do not apply and the aims of both SR and

SA may be satisifed. This victory does not come without a cost, however. This imposes a limitation on the cosmologists’ choice of models, which could result in a model being left out, even though that model provides a better fit to the data than any of the models in the candidate set. Therefore, cosmologists can avoid the problem of parameter degeneracies, but the soultion may nevertheless lead to the selection of a sub-optimal model. Ultimately this may not really be a problem so much for the use of AIC as it is for the methodology of candidate model selection but the two are closely related so the point remains. 31

5. CONCLUSION

The evaluation of AIC against both the SR and SA positions indicates some shortcomings and limitations of its utilization as a tool of model selection in cosmology.

It has been shown that as the data set increases towards infinity the probability of error remains at a significant value. The AI of AIC poses a problem for the satisfaction of SR aims by retaining a significant probability of selecting an overfitted (and hence false) model that models noise rather than data. Such a result is much less troubling for the satisfaction of SA aims since such aims are concerned with something other than truth and overfitted models satisfying such aims is not ruled out by AI.

The existence of parameter degeneracies and the inability of AIC to discriminate between constrained and unconstrained parameters also present a problem for the satisfaction of cosmologists’ aims. The fact that AIC does not differntiate these types of parameters can lead to the selection of a model that is not the best fit, even when the best fit model is available. Such a result can be problematic for both the SR and SA positions, since a central difference between the two in this situation is the interpretation of goodness-of-fit. To make the situation even more problematic, the inability of AIC to predict the constraints placed by future data sets on previously unconstrained parameters may result in models selected that are later shown to be a lesser fit to the data and hence undermines the aims of both the SR and SA positions.

The three features explicated above warrant caution in the application of AIC as a tool in model selection. In light of AI it is advised that cosmologists check the selected model against the data to determine whether or not an overfitted model been has selected. 32

Futhermore cosmologists must be aware of the presence of degenerate parameters of the models in order to counterbalance AIC’s equal treatment of constrained and unconstrained parameters. These precautions should go some way to counteracting these limiting features of AIC. However, the questions remains as to whether in light of these features the retention of AIC as a tool of model selection in cosmology is really desired. It may be the case that AIC is not worth keeping in the toolbox and cosmologists should prefer a criterion to which AI and the problem of parameter degeneracies does not apply.

As a final aside, it may be asked what the scope of my conclusion is. Does my conclusion apply to sciences other than physical cosmology? This can be a difficult question to answer, however. One question that must be asked is whether or not parameter degeneracies exist in the particular branch of science. If so then deficiencies two and three may very well apply to the application of AIC in that branch of science and the same cautious attitude ought to be adopted. The aims of the scientists must also be made explicit. I suspect that the positions of SR and SA also exist in most if not all branches of science. If this is the case, then those scientists who accept SR will have to contend with AI, which has been shown to be quite problematic for the satisfaction of SR aims. In order to determine if my conclusion applies, these features of the branch of science must be determined. Therefore, it is quite possible that my evaluation applies to a number of branches of science. However, I can make no global claim due to ignorance of the relevant conditions for each branch of science.

33

REFERENCES

Adriaans, Pieter. 2013. “Information.” In The Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. Stanford, CA: Stanford University. http://plato.stanford.edu/archives/fall2013/entries/information/

Akaike, Hirotugu. 1973. “Information Theory and an Extension of the Maximum Likelihood Principle.” In 2nd International Symposium on Information Theory, ed. B.N. Petrov and F. Csaki, 267-81. Budapest: Akademiai Kiado.

Akaike, Hirotugu. 1974. “A New Look at the Identification.” IEEE Transactions on Automatic Control 19:716-723.

Biesiada, Marek. 2007. “Information-theoretic Model Selection Applied to Supernovae Data.” Journal of Cosmology and Astroparticle Physics, 2007.

Butterfield, Jeremy. 2012. “Under-determination in Cosmology: an Invitation.” The Aristotelian Society Supplementary Volume 2012 86:1-18.

Butterfield, Jeremy. 2014. “On Under-determination in Cosmology.” Studies in the History and Philosophy of Modern Physics 46:52-69.

Carnap, Rudolph. 1950. “Empiricism, Semantics and Ontology.” Revue Internationale de Philosophie 4:20-40.

Cartwright, Nancy. 1983. How the Laws of Physics Lie. Oxford: Clarendon.

Chakravartty, Anjan. 2011. “Scientific Realism.” In The Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. Stanford, CA: Stanford University. http://plato.stanford.edu/archives/win2013/entries/scientific-realism/

Davis, T. M., et al. 2007. “Scrutinizing Exotic Cosmological Models Using Essence Supernova Data Combined with Other Cosmological Probes.” The Astrophysical Journal 666:716-725.

Duhem, Pierre. 1954. The Aim and Structure of Physical Theory. Trans. P.W. Wiener. Princeton, NJ: Princeton University Press.

34

Fine, Arthur. 1996. The Shaky Game: Einstein, Realism and the Quantum Theory. 2nd ed. Chicago: Chicago University Press.

Forster, Malcolm, and Elliot Sober. 1994. “How to Tell When Simpler, More Unified, or Less ad hoc Theories Will Provide More Accurate Predictions.” The British Journal for the Philosophy of Science 45:1-35.

Godłowski, Włodzimierz, and Marek Szydłowski. 2005. “How Many Parameters in Cosmological Models with Dark Energy.” Physics Letters B 623:10-16.

Hacking, Ian. “Experimentation and Scientific Realism.” Philosophical Topics 13:71-87.

James, William. 1907. Pragmatism. Cambridge, MA: Harvard University Press.

Kashyap, R.L. 1980. “Inconsistency of the AIC Rule for Estimating the Order of Retrogressive Models.” IEEE Transactions on Automatic Control 25: 996-998.

Kieseppa,̈ I. A. 2001a. “Statistical Model Selection Criteria and Bayesianism.” Philosophy of Science 68:141-152.

Kieseppa,̈ I.A. 2001b. “Statistical Model Selection Criteria and the Philosophical Problem of Underdetermination.” The British Journal for the Philosophy of Science 56:761-794.

Kieseppä, I. 2003. “AIC and Large Samples.” Philosophy of Science 70:1265-1276.

Laudan, Larry. “A Confutation of Convergent Realism.” Philosophy of Science 48:19-48.

Laudan, Larry, and Jarrett Leplin. 1991. “Empirical Equivalence and Underdetermination.” Journal of Philosophy 88:449–472. Li Miao, Xiao-Dong Li, and Xin Zang. 2010. “Comparison of Dark Energy Models: a Perspective from the Latest Observational Data.” SCIENCE CHINA Physics, Mechanics and Astronomy 53:1631-1645.

Liddle, Andrew. 2004. “How Many Cosmological Parameters?” Monthly Notices of the Royal Astronomical Society 351:49-53.

35

Liddle, Andrew. 2008. “Information Criteria for Astrophysical Model Selection.” Monthly Notices of the Royal Astronomical Society 377:74-78.

Linder, Eric. V., and Ramon Miquel. 2008. “Cosmological Model Selection: and Physics.” International Journal of Modern Physics D 17:2315-2324.

McAllister, James. W. 2007. “Model Selection and the Multiplicity of Patterns in Empirical Data.” Philosophy of Science 74:884-894.

Mukherjee, Pia, and David Parkinson. 2008. “Cosmological Model Selection.” International Journal of Modern Physics A 23:787-802.

Myrvold, Wayne. C., and William. L Harper. 2002. “Model Selection, Simplicity and Scientific Inference.” Philosophy of Science 69:135-149.

Putnam, Hilary. 1975. Mathematics, Matter and Method. Cambridge: Cambridge University Press.

Riess, Adam G., et al. 1998. “Observational Evidence from Supernovae for an Accelerating Universe and a Cosmological Constant.” The Astronomical Journal 116:1009-1038.

Stanford, Kyle. 2006. Exceeding Our Grasp: Science, History and the Problem of Unconceived Alternatives. Oxford: Oxford University Press.

Stanford, Kyle. 2013. “Underdetermination of Scientific Theory.” In The Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. Stanford, CA: Stanford University. http://plato.stanford.edu/archives/win2013/entries/scientific- underdetermination/

Szydłowski, Marek, Aleksandra Kurek, and Adam Krawiec. 2006. “Top Ten Accelerating Cosmological Models.” Physics Letters B 642:171-178.

Tan, M. Y., and Rahul Biswas. (2012). “The Reliability of the Akaike Information Criterion Method in Cosmological Model Selection.” Monthly Notices of the Royal Astronomical Society 419:3292-3303.

Van Fraassen, Bas. 1980. The Scientific Image. Oxford: Oxford University Press. 36

Worrall, John. 1989. “Structural Realism: The Best of Both World?” Dialectica 43:99- 124. Wandelt, Benjamin. D., Jens Jasche, and Guilhem Lavaux. 2012. “Robust, Data-driven Inference in Non-linear Cosmostatistics.” In Statistical Challenges in Modern Astronomy V, ed. E.D. Feigelson, 27-40. New York, NY: Springer.

Watkinson, Catherine, Andrew Liddle, Pia Mukherjee, and David Parkinson. 2012. “Optimizing Future Dark Energy Surveys for Model Selection Goals.” Monthly Notices of the Royal Astronomical Society 424:313-324.

Weinberg, Martin. D. 2012. “Parameter Estimation and Model Selection in Extragalactic Astronomy.” In Statistical Challenges in Modern Astronomy V, ed. E.D. Feigelson, 101-116. New York, NY: Springer.

! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Thesis and Dissertation Services ! !