<<

focus

DDicingicing wwithith tthehe uunknownnknown

There are many things that I am uncertain about, says Tony O’Hagan. Some are merely unknown to me, while others are unknowable. This article is about different kinds of uncertainty, and how the distinction between them impinges on the foundations of and .

Two kinds of uncertainty are potentially reducible by further investigation. But it is easy to see how much There are things that I am uncertain about sim- more fundamental it should be for statisticians, ply because I lack , and in principle for whom and uncertainty are their my uncertainty might be reduced by gathering very raison d’être. more . Others are subject to random The theory of Statistics rests on describing variability, which is unpredictable no matter how uncertainties by using probability. A probability much information I might get; these are the un- near 1 represents an event that is almost certain knowables. The two kinds of uncertainty have to occur, while a probability near 0 represents been debated by philosophers, who have given one that is almost certain not to occur. As we them the names epistemic uncertainty (due to move away from these extremes towards the lack of knowledge) and aleatory uncertainty (due probability of ½, there is increasing uncertainty. to randomness). Here, however, is where we meet the fundamental Examples of aleatory uncertainty are famil- dichotomy between the two principal theories of iar to students of , and include Statistics: the frequentist and Bayesian theories. the outcomes of tossing dice and drawing cards One characterisation of the difference between from a shuffl ed pack. In statistics, aleatory un- these two schools of statistical theory is that fre- is present in almost all data that we Charles Trevelyan quentists do not accept that aleatory uncertainty obtain, due to random variability between the can be described or measured by , members of a population that we sample from, while Bayesians are happy to use probabilities to or to random errors. quantify any kind of uncertainty. Examples of epistemic uncertainty are all around us. I am uncertain about the atomic weight of zinc, about the population of the city Two kinds of probability of Paris, and about whether the river Thames froze over in London during the winter of Delving more deeply, the root of this disagree- 1600–1601. At least for the fi rst two of these, ment lies in what probability means. Almost my uncertainty could be resolved by looking in everyone who encounters probability for the fi rst a suitable reference book. It may be that there time in their education will be taught it using is no such source of information about the river aleatory uncertainties, like the familiar of Thames in 1600–1601, but in principle this is tossing dice or coins. And the way probability will a question that might be resolved by histori- be taught is that it represents the long-run fre- cal investigation. Epistemic uncertainty about quency with which the event in question occurs any given question varies from one person to if it is repeated an indefi nite number of times, another. For instance, I have negligible uncer- and this is accordingly known as the frequency tainty about my height. Someone who has seen defi nition of probability. The nature of random me might be able to guess reasonably accurately events is that they are, at least conceptually, re- but could not be certain. Someone who knows peatable in this way. Epistemic uncertainties are only that I am a British male would have more generally not. uncertainty. If probability is to encompass epistemic un- The distinction between aleatory and epis- certainty it needs another defi nition. temic uncertainties is valuable in many areas In Bayesian statistics the probability of a where it is important to appreciate which un- proposition simply represents a degree of

132 september2004

112-focus.indd2-focus.indd 132132 112/08/20042/08/2004 15:30:5015:30:50 PProcessrocess CyanCyan PProcessrocess MagentaMagenta PProcessrocess YellowYellow PProcessrocess BBlacklack PPANTONEANTONE 11807807 C in the of that proposition. Notice in pass- interval were computed from each of the same ing that we tend to use the word “proposition” infi nite sequence of data sets, then 95% of those Six of one and half a (a statement that is either true or false) rather intervals will contain the true value of the pa- than “event” (something which may or may not rameter. dozen of the other occur) when discussing this kind of probability, Neither of them says anything about the If I toss an ordinary coin, my probabil- since “event” has connotations of randomness chance that the null hypothesis is true, or that ity that it will land Heads is 0.5. Suppose and repeatability. A proposition might simply as- the parameter lies in the interval, for these data. now that I have a bag of poker chips, and sert that an event will occur, but it may also refer If we condition on the single set of data in front I know only that some are red and some to a statement with epistemic uncertainty. of us, there is no randomness in the problem, and are green. I have no to suppose that The degree-of-belief interpretation of prob- so no frequentist probabilities can be stated. there are equal numbers of red and green ability is sometimes referred to as personal prob- In contrast, Bayesian inference does make chips. Indeed, almost certainly one colour ability or subjective probability because, as noted probability statements about parameters. It can will be more abundant than the other, but already, different people may have different de- do so because the epistemic uncertainty in pa- I have no idea which colour that will be, grees of uncertainty about a proposition. It is rameters can be quantifi ed using the Bayesian’s or how much more abundant it will be than the “subjectivity” of this approach to probabil- personal probability. Indeed, Bayesian inference the other colour. If one chip is to be pulled ity that is most objected to by followers of the describes how the acquisition of data modifi es out of the bag my probability that it will be frequentist theory. Bayesians steadfastly defend (and usually reduces) the uncertainty about red is 0.5. this defi nition, and question the extent to which a parameter, from “prior” uncertainty to “pos- Now surely my uncertainties in the coin their methods are any more subjective than fre- terior” uncertainty. The Bayesian equivalent of toss and the poker chip draw are different— quentist practice, or indeed scientifi c practice a signifi cance test asserts the probability that the coin toss being very familiar and the generally. However, that debate is beyond the the null hypothesis is true. In the same way, the bag of poker chips full of uncertainty—so scope of this article! Bayesian analogue of the confi dence interval why do I give both events the same prob- (usually called a credible interval) has exactly the ability? It is true that the uncertainties interpretation that is so often erroneously attrib- Two kinds of statistics are different. The uncertainty about the uted to the frequentist confi dence interval. coin toss is purely aleatory, whereas there To see the implications of the frequentist po- is clearly epistemic uncertainty about the sition on probability, it is enough to note that And two kinds of statistician make-up of the bag of chips. Nevertheless, uncertainty about parameters in statistical mod- for a single coin toss and a single poker chip els is almost invariably epistemic. If, for instance, On a personal note, I have been both kinds of all the uncertainty is quantifi ed in a single I was conducting experiments to measure empiri- statistician in my career. It was my , probability, that of Heads or red. cally the atomic weight of zinc, the unknown pa- as a young statistician, of analysing data and The difference emerges when I consider rameter is that atomic weight. I cannot consider producing frequentist tests and confi dence inter- a sequence of tosses of that coin, and a zinc as a randomly chosen element. Indeed, it vals for other scientists that convinced me that sequence of chips drawn from the bag. My is particularly zinc that I wish to know about. the Bayesian approach is the right one for sta- uncertainty about the coin tosses is still In a similar way, nearly all statistical analysis tistical analysis. I had great diffi culty persuading purely aleatory. No matter how many times is to learn about parameters, and so to reduce the scientists not to misinterpret the frequentist I toss the coin, my uncertainty about get- our epistemic uncertainty about them. Since fre- inferences I was giving them. And it was clear to ting Heads on the next toss is the same, and quentist statistics does not and cannot quantify me that this was because the correct interpreta- is expressed by a probability of 0.5. On the that uncertainty with probabilities, conventional tion was of no use to them. Frequentist infer- other hand, as I draw chips from the bag statistical inferences (such as signifi cance tests ences make only indirect statements about pa- my epistemic uncertainty about its compo- and confi dence intervals) never make probability rameters, and can only be interpreted in terms of sition reduces, and my probability for the statements about parameters. repeated sampling. Bayesian inferences directly next chip being red changes according to Yet the recipients of those conventional in- answered the scientists’ questions, making state- the chips I have now seen. ferences almost universally interpret them as ments that were unambiguously about the pa- The epistemic uncertainty lies in the making probability statements about the param- rameters they wanted to learn about. Since that proportion of red chips that I will see if I eters. When the null hypothesis is rejected with time (more than 30 years ago now), I have been continue to pull chips from the bag until a p-value of 0.05, this is widely misunderstood an enthusiastic advocate and practitioner of the they are all removed. That proportion could as saying that there is only a 0.05 chance that Bayesian approach. be anything between 0 and 1. This is my un- the null hypothesis is true. If told that (3.2, 5.7) Every statistician needs to understand the known parameter, and it is this that I learn is a 95% confi dence interval for a certain pa- difference between the frequentist and Bayesian about as chips are drawn from the bag. For rameter, the interpretation that there is a 95% theories of statistics, and every practising statis- the coin, though, as I keep tossing it I know probability that the parameter lies between 3.2 tician must (at least implicitly) choose between that the proportion of Heads will converge and 5.7 is extremely common. It is enough to them. And whether something is unknown or to 0.5. There is no epistemic uncertainty in realise that our uncertainty about parameters is unknowable, whether its uncertainty is due to the coin tossing, and no unknown param- epistemic to appreciate that these have to be fundamentally unpredictable randomness or to eter to learn about. false interpretations. potentially resolvable lack of knowledge, turns The whole purpose of Statistics is to learn The p-value of 0.05 and the confi dence coeffi - out to lie at the heart of the debate. from data, so there is epistemic uncertainty cient of 0.95 are aleatory probability statements in all statistical problems. The uncertainty about the data. The p-value says that in repeat- Tony O’Hagan is a Professor of Statistics at the Uni- in the data themselves is both aleatory, be- ed sampling (creating an indefi nite sequence of versity of Sheffi eld. His research is in the theory and cause they are subject to random sampling sets of data of the type being analysed) then applications of Bayesian statistics. He has been in- or observation errors, and epistemic, be- if the null hypothesis were really true we would volved in numerous applications, particularly in en- cause there are always unknown parameters reject it in only 5% of those experiments. The vironmental statistics, asset management and health to learn about. confi dence interval says that if this confi dence .

september2004 133

112-focus.indd2-focus.indd 133133 112/08/20042/08/2004 15:30:5515:30:55 PProcessrocess CyanCyan PProcessrocess MagentaMagenta PProcessrocess YellowYellow PProcessrocess BBlacklack PPANTONEANTONE 11807807 C