Appendix a Statistical Approaches, Probability Interpretations, and the Quantification of Standards of Proof
Total Page:16
File Type:pdf, Size:1020Kb
Appendix A Statistical Approaches, Probability Interpretations, and the Quantification of Standards of Proof A.1 Relevance Versus Statistical Fluctuations The term statistics is often used to mean data or information, that is, nu merical quantities based on some type of observations. Statistics in this report is used in a more specific sense to refer to methods for planning scientific studies, collecting data, and then analyzing, presenting, and in terpreting the data once they are collected. Much statistical methodology has as its purpose the understanding of patterns or forms of regularity in the presence of variability and errors in observed data: If life were stable, simple, and routinely repetitious, there would be little need for statistical thinking. But there would probably be no human beings to do statistical thinking, because sufficient stability and simplicity would not allow the genetic randomness that is a central mechanism of evolution. Life is not, in fact, sta ble or simple, but there are stable and simple aspects to it. From one point of view, the goal of science is the discovery and eluci dation of these aspects, and statistics deals with some general methods of finding patterns that are hidden in a cloud of irrel evancies, of natural variability, and of error-prone observations or measurements (Kruskal, 1978b:l078). When the Lord created the world and people to live in it-an enterprise which, according to modern science, took a very long time-I could well imagine that He reasoned with Himself as follows: "If I make everything predictable, these human beings, whom I have endowed with pretty good brains, will undoubt edly learn to predict everything, and they will thereupon have no motive to do anything at all, because they will recognize that the future is totally determined and cannot be influenced by any human action. On the other hand, if I make everything unpre dictable, they will gradually discover that there is no rational 192 Statistical Assessments as Evidence basis for any decision whatsoever and, as in the first case, they will thereupon have no motive to do anything at all. Neither scheme would make sense. I must therefore create a mixture of the two. Let some things be predictable and let others be un predictable. They will then, amongst many other things, have the very important task of finding out which is which" (Schu macker, Small is Beautiful, 1975). This view of statistics has inherent in it the need for some form of in ference from the observed data to a problem of interest. There are several schools of statistical inference, each of which sets forth an approach to sta tistical analysis, reporting, and interpretation. The two most visible such schools are (1) the proponents of the Bayesian or "subjective" approach and (2) the frequentist or "objective" approach. The key difference between them is in terms of how they deal with the interpretation of probability and of the inferential process. A statistical analyst, whether a Bayesian or a frequentist, must ulti mately confront the relevance of the patterns discerned from a body of data for the problem of interest. Whether in the law or in some other field of endeavor, statistical analysis and inference are of little use if they are not designed to inform judgments about the issue that generated the statistical inquiry. Thus perhaps a better title for this introductory section might have been "Discovering Relevance in the Presence of Statistical Fluctuations." This appendix describes the two basic approaches to drawing inferences from statistical data and attempts to link the probabilistic interpretation of such inferences to the formalities of legal decision making. As one turns from the use of statistics in both the data and the method ological senses, i.e., in the plural and the singular, in various scientific fields to its use in the law, there will clearly be occasions when statistics has little to offer. In a somewhat different context Bliley (1983) notes: Our tendency to rely on statistics is understandable. Statistics offer to our minds evidence which is relatively clear and easy to understand. By comparison, arguments about the rights of individuals, the natural authority of the family, and the legit imate powers of government are complex and do not provide us with answers quickly or easily. But we must remember that these are questions which are at the root of our deliberations. Until these questions are answered, statistics are of limited use. Statisticians could not have written the Declaration of Indepen dence or the Constitution of the United States, and statisticians should not be depended upon to interpret them. Appendix A. Statistical Approaches 193 A.2 Views on Statistical Inference Rule 401 of the Federal Rules of Evidence provides the following definition of relevance: "Relevant evidence" means evidence having any tendency to make the existence of any fact that is of consequence to the determination of the action more probable or less probable than it would be without the evidence. From this rule one might infer that the court wishes and expects to have its judgments about facts at issue to be expressed in terms of probabilities. Such a situation is tailor-made for the application of Bayes' Theorem and the Bayesian form of statistical inference in connection with evidence. Thus some have characterized the Bayesian approach as what the Court needs and often thinks it gets. Rule 401 has traditionally been thought of, in much less sweeping terms, as being applicable only to the admission of evidence and not to its use in reaching judgments, even though the evidence is admitted to aid the fact finder in reaching judgments. Our review of the use of statistical assessments as evidence in the courts suggests that, with some special exceptions, such as the use of blood group analyses in paternity cases, what the court actually gets is a frequentist or "classical" approach to statistical inference. Such an approach is difficult to reconcile with the language of Rule 401 (but see the discussion in Lem pert, 1977, and Tiller, 1983). The key weapon in the frequentist's arsenal of statistical techniques is the test of significance whose result is often ex pressed in terms of a p-value. There is a tendency for jurists, lawyers, and even expert statistical witnesses to misinterpret p-values as if they were probabilities associated with facts at issue (see the discussion of this issue in the case study in Section 2.6). In this section, we describe the two approaches to statistical inference: the Bayesian approach and the frequentist approach. Then we discuss the interpretation of p-values and their relationship or lack thereof to the tail values that are interpretable as probabilities in a Bayesian context. The Basic Bayesian Approach Suppose that the fact of interest is denoted by G. In a criminal case, G would be the extent that the accused is legally guilty, and for specificity we speak of the event G as guilt and of its complement G as innocence or nonguilt. Following Lindley (1977) and Fienberg and Kadane (1983), we let H denote all of the evidence presented up to a certain point in the trial, and E denote new evidence. We would like to update our assessment of the probability of G given the evidence H, i.e., peG I H), now that the new evidence E is available. This is done via Bayes' Theorem, which states that: 194 Statistical Assessments as Evidence P(G I H)P(E I G&H) P(G I E&H) = P(G I H)P(E I G&H) + P(G I H)P(E I G&H) All of these probabilities represent subjective assessments of individual de grees of belief. As a distinguished Bayesian statistician once remarked: coins don't have probabilities, people do. It is also important to note that for the subjectivist there is a symmetry between unknown past facts and unknown future facts. An individual can have probabilities for either. The well-known relationship of Bayes' Theorem is more easily expressed in terms of the odds on guilt, i.e., the ratio of the probability of guilt, G, to the probability of innocence (or nonguilt), G: P(E I G&H) odds (G I E&H) = P(E I G&H) x odds (G I H). Lempert (1977, p.l023), notes that This formula describes the way knowledge of a new item of evidence (E) would influence a completely rational decision maker's evaluation of the odds that a defendant is guilty (G). Since the law assumes that a factfinder should be rational, this is a normative model; that is, the Bayesian equation describes the way the law's ideal juror evaluates new items of evidence. In this sense Bayes' Theorem provides a normative approach to the issue of relevance. One might now conclude that when the likelihood ratio P(E I G&H) P(E I G&H) for an item of evidence differs from one, the evidence is relevant in the sense of Rule 401. Lempert (1977) refers to this as logical relevance. Now, the way to incorporate assessments of the evidence in a trial is clear. The trier of fact (a judge or juror) begins the trial with an initial assessment of the probability of guilt, P(G), and then updates this assess ment using either form of Bayes' Theorem given above, with each piece of evidence. At the end of the trial, the posterior estimate of G given all of the evidence is used to reach a decision. In the case of a jury trial, the jurors may have different posterior assessments of G, and these will need to be combined or reconciled in some way.