Consequences of Prejudice Against the Null Hypothesis

Psychological Bulletin 1975, Vol. 82, No. 1, 1-20 Consequences of Prejudice Against the Null Hypothesis Anthony G. Greenwald Ohio State University The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the research-publication process and (b) case studies of apparent erroneous rejections of the null hypothesis in published psychological research. The input parameters for the model characterize investigators' probabilities of selecting a problem for which the null hypothesis is true, of reporting, following up on, or abandoning research when data do or do not reject the null hypothesis, and they characterize editors' probabilities of publishing manuscripts concluding in favor of or against the null hypothesis. With estimates of the input parameters based on a questionnaire survey of a sample of social psychologists, the model output indicates a dysfunctional research-publication system. Particularly, the model indicates that there may be relatively few publications on problems for which the null hypothesis is (at least to a reasonable approximation) true, and of these, a high proportion will erroneously reject the null hypothesis. The case studies provide additional support for this conclusion. Accordingly, it is concluded that research traditions and customs of discrimination against accepting the null hypothesis may be very detrimental to research progress. Some procedures that can help eliminate this bias are prescribed. In a standard college dictionary (Webster's no value," "insignificant," and presumably New World, College Edition, 1960), null is "invalid." My aims here are to document this defined as "invalid; amounting to nought; of state of affairs, to examine its consequences no value, effect, or consequence; insignifi- for the archival accumulation of scientific cant." In statistical hypothesis testing, the knowledge, and lastly, to make a positive case null hypothesis most often refers to the hy- for the formulation of more potent and ac- pothesis of no difference between treatment ceptable null hypotheses as a part of an over- effects or of no association between variables. all research strategy. Interestingly, in the behavioral sciences, re- Because of my familiarity with its litera- searchers' null hypotheses frequently satisfy ture, most of the illustrative material I use is the nonstatistical definition of null, being "of drawn from social psychology. This should not be read as an implication that the prob- Preparation of this report was facilitated by grants lems being discussed are confined to social from National Science Foundation (GS-3050) and psychology. I suspect they are equally char- US. Public Health Service (MH-20527-02). Although they should not be held responsible for positions acteristic of other behavioral science fields espoused herein, I am very grateful to the following that are lacking in well-established organizing for providing comments on earlier drafts: Mari R. theoretical systems. Tones. Paul Isaac, David Bakan, Timothy C. rock, Bibb ~atani,Thomas M. Ostrom, Hanan C. Selvin, Martin Fishbein, Zick Rubin, and Richard The A. Zeller. My paraphrasing of some widespread be- Requests for reprints should be sent to Anthony G. Greenwald, Department of Psychology, Ohio State liefs of behavioral scientists concerning the University, 404C West 17th Avenue, Columbus, Ohio appears below. Some partial 432 lo. sources for the content of this listing are 2 ANTHONY G. GREENWALD Festinger (1953, pp. 142-143), Wilson and respond to those in the preceding listing.) Miller (l964), Aronson and Carlsmith (1969, Briefly stated, these attacks are: p. 21), and Mills (1969, pp. 442-448). 1. The notion that you cannot prove the 1. Given the characteristics of statistical null hypothesis is true in the same sense that analysis procedures, a null result is only a it is also true that you cannot prove any exact basis for uncertainty. Conclusions about rela- (or point) hypothesis. However, there is no tionships among variables should be based reason for believing that an estimate of some only on rejections of null hypotheses. parameter that is near a zero point is less 2. Little knowledge is achieved by finding valid than an estimate that is significantly out that two variables are unrelated. Science different from zero. Currently available Baye- advances, rather, by discovering relationships sian techniques (eg, Phillips, 1973) allow between variables. methods of describing acceptability of null 3. If statistically significant effects are ob- hypotheses. tained in an experiment, it is fairly certain 2. The point is commonly made that the- that the experiment was done properly. ories predict relationships between variables; 4. On the other hand, it is inadvisable to therefore, finding relationships between vari- place confidence in results that support a null ables (i.e., non-null results) helps to confirm hypothesis because ,there are too many ways theories and thereby to advance science. This (including incompetence of the researcher), argument ignores the fact that scientific ad- other than the null hypothesis being true, for vance is often most powerfully achieved by obtaining a null result. rejecting theories (cf. Platt, 1964). A major Given the existence of such beliefs among strategy for doing this is to demonstrate that behavioral science researchers, it is not sur- relationships predicted by a theory are not prising that some observers have arrived at obtained, and this would often require ac- conclusions such as: ceptance of a null hypothesis. 3. I am aware of no reason for thinking Many null hypotheses tested by classical procedures that a statistically significant rejection of a are scientifically preposterous, not worthy of a null hypothesis is an appropriate basis for as- moment's credence even as approximations. (Ed- wards, 1965, pp. 401-402) suming that the conceptually intended variables were manipulated or measured validly. It [the null hypothesis] is usually formulated for The significant result (barring Type I error) the express purpose of being rejected. (Siegel, 1956, does indicate that some relationship or effect P. 7) was observed, but that is all it indicates. The researcher who would claim that his data Refutations of Null Hypothesis show a relationship between two variables "Cultural Truisms" should be as clearly obliged to show that I am sure that many behavioral science re- those variables are the ones intended as should searchers endorse the beliefs previously enu- the researcher who would claim that his data merated but would have difficulty in provid- show the absence of a relationship. ing a rational defense for these beliefs should 4. Perhaps the most damaging accusation they be strongly attacked. That is, these atti- against the null hypothesis is that incom- tudes toward the null hypothesis may have petence is more likely to lead to erroneous some of ithe characteristics of cultural truisms nonsignificant, "negative," or null results than as described by McGuire (1964). Cultural to erroneous significant or "positive" results. truisms are beliefs that are so widely and un- There is some substance to this accusation- questioningly held that their adherents (a) when the incompetence has the effect of in- are unlikely ever to have heard them being troducing noise or unsystematic error into attacked and may therefore (b) have diffi- data. Examples of this sort of incompetence culty defending them against an attack. If I are the use of unreliable paper-and-pencil am correct, the reader will have difficulty de- measures, conducting research in a "noisy" fending the preceding beliefs against the fol- setting (i.e., one with important extraneous lowing attacks (the numbered paragraphs cor- variables uncontrolled), unreliable apparatus PREJUDICE AGAINST THE NULL HYPOTHESIS functioning, inaccurate placement of record- jection of the null hypothesis and continuing ing or stimulating electrodes, random errors to revise until the null hypothesis is (at last! ) in data recording or transcribing, and making rejected or until the problem is abandoned too few observations. These types of incom- without publication; (f) failing to report ini- petence are often found in the work of the tial data collections (renamed as "pilot data'" novice researcher and are proper cause for or "false starts") in a series of studies that caution in accepting null findings as adequate eventually leads to a prediction-confirming re- evidence for the absence of effects or relation- jection of the null hypothesis; (g) failing to ships. Some other very common types of in- detect data analysis errors when an analysis competence are much more likely to produce has rejeoted the null hypothesis by miscom- false positive or significant results. These putation, while vigilantly checking and re- types of incompetence result in the introduc- checking computations if the null hypothesis tion of systematic errors into data collection. has not been rejected; and (h) using stricter Examples of such sources of artifact (cf. editorial standards for evaluating manuscripts Rosenthal & Rosnow, 1969) are experimenter that conclude in favor of, rather than against, bias, inappropriate demand characteristics, the null hypothesis. nonrandom sampling, invalid or contaminated Perhaps the enumeration of the items on manipulations or measures, systematic ap- this list will arouse sufficient recognition of paratus malfunction (e.g., errors in calibra- symptoms

Consequences of Prejudice Against the Null Hypothesis

Experimental Evidence During a Global Pandemic

Sensitivity and Specificity. Wikipedia. Last Modified on 16 November 2013

A Randomized Control Trial Evaluating the Effects of Police Body-Worn

Why Replications Do Not Fix the Reproducibility Crisis: a Model And

Which Findings Should Be Published?∗

1 Maximizing the Reproducibility of Your Research Open

Paradoxical Conclusions in Randomised Controlled Trials with 'Doubly Null

(Salmo Salar) with Amoebic Gill Disease (AGD) Chlo

The Significance of Null Results in Physics Education Research

Using Bias Analysis to Address Misclassification in Epidemiology

Null Hypothesis Significance Testing: a Review of an Old and Continuing Controversy

ORBITA and Coronary Stents: a Case Study in the Analysis and Reporting