Hill's Criteria for Causality

relation between cigarette smoking and cardiovascu- Hill’s Criteria for lar disease. Causality Counterexamples of strong but noncausal associations are also not hard to find; any study with Despite philosophic criticisms of inductive inference, strong confounding illustrates the phenomenon. For inductively oriented causal criteria have commonly example, consider the strong but noncausal relation been used to make such inferences. If a set of ne- between Down syndrome and birth rank, which is cessary and sufficient causal criteria could be used confounded by the relation between Down syndrome to distinguish causal from noncausal associations and maternal age. Of course, once the confounding in observational studies, the job of the scientist factor is identified, the association is diminished by would be eased considerably. With such criteria, adjustment for the factor. These examples remind all the concerns about the logic or lack thereof in us that a strong association is neither necessary nor causal inference could be forgotten: it would only be sufficient for causality, nor is weakness necessary nor necessary to consult the checklist of criteria to see if sufficient for absence of causality. In addition to these a relation were causal. We know from philosophy counterexamples, we have to remember that neither that a set of sufficient criteria does not exist [3, relative risk nor any other measure of association is 6]. Nevertheless, lists of causal criteria have become a biologically consistent feature of an association; as popular, possibly because they seem to provide a road described by many authors [4, 7], it is a characteristic map through complicated territory. of a study population that depends on the relative A commonly used set of criteria was proposed prevalence of other causes. A strong association by Sir Austin Bradford Hill [1]; it was an expan- serves only to rule out hypotheses that the association sion of a set of criteria offered previously in the is entirely due to one weak unmeasured confounder landmark Surgeon General’s report on Smoking and or other source of modest bias. Health [11], which in turn were anticipated by the inductive canons of John Stuart Mill [5] and the rules of causal inference given by Hume [3]. Hill Consistency suggested that the following aspects of an association be considered in attempting to distinguish causal Consistency refers to the repeated observation of an from noncausal associations: strength, consistency, association in different populations under different specificity, temporality, biologic gradient, plausibil- circumstances. Lack of consistency, however, does ity, coherence, experimental evidence, and analogy. not rule out a causal association, because some effects The popular view that these criteria should be used are produced by their causes only under unusual cir- for causal inference makes it necessary to examine cumstances. More precisely, the effect of a causal them in detail: agent cannot occur unless the complementary com- ponent causes act, or have already acted, to complete Strength a sufficient cause. These conditions will not always be met. Thus, transfusions can cause HIV infection Hill’s argument is essentially that strong associations but they do not always do so: the virus must also be are more likely to be causal than weak associations present. Tampon use can cause toxic shock syndrome, because, if they could be explained by some other but only when other conditions are met, such as pres- factor, the effect of that factor would have to be ence of certain bacteria. Consistency is apparent only even stronger than the observed association and there- after all the relevant details of a causal mechanism are fore would have become evident (see Cornfield’s understood, which is to say very seldom. Even stud- Inequality). Weak associations, on the other hand, ies of exactly the same phenomena can be expected are more easily explained by undetected biases.To to yield different results simply because they differ some extent this is a reasonable argument, but, as in their methods and random errors. Consistency Hill himself acknowledged, the fact that an asso- serves only to rule out hypotheses that the associ- ciation is weak does not rule out a causal con- ation is attributable to some factor that varies across nection. A commonly cited counterexample is the studies. Encyclopedia of Biostatistics, Online © 2005 John Wiley & Sons, Ltd. This article is © 2005 John Wiley & Sons, Ltd. This article was published in the Encyclopedia of Biostatistics in 2005 by John Wiley & Sons, Ltd. DOI: 10.1002/0470011815.b2a03072 2 Hill’s Criteria for Causality Specificity a J-shaped dose–response curve is at least biologically plausible. The criterion of specificity requires that a cause leads Conversely, associations that do show a monotonic to a single effect, not multiple effects. This argument trend in disease frequency with increasing levels of has often been advanced to refute causal interpre- exposure are not necessarily causal; confounding can tations of exposures that appear to relate to myr- result in a monotonic relation between a noncausal iad effects, especially by those seeking to exonerate risk factor and disease if the confounding factor smoking as a cause of lung cancer. The criterion is itself demonstrates a biologic gradient in its relation wholly invalid, however. Causes of a given effect with disease. The noncausal relation between birth cannot be expected to lack other effects on any rank and Down syndrome mentioned above shows a logical grounds. In fact, everyday experience teaches biologic gradient that merely reflects the progressive us repeatedly that single events or conditions may relation between maternal age and the occurrence of have many effects. Smoking is an excellent example: Down syndrome. it leads to many effects in the smoker. The existence Thus the existence of a monotonic association is of one effect does not detract from the possibility that neither necessary nor sufficient for a causal relation. another effect exists. Thus, specificity does not confer A nonmonotonic relation only conflicts with those greater validity to any causal inference regarding the causal hypotheses specific enough to predict a mono- exposure effect. Hill’s discussion of this criterion tonic dose–response curve. for inference is replete with reservations, and many authors regard this criterion as useless and misleading Plausibility [8, 9]. Plausibility refers to the biologic plausibility of the hypothesis, an important concern but one that is far Temporality from objective or absolute. Sartwell [9], emphasizing Temporality refers to the necessity that the cause pre- this point, cited the remarks of Cheever, in 1861, who cede the effect in time. This criterion is unarguable, was commenting on the etiology of typhus before its insofar as any claimed observation of causation must mode of transmission (via body lice) was known: involve the putative cause C preceding the putative It could be no more ridiculous for the stranger who effect D. It does not, however, follow that a reverse passed the night in the steerage of an emigrant ship time order is evidence against the hypothesis that C to ascribe the typhus, which he there contracted, to can cause D. Rather, observations in which C fol- the vermin with which bodies of the sick might be lowed D merely shows that C could not have caused infested. An adequate cause, one reasonable in itself, D in these instances; they provide no evidence for or must correct the coincidences of simple experience. against the hypothesis that C can cause D in those What was to Cheever an implausible explanation instances in which it precedes D. turned out to be the correct explanation, since it was indeed the vermin that caused the typhus infection. Biologic Gradient Such is the problem with plausibility: it is too often not based on logic or data, but only on prior beliefs. Biologic gradient refers to the presence of a mono- This is not to say that biological knowledge should tone (unidirectional) dose–response curve. We often be discounted when evaluating a new hypothesis, expect such a monotonic relation to exist. For exam- but only to point out the difficulty in applying that ple, more smoking means more carcinogen exposure knowledge. and more tissue damage, hence more carcinogenesis. The Bayesian approach to inference attempts to Such an expectation is not always present, however. deal with this problem by requiring that one quan- The somewhat controversial topic of alcohol con- tify, on a probability (0 to 1) scale, the certainty that sumption and mortality is an example. Death rates one has in prior beliefs, as well as in new hypotheses. are higher among nondrinkers than among moderate This quantification displays the dogmatism or open- drinkers, but ascend to the highest levels for heavy mindedness of the analyst in a public fashion, with drinkers. Because modest alcohol consumption can certainty values near 1 or 0 betraying a strong com- have beneficial effects on serum lipid profiles, such mitment of the analyst for or against a hypothesis. It Encyclopedia of Biostatistics, Online © 2005 John Wiley & Sons, Ltd. This article is © 2005 John Wiley & Sons, Ltd. This article was published in the Encyclopedia of Biostatistics in 2005 by John Wiley & Sons, Ltd. DOI: 10.1002/0470011815.b2a03072 Hill’s Criteria for Causality 3 can also provide a means of testing those quantified Although experimental tests can be much stronger beliefs against new evidence [2]. Nevertheless, the than other tests, they are not as decisive as often Bayesian approach cannot transform plausibility into thought, because of difficulties in interpretation. For an objective causal criterion. example, one can attempt to test the hypothesis that malaria is caused by swamp gas by draining swamps in some areas and not in others to see if the malaria Coherence rates among residents are affected by the draining. As predicted by the hypothesis, the rates will drop in Taken from the Surgeon General’s report on Smok- the areas where the swamps are drained.

Hill's Criteria for Causality

Descriptive Statistics (Part 2): Interpreting Study Results

Risk Factors Associated with Maternal Age and Other Parameters in Assisted Reproductive Technologies - a Brief Review

Clarifying Questions About “Risk Factors”: Predictors Versus Explanation C

Observational Determinism for Concurrent Program Security

Judging the Evidence 2018 Contents World Cancer Research Fund Network 3 1

Librarianship and the Philosophy of Information

A Modular Approach to Integrating Multiple Data Sources Into Real-Time Clinical Prediction for Pediatric Diarrhea

The Anti-Essentialism Paper

1.) What Is the Difference Between Observation and Interpretation? Write a Short Paragraph in Your Journal (5-6 Sentences) Defining Both, with One Example Each

Biocentrism in Environmental Ethics: Questions of Inherent Worth, Etiology, and Teleofunctional Interests David Lewis Rice III University of Arkansas, Fayetteville

Could Low Grade Bacterial Infection Contribute to Low Back Pain? a Systematic Review

Building a Better Mousetrap: Patenting Biotechnology in the European