MAnERS

Causation and Causal Inference in

Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat

fixed. In other words, a of a disease Concepts of cause and causal inference are largely self-taught from early learn- ing experiences. A model of causation that describes causes in terms of suffi- event is an event, condition, or characteristic cient causes and their component causes illuminates important principles such that preceded the disease event and without as multicausality, the dependence of the strength of component causes on the which the disease event either would not prevalence of complementary component causes, and interaction between com- have occun-ed at all or would not have oc- ponent causes. curred until some later time. Under this defi- Philosophers agree that causal propositions cannot be proved, and find flaws or nition it may be that no specific event, condi- practical limitations in all philosophies of causal inference. Hence, the role of logic, tion, or characteristic is sufiicient by itself to belief, and observation in evaluating causal propositions is not settled. Causal produce disease. This is not a definition, then, inference in epidemiology is better viewed as an exercise in measurement of an of a complete causal mechanism, but only a effect rather than as a criterion-guided process for deciding whether an effect is pres- component of it. A "sufficient cause," which ent or not. (4m JPub//cHea/f/i. 2005;95:S144-S150. doi:10.2105/AJPH.2004.059204) means a complete causal mechanism, can be defined as a set of minimal conditions and What do we mean by causation? Even among eral. The tendency to identify the switch as events that inevitably pT'oduce ciisease; "mini- those who study causation as the object of their the unique cause stems from its usual i-ote as mal" implies that all ofthe conditions or work, the concept is largely self-taught, cob- the final factor that acts in the causal mecha- events are necessar}' to tliat occurrence. In bled together fioni early experiences. As a nism. The wiring can be considered part of disease etiology, the completion of a sufficient youngster, each person develops and tests an the causal mechanism, but once it is put in cause may be considered equivalent to the inventory of causal explanations that brings place, it seldom warrants further attention. onset of disease. (Onset here refers to the meaning to perceived events and Ihat ulti- The switch, however, is often the only part of onset of the earliest stage of the disease pro- mately leads to more control of those events. the mechanism that needs to be activated to cess, rather than the onset of signs or symp- toms,) For biological effects, most and some- Because our first appreciation ofthe con- obtam the effect of turning on the light. The times all of the components of a sufficient cept of causation is based on our own direct effect usually occurs immediately after turn- cause are unknown.' observations, the resulting concept is limited ing on the switch, and as a result we slip into by the .scope of those obsei-vations. We typi- the fi^ame of thinking in which we identify the For example, tobacco smoking Ls a cause of cally observe causes with effects that are im- switch as a unique cause. The inadequacy of lung cancer, but by itself it is not a sufficient mediately apparent. For example, when one this assumption is emphasized when the bulb cause. First the term smoking is too imprecise turns a light switch to the "on" position, one goes bad and needs to he replaced, 'lliese to be used in a causal description. One must normally sees the instant efTect of the light concepts of causation that are established specify the type of smoke (e.g., cigarette. going on. Nevertheless, the causal mechanism empirically early in life are too rudimentary cigar, pipe), whether it is filtered or unfiltered, for getting a light to shine involves more to serve well as the basis for scientific theo- the manner and frequency of inhalation, and than turning a light switch to "on." Suppose ries. To enlarge upon them, we need a more the onset and duration of smoking. More im- a storm has downed the electric lines to the general conceptual model that can serve as a portantly, smoking, even defined explicitly, building, or the witnng is faulty, or the bulb common stalling point in discussions of will not cause cancer in everyone. Appar- is burned out—in any of these cases, turning causal theories. ently, there are some people who, by virtue the switch on wili have no efFect One cause of their genetic makeup or previous experi- ofthe light going on is having the switch in SUFFICIENT AND COMPONENT ence, are susceptible to the effects of smok- the proper position, but along with it we CAUSES ing, and others who are not. These suscepti- must have a supply of power to the circuit, hiWty factors are other components in the good wiring, and a working bulb. When all The concept and definition of causation various causa] mechanisms through which other factors are in place, turning the switch engender continuing debate among philoso- smoking causes lung cancer. will cause the light to go on, but if one or phers. Nevertheless, researchers interested in Figure 1 provides a schematic diagram of more of the other factors is lacking, the light causal phenomena must adopt a working defi- sufficient causes in a hypothetical individual, will not go on. nition. We can define a cause of a specific dis- Kach constellation of component causes rep- Despite the tendency to consider a switch ease event as an antecedent event, condition, resented in Figure 1 is minimally sufficient to as the unique cause of turning on a light, the or characteristic that was necessary for the produce the disease: that is, there is no redun- complete causal mechanism is more intricate, occun'ence of the disease at the moment it dant or extraneous component cause. Kach and the switch is only one component of sev- occun-ed, given that other conditions are one is a necessary part of that specific causal

S144 I Public Health Matters ( Peer Reviewed I Rothman and Greenland American Journai of Public Heaith | Supplement 1, 2005. Vol 95, No, SI PUBLIC HEALTH MAHERS

disease frequency produced by introducing tbe factor into a population. Tbis cbange may be measured in absolute or relative terms. In eitber case, tlie strength of an effect may bave tremendous public health significance, bul it may have little biological signili :ance. Tbe reason is that given a spedfic causal mechanism, any ol' the component causes can have strong or weak efTects. Tbe aetua! iden- tity of tbe constituent components of tbe causal mecbanism amounts to the biology of causation. In contrast, the strengtb of a fac- One Causal Mechanism Single Component Cause tor's effect depends on the time-specific distri- bution of its causal complements in the popu- FIGURE l-Three sufficient causes of disease. lation. Over a span of time, tbe strengtb of tbe effect of a given factor on the occurrence of a given disease may change, because tbe prevalence of its causal complements in vari- mechanism. A specific component cause may sertion tbat tbere are nearly always some ous causal mechanisms may also cbange. play a role in one, two, or all three of the genetic and some environmental component The causal mechanisms in which the factor causal mechanisms pictured. causes in every causa! mecbanism, Tbus, and its complements act could remain un- even an event sucb as a fall on an icy patb changed, however. MULTiCAUSAUTY leading to a broken bip is part of a compli- cated causal mecbanism tbat involves many The model of causation implied by component causes. INTERACTION AMONG CAUSES Figure 1 illuminates several important princi- Tbe importance of multicausality is tbat ples regaRJiiig causes. Perhaps the most im- most identified causes are neitber necessar}' Tbe causal pie model posits tbat several portant of these principles is self-evident from nor sufficient to produce disease, Nevertbe- causal components act in concert to produce the model: A given disease can be caused by less, a cause need not be eitber necessary or an effect. "Acting in concert" does not neces- more than one causal mechanism, and ever}' sufficient for its removal to result in disease sarily imply that factors must act at tbe same causal mechanism involves tbe joint action of prevention. If a component cause tbat is nei- dme. Consider tbe example above of tbe per- a multitude of component causes. Consider tber necessary nor sufficient is blocked, a sub- son wbo sustained trauma to the head that as an example the cause of a broken bip. Sup- stantial amount of disease may be prevented, resulted in an equilibrium disturbance, pose tbat someone experiences a naumatie Tbat tbe cause is not necessary implies tbat wbieh led, years later, to a fall on an icy injury to the bead tbal leads to a permanent some disease may still occur after tbe cause path. Tbe earlier head trauma played a disturbance in equilibrium. Many years later, is blocked, but a component cause wili never- causal role in tbe later hip fracture: so did the faulty equilibrium plays a causal role in a theless be a necessary cause for some of the the weatber conditions on the day of tbe fall tbat occui"s wbile tbe person is walking cases tbat occur, Tbat tbe component cause is fracture. If both of tbese factors played a on an icy patb. Tbe fall results in a broken not sufficient implies tbat otber component causal role in tbe bip fracture, tben tbey in- hip. Otber factors playing a causal role for the causes must interact witb it to produce tbe teracted with one anotber to cause the frac- broken bip could include tbe type of sboe tbe disease, and tbat blocking any of tbem would ture, despite the fact tbat tbeir time of aetion person was wearing, tbe lack of a baiidrail result in prevention of some cases of disease. is many years apart. We would say tbat any along tbe patb, a strong wind, or the body Tbus, one need not identify every component and all of the factors in tbe same causal weigbt of tbe person, among others. Tbe com- cause to prevent some cases of disease. In tbe mechanism for disease interact witb one an- plete causal mechanism involves a multitude law. a distinction is sometimes made among otber to cause disease. Tbus, tbe bead of factors. Some factors, such as tbe person's component causes to identify tbose tbat may irauma interacted with the weather condi- weigbt and tbe earlier injury tbat resulted in be considered a "proximate" cause, implying tions, as well as with otber component causes tbe equilibrium disturbance, reflect earlier a more direct connection or responsibility for sucb as tbe type of footwear, tbe absence of events tbat bave bad a lingering efTect. Some tbe outcome.^ a bandbold, and any otber conditions that causal components are genetic and would af- were necessary to tbe causal mecbanism of fect the person's weight, gait, behavior, recov- tbe fall and the broken hip tbat resulted. STRENGTH OF A CAUSE ery from the earlier trauma, and so fortb. One can view each causal pie as a set of in- Otber factors, sucb as the force of tbe wind, teracting causal components. I'his model In epidemiology, tbe strengtb of a factor's are environmental. It is a reasonably safe as- provides a biological basis for a concept of effect is usually measured by tbe cbange in

Supplement 1, 2005, Vol 95. No, SI I American Journal ot Public Health Rothman and Greenland ] Peer Reviewed j Public Heaith Matters i S145 PUBLIC HEALTH MAnERS

Table 1-Hypothetical Rates of Head tbat as much as 4O''/() of cancer is attributable there is no reasonable way to allocate a por- and Neck Cancer (Cases per 100000 to occupational exposures. Many scientists tion of tlie causation to either genes or envi- Person-Years) According to Smoking thought that this fr^aclion was an overestimate, ronment. Similarly, every case of every dis- and ai^ed against this claim."*' One of the ease has some enviranmenta! and some Status and Aicohoi Drinking arguments used in rebuttal was as follows; genetic component causes, and therefore Alcohol Drinking X percent of cancer is caused by smoking, every case can be attributed both to genes Smoking Status No Yes y percent by diet, z percent by aicohoi, and and to environmeTit. No paradox exists as so on; when all these percentages are added long as it is understood that the fractions of Nonsmoker 1 3 up. oniy a small percentage, much less than disease attributable to genes and to environ- Smoker 4 12 40%, is left for occupational causes. But this ment overlap. rebuttal is fallacious, because it is based on Many researchers bave spent considerable the naive view that every case of disease has efTort in developing heritabilit}' indices, wbicb interaction distinct from the usual statistical a single cause, and that two causes cannot are supposed to measure the fraction of dis- both contribute to the same case of cancer. view of interaction.'* ease that is inherited. Unfortunately, these In fact, smce diet, smokiiig, asbestos, and van- indices only assess the relative role of envi- SUM OF ATTRIBUTABLE FRACTIONS ous occupational exposures, along with otber ronmental and genetic causes of disease in a factors, interact with nne another and with particular setting. For example, some genetic genetic factoi's to cause cancer, each case of Consider the data on rates of head and causes may be necessary components of caneer could be attributed repeatedly to neck cancer according to whether people every causal mechanism. If everyone in a many separate component causes. Tbe sum have been cigarette smokers, alcohol drinii- population has an identical set of the genes of disease attributable to vaiious component ers, or both (Table 1). Suppose that the differ- that cause disease, however, tbeir efTect is causes thus has no upper limit. ences in the rates all rettect causal efTects. not included in heritabiiity indices, despite /\mong those people who are smokers and A single cause or category of causes that is the fact that having these genes is a cause of also alcohol drinkers, what proportion of the present in every sulTicient cause of disease the disease. I he two fanners in the example cases is attributable to the effect of smoking? will have an attributable fraction of 100%, above would offer very difTerent values for We know that the rate for these people is 12 Much publicity attended the pronouncement the heritabiiity of yellow shanks, despite the cases per 100000 person-years. If these in 1960 that as much as 90*'/i) of cancer is fact tbat the condition is always 100% depen- same people were not smokers, we can infer caused by environmental factors.' Since "envi- dent on having certain genes. that their rate of head and neck cancer would ronment" can be thought of as an all-embracing If all genetic factors that detennine disease be 3 cases per 100000 person-years, if this category that represents nongenetic causes, are taken into account, whether or not they difference reflects the causal role of smoking, wbich must be present to some extent in vary witliin populations, then 100% of dis- then we might infer that 9 of every 12 cases, every sufficient cause, it is clear on a priori ease can be said to be inherited. Analogously, or 75%, are attributable to smoking among grounds that 100% of any disease is environ- 100% of any disease is environmentally those who both smoke and drink alcohol. If mentally caused. Thus, lligginson's estimate caused, even those diseases that we often we tum the question around and ask what of 9O"/o was an underestimate. consider purely genetic. Phenylketonuria, for proportion of disease among these same Similarly, one can show that lOO'Vu of any example, is considered by many to be purely people is attributable to alcohol drinking, disease is inherited. MacMalion ' dted the ex- genetic. Nonetheless, the mental retardation we would be able to attribute 8 of eveiy 12 ample of yellow shanks, ^ a trait occurring in that it may cause can be prevented by appro- cases, or 67%, to alcohol drinking. certain strains of fowl fed yellow com. Both priate dietary intervention. How can we attribute 75"Ai of the cases to the right set of genes and the yellow-com diet The treatment for phenylketonuria illus- smoking and 67"/o to alcohol drinking among are necessary to produce yellow shanks. A trates the interaction of genes and environ- those who are exposed to both? We can be- fanner with several strains of fowl, feeding ment to cause a disease commonly thought to cause some cases are counted more than them all only yellow com, would consider be purely genetic. What about an apparently once. Smoking and alcohol interact in some yellow shanks to be a genetic condition, since purely environmental cause of death such as cases of head and neck cancer, and these only one strain would get yellow shanks, de- death from an automobile accident? It is easy cases are attributable both to smoking ami to spite all strains getting the same diet. A differ- to conceive of genetic traits tbat lead to psy- alcohol drinking. One consequence of interac- ent fanner, who owned only the strain liable chiatric problems such as alcoholism, which tion is that we should not expect that the pro- to get yellow shanks, but who fed some of in tum lead to dnink driving and consequent portions of disease attributable to various the bii'ds yellow com and others white com. fatality. Consider another more extreme envi- component causes will sum to 100%. would consider yellow shanks to be an envi- ronmental example, being killed by lightning. A widely discussed (though unpublished) ronmentally detemiined condition because it Paitially heritable psychiatric conditions can paper from the 1970s, written by scientists at depends on diet. In reality, yellow shanks is influence whether someone will take shelter the National Institutes of Health, proposed determined by both genes and environment; during a lightning storm; genetic traits sucb as

S146 I Public Health Matters I Peer Reviewed I Rothman and Greenland American Journal of Public Heaitti | Supplement 1, 2005, Vol 95, No. SI PUBLIC HEALTH MAnERS

athletic ability may influence the likelihood of reveal cause-effect relations with certainty. cancer at an earlier stage in these women, as being outside when a lightning storm strikes; This view overlooks the fact that all relations compared with women not taking estrogens, and having an outdoor occupation or pastime arc suggestive m exactly the manner dis- Many epidemiologie observations could have that is more frequent among men (or women), cussed by Hume: even the most careful and been and were used to evaluate tbese com- and in that sense genetic, would also inilu- detailed mechanistic dissection of individual peting hypotheses. The causal theory pre- cnce the probability of getting killed by light- events cannot provide more than associations, dicted that the risk of endometi-ial cancer ning. The argument may seem stretched on albeit at a finer level. Laboratory studies would tend to increase with increasing use this e.iiiamjjle, but the point that every case of often involve a degree of observer control (dose, frequency, and duration) of estrogens, disease has both genetic and environmental that cannot be approached in epidemiology; as for other carcinogenic exposures. The causes is defensible and has important impli- it is only this control, not the level of observa- detection bias theory, on the other hand, cations for research. tion, that can strengthen the inferences Irom predicted that women who had used estro- laboratoiy studies. Furthermore, such control gens only for a short white would have the MAKING CAUSAL INFERENCES is no guarantee against error All of the fruits gi-eatest risk, since the symptoms related to of scientific work, in epidemiology or other estrogen use that led to the medical consulta- Causal inference may be viewed as a spe- disciplines, are at best only tentative formula- tion tend to appear soon after use begins. cial case of the more general process of scien- tions of a description of nature, even wben Because the association of recent estrogen lilic reasoning, about which there is substan- the work itself is carried out without mistakes. use and endometrial cancer was the same tial scholarly debate among scientists and in both long-term and short-term estrogen users, the detection bias theoiy was refuted philosophers. Testing Competing Epidemiologie as an explanation for all but a small fraction Theories of endometrial cancer cases occurring after Impossibility of Proof Biological knowledge about epidemiologie estrogen use. Vigorous debate is a characteristic of mod- hypotheses Ls often scant, making the hy- em scientific philosophy, no less in epidemiol- potheses themselves at times little more than The endometrial cancer example illus- ogy than in other areas. Perhaps the most im- vague statements of causal association be- trates a critical point in understanding the f)ortant common thread that emerges From tween exposLu^e and disease, sucb as "smok- process of causal inference in epidemiologie the debated philosophies stems from lStli- ing causes cardiovascular disease." These studies: many of the hypotheses being evalu* centLiiy empiricist David I lume's observation vague hypotheses have only vague conse- ated in the interpretation of epidemiotogic that proof is impossible in empirical science. quences Ihat can be dilTicult to test To cope studies are noncausal hypotheses, in the This simple fact is espedally important to epi- with this vagueness, epidemiologists usually sense of involving no causal connection be- demiologists, who often face the criticism tbat focus on testing tiie negation of the causal tween tbe study exposure and the disease. proof is impossible in epidemiology', witli the hypothesis, that is, tbe null hypothesis that For example, hypotheses that amount to implication that it is possible in other scien- the exposure does not have a causal relation explanations of how specific types of bias tific disciplines. Such criticism may stem from to disease. Then, any observed association could bave led to an association between ex- a view that are the definitive can potentially refute the hypothesis, subject posure and disease are the ustial alternatives source of scientific knowledge. Such a view is to tlie assumption (auxiliary hypothesis) that to the primary study hypothesis that the epi- mistaken on at least two counts. First, the biases are absent. demiologist needs to consider in drawing in- nonexperimental nature of a science does not lithe causal mecbanism is stated specifi- ferences. Much of the interpretation of epi- preclude impressive scientific discovei-ies; the cally enough, epidemiologie observations demiologie studies amounts to the testing of myriad examples include plate tectonics, the (.mder some circumstances might provide such noncausal explanations, evolution of species, planets orbiting other crucial tests of competing non-nutl causal stars, and the efTects of cigarette smoking on hypotheses. On the other hand, many epide- THE DUBIOUS VALUE OF CAUSAL human health. Hven when they are possible. miologie .studies are not designed to test a CRITERIA experiments (including randomized trials) do causal hypothesis. For example, epidemio- not provide anything approaching proof, and logie data related to the finding that women In practice, how do epidemiologists sepa- in fact may be controveraal. contradictory, who took replacement estrogen therapy were rate out the causal from the noncausal expla- or irreprodiicible. The cold-fusion dehade at a considerably higher risk for endometrial nations? Despite philosophic criticisms of in- demonstrates well that neither physical nor cancer was examined by Horwitz and Fein- ductive inference, inductively oriented causal experimental science is immune to such stein, who conjectured a competing theory to criteria have commonly been used to make probiems. explain the association: they proposed that sueh inferences. If a set of necessary and suf- Some experimental scientists hold that women taking estrogen experienced symp- fident causal criteria could be used to distin- epidemiologic relations are only suggestive, toms such as bleeding that induced them to guish causal from noncausal relations in epi- and believe that detailed laboratory study of consult a physician.'* The resulting diagnostic demiologie studies, the job of the scientist mechanisms within single individuals ean workup led to the detection of endometriat would be eased considerably. With such

Supplement 1, 2005. Vol 95. No, SI I American Journal of Public Health Rothnian and Greenland | Peer Reviewed I Public Health Matters I S147 PUBUC HEALTH MAHERS

criteria, ail the concerns about the logic or Counterexamples of strong but noncausal simply beeause some results are "statistically lack thereof in causal inference could be associations ai^e also not bard to find; any significant" and some are not This sort of forgotten: it would only be necessaiy to con- study with strong illustrates the evaluation is completely fallacious even if one sult the checklist of criteria to see if a relation phenomenon. For example, consider the accepts the use of significance testing meth- were causal. We know from piiilosophy that a strong but noncausal relation between Down ods: The results (effect estimates) from the set of sufficient criteria does not exist. Never- syndrome and birth rank, which is con- studies could all be identical even if many theless, lists of causal criteria have become founded by the relation between Down syn- were significant and many were not, tbe dif- popular, possibly because they seem to drome and maternal age. Of course, once the ference in significance arising solely because provide a road map through complicated confounding factor is identified, the associa- of difTerences in the standard eixors or sizes territory. tion is diminished by adjustment for the fac- of the studies. Furthermore, this fallacy is not tor These examples remind us that a strong eliminated by "standai^dizing" estimates. Hill's Criteria association is neither necessary nor sufficient 3. Specificity. Tbe criterion of specificity A commonly used set of criteria was pro- for , nor is weakness necessary or requires that a cause leads to a single effect, posed by Hill,"^ it was an expansion of a set sufficient for absence of causality. Further- not multiple efTects. This argument has often of criteria offered previously in the landmark more, neither relative risk nor any othei^ mea- been advanced to refute causal interpreta- surgeon general's repoit on smoking and sure of association is a biologically consi.stent tions of exposures that appear to relate to health," which in tum were anticipated by feature of an association; as described above, myriad effects—for example, by those seeking tbe inductive canons of John Stuart Mill'^ sudi measures of association are cbaracteris- to exonerate smoking as a cause of lung can- and the rules given by Hume. '' tics of a given population that depend on tlie cer. Unfortunately, the criterion is invalid as a relative prevalence of other causes in that Hill suggested that the following aspects of general mle. Causes of a given effect cannot population. A strong association serves only an association be considered in attempting to be expected to lack all other effects. In fact, to mle out hypotheses that the assodation is distinguish causal from noncausal associa- everyday experience teaches us repeatedly entirely due to one weak unmeasured con- tions: (1) strength, (2} consistency, (3) speci- that single events or conditions may have founder or other source of modest bias, ficity, (4) temporality, (5) biological gradient, many effects. Smoking is an excellent exam- (6) plausibility, (7) coherence, (8) experimen- 2. Consistency. Consistency refers to the re- ple; it leads to many effects in the smoker, tal , and (9) analogy. These criteria peated observation of an assodation in differ- in part because smoking involves exposure suffer from their induetivist origin, but their ent populations under different circumstances. to a wide range of agents.'^"' The existence popularity demands a more specific discus- Lack of consistency, however, does not rule of one effect of an exposure does not detract sion of their utility. out a causal assodation, beeause some effects from the possibility that another effect exisls. /. Strength. Hill's argument is essentially are produced by their causes only under un- On the other hand, Weiss"' convincingly ar- that strong associations are more likely to be usual circumstances. Moi'e precisely, the effect gued that specificity can be used to distinguish causal than weak associations because, if they of a causal agent cannot occur unless the eom- some causal hypotheses from noncausal hy- could be explained by some otber factor, the plementary component causes act, or have al- potheses, when the causal hypothesis predicts effect of that factor would have to be even ready acted, to complete a sufficient cause. a relation with one outcome but no relation stronger than the observed assodation and These conditions will not always be met. Tlius, with another outcome. 'ITius, specificity can therefore would have become evident. Weak transtiisions can cause HIV infection but they come into play when il can be logically de- associations, on the other hand, are more do not always do so: the virus must also be duced Irom the causal hypothesis in question. likely to be explained by undetected biases. To present. Tampon use can cause toxic shock 4. Temporality. Temporality refers to the some extent tbis is a reasonable argument but, syndrome, but only rarely when certain other, necessity for a cause to precede an effect in as Hill himself acknowledged, the tact that an perhaps imknown, conditions are met. Consis- time. 'ITiis criterion is inarguable, insofar as association is weak does not rule out a causal tency is apparent only after all the relevant de- any claimed observation of causation must in- connection, A commonly cited counterexam- tails of a causal mechanism are understood, volve the putative cause C preceding the pu- ple is the relation between dgarette smoking which is to say very seldom. Furthermore, tative effect D. It does not however, follow and cardiovascular disease: one explanation even studies of exactly the same phenomena that a reverse time order is evidence against for this relation being weak is tbat cardiovas- can be expected to yield different results sim- tlie hypothesis that C can cause D. Rather, cular disease is common, making any ratio ply because they differ in their methods and observations in which C followed D merely measure of effect comparatively small com- random errors. Consistency serves only to rule show that C could not have caused D in these pared with ratio measures for diseases that are out hypotheses that the association is attributa- instances; they provide no evidence for or less common.'"* Nevertheless, dgarette smok- ble to some factor that varies across stiadies. against the hypothesis that C can cause D in ing is not serioasly doubted as a cause of car- One mistake in implementing the consis- those instances in which it precedes D. diovascular disease. Another example would tency criterion is so common that it desen,'es 5. Biological gradient. Biological gradient be passive smoking and lung cancer, a weak special mention. It is sometimes claimed that refers to the presence of a unidirectional association that few consider to be noncausal. a literature or set of results is inconsistent dose-response curve. We often expect such a

S148 Public Health Matters t Peer Reviewed I Rothman and Greenland American Journai of Public Health | Supplement 1, 2005, Voi 95, No. SI PUBLIC HEALTH MAnERS

monotonic relation to exist For example, sion (via body lice) was known: "It could be no 8, Experimental evidence. It is not elear what more smoking means more carcinogen expo- moi-e ridiculoiLs for tlie stranger who passed I lill meant by expeiimcntal evidence. It mig^t sure and more tissue damage, hence more op- tlie night in the steerage of an emigi-ant ship to have refen'ed to evidence from laboratory' ex- portunity for carcinogenesis. Some eausal as- ascrihe the typhus, which he there contracted, periments on animals, or to evidence from sociations, however, show a single jump to tlie vermin witli whicb hfxlies of the sick human experiments. Evidence from hiunaii ex- (threshold) rather than a monotonic trend; an might be mfested. An adequate cause, one rea- periments, however, is seldom available for example is the association between DES and sonable ui itself, must con-ect the coincidences most epidemioiogic reseairh questions, and an- adenocarcinoma of the vagina. A possihie ex- of simple experience."'' What was to Qieever imal evidence relates to different species and planation is that the doses of DKS that were an implaiLsible explanation turned out to be usually to levels of exposure very difTerent administered were alt sufficiently great to pro- the correct explanation, since it was indeed the from those humans experience. From Hill's ex- duce tlie maximum elTect from DP^S. Under vermin that caused tlie ty(ihus infection. Such amples, il seems that what he had in mind for this hypothesis, for all those exposed to DES, is tiie problem with plausibility: it is too often experimental evidenee was the result of re- the development of disease would depend not based on logic or data, but only on prior moval of .some harmful exposure in an inter- entirely on (ither eomponent causes. beliefs. This Ls not to say that biological knowl- venti(jn or prevention program, leather than the Alcohol consumption ajid mortality is an- edge should be discounted when evaluating a results of laboratoiy experiments. 'ITie lack of other example. Death rates are higher among new hy[)othesLS, but only to point out the difll- avaiiabilit}' of such evidenee would at least be culty in applying that knowledge. nondrinkers than among moderate drinkers, a pragmatic difficulty in making this a criterion hut ascend to the highest levels for heavy 'ITie Bayesian approach to inference at- for inference. Logically, however, expeiimental drinkers. There is considerable debate about tempts to deal with this problem by requiring evidence is not a criterion but a test of the which paits of tlie J-shaped dose-response that one quantify, on a probability (0 to 1) causal hypothesis, a test that is simply unavail- curve are causally related to alcohol con- scale, the certainty thai one has in prior be- able in most circumstances. Altliough exjieri- sumption and which parts are noncausal ar- liefs, as well as in new hypotheses. This quan- mental tests ean be much stronger than other tifacts stemming from confounding or other tification displays the dogmatism or open- tests, they are often not as decisive as thought, biases. Some studies appear to find only an mindedne.ss ofthe analyst in a public fashion, because of difficulties in inteipretation. For ex- increasing relation between alcohol consump- with ceitaitity values near 1 or 0 betraying a ample, one can attempt to test the hypothesis tion eind mortality, possibly because the cate- strong commitment of the analyst for or that malaria is caused by swamp gas by drain- gories of alcohol consumption are too broad against a hypothesis, It can also provide a ing swamps in some areas and not in others to to distinguish dilTerent rates aanong moderate means of testing those quantified beliefs see if the malaria rates among residents are af- drinkers and nondrinkers. against new evidenee,'" Nevertheless, the fected by the draining. As predicted by the hy- Associations that do show a monotonie Bayesian approach cannot transfonn plausi- ))othesis, the rates will drop in the areas where trend in disease frequency with increasing lev- bility into an objective causal criterion. the swamps are drained. As Popper empha- els of exposure are not necessarily eausal; eon- 7 Coherence. Taken fiiDm the sui^eon gen- sized, however, there are always many alterna- founding can result in a monotonic relation eral's report on smoking and healtli," the temi tive explanations for the outcome of every ex- between a noncausal ilsk factoi' and disease if coherence implies that a cause-and-efTect inter- periment In this example, one alternative, the confounding factor itself demonstrates a pretation for an assoeiation does not conflict which happens to be correct, is that mosqui- biological gradient in its relation witli disease. with whal is known of tlie natural history and toes are responsible for malaria transmission. The noneaiisal relation between hirth rank biology of the disease. The examples Hill gave 9. Antilogy. Whatever insight might be de- eind Down syndrome mentioned in pail 1 for coherence, such as the histopathologic ef- rived from analogy is handicapped by the in- above shows a biological gradient that merely fect of smoking on bronchial epithelium (in ref- ventive imagination of scientists who can find reflects the progi'essive relation between ma- erence to tlie association between smoking and analogies everywhere. At best, ajialogy pro- ternal age and Down syndrome occurrence. lung cancer) or the difference in lung eaneer vides a souree of more elaborate hy{iotheses These examples imply that the existence of incidence hy gender, could reasonably be about the associations under study; absence of a monotonic association is neither necessary considered exajnples of plausibility as well such analogies only reflects lack of imagination nor sufficient for a eausal relation. A nonmo- as coherence; the distinction appears to be a or experience, not falsify of the hypothesis. notonic relation only refutes those causal hy- fine one. Hill emphasized that the absence of potlieses specific enough to predict a monoto- coherent infojinadon, as distinguished, appai^- Is There Any Use for Causal Criteria? ently, from the presence of conflicting infonna- nic dose-response curve. As is evident the standards of epidemio- tion, .should not be taken as evidence against 6, Plausibility. Plausibility relers to the bio- iogic evidence offered by Hill are saddled with an association hemg considered causal. On tlie logical plausibility ol the hj'pothesis, an impor- reservations and exceptions. I lill himself was other hand, presence of conllicting information tant concern but one that is far fh)m objective ambivalent about the utility of these "view- may indeed refute a hypothesis, but one tnust or absolute. Sartwell, emphasizing this point points" (he did not use the word criteria in the always remember that the conllicting infbnna- cited tlie 1861 comments of Cheever on the paper). On tlie one hand, he asked, "In what tion may be mistaken or misinterpreted,'^ etiology of typhus before its mode of transmis- circumstances can we pass from this observed

Supplement 1, 2005, Vol 95, No. SI I American Journal of Public Health Rothman and Greenland I Peer Reviewed I Public Health Matters S149 PUBLIC HEALTH MAnERS

association to a verdict of causation?" Yet de- error, since nearly every study will have nearly 5. Ivphron li. Apacalyptics: Cancer and the Big Lie- spite speaking of verdicts on causation, he dis- every type of error The real issue is to quan- How Environmental Politics Controls What We Know ahout Cancer. New York, NY: Simon and Schuster: agreed that any "hard-and-fast rules of evi- tify the errors. As there is no precise cutoff 1984. dence" existed by which to judge causadon: with respect to how much error can be toler- G. Higginson J. Population studies in cancer. Acta This conclusion accords with the views of ated before a study must be considered in- Unio Intemat Contra Cancrum 1960:16:1667-1670. Hume, Popper, and others that causal infer- valid, there is no altemative to the quantifica- 7. MacMahon B. Gene-environment interaction in ences cannot attain the certainty of logica] de- tion of study errors to the extent possible. human disease. J Psychiatr Res ]968;6:393-402. ductions. Although some sdentists continue to Although there are no absolute criteria for 8 i logbcn L. !\'ature and Nurture London, Kngland: promulgate causal criteria as aids to inference, assessing the validity of scientific evidence, it Williams and Norgate: 1933. others argue that it is actually detrimental to is stOl possible to assess the validity of a 9, Horwitz Rl, Feinstein AR, Altemative analytic cloud the inferential process by considering study. What is required is much more than methods for case-control studies of estrogens and checklist criteria.' An intermediate, reftitation- the application of a list of criteria. Instead, eiidometriai cancer, N EnglJ Med. 1978;299: 1089-1094. ist reproach seeks to transform the criteria one must apply thorough criticism, with the into deductive tests of causal hypotheses,^""^' 10, Hill AB. "ITie environment and disease: association goal of obtaining a quantified evaluation of or causation? PriK R Soc Med. 1965;58;295-300, Such an approach avoids the temptation to the total error that afflicts the study. This type use causal criteria simply to buttress pet theo- 11, Smoking and Health: Report of the Advisory of assessment is not one that can be done Committee to the Surgeon General of the Puhlic lies at hand, and instead allows epidemiolo- easily by someone who lacks the skiils and Health Service. Washington, DC: US Department of gists to focus on evaluating competing causal training of a scientist familial' with the subject Health, Education, and Welfare: 1964, Public Health Serviee I^iblication No. 1103. theories using crucial observations. matter and the sdentific methods that were employed. Neither can it be applied readily 12, Mill JS. .'1 System of Logic, Ratiocinative and Induc- tive. 5th ed, London, England: Parker, Son and Bowin, CRITERIA TO JUDGE WHETHER by judges in court, nor by sdentists who ei- 1862, Cited in Clark DW. MacMahon B, eds. Preventive SCIENTIFIC EVIDENCE IS VALID ther lack the requisite knowledge or who do and Community Medicine. 2nd ed, Boston, Mass: Little. not take the time to penetrate the work. • Brown; 1981:chap 2. 13, Hume D. A Treatise of Human Nature. (Originally Just as causal criteria cannot be used to published in 1739.) Oxford University Press edition, establish the validity of an inference, there with an Analytical Index by L, A. Selby-Bigge. pub- are no criteria that can be used to establish lished 1888, Second edition with text revised and About the Authors notes by P,H. Nidditch, 1978. the validity of data or evidence. There are Kenneth J. Rothman is with the Boston University Medical methods by which validity can be assessed, Center. Boston. Mass. Sunder Greenland is with the Uni- 14, Rothman KJ. Poole C, A strengthening programme for weak associations. IntJ Epidemiol 1988;17{Suppl): but this assessment would not resemble any- versity of California. Los Angeles. Requests for repnnLs should be sent to Kenneth]. Rothman. 955-959, thing like the application of rigid criteria. DrPH, Boston University School of Public Heallh. Depart- 15, Smidi GD, Specificity as a criterion for eausation: Some of the difficulty can be understood by ment of Hpideminlogy, 7/5 Albany St., Boston. MA a premature hurial? hit j Epidemiol. 2002 ;31:710—713, taking the view that sdentific evidence can 02118 (e-maii: [email protected]). This article was accepted November 18. 2004. 16, Weiss NS:, Can the spedfieity ol' an association be usually be viewed as a form of measurement. rehahilitated as a basis for supporting a causal hypoth- esis? Epidemiology. 2002:13:6-8. If an epidemiologic study sets out to assess the Contributors 17, Sartwell P, On the methodology of investigations relation between exposure to tobacco smoke Kt'iineth J. Rulhman and Sander Greenland participated of etiologic factors in chronic diseases—furiher com- equally in the planning and writing of this articie. and lung cancer risk, the results can and ments, / CArwn Dis, 1960;n;61-63. should be framed as a measure ol" causal el- 18, Popper. KR, The Logic of Scientific Discovery. New fect. such as the ratio of the risk of lung cancer Acknowledgments York. MY: Harper & Row; 1959 (first published in Ger- among smokers to the risk among nonsmok- This woi'k LS laj-gely abridged from diajjter 2 of Modem man in 1934), Epidemiology, 2nd ed., hy K.j. Rothman and S. Gceun- ers. Like any measurement, the measurement land. Lippmcott, Williams & Wilkins, 1998. and chap- 19, Lanes SF, Poolc C. 'Tnith in packaging?" The of a causal efFect is subject to measurement ter 2 of Epidemiology-An Introduction by K.J. Rothman, unwrapping of epidemiologic research, J Occup Med. erroi". For a scientific study, measurement error Oxford Univei-sity VKSS, 2002. 19B4;2t);571-574, encompasses more than the eiTor that we 20, Maclure M, Popperian refutation in epidemiology, might have in mind when we attempt to mea- References AmJ Epidemiol. 19B5;121:343-350, sure the length of a piece of carpet. In addition 1. Rothman KJ. Causes. Am J Epidemiol. 1976:104: 21, Weed D. On the logic of causal inference, AmJ 587-592. Epidemiol. ]986;123:965-979. to statistical error, the measurement error sub- 2. Honorpe A. Causation in the Law. In: Zalta EN, sumes problems that relate to study design, in- ed. Stanford Encyclopedia of Philosophy. Winter 2001 cluding subject selection and retention, infor- ed. Stanford. Caiif: Stanford University; 2001. Avail- mation acquisition, and uncontrolled able at: h(tp://plato,stanford,edu/archive.s/win2l)01/ confounding and other sources of bias. There entries/causation- law. are many individual sources of possible error. 3. Rothman Kj, Greenland S. Modem Epidemiology Philadelphia, Pa: Lippincott: 1998: chap 18. It is not suffident to chai^acterize a study as having or not having any of these sources of 4. Higginson J. Proportion of cancer due to occupa- tion, PJCT Med 1980:9:180-188.

S150 I Public Health Matters , Peer Reviewed I Rothman and Greenland American Journal of Public Heaith i Suppiement 1. 2005. Voi 95. No. SI