Psychological Methods Copyright 2001 by the American Psychological Association, Inc. 2001. Vol. 6, No. 2, 115-134 1082-989X/01/S5.00 DOI: 10.1037//1082-989X.6.2.115

Estimating and Testing and in Within-Subject Designs

Charles M. Judd David A. Kenny University of Colorado at Boulder University of Connecticut

Gary H. McClelland University of Colorado at Boulder

Analyses designed to detect mediation and moderation of treatment effects are increasingly prevalent in research in psychology. The mediation question concerns the processes that produce a treatment effect. The moderation question concerns factors that affect the magnitude of that effect. Although analytic procedures have been reasonably well worked out in the case in which the treatment varies between participants, no systematic procedures for examining mediation and moderation have been developed in the case in which the treatment varies within participants. The authors present an analytic approach to these issues using estimation.

The issues of mediation and moderation have re- condition). Assuming that a performance difference is ceived considerable attention in recent years in both found, one might plausibly hypothesize different me- basic and applied research (Baron & Kenny, 1986; diating mechanisms for this effect. The new curricu- James & Brett, 1984; Judd & Kenny, 198Ib; Mac- lum might increase students' interest in the subject Kinnon & Dwyer, 1993). In addition to knowing matter; it might cause students to study harder outside whether a particular intervention has an effect, the of class; or it might convey the material more clearly. researcher typically wants to know about factors that These are alternative reasons why the performance affect the magnitude of that effect (i.e., moderation) difference is found, that is, alternative mediators of and mechanisms that produce the effect (i.e., media- the treatment effect. The researcher might also be in- tion). Such knowledge helps in both theory develop- terested in factors that affect the magnitude of the ment and intervention application. difference between performance following the old To illustrate the difference between mediation and curriculum and performance following the new one. moderation, consider a design in which a researcher is That difference might be larger or smaller for differ- interested in whether students who are taught with a ent types of students or in different types of class- new curriculum (the treatment condition) show higher rooms or when taught by different kinds of teachers. performance on a subsequent standardized test than All of these then are potential moderators of the treat- students taught under the old curriculum (the control ment effect. It is possible that the same variable may serve as both a mediator and a moderator. For instance, study Charles M. Judd and Gary H. McClelland, Department of time might serve both roles. First, as a mediator, the Psychology, University of Colorado at Boulder; David A. new curriculum might lead to higher performance be- Kenny, Department of Psychology, University of Connecti- cause it causes students to study more. Second, as a cut. moderator, the treatment might be especially effective Preparation of this article was partially supported by for students who spend more time studying. National Institute of Mental Health Grants R01 MH45049 and R01 MH51964. Procedures for assessing mediation and moderation Correspondence concerning this article should be ad- have been relatively well worked out through ordinary dressed to Charles M. Judd, Department of Psychology, least squares regression and analysis of pro- University of Colorado, Boulder, Colorado 80309-0345. cedures. Mediation is assessed through a four-step Electronic mail may be sent to [email protected]. procedure (Baron & Kenny, 1986; Judd & Kenny,

115 116 JUDD, KENNY, AND McCLELLAND

1981b) that involves testing whether the treatment than dealing with concomitant variables that vary effect is reduced in magnitude when the mediator is within participants (but see Khattree & Naik, 1995). controlled. The test of moderation involves a test of Certainly, recent literature on multilevel modeling the statistical between the treatment vari- and hierarchical linear modeling (Bryk & Rauden- able and the moderator, examining whether a variable bush, 1992; Kenny, Kashy, & Bolger, 1998; Snijders that is a product of the two predicts the outcome vari- & Bosker, 1999) has considered such time-vary ing able over and above the treatment and moderator concomitant variables, but again this literature has not (Aiken & West, 1991; McClelland & Judd, 1993). explicitly addressed the conditions under which these These analytic procedures, however, apply only to variables serve mediating and moderating roles. the situation in which the treatment variable varies Interestingly, in spite of an absence of a formal between experimental units, with some units or par- literature on within-subject mediation and modera- ticipants receiving one treatment and others receiving tion, there are several empirical articles that have at- another. However, treatment comparisons often in- tempted to evaluate mediational models with within- volve within-subject comparisons, examining every subject treatment variables (e.g., Fiske, Kenny, & experimental unit or participant in each treatment con- Taylor, 1982; Neuberg, 1989). In general, these ar- dition. For example, as a of evaluating the ticles have explored mediation by analyzing differ- effects of mood differences on judgment, the same ence scores in ordinary least squares regression mod- participants might be observed both in a positive els. But there has been no formal work that has mood condition and in a negative one. Or to evaluate developed the rationale for the use of such an ap- a new remedy for chronic migraine headaches, the proach. same participant might one day take a placebo pill and The purpose of this article is to provide a formal the next day the new remedy pill (or vice versa). A basis for estimating and testing mediation and mod- design in which the independent variable varies eration effects in within-subject designs, wherein the within participants rather than between them is par- observations under one level of the treatment variable are not independent of observations under the other ticularly useful when the effects of the independent 2 variable are relatively transitory and the independent level(s). Although much of what we present could be variable is easily varied. Such a design can often re- developed within the context of multilevel modeling sult in a dramatic increase in statistical power relative or hierarchical linear modeling procedures, we made to between-subjects designs, because variance due to the decision to explore these issues in the context of stable individual differences contributes to the error ordinary least squares regression for a number of rea- variance in testing between-subjects effects but not to sons. First, many researchers who are interested in the error variance in within-subject analyses. As a questions of mediation and moderation are likely to be result, within-subject designs often involve smaller much more conversant with and samples than those typically used in between-subjects designs. Surprisingly, there is very little work that has formally considered issues of mediation and modera- 1 Traditionally, in the analysis of variance literature, co- tion in within-subject designs (for an exception, see variates have been seen as variables assumed to be uncor- Judd, McClelland, & Smith, 1996). related with the between-subjects treatment variable. They In the analysis of variance literature, a variety of are included in an to increase statis- authors have dealt with the use of covariates in tical power. Because this assumption is not made for po- within-subject designs (for instance, Bock, 1975; Hu- tential mediators and moderators, we refer to them as con- itema, 1980; Khattree & Naik, 1995; Myers, 1979; comitant variables rather than as covariates. 2 Winer, Brown, & Michels, 1991), and some of this Although we generally refer to the designs we are con- literature is relevant.1 But this literature has not ex- sidering as within-subject designs, in fact the procedures plicitly addressed the conditions under which covari- that we develop are widely applicable whenever observa- tions are linked or nonindependent of each other because ates or, more generally, concomitant variables may be they come from the same participant, the same family, the considered to be either mediators or moderators of a same classroom, the same setting, or whatever induces de- within-subject treatment effect. In addition, this lit- pendence. In general, the designs considered are designs in erature has largely confined itself to concomitant vari- which the factor that causes the dependence, and under ables that are measured only once (i.e., between- which observations are nested, is crossed with the indepen- subject variables in within-subject designs) rather dent or treatment variable of interest. WITHIN-SUBJECT MEDIATION AND MODERATION 117 regression procedures that use ordinary least squares and a placebo pill another. The variables Y{ and Y2 are estimation than with multilevel modeling procedures. measures of pain taken after the administration of the Second, within-subject designs typically involve rela- placebo or drug. In a later section, we present hypo- tively few participants. As a result, the large-sample thetical data from such a design and provide illustra- assumptions underlying some estimation procedures tive analyses of these data. used in multilevel modeling may be untenable. Third, We start by considering the simple case without a the detailed examination of the within-subject media- potential mediator or moderator. We present more de- tion and moderation models we offer holds indepen- tail here than necessary to provide a basis for later dent of what statistical methods are used to estimate generalization. Then we introduce a concomitant vari- the parameters of those models. The analyses that we able that may serve as a mediator or moderator. We outline involve the use of difference scores in regres- first consider the case of a stable concomitant vari- sion models, thus providing a more formal basis for able, assumed to represent a stable individual- the procedures that researchers have occasionally differences attribute or other variable that varies be- used on an ad hoc basis. tween units but not across treatment levels. Finally, The within-subject case requires us to address a set we consider the case of a varying concomitant vari- of issues that are typically not addressed in the be- able, assumed to vary within units across treatment tween-subject case. Doing so will make clear certain levels as well as between units. assumptions underlying the definition and assessment Although we discuss estimation, in general we de- of mediation and moderation, both in within-subject fine models in terms of population parameters. These and between-subject designs. are designated with Greek letters, whereas their esti- mates are in the Roman alphabet. The Basic Model Case 1: No Concomitant Variable

We start with a relatively simple research design in Treatment effects in the case of between-subjects which there are two treatment conditions being com- experimental designs are estimated by asking whether pared. At a later point, we generalize our approach to the value of Y differs between units exposed to more complex designs involving multiple levels of a one treatment and those exposed to the other. In the within-subject treatment variable. In the two- within-subject case, however, we have Yl and Y2 mea- condition case, each unit (e.g., participant, respon- sured on each unit, because each unit is observed in dent, or case) is observed twice, once in each condi- each treatment condition. Formally, each response is 3 modeled as tion, and these observations are denoted Yl and Y2. The treatment effect is equal to the difference in the population between the mean of K, and the mean of (1) K2- We assume, for now, that there are no confound- ing factors between the observations, so the difference In these equations, Y and K are the Y measurements between the means is an unbiased estimate of the true u 2/ in each treatment condition for the fth participant. The treatment effect. Accordingly, following the model of as are expected values of each Y, and the ss are errors causal inference developed by Rubin (1974, 1978) or residuals for each observed Y value. According to and Holland (1986, 1988; see also West, Biesanz, & these equations, we are assuming that all observations Pitts, 2000), we assume both temporal stability (i.e., within a treatment condition are equal, except for er- the value of an observation does not depend on when the treatment is delivered) and causal transience (i.e., the effect of a treatment or of a prior measurement 3 does not persist over time). These are strong assump- In the text, we exclude the subscripts that designate unit tions that may not be met in many applications of or participant. These are given in the equations, however. 4 In many within-subject designs in psychology, re- within-subject designs in psychological research.4 searchers may use a Latin square design, counterbalancing We occasionally illustrate our discussion by refer- the order of treatment exposure across participants. This is ring to a hypothetical research design involving a drug commonly thought to yield unbiased estimates of the treat- versus placebo treatment comparison. In the design, ment effect when temporal stability cannot be assumed. At participants who chronically suffer from migraine a later point in this article, we examine the implications of headaches are given an experimental drug one day relaxing this temporal stability assumption in greater detail. 118 JUDD, KENNY, AND McCLELLAND ror, to a constant. In other words, the true scores are extent to each individual's two Y scores. We consider all equal within each condition. We assume that the two cases depending on whether the two within- errors or residuals (es) share a multivariate normal condition slopes are equal to each other, that is, distribution, each with a mean of zero and a common whether or not (3U = (32i- m variance, and, given dependence, a nonzero covari- Parallel functions. If pu = (32i Equation 4, ance between residuals from the same unit (typically then the two linear functions for the Ys, one in each of positive). the two conditions, are parallel. This situation is de-

If a,0 = a20, then there is no treatment difference picted in Figure 1 with two participants, P and Q. Pl on average, although there may be differences be- and 2, are the data points for these participants in tween the Ys for individual units as a result of eh not Condition 1, and P2 and Q2 are their data points in equaling e2/. Assuming that alo + a20, the treatment Condition 2. Accordingly, in this and subsequent fig- difference for each unit equals the difference between ures, both y, and Y2 are graphed on the common the preceding two equations: ordinate axis. Because each participant has only a single X value, both data points for a given participant (2) YDi = have exactly the same location on the jc-axis. In ad- It is assumed that the treatment difference, except for dition, with only two data points in each condition, we error, is constant for all units. The mean difference assume no error, so that all data points lie exactly on the two linear functions. The bottom line, labeled ~ a io (3) y, :X, represents the linear function in Condition 1; the is the treatment effect. In a sample, it is estimated as Y2'.X line is the function in Condition 2. The treatment effect for each participant is then the vertical distance the difference between the sample mean of Yi: and the between the two data points for each participant, rep- sample mean of Y2i, both computed across the partici- pants in the sample. resented by the two vertical dashed lines: between P, A test of whether this mean treatment effect differs and P2 and between Ql and Q2. Importantly, because from zero is provided equivalently by a paired- or of the equal slopes in the two conditions, these two dependent-samples t test, a one-factor repeated mea- within-subject treatment effects are equal to each sures analysis of variance, or a single-sample t test other. With X centered in the figure so that the mean testing whether the mean difference score differs from X score for the two participants equals zero, the solid zero. In our example, the mean pain difference be- vertical line represents the treatment effect on aver- tween the two experimental conditions estimates the overall treatment effect, and each participant is as- sumed to manifest this same pain difference, except for error. Case 2: A Stable Concomitant Variable Instead of assuming that the true scores are constant :X within each conditions, it may be more realistic to assume that they are related to one or more concomi- tant variables. We first consider the case of a single stable concomitant variable, X, assumed to be an in- dividual-differences attribute or some other variable that does not vary over the time frame of the experi- ment. In the context of our example, this variable 0 might be the participant's age or gender. Figure 1. Treatment effect (dashed vertical line) unrelated The Ys are now modeled as a function of this con- to a stable concomitant variable. P and Q are two units comitant variable: measured on the concomitant X variable and on Yl and Y2 in Treatments 1 and 2. The treatment effect (solid vertical line) is the distance between the two lines relating the two Y (4) scores, respectively, to the unvarying concomitant variable X. The parallel slopes indicate a constant treatment effect In other words, the X variable contributes to some that does not depend on X. WITHIN-SUBJECT MEDIATION AND MODERATION 119 age. And note once again that this is equal to the tions in Equation 4. For each unit, the difference be- vertical distance between the two linear functions. tween the two observations, one in Condition 2 and Nonparallel functions. Now consider what hap- one in Condition 1, equals pens if the two within-condition slopes are not equal = (P - 3 ) + (P - to each other, that is, (3 + (3 . This situation, again YD, = 20 10 21 n 21 + (e - e,,-), (5) with two participants, P and Q, is depicted in Figure 2/ 2. Here the slope of the Y2'.X linear function is more which represents the treatment effect for each unit positive than the slope for the X, :X linear function. In (plus error). There are two components to this treat- other words, X relates to Y in Condition 2 more ment effect, one that is common to all units, (320 - (310, strongly than it does in Condition 1. Again, the treat- and one that varies between units, (f321 - (3, ,)X,. If the ment difference for each participant is represented by two within-condition slopes are equal to each other, the dashed vertical line between the two Y:X func- PH = ^2i' men the second component of the treat- tions, at that participant's particular X value. Given ment effect equals zero. Hence, the treatment differ- the slopes in the figure, participant P has a smaller X ence is the same for all units. On the other hand, if the value than participant Q, and so the treatment differ- two within-condition slopes are unequal, then the ence for P is smaller than that for Q. Again, with X treatment effect depends on the value of X,. This de- centered, the mean X equals zero, and the average fines moderation. The magnitude of the treatment ef- treatment difference is represented by the solid verti- fect depends on X,. Equivalently, the Y:X slopes are cal line at this point on the Jt-axis. It necessarily equals different in the two conditions. the average of the two within-subject treatment dif- Estimation. Moderation due to a stable X is esti- ferences. mated by regressing the Y difference on X, thus esti- Figures 1 and 2 differ in that the within-condition mating the model in Equation 5: slopes are equal in the former and unequal in the + YDI - bo b\X; + eYDi. (6) latter. The situation in Figure 2 represents moderation of the within-subject treatment effect. With unequal The slope for X in this will exactly within-condition slopes, the relationship between Y equal the difference in the two slopes if separate re- and X depends on condition. Equivalently, the mag- gression models are estimated within each condition. nitude of the treatment effect—that is, the vertical Accordingly, a test of whether this slope differs from distance between the two functions—depends on the zero provides a statistical test of moderation. In other value of X. words, moderation of the within-subject treatment dif- Algebraically, the situation is captured by taking ference due to a stable X variable is indicated if X the difference between the two within-condition func- predicts the Y difference. For ease of interpretation, it is helpful in estimation to center X by deviating each unit's X score from the mean of X, as we did in Figures 1 and 2. If X has been centered, then the estimated Y difference is given by

YD, = b0 + bj(X, - X) + eYDi, (7) and the estimated mean difference equals

YD = Y2- Yj = bo + b^X- (8) or

YD = Y2 - F, = bo, (9) which equals the estimated treatment effect for a unit with an average value of X. Thus, with centered X, the estimated intercept equals the average treatment ef- fect.5 Figure 2. Treatment effect (dashed vertical line) moder- ated by a stable concomitant variable. P and Q and the regression lines are as in Figure 1. The nonparallel slopes indicate a varying treatment effect that depends on the con- 5 At first, there may appear to be an inconsistency be- comitant variable X. tween the -1,1 coding for the within-subject difference here 120 JUDD, KENNY, AND McCLELLAND

In the between-subjects case, moderation is tested now have X} in the first treatment condition and X2 in by computing a product term between a dichotomous the other, and these X scores, measured after the ad- treatment variable and the moderator and testing ministration of each treatment, may be influenced by whether that product variable predicts the outcome the treatment variable. As we did for Y, we make the once both the treatment variable and the moderator strong assumptions from Rubin and Holland of tem- have been partialed. In other words, in the between- poral stability and causal transience in these X mea- subjects case, moderation is shown by the familiar sures. Thus, any mean difference in X between the practice of testing whether an interaction, computed two treatment conditions is assumed to be a causal as a partialed product variable, predicts the outcome. effect of treatment. In addition, we assume that the X Given this context, moderation in the within-subject variables are causally prior to the Y variables in each case may seem a bit strange, because we are not form- treatment condition. ing a product variable and so one might wonder Just as we did in the case of the Y variables, we can whether an interaction is being tested or not. In fact, develop a model for the X variables: the model given in Equation 5 does imply an interac- X» = *io + ei,-. tion, because X predicts the treatment difference for (1Q) each unit only if (32, ^ Pin that is, only if the effect of A" on Y depends on whether the observation is in From these equations, we can derive the treatment Treatment Condition 1 or Treatment Condition 2. difference for each unit: In the context of our example, suppose the con- XDi = %2i ~XU = <>20 - ^lo) + (S2i ~ eu)- comitant variable was the participant's age. If the (11) within-condition relationships between pain judg- And the mean treatment effect in X equals ments and age are different, then moderation of the within-subject treatment effect by age is indicated. An assessment of whether such moderation is present In the context of our example, if the Y variables are would be provided by testing whether age is a pre- measures of subjective pain on each day, following dictor of each participant's difference between his or the administration of the experimental drug or the her pain ratings in the two conditions. If age does placebo, X might be the degree to which a certain predict the pain difference, then the magnitude of the hormone, presumed to be released by the administra- treatment difference depends on the participant's age. tion of the experimental drug, is found in the blood Equivalently, the relationship between pain and age is system on each of the 2 days. different in the two conditions. Each of the Y variables can be modeled as a func- Case 3: A Varying Concomitant Variable tion of the X variable measured in the same condi- tions:6 We now allow there to be variation in X within ,,. = 810 + s, , participants across the two treatment conditions. Just (13) as we have two measures of Y for each participant (7, in one treatment condition and Y2 in the other), we These two equations are very similar in form to those we presented for the case of the stable concomitant and the -.5 and .5 contrast coding recommended by Judd, McClelland, and Culhane (1995) and West, Aiken, and 6 As already stated, we are assuming that X in each con- Krull (1996) to estimate the treatment difference in the be- dition is causally prior to Y in each condition. Thus, in tween-subjects case. However, any apparent inconsistency modeling the X effects, we did not include Y variables as is entirely due to the difference between slope and intercept predictors. However, we do include X in modeling the Y estimates. If we were to compute a regression equation for variables. In addition, because of our assumptions of tem- each participant to estimate each participant's treatment ef- poral stability and causal transience, we can assume that in fect as a slope, it would equal (Z\,1})/Z\,2 = (-.5K, + the population the X in one condition exerts no causal effect .5y2)/(.25 + .25) = Y2 - Y,. This of course parallels the on the Y in the other. In other words, each Y is a function of between-subjects regression for_which the slope, using the the X from that same condition alone and is uninfluenced by -.5 and .5 codes, equals Y2 - Y,. Note also that the inter- the X measured in the other condition. At a later point at cept in a within-subject regression would equal the average which we consider additional complexities, we examine de- of the two observations. signs in which this assumption cannot reasonably be made. WITHIN-SUBJECT MEDIATION AND MODERATION 121 variable, the difference being that the X predictor vari- equals X2i. The total treatment effect for this unit able varies between the two treatment conditions; that equals the difference between Y2i and Ylt (the vertical is, there are two Xs, not one. Again, interpretation dashed line). But note that these two different Y val- depends on whether the within-condition slopes are ues, one in Condition 1 and one in Condition 2, are equal, that is, whether 8n = 822. We consider each associated with different X values (and the difference case in turn. between the two Xs is represented by the horizontal Parallel Junctions. If 8n = 822, then the two lin- dashed line). Part of the difference in the two Y values ear functions in Conditions 1 and 2 are parallel. Fig- for this participant derives from the fact that higher X ure 3 presents a graph of these two parallel slopes. values in both conditions lead to higher Y values, and The top sloping line is the function in Condition 2 the X value for the observation in Condition 1 is lower relating Y2 to X2. The lower line graphs the function in than the X value for the observation from this unit in Condition 1. As in the earlier two figures, both Ys are Condition 2. In other words, of the total treatment plotted on the common ordinate. Unlike the earlier difference between Y2i and Yu for this unit, part of it figures, where there was only one X, now both Xs are is due to the X difference between the two conditions, plotted on the common abscissa. Again, the important and part of it is a treatment difference over and above assumption in this figure is that the two slopes are that. The treatment difference that persists over and identical, that is, 8n = 822. above the X difference in the two conditions is indi- We have plotted the two observations from a single cated by the solid vertical line between the two lines, unit in Figure 3, participant Q, with observation <2i located above the participant's mean X value. from the first condition and observation Q2 from the The Y difference for any unit (e.g., unit Q in Figure second. Both observations lie perfectly on the two 3) equals the difference between the two functions in functions; accordingly, for purposes of illustration, we Equation 13: are assuming that their Y scores are measured without Y Y Y error.7 Note that the two observations from this unit DI = 2i ~ u = not only are at two different locations on the v-axis or z,- - £„•)• (14) ordinate axis (corresponding to their two different Y Assuming parallel slopes in the two conditions —tha t scores), but they also occupy two different locations is, 8U = 822 — this reduces to on the x-axis, because we are not presuming that Xu ($20 ~ ^10) + 2(- - Xu) + (e2l- - £„•), (15)

where 8, l is the common slope in the two conditions. The total treatment effect is then equal to the sum of two different components: (820 - 810) and 8U (X2, - XH). The first of these is the vertical distance between the two lines in the figure, at the mean level of X for the unit. The second of these two components repre- sents the part of the treatment difference in the Ys that is attributable to the X difference for that unit. In essence, the vertical distance between the two lines, at the value of the mean X for the unit, represents the treatment effect in the Ys over and above or adjusting XQ for the X difference between the two conditions for that unit. It represents an adjusted treatment differ- Figure 3. Treatment effect adjusted within units by ence for the unit, adjusting for the within-unit differ- changes in the concomitant variable. Q is a unit measured ences between the two Xs. twice on both the outcome variable Y and the concomitant variable X after Treatments 1 and 2. The total treatment effect is the difference between Yl and Y2 (the dashed ver- tical line). The adjusted treatment effect (the solid vertical 7 This is only an assumption of convenience to better line) is the distance between the two lines relating each Y illustrate the situation for this particular unit in the figure. In score, respectively, to its concomitant X variable at the general, there is assumed to be residual variation in all mean value of X for that unit. observations. 122 JUDD, KENNY, AND McCLELLAND

With multiple units, interpretation depends on whether or not there is a treatment effect for X on average. For simplicity, we first consider the case of no X treatment effect. Figure 4 presents two different participants, P and Q. As in Figure 3, unit Q has a positive X2i - X,,- difference. Given the positive rela- tionship between X and Y in the two conditions, this means that Qs adjusted treatment difference, over and above his X difference, is smaller than his total or unadjusted treatment difference. That is, the vertical distance between the two lines at his mean A" value is smaller than the total difference between his Y values. On the other hand, the adjustment for participant P runs in exactly the opposite direction; her X2i - Xu Figure 5. Treatment effect adjusted within units by difference is negative, whereas her total treatment ef- changes in the concomitant X with between-units mediation. fect, Y2i - Yu, is slightly positive. As a result, partici- P and Q, the regression lines, and the within-unit adjust- pant P's adjusted treatment difference is larger than ments are as in Figure 4. The difference in the means of X, the unadjusted difference. and X2 implies that the treatment effect adjusting for the For there to be no average treatment effect in X, the average change in the concomitant variable (the solid ver- degree of the adjustments must be exactly the same, tical line) is smaller than the average of the two unadjusted treatment effects (the two dashed vertical lines). except for sign, for P and Q. That is, the X2i - Xu differences are equal, but of opposite signs, for the of the Y treatment difference within each unit, there is two. Hence, the net adjustment collapsing across both no aggregate adjustment in the magnitude of the treat- units is exactly zero. Accordingly, the mean X- ment difference when X is controlled. adjusted treatment effect, portrayed by the length of Finally, consider Figure 5, again with two units, P the solid vertical line between the two functions at the and Q. In this case, both units have a positive X - X,, centered value of zero on the X axis, exactly equals 2i difference. Hence, on average across the two units, the mean of the two unadjusted treatment effects for there is a treatment difference in X as well as in Y. the two units. Although there is individual adjustment Accordingly, the two Xs are at different locations on the joint abscissa (which has been centered at their joint mean). Given a positive relationship between X

Q2 and Y in each condition, the adjustment in the mag- nitude of the treatment effect is such that the total treatment effect for each unit, Y2i - Yu, is larger than the treatment effect adjusted for the X difference. As a result, the average adjusted treatment difference, represented by the solid vertical line at the joint mean of the two Xs, is smaller than the average unadjusted Pi treatment effect. The situation that is graphically displayed in Figure X1 5 represents mediation. It is captured by the fact that X2 (a) there are, on average, treatment differences in both X and Y; (b) X and Y are related in each condition (we Figure 4. Treatment effect adjusted within units by assume that X has been scaled to have a positive re- changes in the concomitant X without between-units media- tion. P and Q are units measured on both X and Y in Treat- lationship with Y); and (c) the treatment effects in Y ments 1 and 2. On average across the two units, the means and X are in the same direction (assuming the scaling 8 of X, and X2 are equal; they both can be centered at zero of X just specified). In terms of our example, treat- without loss of generality. The average treatment effect ad- justed within-unit (the solid vertical line) equals the average of the two unadjusted treatment effects (the two dashed 8 It is possible that the mean X difference between the two vertical lines). conditions is in the opposite direction from the mean Y WITHIN-SUBJECT MEDIATION AND MODERATION 123

ment differences in hormone levels in the blood equal, moderation of the treatment effect is once again would be a mediator of treatment differences in pain indicated. Moderation, as shown earlier, has two judgments if (a) pain judgments are related to blood equivalent definitions: (a) The X:Y relationship de- hormone levels in each condition (scaled so that the pends on condition, and (b) the magnitude of the treat- relationship is positive) and (b) there are mean treat- ment effect depends on the value of X (in this case, ment differences in both variables in the same direc- assuming one common value). tion. In this case, the adjusted treatment effect in pain We now relax the assumption that XH = X2;, con- judgments would be smaller than the unadjusted one, tinuing to allow unequal within-condition slopes, so once those effects are estimated controlling for blood that the model as specified in Equation 14 holds: hormone levels. The specifics of the estimation are discussed later. = Y2i - - (820 - 810) 822X2, - Nonparallel functions. Let us return to the two + (e2i - E!,. equations relating Y and X in each condition (Equation It is useful to reexpress this equation in terms of the 13): Yu = 810 + 8UX, eu and Y2i = 520 + 522X2, corresponding difference and the sum of the two X + e2,. Their difference tells us about the treatment variables. This amounts to a 45° rotation of the two difference in Y for each unit (Equation 14): original ;t-axes. The preceding difference equation 9 Y Y 8 8 8 can be equivalently expressed as YDi = 2i ~ \i = (°20 ~ lo) + 22*2/ ~ ll^li + (e2/ - EU). 822-8n 8n +822 In our discussion so far, we have been assuming

that the two slopes are homogenous, that is, 5U = + (e2l--e1/), (17) 822. Now, however, we want to examine the conse- quences if these two within-condition slopes are not where XSi = Xu + X2i and XDi = X2i - Xu. In this full equal to each other, 5n + 822. To examine this situ- model, allowing unequal within-condition slopes as ation, for pedagogical purposes we initially make the well as X differences within participants, there are unrealistic assumption that there are no treatment dif- three different components to each unit's treatment ferences in X, either on average or at the level of effect in Y, ignoring error. individual units, that is, XH = X2i. Then the treatment First, there is a component that is common to all difference in Y specified in Equation 14 becomes units, 820 - 810. Its magnitude is unaffected by indi- vidual differences in the Xs. YD, = = (520 - 610) + (822 - SU)X,. Second, there is a component of each unit's treat- + (B2i - B,,.), (16) ment effect that varies as a function of the sum of each unit's two X scores, [(8 - 8, !)/2]X . Its contribution where X, is either Xu or X2i, as they are assumed to be 22 S( equal to each other. This model is identical to Equa- to the unit's treatment effect will be large depending tion 5 when we were considering the case of the stable on the magnitude of the condition differences in the concomitant variable. It captures what we earlier de- Y:X slopes. Assuming that X is scaled to have a posi- fined as moderation of the within-subject treatment tive within-condition relationship with Y and assum- effect and was graphically depicted in Figure 2. Ac- cordingly, when the within-condition slopes are un- 9 Any orthogonal basis can be transformed by an orthogo- nal matrix to another orthogonal basis (Cliff, 1987, pp. 320- 326). Solving difference, even given the scaling of X to have a positive Xs relationship with Y in each condition. In such a case, we *\ n n/*, might talk about inconsistent mediation, wherein the direc- X•J£ L-i uU2 tion of the indirect or mediated effect is opposite from the for X, and X2 and then substituting these values into the direction of the unmediated effect. In this case, the average original difference equations yields the reexpression. Such adjusted treatment difference in Y, over and above the treat- transformations have long been used for mathematical con- ment difference in X, would actually be larger than the total venience but often have substantive importance in factor treatment difference in Y. Although inconsistent mediation analysis and elsewhere. For a substantive example using the is possible, empirically it seems less likely than the case of 45° rotation to the sum and difference, see Coombs, consistent mediation described in the text. Coombs, and McClelland (1975). 124 JUDD, KENNY, AND McCLELLAND ing that 822 > 8,,, larger treatment differences will be the first, and this means that the magnitude of the found for units having larger average (or sum of) X treatment difference for each unit, represented by the values across the two conditions. This second com- dashed vertical line for each unit, increases as the ponent reflects moderation: With different average X unit's average X value increases. Thus, unit Q shows values, we get different treatment effects. Equiva- a larger treatment effect than does unit P, over and lently, the relationships between Y and X are not ho- above any within-unit adjustment due to differences mogeneous in the two conditions. in the unit's two X values. The final component of each unit's treatment effect Second, each unit has different X values in the two is that due to the difference between the two Xs, [(822 different conditions, and there are positive relation- + 8, ,)/2]XD,. It will be large whenever the sum of the ships between Y and X in both conditions. Hence, part two within-condition slopes is large and when there of the Y difference for each unit is due to the X dif- are large differences in the unit's X values. This com- ference between conditions. Adjusting for the X dif- ponent reflects the individual adjustment due to dif- ference within each unit leaves each unit's adjusted ferences between X2i and X,,- for each unit. When treatment difference (i.e., the vertical distance be- there is a large X difference in the same direction as tween the two lines at each unit's mean X). In addi- the Y difference, and X and Y are highly related, then tion, as in Figure 5, averaging across the two units, part of the individual's treatment effect in Y is due to there is an average treatment effect in X, such that the the individual's treatment effect in X. And if, on av- mean X value in Condition 2 is higher than the mean erage, across units, there is a mean treatment effect in X value in Condition 1. Given that the treatment effect X, then mediation is indicated. in Y is in the same direction and that the within- Figure 6 captures the full complexity, again for two condition relationships between Y and X are positive, perfectly measured participants, P and Q. First, there then on average, across units, there is mediation due is moderation of the treatment effect according to to X: The overall treatment effect in Y is partly due to each unit's average or typical X value in the two con- the treatment difference in X and is reduced when the ditions. This is revealed by the unequal slopes be- X difference is controlled. tween the two conditions, with the Y:X slope being As in previous figures, we have centered the jc-axis steeper in Condition 2 than in Condition 1. Thus, X so that the joint mean of X,, and X2i equals zero. The relates to Y more strongly in the second condition than solid vertical line at this point represents the average adjusted treatment difference, over and above any X difference, averaged across the two units and allowing for moderation. Note that it is smaller than the aver- X2 age of the two unadjusted treatment effects. Estimation. Having discussed this full model, we can now turn to estimation. We need to estimate three different models. First, Equations 2 and 11 should be estimated, testing for overall treatment effects in both Y and X. Second, the full model portrayed in Equation 14 should be estimated. However, the equivalent for- mulation, given in Equation 17, is much more ame- X1 nable to interpretation. Accordingly, one can estimate a single regression model in which the difference in X, 0 X X2 2 the two Y scores for each unit is regressed on two Figure 6. Between-units mediation of treatment effect predictors, the sum of that unit's X scores and the with moderation of treatment effect due to concomitant X. difference in that unit's X scores (with the Y and X All points and lines are as in Figure 5, except that the differences taken in the same direction): nonparallel slopes indicate a varying adjusted treatment ef- fect depending on each unit's average X score. Mediation is YDi = d2XD (18) shown by the fact that the average adjusted treatment effect (the solid vertical line) is smaller than the average of the two The regression coefficients in this model provide es- unadjusted treatment effects (the two dashed vertical lines). timates of the parameters in Equation 17. Thus, d0 Moderation is shown by the fact that the adjusted treatment estimates the component of the treatment difference effect depends on the average of X, and X2 for a given unit. that is constant across all units, 820 - 810. For most WITHIN-SUBJECT MEDIATION AND MODERATION 125 applications, we recommend centering the Xsi predic- Table 1 tor in the model, as we did in Figure 6. If Xsi has been Data Used in Example centered, then the estimated intercept, dQ, estimates Y, Y2 A #1 H2 the mean treatment effect over and above any media- 73 61 59 37 33 tion of that treatment effect due to X. In addition, this 57 55 60 30 28 estimate is at the mean value of XSi, allowing mod- 57 61 56 30 36 eration as a function of the magnitude of Xsi. If there 67 49 69 31 30 is an overall treatment effect in both Y and X, but the 80 75 67 37 35 intercept in Equation 18 is not different from zero 56 60 51 33 34 (given XSi centered), then complete mediation is in- 72 73 56 38 35 dicated. 81 65 60 43 29 The regression coefficient for the sum predictor 61 59 58 33 31 variable, d , estimates the difference in the two 67 48 61 20 17 l 74 64 58 43 41 within-condition slopes, (8 - 8 )/2. Accordingly, it 22 n 70 55 64 34 27 estimates the degree to which the treatment effect in Y 83 68 58 41 39 is moderated by a unit's average X score. Assuming 62 61 60 35 30 that X has been scaled to relate positively to Y in each 49 55 56 32 32 condition, then a positive value for this regression 74 79 63 35 37 coefficient indicates larger treatment effects for units 73 60 61 36 38 with larger average X scores. 46 51 59 25 24 Finally, the regression coefficient for the difference 60 55 54 26 25 predictor variable, d2, estimates the degree to which 93 71 62 46 39 each unit's treatment difference in Y is due to that unit's X difference. It represents the within-unit ad- justment in the magnitude of the Y difference due to judgments. We analyze hypothetical data to illustrate the X difference. If this regression coefficient is sig- our proposed analysis strategy. nificant (in a positive direction, given the scaling of Data from this example are presented in Tables 1 X), and if there is a mean difference in X between the and 2. Variables Yl and Y2 are the pain judgment two conditions, then mediation of the treatment effect scores for each participant in the two conditions (F, in in Y is indicated. the placebo condition and Y2 in the drug condition, both measures taken on a hypothetical 100-point scale In summary, first, the slope d{ for the sum XSi (= in which higher numbers indicate greater levels of X,, + X2,) estimates (822 - 8n)/2, half of the differ- ence of the within-condition slopes for the concomi- pain). Three other variables are measured. The first, tant variable X, and indicates moderation of the treat- A, is the participant's age, assumed to be unchanging ment effect in Y depending on the level of the X sum. for all participants during the course of the study. The second and third, Hl and H2, are measures of levels of Second, the slope d2 for the difference XDi (= X2/ - the hypothetical hormone in the blood of each partici- Xu) estimates (822 + S,,)/2, the average of the two within-condition slopes for the concomitant variable pant, measured after the administration of either the X, and indicates mediation of the treatment effect in Y, placebo or the drug. The hormone variables have been assuming a mean difference between the two Xs. scaled so that lower numbers indicate higher levels of Third, the intercept d0 estimates 820 - 810, which is 10 (assuming Xs, centered ) (a) the average treatment 10 effect over and above mediation, if d2 differs from It is a useful default practice to mean-center predictor zero, and (b) that treatment effect for a unit with an variables in moderator models so that zero will be a mean- average sum of the X values, if d} differs from zero. ingful value (Aiken & West, 1991). However, we specifi- cally do not do so for the difference predictor because its uncentered zero value is meaningful; it represents no dif- Illustration ference in X between the two treatment conditions. The intercept then equals the treatment effect in Y at the average Our example has been a hypothetical study involv- level of the Xs (assuming that the X sum is centered) and ing a drug treatment versus placebo comparison, ex- when the X difference equals zero (i.e., acting as if there amining effects of these treatments on chronic pain were no treatment difference in X). 126 JUDD, KENNY, AND McCLELLAND

Table 2 have more negative pain differences between Condi- Correlations, Means, and Standard Deviations Used tion 2 (drug) and Condition 1 (placebo). in Example The interpretation of the parameter estimates in this Variable 1 2 3 4 5 model is made simpler if age is centered from its i.y, — mean (A' = A - A): YDi = -6.50 - 1.06 A'. In this equation, the intercept equals the predicted treatment 2. Y2 .64 — 3. A .42 .06 — difference at the mean age (i.e., A' = 0), and it is 4.H, .74 .70 .10 — identical to the estimated treatment difference when 5.H2 .48 .71 -.05 .78 — age was not included as a predictor. A test of whether M 67.75 61.25 59.60 34.25 32.00 this treatment difference differs from zero yields f(18) SD 11.89 8.58 4.21 6.41 5.96 = 3.50, p < .01. Although this test of the mean treat- ment effect has one fewer degrees of freedom than the test of the treatment difference when age was not the hormone in the blood, and thus they are scaled to included as a predictor, it provides a more powerful have positive relationships with the pain variables: test of the overall treatment effect because age is Higher levels of hormone in each condition coincide strongly related to the treatment difference. with lower levels of pain. These variables are all mea- Turning to the two hormone variables, we want to sured for 20 hypothetical participants.'l know whether the hormone levels in the two condi- The overall effect of treatment on pain judgments is tions both mediate and moderate the treatment differ- ence in pain judgments. Initially, we regress each pain estimated by the mean difference between Y2 and Yl, which equals -6.50. A test of the null hypothesis that measure on the hormone measure from the same con- the true difference is zero yields /(19) = -3.15, p < dition: YU = 20.51 + 1.38 //,, and Y2i = 28.54 + 1.02 .01. Hence, in these data there is a statistically sig- H2i. As expected in the present research context, hor- nificant effect of the drug relative to the placebo, with mone levels in each condition are significantly related lower levels of pain reported in the drug condition. to pain levels, with higher hormone levels associated Next, we estimate this treatment effect considering with lower judgments of pain. (Remember that lower age as the concomitant variable and then considering scores on Y indicate less pain and lower scores on H hormone levels as the concomitant variables. indicate higher hormone levels.) When the pain difference is regressed on the hor- Regressing each pain score on age yields Yu = mone sum (Hs) and difference (HD), the following -3.39 +1.19 A,, and Y2i = 53.64 + 0.13 A,, In the first 2 estimate results:' YDi = 0.55 - 0.07//s, + \.22HDi. In equation, A is a marginally significant predictor of Yl, ?(18) = 1.98, p < .10; in the second equation, the this model, the hormone level difference is a signifi- slope for A is not significantly different from zero, cant predictor of the pain difference, t(ll) = 2.66, r(18) = 0.27, p> .50. p < .05, but the hormone sum is not. To aid interpre- When the pain difference, YD (Y2 - 3j), is regressed on A, the following equation results: YDi = 57.03 - 1.06 A. Here the slope for A is significantly different 1' Both types of concomitant variables (age and hormone from zero, f(18) = 2.36, p < .05. It equals the differ- level) could be included in one analysis. For simplicity of ence in the two slopes from the separate Y2 and Yl exposition, we treat them separately, just as we have done equations just described (i.e., -1.06 = .13 - 1.19). throughout. 12 Thus, the test of whether this slope differs from zero Given Equation 16 presented earlier, it may come as a shows equivalently that the slope for A in the Y2 equa- surprise that the intercept and slopes in this estimated model tion differs from the slope for A in the y, equation. As are not exact functions of the intercepts and slopes in the a result, we can conclude that there is a significant two equations estimated in each treatment condition. The algebra underlying Equation 16 presumes that the two Age x Drug Condition interaction, meaning that age crossed slopes (effect of H in one condition on P in the (A) relates more strongly to pain judgments (older other) are exactly zero, which they will be in the population participants report greater pain) in the placebo condi- if the assumptions that we made in the earlier section hold. tion than in the drug condition. Equivalently, we can Even though in the present data those crossed effects are not say that age moderates the treatment difference, with significant, they do not exactly equal zero. We discuss the larger differences between the drug and placebo con- role of these crossed slopes in a later section devoted to ditions for older participants; that is, older participants complications. WITHIN-SUBJECT MEDIATION AND MODERATION 127 tation, as previously explained, we reestimate this experimental situations, wherein the assumptions un- model after centering the sum variable: YDi = -3.77 derlying causal inference laid out by Rubin (1974, - 0.07 H'Si + 1.22 HDi, where H's is the centered sum. 1978) and Holland (1986, 1988) are very unlikely to Differences in hormone levels can be said to me- be met. diate the treatment difference in pain if two conditions are met. First, there must be a condition difference in Within-Subject Independent Variables With hormone levels that is in the same direction as the More Than Two Levels condition difference in pain (assuming that hormone The models developed earlier for the case of two levels have been scaled to have a positive relationship repeated measures readily generalize to studies with with pain judgments). The mean hormone level in the three or more levels of a within-subject independent drug condition is 32.00; in the placebo condition, it is variable. For example, a study of a treatment effect 34.25. The difference between these two is signifi- might include two different drugs being evaluated as cant, f(19) = 2.45, p < .05. Second, the hormone level well as a placebo control condition. Or a study of difference must be predictive of the pain difference, mood effects might involve comparisons among a as we have shown it to be. positive, negative, and neutral mood induced in par- The intercept in the preceding model in which the ticipants at three different times. And in these cases sum predictor has been centered, -3.77, estimates the there might also be a single concomitant variable mean difference in pain judgments between the two measuring a stable characteristic or a varying con- treatment conditions over and above the hormone dif- comitant variable measured in each treatment condi- ference, or acting as if the hormone difference were tion. With less detail, we show the generalization zero. Thus, it represents the residual treatment differ- from two to three levels of the within-subject treat- ence over, and above mediation. In this case, that re- ment variable and describe the general principles that sidual difference remains marginally significantly dif- also apply to more than three repeated measures. ferent from zero (p < .10). Nevertheless, it is First consider the case of three repeated measures necessarily less than the total or unmediated treatment without concomitant variables: difference because of the fact that the hormone dif- ference is a significant predictor in this model and there is a mean hormone difference between the two conditions. Y2i = (19) In conclusion, with these hypothetical data, we Y have evidence of a significant treatment difference in 3i = <*30 • pain judgments, with the age of the participant mod- For illustrative purposes, we assume that Yu and Y2i erating that difference. Older participants show a are measures of the dependent variable after admin- larger treatment effect. In addition, there is a signifi- istration of one of two drugs, whereas Y3i comes from cant treatment effect on levels of the blood hormone the placebo control condition. that serves as a mediator of the treatment difference in Procedures for assessing the treatment effect for pain. Over and above the treatment difference in hor- two repeated measures generalize to orthogonal con- mone levels, the residual pain difference between the trasts for three or more repeated measures. For ex- two treatment conditions is marginally significant. ample, the researcher might want to compare the two There is no evidence that average hormone levels of drug conditions with the placebo control condition the participant moderate the treatment effect. (l,l,-2)13 to determine whether there is an overall effect of the drugs. Then he or she might want to Elaborations and Complications compare the two drug conditions with each other (1,-1,0). We use these two contrasts for illustrative Having laid out the underlying logic and estimation purposes; many other pairs of orthogonal contrasts of mediation and moderation in within-subject de- are, of course, also possible. To obtain the treatment signs, we now turn to two different complicating is- sues. The first is that many within-subject treatment comparisons involve independent variables having 13 We represent contrasts with their coefficients. The more than two levels. The second is that within- contrast (1,1 ,-2) represents Y}, + Y2i - 2 Y3i, and the contrast subject comparisons are frequently conducted in non- (1,-1,0) represents Yti - Y2i. 128 JUDD, KENNY, AND McCLELLAND difference coded by each contrast, we take the corre- the relationship between X and Y varies across the sponding differences among the Ys: levels of treatment specified in the Y contrast. Equiva- lently, X moderates the magnitude of the treatment effect specified in the contrast, such that larger or 10 ' "20 - 2a30) + (e](- + e-u - 2e3/), smaller contrast differences are associated with larger (20) YD2i = Y\i- Y2i = ("10 - «2 or smaller X scores. In addition, if X has been cen- tered, the intercept estimates the contrast effect at the In other words, the two treatment differences of in- average level of X. terest, except for error, are constant for all of the Just as we did in the case of two treatment levels, observational units. The mean treatment effects ac- we might also have an X variable whose values vary cording to these two contrasts are then across the treatment conditions. If we assume, as we 2a30, did in the case of two treatments, that this X variable (21) manifests stability and causal transience, then treat- = a10 ~ a 20- ment differences in it are assumed to be causal effects, Next consider the case of a single stable concomi- as they are in the outcome variable. In this case, we tant variable. That is, can model each Y as a function of the X measured in that condition:

(22)

Y2i = (25)

Y Then the contrast differences are given by i = 830 e3,.

Y = Y Y Given these models, we can calculate the differences D\i \i 2i ~ due to each of the contrasts of interest: = (Pio + £20 - 2P30) + (Pn + P21 - 2p31)X, + (e,,. + e2,-2e3l.), (23) YDli~ Y2i ~ + Y =Y Y ~ (810 ^20 ~ 28 ) + §! [X,,. + 8 X ,. - 2833X3,. D2i \i~ 2i 30 22 2 + (eu + 82, - 2e3i) (26) = (Pio - P20) + (Pn - P2i)*,- + (e» - e2l-).

Now treatment differences according to each contrast YD2i~ Yli~ Y2i have two components (other than error). First, there is = (810 - S20) + 8UX,,. - 822X2,. + (e,(- - e2/). a constant effect for all units (represented by the in- As in the case of two treatment levels, these models tercept differences in the preceding equations). Sec- can be equivalently reexpressed as two models, with ond, the magnitude of each contrast difference de- three predictors in each model: the sum of the Xs, the pends on the unit's X value. Just as was the case when difference in the Xs according to the first contrast the independent variable had only two levels, if the weights (1,1, -2), and the difference in the Xs accord- relationships between Y and X are homogeneous = men tne ing to the second contrast weights (1,-1,0): within conditions (i.e., (3,, = (32i PsiX contrast effects do not depend on or are not moderated by X. In this case, the estimates of the contrast effects 8n + 822-: are the same whether or not X is controlled. However, ^30) Xs, if the relationships are not homogeneous, then the h2833 ^11 -8;22 contrast effects are moderated by X. The average con- Dli trast effects are (27)

YD2i — = (P,o + P20 - 2P30) + (Pn + P2i - (24) >io 820) V-YD2 = M-n - M-re = (Pio - P2o) + (Pn - Estimation is accomplished by regressing each con- + e trast difference on the X variable. A significant re- Xn,,+ %D2i ( li gression coefficient associated with X indicates that WITHIN-SUBJECT MEDIATION AND MODERATION 129

These unwieldy expressions reduce considerably if tenable. Instead, to overcome the threats to internal we assume that all within-condition relationships be- validity that violations of this assumption imply (e.g., tween X and Y are homogeneous, that is 8t, = 822 = history, maturation, regression toward the mean, in-

8J33, - strumentation, and testing threats; Campbell & Stan- ley, 1963; Judd & Kenny, 1981a), alternative orders of treatment administration are typically used, and 811+822 participants are randomly assigned to these alternative = (810 + 820-2830) + •^Dli orders. + (e,,- + s2(- — 2s3,-) (28) In still other cases, within-subject analyses are fre- quently used in cases in which causal inference is clearly impossible; that is, "treatments" are in no 8n +822 = (810-820) sense under the control of the experimenter, and ex- perimental manipulation of their levels is not conceiv- In other words, homogeneity implies that the contrasts able. Consider a researcher who is interested in gen- among the 7s are a function not of the sum of the Xs der differences in subjective happiness. He or she but only of the contrast among them that corresponds recruits married couples for the research and asks the to the contrast among the Fs that is being modeled. As spouses about their subjective happiness. To assess a result, whenever the sum of the Xs is found to be a gender differences in happiness in these couples, it is significant predictor of a Y contrast, moderation of entirely appropriate to conduct a repeated measures that treatment contrast by the Xs is implied. If there is analysis of variance or paired samples t test to ask a mean contrast difference for both the Xs and the 7s, whether the mean happiness of the men is different then the X contrast being predictive of the Y contrast from the mean happiness of the women. But clearly implies mediation. In this case, part of the overall the Rubin and Holland assumptions make no sense in contrast difference in the Ys can be attributed to the this context. Although we can ask whether the mean overall contrast difference in the Xs. difference is different from zero, causal inference to The preceding conclusions generalize to situations gender is impossible. having more than three levels of a within-subject And yet, it is entirely possible in such a situation treatment variable. Estimation in such cases involves that one might be interested in knowing whether some defining a series of orthogonal contrast scores among concomitant variable (stable across the two individu- both the Fs and the Xs. Then the Y contrasts are re- als in the couple, e.g., length of time they have been gressed on the sum of the Xs and the full set of cor- married) moderates the within-couple gender differ- responding contrast differences among the Xs. Mod- ence in subjective happiness. Perhaps couples who eration of a given contrast difference is indicated if have been married longer are less likely to show gen- the sum of the Xs is predictive of that given Y contrast. der differences in subjective happiness. Certainly, one Mediation is indicated if the corresponding X contrast can ask about this within-couple moderation using the is significant, having already shown mean differences procedures developed earlier, even though we would in both the Y and X contrasts of interest. not want to refer to it as moderation of a "treatment Complications When Assumptions Are Relaxed effect." Similarly, we could focus on some other concomi- Following the lead of Rubin (1974, 1978) and Hol- tant variable that possibly varies between gender land (1986, 1988), we have made the strong assump- within a couple (such as overall mental health) and tions of temporal stability and causal transience. In ask whether differences in mental health "mediate" fact, within-subject experimental designs are fre- the gender difference in subjective happiness. In this quently used in psychological research wherein these context, it seems that we should be particularly cau- assumptions are unlikely. There is considerable con- tious in using the term mediation because there seems sensus that such designs should not be used in cases in to be an implicit assumption in this term that an "ef- which causal transience is violated (i.e., treatment ef- fect" is "due to" the mediator. But certainly one can fects carry over and may affect observations at later reframe the question, avoiding the term mediation, points in time). However, the assumption of temporal and conduct analyses to ask whether the gender dif- stability (that responses are stable over time in the ference in subjective happiness is reduced once we absence of treatment differences) is frequently not control for gender differences in mental health. In 130 JUDD, KENNY, AND McCLELLAND light of this discussion, it seems appropriate to con- i ~ *2i Y\ i sider the ways in which the models that we have = (820 - 810) + (821 - 8, ,)XU + (822 outlined need to be modified once we move outside of (30) the experimental context. In general, analyses either without a concomitant So too is the model in which the difference in the Ys variable or with a single stable concomitant variable is modeled as a function of the sum of the Xs and the are identical to the models presented earlier. That is, difference in the Xs: the overall mean difference in Y between the two Y Y "treatment" levels (e.g., male vs. female spouses) is D — 2 ' , ,-812 estimated as the mean difference in Y. In addition, that = (820-S10) X•S, i difference can be regressed on some concomitant 8,, + 8 -8 , -8 variable, X, that is stable across the two "treatment 22 2 12 levels" (e.g., length of marriage of the couple) to de- termine whether it moderates the magnitude of the (31) difference in Y. If it does, then that means that X Now the effect of the sum of the Xs on the Y differ- relates to Y differently in the two "treatment levels." ence will be large to the extent that the two within- In terms of our example, it might be the case that level effects differ and to the extent that the two female spouses report, on average, significantly less crossed effects differ (in the same direction). The ef- subjective happiness than the male spouses. But this fect of the X difference on the Y difference will be difference may be less pronounced among couples large to the extent that the crossed effects are smaller who have been married longer than among more re- than the two within effects (assuming all 8s are posi- cently married couples. This would imply that length tive). of marriage relates differently to the men's self- The estimation of "mediation" and moderation can reports of subjective happiness than to women's self- still be accomplished by regressing the Y difference reports. on the X sum and X difference. If X serves as a mod- In the case of an X variable that varies across "treat- erator, then the X sum should predict the Y difference. ment levels," however, the models necessarily be- Equivalently, moderation implies that within and come a bit more complicated, because it may be the crossed effects of the X in one "treatment level" are case that the X in one "treatment level" exerts effects different from the within and crossed effects in the on the Y in the other level. Again, consider our ex- other. If the X difference predicts the Y difference, ample. Suppose we were interested in each spouse's then X can be said to "mediate" the Y difference, overall mental health status as both a potential "me- assuming that there is a meaningful mean difference diator" and moderator of the spousal difference in in X as well as in Y. Finally, if the X sum has been subjective happiness. It seems likely that each centered as a predictor, then the intercept in this re- spouse's mental health not only influences his or her gression model estimates the mean residual "treat- own subjective happiness but also the subjective hap- ment effect" in Y when the "treatment effect" in X piness of his or her marriage partner. In the previous equals zero, that is, over and above any difference in models with a varying X, we could make the assump- X. To clarify the nature of the within and crossed tion that such crossed effects were zero. Now we no effects in this context, we also recommend estimating longer can. Accordingly, each Y must be modeled as the separate Y models, regressing each Y on the two a function not only of X in the same "treatment level" Xs (one from the same "treatment level" and one from but also of X in the other "level": the other). In terms of the example that we have been devel- oping, the female-male difference in subjective hap- (29) Y 8 piness within each couple would be regressed on the 2i = 20 centered sum of the two mental health scores and the In these models, 8,, and 822 represent the within-level female-male difference in the two mental health effects, whereas 8:2 and 821 represent the crossed ef- scores. If greater within-couple differences in happi- fects. ness are associated with lower average mental health Given these models, the model for the difference scores for the two individuals, then the sum predictor between the two Ys is more complicated: would have a significant regression coefficient. This WITHIN-SUBJECT MEDIATION AND MODERATION 131 would imply moderation. If greater within-couple dif- be predictive of the Y difference. Equivalently, this ferences in happiness are associated with greater means that X relates to Y more strongly in one of the within-couple differences in mental health status, and treatment conditions than the other. In the estimated if on average there was a female-male difference in model, if X has been centered, then the intercept will mental health status in the same direction as the sub- equal the overall or average treatment effect, equiva- jective happiness mean difference, then mental health lent to the estimated treatment effect assessed inde- status might be said to "mediate" the subjective hap- pendently of moderation. piness difference. That is, part of the gender differ- When the concomitant variable is affected by the ence in happiness might be attributable to the gender treatment, then it may serve as a mediator of the treat- difference in mental health status. This interpretation, ment effect if it is causally prior to Y. In this case, X of course, depends heavily on an implicit causal is measured in each treatment condition, and there is model suggesting that mental health status is some- on average a mean difference in X between the two how causally prior to subjective happiness. In addi- conditions. This concomitant variable, measured in tion, we again issue the warning here that talking each treatment condition, may also serve as a mod- about "mediation" in this situation, where we clearly erator of the treatment effect, with treatment effects of cannot talk about causal effects of a treatment vari- different magnitudes depending on a participant's to- able, is highly suspect. At best, perhaps, we could tal or average X score. argue that gender differences are found in both mental To assess both mediation and moderation due to a health and subjective happiness and that, when the concomitant variable that varies between treatment former difference is controlled, the magnitude of the conditions, one regresses the Y difference on both the latter one is diminished. In the absence of clear con- X sum and the X difference. Assuming that there is an ditions that establish the causal effect of a treatment overall treatment effect on X and that the X difference variable, "mediational" analyses can ultimately do predicts the Y difference, mediation of the treatment nothing to help make the causal arguments that are effect in Y by X is indicated (assuming that X and Y implicit in the terminology used.14 are scaled to have a positive relationship and the treat- ment effects in Y and X are in the same direction). If Summary and Conclusion the X sum is predictive of the Y difference, then X also serves as a moderator of the treatment effect and, In this concluding section, we have three main equivalently, X relates to Y differently in the two treat- goals. First, we want to summarize the analytic rec- ment conditions. Finally, if the X sum has been cen- ommendations we have made for assessing mediation tered, then the intercept in this regression equation and moderation in within-subject designs. Second, in will equal the magnitude of the residual treatment addition to the complications discussed in the previ- difference in Y, over and above mediation due to X ous section, we want to mention some caveats and and at the mean value of X. In other words, it will concerns that need to be raised whenever the analyses equal the portion of the mean treatment effect that is we have outlined are conducted. Finally, we conclude not mediated through X. with some general comments about ways in which the In cases in which the treatment variable has more assessment of mediation and moderation in within- than two levels, this analytic approach can be accom- subject designs permits us to think more clearly about these processes and how they are assessed in between- subjects designs. 14 Given the content of the example we have been pur- Summary of Analyses suing, it makes sense to link the "mediation" and modera- tion model we have been developing with the actor-partner The assessment of overall treatment effects in effect model that Kenny and colleagues have developed within-subject designs involves estimating the mean (Kashy & Kenny, 2000; Kenny, 1996). That model is in fact given by Equation 29. In our model, wherein the difference difference between the two treatment conditions in Y, is regressed on the sum and the difference, the regression the outcome variable. To assess whether the magni- coefficient for the difference estimates the extent to which tude of this treatment effect is moderated by some the partner effects are smaller than the actor effects. The stable concomitant variable, X, the Y difference is coefficient for the sum estimates the extent to which the regressed on that concomitant variable. If the magni- effects on one target (both actor effect and partner effect) tude of the treatment effect depends on X, then X will are greater than the effects on the other target. 132 JUDD, KENNY, AND McCLELLAND plished by computing contrast differences in both Y have outlined could be conducted within a structural and X. One then focuses on the analysis of individual equation approach that allows for multiple indicators Y contrasts among the treatment conditions, regress- and controls for errors of measurement. Such an ap- ing these on the X sum as well as the set of X contrast proach would need to model the means of the vari- differences. Mediation of a particular contrast differ- ables as well as their and covariances (see ence in the Ys is indicated if there is a parallel X Kline, 1998, pp. 293-307). difference and if the Y contrast difference is predicted by that same X contrast difference. Concluding Comments Our approach, as just summarized, relies exclu- Assessment of mediation and moderation in within- sively on estimates provided by ordinary least squares subject designs suggests some refinement in our regression models. Estimation can be conducted thinking about these processes in both these designs through any general purpose multiple regression pro- and designs in which the treatment variable varies gram. As we noted in the introduction, much of what between participants. We conclude with two such we have presented could also have been developed thoughts. within the context of multilevel modeling, using pro- First, the within-subject analysis with a varying X grams that rely on other estimation strategies (e.g., that can serve as both a moderator and a mediator is HLM or Proc Mixed in SAS). We chose the ordinary conducted by assessing whether the magnitude of the least squares approach simply because most psycho- treatment difference in Y depends on two different logical researchers are much more familiar with re- linear functions of the Xs: their sum and their differ- gression and analysis of variance procedures than ence. Moderation is indicated if the treatment differ- with multilevel modeling. ence in Y depends on the X sum. Mediation is indi- Further Concerns cated if the treatment difference in Y depends on the X difference, assuming overall treatment effects in In the initial sections of this article, we made a both X and Y. Thus, in a deeper sense, both modera- series of assumptions when we laid out the basic tion and mediation suggest that the magnitude of the model. These included that any treatment differences Y treatment difference varies and is predictable from in Y and X were causal in nature and were estimated some function of the Xs. Although we are convinced without bias. An additional assumption is important of the utility of thinking about moderation and me- if, in fact, one wishes to use the proposed analyses to diation as fundamentally different processes, they argue for mediation. It must be the case that X is both imply treatment effects varying in magnitude in causally prior to Y; that is, changes in X induced by predictable ways as a function of a concomitant vari- the treatment cause changes in Y between the two able. treatment conditions. Because X is a measured vari- Second, our analyses suggest that when X varies able rather than one that is manipulated in the designs between treatment conditions and Y is a function of we have been considering, this assumption cannot be that X in each condition, the full model necessitates critically examined without additional research in including both the X sum and the X difference as which X is actually manipulated and its effects on Y predictors. To assess the impact of one of these with- assessed experimentally. out the other also included in the model is to risk Implicitly, we have also been assuming that both Y biased assessments of either mediation or modera- and X are measured without error. Random errors of tion.15 The same presumably should apply in be- measurement, particularly in the concomitant X vari- tween-subjects assessments of mediation and modera- able, will result in biased assessments of both media- tion. Furthermore, in cases in which crossed effects tion and moderation. In the case of moderation due to may exist (i.e., the X in one treatment condition af- a single stable X, measurement error in X will attenu- fects Y in the other), analyses in between-subject de- ate the estimate of moderation. In the case of a con- comitant X that varies between the two treatment con- ditions, the situation is more complicated because 15 If the Xs in the two conditions have equal variances, it both mediation and moderation are assessed in the can be shown that their sum and difference are uncorrelated same model. Random errors of measurement in this (Kenny, 1979, p. 21). Accordingly, in this equal variance case can lead to either an underestimation or an over- case, there will be no bias if estimation includes only the estimation of these effects. All of the analyses that we sum or only the difference as predictor. WITHIN-SUBJECT MEDIATION AND MODERATION 133 signs may be misleading because such crossed effects Huitema, B. E. (1980). The analysis of covariance and cannot be estimated. In this sense, one can think of alternatives. New York: Wiley Interscience. within-subject designs as containing considerably James, L. R., & Brett, J. M. (1984). Mediators, moderators, more information than between-subjects designs. Or, and tests for mediation. Journal of Applied Psychology, equivalently, one can think of between-subjects de- 69, 307-321. signs as within-subject designs with half of the data Judd, C. M., & Kenny, D. A. (198la). Estimating the effects from each participant randomly missing. of social interventions. New York: Cambridge University As with any analytic approach to data, the models Press. we have outlined should be applied only after careful Judd, C. M., & Kenny, D. A. (1981b). Process analysis: consideration of their appropriateness and of the as- Estimating mediation in treatment evaluations. Evalua- sumptions that underlie them. Mediation and modera- tion Review, 5, 602-619. tion of treatment effects are clearly important analytic Judd, C. M., McClelland, G. H., & Culhane, S. E. (1995). issues to be addressed in both between-subjects and Data analysis: Continuing issues in the everyday analysis within-subject designs. And we hope to have clarified of psychological data. Annual Review of Psychology, 46, methods in the latter case. But these analyses make 433-465. sense only if the many assumptions that underlie them Judd, C. M., McClelland, G. H., & Smith, E. (1996). Test- can realistically be made. ing treatment by covariate interactions when treatment varies within subjects. Psychological Methods, 1, 366- References 378. Kashy, D. A., & Kenny, D. A. (2000). The analysis of data Aiken, L. S., & West, S. G. (1991). Multiple regression: from dyads and groups. In H. T. Reis & C. M. Judd Testing and interpreting interactions. Newbury Park, (Eds.), Handbook of research methods in social and CA: Sage. personality psychology (pp. 451-477). Cambridge, Baron, R. M., & Kenny, D. A. (1986). The moderator- England: Cambridge University Press. mediator variable distinction in social psychological re- Kenny, D. A. (1979). Correlation and causality. New York: search: Conceptual, strategic, and statistical consider- Wiley Interscience. ations. Journal of Personality and Social Psychology, 51, Kenny, D. A. (1996). Models of interdependence in dyadic 1173-1182. research. Journal of Social and Personal Relationships, Bock, R. D. (1975). Multivariate statistical methods in 13, 279-294. behavioral research. New York: McGraw-Hill. Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical analysis in social psychology. In D. T. Gilbert, S. T. linear models: Applications and data analysis methods. Fiske, & G. Lindzey (Eds.), The handbook of social Newbury Park, CA: Sage. psychology (4th ed., Vol. 1, pp. 233-268). Boston: Campbell, D. T., & Stanley, J. C. (1963). Experimental and McGraw-Hill. quasi-experimental designs for research. Chicago: Rand Khattree, R., & Naik, D. N. (1995). Applied multivariate McNally. with SAS software. Gary, NC: SAS Institute. Cliff, N. (1987). Analyzing multivariate data. San Diego, Kline, R. B. (1998). Principles and practice of structural CA: Harcourt Brace Jovanovich. equation modeling. New York: Guilford Press. Coombs, L. C., Coombs, C. H., & McClelland, G. H. MacKinnon, D. P., & Dwyer, J. H. (1993). Estimating (1975). Preference scales for number and sex of children. mediated effects in prevention studies. Evaluation Re- Population Studies, 29, 273-298. view, 17, 144-158. Fiske, S. T., Kenny, D. B., & Taylor, S. E. (1982). Struc- McClelland, G. H., & Judd, C. M. (1993). Statistical diffi- tural models for the mediation of salience effects on culties of detecting interactions and moderator effects. attribution. Journal of Experimental Social Psychology, Psychological Bulletin, 114, 376-390. 18, 105-127. Myers, J. L. (1979). Fundamentals of experimental design Holland, P. W. (1986). Statistics and causal inference. Jour- (3rd ed.). Boston: Allyn & Bacon. nal of the American Statistical Association, 81, 945-960. Neuberg, S. L. (1989). The goal of forming accurate Holland, P. W. (1988). Causal inference, path analysis, and impressions during social interactions: Attenuating the recursive structural equation models. In G. C. Clogg impact of negative expectancies. Journal of Personality (Ed.), Sociological methodology 1988 (pp. 449-484). and Social Psychology, 56, 374-386. Washington, DC: American Sociological Association. Rubin, D. B. (1974). Estimating causal effects of treatments 134 JUDD, KENNY, AND McCLELLAND

in randomized and nonrandomized studies. Journal inference and generalization in field settings: Experimen- of Educational Psychology, 66, 688-701. tal and quasi-experimental designs. In H. T. Reis & C. M. Rubin, D. B. (1978). for causal effects: Judd (Eds.), Handbook of research methods in social and The role of randomization. Annals of Statistics, 6, 34-58. personality psychology (pp. 40-84). Cambridge, Snijders, T. A. B., & Bosker, R. J. (1999). Multilevel analy- England: Cambridge University Press. sis: An introduction to basic and advanced multilevel Winer, B. J., Brown, D. R., & Michels, K. M. (1991). modeling. London: Sage. Statistical principles in experimental design (3rd ed.). West, S. G., Aiken, L. S., & Krull, J. L. (1996). Experimen- New York: McGraw-Hill. tal personality designs: Analyzing categorical by continu- ous variable interactions. Journal of Personality, 64, Received February 3, 1999 1-48. Revision received January 5, 2001 West, S. G., Biesanz, J. E., & Pitts, S. C. (2000). Causal Accepted January 8, 2001 •

AMERICAN PSYCHOLOGICAL ASSOCIATION

SUBSCRIPTION CLAIMS INFORMATION Today's Date:

We provide this form to assist members, institutions, and nonmember individuals with any subscription problem. With the appropriate information we can begin a resolution. If you use the services of an agent, please do NOT duplicate claims through them and directly to us. PLEASE PRINT CLEARLY AND IN INK IF POSSIBLE.

PRINT FULL NOME OR KEY NAME OF INSTITUTION MEMBER OR CUSTOMER NUMBER (MAY BE FOUND ON ANY PAST ISSUE LABEL)

ADDRESS DATE YOUR ORDER WAS MAILED (OR PHONED)

PREPAID CHECK CHARGE CITY STATE/COUNTRY ZIP CHECK/CARD CLEARED DATE:_

(If possible, send a copy, front and back, of your cancelled check to help us in our YOUR NAME AND PHONE NUMBER research of your claim.) ISSUES: MISSING DAMAGED

TITLE VOLUME OR YEAR NUMBER OR MONTH

- (TO BE FILLED OUT BY APA STAFF) - DATE RECEIVED:, DATEOFACTION:_ ACTION TAKEN:_ IN V. NO. & DATE: STAFF NAME: LABEL NO. & DATE:.

Send this form to APA Subscription Claims, 750 First Street, NE, Washington, DC 20002-4242 or FAX a copy to (202) 336-5568. PLEASE DO NOT REMOVE. A PHOTOCOPY MAY BE USED.