Moderation Analysis by Elaine Eisenbeisz
Total Page:16
File Type:pdf, Size:1020Kb
Vol. 14, No. 2, February 2018 “Happy Trials to You” Making Sense of Biostatistics: Moderation Analysis By Elaine Eisenbeisz Continuing on from last month’s column about mediation, the simplest research studies compare treatment arms and endpoints. But often, there are one or more lurking variables that might explain some or all of the effects between the treatment arms and endpoints. These variables are called “effect modifiers” because, well, that is what they do. Variables like gender, age, ethnicity, baseline measurements, or any other variables that are associated with an outcome could be influencing the outcomes, in addition to, or more than, the treatment. Moderation is one more of many types of effect modifiers. A moderator is a variable that, when included in an analysis, affects the strength of the relationship between a treatment and an endpoint. A moderator can be either continuous or categorical. A good way to think about the difference between a mediator variable and a moderator variable is: Mediators speak to how or why an effect between a treatment and outcome occurs. Mediators happen outside the intervention-to-endpoint process. Mediators explain the relationship between two other variables. Moderators influence the strength of a relationship between the treatment and outcome. They might even determine when certain effects hold at all, depending on the levels of the moderator. Like mediators, moderators differ from confounders. A confounder is something that occurs outside the intervention-to-endpoint process. Often, the randomization process can account for them. In contrast, with moderators, there is a different relationship between the treatment and outcome at different levels of the moderator. The effect of moderators can’t be randomized out of a study because moderators are inherent between the treatment and the outcome. In a study of a diet drug on weight loss, age, gender and co-morbidities can moderate the effect due to their impact on metabolic rate, lifestyle, body image, etc. Randomization can prevent these factors from distorting the statistical analysis, but it cannot explain their effects on the results — for that you need effect modifiers like moderators. In other words, you can randomize for age and also consider its impact as a modifier. By understanding the effects of modifiers, you might identify an important sub-population, discover the real cause of an outcome, or estimate the impact of inadequate randomization. Figure 1 compares a straightforward treatment-to-outcome model to a simple moderated model. With multiple moderators, the models get complicated, so we will keep it simple today and look at a model with just one moderating variable. Subscribe free at www.firstclinical.com © 2018 First Clinical Research and the Author(s) Figure 1A. A simple model with treatment effect (X) on outcome/endpoint (Y). The moderating effect (Z) influences the strength of the relationship between X and Y. Figure 1B. The model of a regression analysis that includes the two predictors X and Z, along with the interaction between X and Z (XZ) on the outcome (endpoint) of Y. The effects in this regression model are measured with the unstandardized (raw) coefficients (β1, β2, β3) (Diagrams from Fairchild and MacKinnon, 2009.) Checking for moderation is a fairly straightforward four-step process: 1. Include a term for the interaction of the X and Z variables as a predictor in the regression model. To make an interaction term, make a new variable that is the product of the X and Z variables. Just multiply the X and Z variables together to make XZ. 2. Add the XZ variable into the regression model with your other predictors. 3. If the XZ variable is statistically significant, then you have moderation. 4. If the interaction is significant, indicating moderation, then make a graph to better see what is happening. a) If Z is categorical and X is also categorical, make a graph of the mean differences in X at each level of Z. b) If Z is categorical and X is continuous, plot the predicted values of the regression lines for X at each level of the moderator. c) If Z is continuous, you could, in theory, have infinite values to plot X on, so you’d have an infinite number of regression lines. Plotting all these lines would be very time-consuming, so determine from your knowledge or other sources the best values of Z to use and plot for the X values accordingly. d) If you don’t have a set of values in mind for Z, then a common convention that usually works is to plot the value of X at three levels of Z: (1) the mean of Z, (2) 1 standard deviation above the mean of Z, and (3) one standard deviation below the mean of Z. e) Finally, if both the Z and X variables are continuous, use the conventions of (c) above for both the Z and X variables. When you plot the interaction, you will see on your graphs that the means or slopes of the values of X are not the same for each value of Z. Visually, the lines will be anything but parallel. One line might increase while another decreases. They might even cross each other. They will be different! Subscribe free at www.firstclinical.com 2 © 2018 First Clinical Research and the Author(s) Figure 2 shows an example of an interaction effect in a study of college students that tested the effect of negative social contacts (X), for instance, arguments with friends, and the number of drinks those students have at home (Y). The moderator (Z) is whether the student drinks to cope (“DTC”). In this study, Z is categorical variable (high DTC vs. low DTC) and X is a continuous variable, so the authors chose seven points of X to plot their lines: Figure 2. Negative social contacts (e.g., arguments with friends) are associated with increased drinking at home for college students who say they drink to cope (e.g., to forget about problems). In contrast, negative social contacts are unrelated to drinking at home for students who do not say they drink to cope. (Mohr et al., 2005, borrowed from Dr. Adam Butler: https://sites.uni.edu/butlera/courses/org/modmed/moderator_mediator.htm) Caveat (revisited) As with the caveat on mediation analysis, moderation analysis can support the hypothesis that moderation is present, but it cannot prove its presence. The reason is that any number of other models could be built to show other associations with other moderators. Additional research might further support the hypothesis that the moderation exists, but statistics deals with likelihood, not certainty. References An easy-to-read blog post about moderation and graphing the effects, from my colleague Karen Grace-Martin at The Analysis Factor: “3 Tips to Make Interpreting Moderation Effects Easier.” https://www.theanalysisfactor.com/3-tips-interpreting-moderation/ Baron, R. M., & Kenny, D. A. (1986). “The moderator-mediator variable distinction in social psychological research: Conceptual, strategic and statistical considerations.” Journal of Personality and Social Psychology, 51, 1173-1182. Fairchild, A. J., & MacKinnon, D. P. (2009). “A General Model for Testing Mediation and Moderation Effects.” Prevention Science : The Official Journal of the Society for Prevention Research, 10(2), 87–99. http://doi.org/10.1007/s11121-008-0109-6 Subscribe free at www.firstclinical.com 3 © 2018 First Clinical Research and the Author(s) Software: PROCESS macro by Andrew Hayes (SPSS and SAS). http://www.processmacro.org/index.html Author Elaine Eisenbeisz, BS, is the Owner & Principal Statistician of Omega Statistics. Contact her at 1.951.461.7226 or [email protected]. Subscribe free at www.firstclinical.com 4 © 2018 First Clinical Research and the Author(s) .