<<

Multilevel Modeling 1271 M

analysis techniques such as Multilevel Modeling and ordinary multiple regression. That is, in nested data, observations at the lower Simon Sherry1 and Anna MacKinnon2 level (level 1) are not independent. For example, 1Department of , Dalhousie individuals (level 1) sampled from the same University, Halifax, NS, Canada neighborhood (level 2) may be more similar 2Department of Psychology, McGill University, than individuals sampled from a different neigh- Montreal, QC, Canada borhood. Single-level data analysis techniques often fail to take into account such a nested data structure and either ignore the nested data struc- Synonyms ture (violating the assumption of independence) or collapse across the levels of the nested data Hierarchical linear modeling (HLM); Mixed- structure (ignoring potentially meaningful vari- effects modeling; Random-coefficient regression ability in the data). Violating the assumption of modeling; Random-effects modeling independence may result in underestimation of standard errors and inflation of type I error rates. In contrast, multilevel modeling allows for Definition data to be analyzed at one level while accounting for variance at other levels. Maximum likelihood Multilevel modeling is a data analysis technique algorithms are typically used in multilevel ana- used to analyze nested data. Nested data refers to lyses, which allow for simultaneous estimation of data wherein units of analysis at one level are multiple error terms. As a result, standard errors nested within units of analysis at higher levels. are more accurate, and type I error rates are not Multilevel data are observed in cross-sectional inflated. In addition, multilevel modeling enables M designs which sample individuals nested within unique types of analyses. Multilevel analyses are groups. An example of this type of multilevel similar to single-level regression analyses, where data is patients (level 1) nested within hospitals intercepts and slopes are calculated. However, (level 2). Multilevel data are also found in unlike single-level regression analyses, repeated measures designs (e.g., multiwave lon- multilevel modeling permits cross-level analyses, gitudinal or experience sampling designs) which wherein a level 2 predictor is used to predict sample repeated reports nested within individ- a level 1 outcome. For example, an investigator uals. An example of this type of multilevel data may test if the neighborhood people live in (level is an experience sampling study where repeated 2) predicts their obesity (level 1). Although such reports of pain (level 1) are nested within indi- computations are complex, there are many soft- viduals (level 2). In multilevel modeling, level ware programs which perform multilevel model- thus refers to the structure of the data. The ing, including HLM, LISREL, MLwiN, MPlus, lower level (level 1) represents the most detailed R, SAS, SPSS, and Stata. unit of analysis and has the greatest number of data points. Level 2 represents the higher level Considerations When Using Multilevel within which level 1 observations are nested. Modeling Power. Although sample sizes at both levels war- rant consideration, in general, sample size at the Description higher level has a greater influence on power than sample size at the lower level. For example, in Why Is Multilevel Modeling Necessary? experience sampling studies, the number of par- Multilevel modeling is necessary because nested ticipants (level 2) has a greater influence on data structures violate the assumption of indepen- power than the number of reports per participant dence required by traditional, single-level data (level 1). M 1272 Multilevel Modeling

Intraclass correlation. It is also necessary to individuals (level 2), group mean centering is test if multilevel modeling is even necessary. conceptually equivalent to creating variables Multilevel modeling is not necessary if there is that are relative to the individual’s own mean no variation at higher levels. Variation at higher based on his/her repeated reports, whereas grand levels may be computed using the intraclass cor- mean centering is conceptually equivalent to cre- relation coefficient. The intraclass correlation ating variables that are relative to the overall coefficient measures the degree to which the mean of all reports provided by all the individuals lower level units (level 1) belonging to the same in the study. Centering aids in interpretation of higher level unit (level 2) are dependent or clus- the results and the choice of centering affects the tered. Larger intraclass correlation coefficients estimates computed. Decisions regarding center- indicate more dependence or clustering at higher ing should be made on a theoretical basis. levels. The occurrence of dependence or cluster- Autocorrelation. In repeated measures ing at higher levels indicates it is important to use designs, autocorrelation is an issue. Because of multilevel modeling to protect against inflation of the repeated nature of the data, residual errors in type I error rates and to capture variability at repeated measures data may be correlated (i.e., higher levels of the nested data structure. autocorrelation). The simplest and the most com- Missing data. Multilevel modeling is often mon autocorrelation structure is the first-order used to analyze repeated measures data, where autoregressive error structure in which reports missing data are common. Because multilevel closer together in time are more strongly corre- analyses typically use maximum likelihood algo- lated than reports further apart in time. Some rithms, participants with missing data may be software programs (e.g., MPlus or SAS) also included in analyses. In multilevel modeling, model more complex error structures to better results are weighted by the amount of data con- account for autocorrelation. tributed by each participant. That is, participants who provide more data contribute more to the Advantages of Multilevel Modeling results than participants who provide less data. Nested data commonly arises in the field of Fixed and random effects. Another consider- behavioral medicine. By taking into account the ation in multilevel modeling is whether effects nested structure of the data, multilevel modeling are fixed or random. With random effects, the provides more contextualized analyses. For outcome-predictor relationship varies across example, multilevel modeling may be used to level 2 units. That is, the slope and the intercept test neighborhood effects on individuals’ obesity, of the regression line are assumed to vary across partner effects on patients with cardiovascular level 2 units. With fixed effects, the variables of disease, and peer influences on adolescents’ interest do not vary across level 2 units; the slope risky health behaviors. Multilevel modeling and the intercept are the same for all level 2 units. may also be used in repeated measures designs, If random effects are modeled, the results are including multiwave longitudinal and experience assumed to generalize to the population from sampling studies of health behaviors (e.g., which cases were sampled, whereas, if fixed smoking, diet, exercise, and medication adher- effects are modeled, the results are confined to ence), chronic illnesses (e.g., pain, diabetes, and the cases studied. However, random effects HIV), and physiological processes (e.g., cardio- models typically require greater sample sizes vascular reactivity and neuroendocrine levels). and may be more complicated to interpret. There are several advantages to using Centering. Variables in multilevel modeling multilevel modeling to analyze repeated mea- may be centered around the group mean (i.e., the sures data. Multilevel modeling is able to analyze mean of each level 1 unit) or centered around the unbalanced designs, including unequally spaced grand mean (i.e., the mean of all the level 2 units). data and missing data. Using multilevel model- For example, in an experience sampling study ing, it is possible to simultaneously estimate involving repeated reports (level 1) nested within within person and between persons effects. For Multiple Risk Factor Intervention Trial (MRFIT) 1273 M example, a researcher could study if on days References and Readings when a participant experiences more stress, he or she smokes more compared to days when he Bickel, R. (2007). Multilevel analysis for applied or she experiences less stress (a within person research: It’s just regression! New York: Guilford. Field, A. (2009). Discovering using SPSS. Thou- effect). This effect may then be tested to see if it sand Oaks, CA: Sage. generalizes across all participants in the study (a Hox, J. (2010). Multilevel analysis techniques and appli- between persons effect) or to see whether cations. Mahwah, NJ: Lawrence Erlbaum. between person differences, such as personality Raudenbush, S., & Bryk, A. (2002). Hierarchical linear models. London: Sage. traits, moderate the relationship between stress and smoking (a cross level ).

Limitations of Multilevel Modeling Multilevel modeling is an advanced statistical Multiple Regression technique, which requires a solid grounding in statistics. Increasingly, resources are available ▶ to support researchers using multilevel modeling (e.g., Bickel, 2007; Field, 2009; Hox, 2010; Raudenbush & Bryk, 2002). Specialized software is also usually needed to conduct multilevel modeling. However, increasingly, mainstream Multiple Risk Factor Intervention software also performs multilevel analyses. Trial (MRFIT)

New Applications and Developments Jonathan Newman Multilevel modeling is not limited to regression Columbia University, New York, NY, USA M analyses. In recent years, researchers are combin- ing multilevel modeling with other data analysis techniques. For example, multilevel modeling Definition may be used in testing moderation, mediation, path models, structural models, growth curves, The Multiple Risk Factor Interventional Trial and meta-analyses. By integrating these data (MRFIT) was a large, randomized primary pre- analysis techniques, multilevel modeling is able vention trial to test the effect of multiple inter- to analyze a wider variety of research questions. ventions to reduce the risk of premature coronary heart disease (CHD) in 12,866 men, age 35–57, with one or more of three risk factors (hyperten- Conclusion sion, hyperlipidemia, or cigarette smoking) with- out a prior history of CHD. The trial was Multilevel modeling is needed to appropriately conducted in 22 clinical centers in the United analyze the nested data structures that often occur States. MRFIT was conducted by the National in research on behavioral medicine. Careful con- Institutes of Health (NIH) and National Heart, sideration of the above issues is critical to appro- Lung, and Blood Institute and was massive in priately using this data analysis technique. scope, screening 356,222 men for the desired study population. These risk factors were chosen because they are modifiable, and there was an Cross-References expectation (largely unproven at the time) that reduction of these factors should have beneficial ▶ Missing Data results on the development of premature CHD. ▶ Multivariate Analysis A subsample of 3,110 men was recruited to ▶ participate in the Behavior Pattern Study, which