Generalized Linear Mixed Models (Glmms), Which the Form Extend Glms by the Inclusion of Random Effects = Ηi Xi Β,(1) in the Predictor

Multilevel Models), in which the level-1 observa- Generalized Linear Mixed tions (subjects or repeated observations) are nested Models within the higher level-2 observations (clusters or subjects). Higher levels are also possible, for example, a three-level design could have repeated obser- Introduction vations (level-1) nested within subjects (level-2) who are nested within clusters (level-3). Generalized linear models (GLMs) represent a class For analysis of such multilevel data, random of fixed effects regression models for several types of cluster and/or subject effects can be added into the dependent variables (i.e., continuous, dichotomous, regression model to account for the correlation of counts). McCullagh and Nelder [32] describe these in the data. The resulting model is a mixed model great detail and indicate that the term ‘generalized lin- including the usual fixed effects for the regressors ear model’ is due to Nelder and Wedderburn [35] who plus the random effects. Mixed models for continuous described how a collection of seemingly disparate normal outcomes have been extensively developed statistical techniques could be unified. Common Gen- since the seminal paper by Laird and Ware [28]. eralized linear models (GLMs) include linear regres- For nonnormal data, there have also been many sion, logistic regression, and Poisson regression. developments, some of which are described below. There are three specifications in a GLM. First, Many of these developments fall under the rubric of the linear predictor, denoted as η ,ofaGLMisof i generalized linear mixed models (GLMMs), which the form extend GLMs by the inclusion of random effects = ηi xi β,(1) in the predictor. Agresti et al. [1] describe a variety of social science applications of GLMMs; [12], [33], where xi is the vector of regressors for unit i with and [11] are recent texts with a wealth of statistical · fixed effects β. Then, a link function g( ) is specified material on GLMMs. which converts the expected value µi of the outcome Let i denote the level-2 units (e.g., subjects) and = variable Yi (i.e., µi E[Yi ]) to the linear predictor ηi let j denote the level-1 units (e.g., nested observations). The focus will be on longitudinal designs g(µ ) = η .(2) i i here, but the methods apply to clustered designs as well. Assume there are i = 1,...,N subjects Finally, a specification for the form of the variance (level-2 units) and j = 1,...,n repeated observa- in terms of the mean µ is made. The latter two i i tions (level-1 units) nested within each subject. A specifications usually depend on the distribution of random-intercept model, which is the simplest mixed the outcome Y , which is assumed to fall within the i model, augments the linear predictor with a single exponential family of distributions. random effect for subject i, Fixed effects models, which assume that all observations are independent of each other, are not appro- = + priate for analysis of several types of correlated data ηij xij β νi ,(3) structures, in particular, for clustered and/or longitudinal data (see Clustered Data). In clustered designs, where νi is the random effect (one for each subject). subjects are observed nested within larger units, for These random effects represent the influence of example, schools, hospitals, neighborhoods, work- subject i on his/her repeated observations that is not places, and so on. In longitudinal designs, repeated captured by the observed covariates. These are treated observations are nested within subjects (see Lon- as random effects because the sampled subjects are gitudinal Data Analysis and Repeated Measures thought to represent a population of subjects, and they Analysis of Variance). These are often referred to as N 2 are usually assumed to be distributed as (0,σν ). multilevel [16] or hierarchical [41] data (see Linear 2 The parameter σν indicates the variance in the population distribution, and therefore the degree of Reproduced from the Encyclopedia of Statistics in heterogeneity of subjects. Behavioral Science. John Wiley & Sons, Ltd. Including the random effects, the expected value ISBN: 0-470-86080-4. of the outcome variable, which is related to the linear 2 Generalized Linear Mixed Models predictor via the link function, is given as probability of a response given the random effects (and covariate values). = | µij E[Yij νi , xij ].(4) This model can also be written as −1 This is the expectation of the conditional distribu- P(Yij = 1|vi , xij , zij ) = g (ηij ) = (ηij ), (7) tion of the outcome given the random effects. As a result, GLMMs are often referred to as conditional where the inverse link function (ηij ) is the logis- models in contrast to the marginal generalized esti- tic cumulative distribution function (cdf), namely −1 mating equations (GEE) models (see Generalized (ηij ) = [1 + exp(−ηij )] . A nicety of the logis- Estimating Equations (GEE)) [29], which represent tic distribution, that simplifies parameter estimation, an alternative generalization of GLMs for correlated is that the probability density function (pdf) is related data (see Marginal Models for Clustered Data). to the cdf in a simple way, as ψ(ηij ) = (ηij )[1 − The model can be easily extended to include mul- (ηij )]. tiple random effects. For example, in longitudinal The probit model, which is based on the standard problems, it is common to have a random subject normal distribution, is often proposed as an alterna- intercept and a random linear time-trend. For this, tive to the logistic model [13]. For the probit model, denote zij as the r × 1 vector of variables having ran- the normal cdf and pdf replace their logistic counter- dom effects (a column of ones is usually included for parts. A useful feature of the probit model is that it the random intercept). The vector of random effects can be used to yield tetrachoric correlations for the vi is assumed to follow a multivariate normal distri- clustered binary responses, and polychoric correla- bution with mean vector 0 and variance–covariance tions for ordinal outcomes (discussed below). For this matrix v (see Catalogue of Probability Density reason, in some areas, for example familial studies, Functions). The model is now written as the probit formulation is often preferred to its logistic counterpart. = + ηij xij β zij vi .(5) Example Note that the conditional mean µij is now specified | as E[Yij vi , xij ], namely, in terms of the vector of Gruder et al. [20] describe a smoking-cessation study random effects. in which 489 subjects were randomized to either a control, discussion, or social support conditions. Con- trol subjects received a self-help manual and were Dichotomous Outcomes encouraged to watch twenty segments of a daily TV program on smoking cessation, while subjects in the Development of GLMMs for dichotomous data has two experimental conditions additionally participated been an active area of statistical research. Several in group meetings and received training in support approaches, usually adopting a logistic or probit and relapse prevention. Here, for simplicity, these regression model (see Probits) and various methods two experimental conditions will be combined. Data for incorporating and estimating the influence of the were collected at four telephone interviews: postin- random effects, have been developed. A review arti- tervention, and 6, 12, and 24 months later. Smoking cle by Pendergast et al. [37] discusses and compares abstinence rates (and sample sizes) at these four time- many of these developments. points were 17.4% (109), 7.2% (97), 18.5% (92), and The mixed-effects logistic regression model is a 18.2% (77) for the placebo condition. Similarly, for common choice for analysis of multilevel dichoto- the combined experimental condition it was 34.5% mous data and is arguably the most popular GLMM. (380), 18.2% (357), 19.6% (337), and 21.7% (295) In the GLMM context, this model utilizes the logit for these timepoints. link, namely Two logistic GLMM were fit to these data: a random intercept and a random intercept and linear trend = = µij = g(µij ) logit(µij ) log ηij .(6) of time model (see Growth Curve Modeling). These 1 − µij models were estimated using SAS PROC NLMIXED Here, the conditional expectation µij = E(Yij |vi , xij ) with adaptive quadrature. For these, it is the probabil- equals P(Yij = 1|vi , xij ), namely, the conditional ity of smoking abstinence, rather than smoking, that Generalized Linear Mixed Models 3 Table 1 Smoking cessation study: smoking status (0 = smoking, 1 = not smoking) across time (N = 489), GLMM logistic parameter estimates (Est.), standard errors (SE), and P values Random intercept model Random int and trend model Parameter Est. SE P value Est. SE P value Intercept −2.867 .362 .001 −2.807 .432 .001 Time .113 .122 .36 −.502 .274 .07 Condition (0 = control; 1 = experimental) 1.399 .379 .001 1.495 .415 .001 Condition by Time −.322 .136 .02 −.331 .249 .184 Intercept variance 3.587 .600 3.979 1.233 Intercept Time covariance .048 .371 Time variance 1.428 .468 −2 log likelihood 1631.0 1594.7 Note: P values not given for variance and covariance parameters (see [41]). is being modeled. Fixed effects included a condition This example shows that the significance of model term (0 = control, 1 = experimental), time (coded 0, terms can depend on the structure of the random 1, 2, and 4 for the four timepoints), and the con- effects. Thus, one must decide upon a reasonable dition by time interaction. Results for both models model for the random effects as well as for the are presented in Table 1. Based on a likelihood-ratio fixed effects. A commonly recommended approach test, the model with random intercept and linear time for this is to perform a sequential procedure for model trend is preferred over the simpler random intercept selection.

Load more