Mixed Models - General
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Linear Mixed-Effects Modeling in SPSS: an Introduction to the MIXED Procedure
Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction. 1 Data preparation for MIXED . 1 Fitting fixed-effects models . 4 Fitting simple mixed-effects models . 7 Fitting mixed-effects models . 13 Multilevel analysis . 16 Custom hypothesis tests . 18 Covariance structure selection. 19 Random coefficient models . 20 Estimated marginal means. 25 References . 28 About SPSS Inc. 28 SPSS is a registered trademark and the other SPSS products named are trademarks of SPSS Inc. All other names are trademarks of their respective owners. © 2005 SPSS Inc. All rights reserved. LMEMWP-0305 Introduction The linear mixed-effects models (MIXED) procedure in SPSS enables you to fit linear mixed-effects models to data sampled from normal distributions. Recent texts, such as those by McCulloch and Searle (2000) and Verbeke and Molenberghs (2000), comprehensively review mixed-effects models. The MIXED procedure fits models more general than those of the general linear model (GLM) procedure and it encompasses all models in the variance components (VARCOMP) procedure. This report illustrates the types of models that MIXED handles. We begin with an explanation of simple models that can be fitted using GLM and VARCOMP, to show how they are translated into MIXED. We then proceed to fit models that are unique to MIXED. The major capabilities that differentiate MIXED from GLM are that MIXED handles correlated data and unequal variances. Correlated data are very common in such situations as repeated measurements of survey respondents or experimental subjects. MIXED extends repeated measures models in GLM to allow an unequal number of repetitions. -
Multilevel and Mixed-Effects Modeling 391
386 Statistics with Stata , . d itl 770/ ""01' ncdctemp However the residuals pass tests for white noise 111uahtemp, compar e WI 1 /0 l' r : , , '12 19) (p = .65), and a plot of observed and predicted values shows a good visual fit (Figure . , Multilevel and Mixed-Effects predict uahhat2 solar, ENSO & C02" label variable uahhat2 "predicted from volcanoes, Modeling predict uahres2, resid label variable uahres2 "UAH residuals from ARMAX model" wntestq uahres2, lags(25) portmanteau test for white noise portmanteau (Q)statistic 21. 7197 Mixed-effects modeling is basically regression analysis allowing two kinds of effects: fixed =rob > chi2(25) 0.6519 effects, meaning intercepts and slopes meant to describe the population as a whole, just as in Figure 12.19 ordinary regression; and also random effects, meaning intercepts and slopes that can vary across subgroups of the sample. All of the regression-type methods shown so far in this book involve fixed effects only. Mixed-effects modeling opens a new range of possibilities for multilevel o models, growth curve analysis, and panel data or cross-sectional time series, "r~ 00 01 Albright and Marinova (2010) provide a practical comparison of mixed-modeling procedures uj"'! > found in Stata, SAS, SPSS and R with the hierarchical linear modeling (HLM) software 0>- m developed by Raudenbush and Bryck (2002; also Raudenbush et al. 2005). More detailed ~o c explanation of mixed modeling and its correspondences with HLM can be found in Rabe• ro (I! Hesketh and Skrondal (2012). Briefly, HLM approaches multilevel modeling in several steps, ::IN co I' specifying separate equations (for example) for levelland level 2 effects. -
Generalized Linear Models
Generalized Linear Models A generalized linear model (GLM) consists of three parts. i) The first part is a random variable giving the conditional distribution of a response Yi given the values of a set of covariates Xij. In the original work on GLM’sby Nelder and Wedderburn (1972) this random variable was a member of an exponential family, but later work has extended beyond this class of random variables. ii) The second part is a linear predictor, i = + 1Xi1 + 2Xi2 + + ··· kXik . iii) The third part is a smooth and invertible link function g(.) which transforms the expected value of the response variable, i = E(Yi) , and is equal to the linear predictor: g(i) = i = + 1Xi1 + 2Xi2 + + kXik. ··· As shown in Tables 15.1 and 15.2, both the general linear model that we have studied extensively and the logistic regression model from Chapter 14 are special cases of this model. One property of members of the exponential family of distributions is that the conditional variance of the response is a function of its mean, (), and possibly a dispersion parameter . The expressions for the variance functions for common members of the exponential family are shown in Table 15.2. Also, for each distribution there is a so-called canonical link function, which simplifies some of the GLM calculations, which is also shown in Table 15.2. Estimation and Testing for GLMs Parameter estimation in GLMs is conducted by the method of maximum likelihood. As with logistic regression models from the last chapter, the generalization of the residual sums of squares from the general linear model is the residual deviance, Dm 2(log Ls log Lm), where Lm is the maximized likelihood for the model of interest, and Ls is the maximized likelihood for a saturated model, which has one parameter per observation and fits the data as well as possible. -
Mixed Model Methodology, Part I: Linear Mixed Models
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/271372856 Mixed Model Methodology, Part I: Linear Mixed Models Technical Report · January 2015 DOI: 10.13140/2.1.3072.0320 CITATIONS READS 0 444 1 author: jean-louis Foulley Université de Montpellier 313 PUBLICATIONS 4,486 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Football Predictions View project All content following this page was uploaded by jean-louis Foulley on 27 January 2015. The user has requested enhancement of the downloaded file. Jean-Louis Foulley Mixed Model Methodology 1 Jean-Louis Foulley Institut de Mathématiques et de Modélisation de Montpellier (I3M) Université de Montpellier 2 Sciences et Techniques Place Eugène Bataillon 34095 Montpellier Cedex 05 2 Part I Linear Mixed Model Models Citation Foulley, J.L. (2015). Mixed Model Methodology. Part I: Linear Mixed Models. Technical Report, e-print: DOI: 10.13140/2.1.3072.0320 3 1 Basics on linear models 1.1 Introduction Linear models form one of the most widely used tools of statistics both from a theoretical and practical points of view. They encompass regression and analysis of variance and covariance, and presenting them within a unique theoretical framework helps to clarify the basic concepts and assumptions underlying all these techniques and the ones that will be used later on for mixed models. There is a large amount of highly documented literature especially textbooks in this area from both theoretical (Searle, 1971, 1987; Rao, 1973; Rao et al., 2007) and practical points of view (Littell et al., 2002). -
Robust Bayesian General Linear Models ⁎ W.D
www.elsevier.com/locate/ynimg NeuroImage 36 (2007) 661–671 Robust Bayesian general linear models ⁎ W.D. Penny, J. Kilner, and F. Blankenburg Wellcome Department of Imaging Neuroscience, University College London, 12 Queen Square, London WC1N 3BG, UK Received 22 September 2006; revised 20 November 2006; accepted 25 January 2007 Available online 7 May 2007 We describe a Bayesian learning algorithm for Robust General Linear them from the data (Jung et al., 1999). This is, however, a non- Models (RGLMs). The noise is modeled as a Mixture of Gaussians automatic process and will typically require user intervention to rather than the usual single Gaussian. This allows different data points disambiguate the discovered components. In fMRI, autoregressive to be associated with different noise levels and effectively provides a (AR) modeling can be used to downweight the impact of periodic robust estimation of regression coefficients. A variational inference respiratory or cardiac noise sources (Penny et al., 2003). More framework is used to prevent overfitting and provides a model order recently, a number of approaches based on robust regression have selection criterion for noise model order. This allows the RGLM to default to the usual GLM when robustness is not required. The method been applied to imaging data (Wager et al., 2005; Diedrichsen and is compared to other robust regression methods and applied to Shadmehr, 2005). These approaches relax the assumption under- synthetic data and fMRI. lying ordinary regression that the errors be normally (Wager et al., © 2007 Elsevier Inc. All rights reserved. 2005) or identically (Diedrichsen and Shadmehr, 2005) distributed. In Wager et al. -
Generalized Linear Models Outline for Today
Generalized linear models Outline for today • What is a generalized linear model • Linear predictors and link functions • Example: estimate a proportion • Analysis of deviance • Example: fit dose-response data using logistic regression • Example: fit count data using a log-linear model • Advantages and assumptions of glm • Quasi-likelihood models when there is excessive variance Review: what is a (general) linear model A model of the following form: Y = β0 + β1X1 + β2 X 2 + ...+ error • Y is the response variable • The X ‘s are the explanatory variables • The β ‘s are the parameters of the linear equation • The errors are normally distributed with equal variance at all values of the X variables. • Uses least squares to fit model to data, estimate parameters • lm in R The predicted Y, symbolized here by µ, is modeled as µ = β0 + β1X1 + β2 X 2 + ... What is a generalized linear model A model whose predicted values are of the form g(µ) = β0 + β1X1 + β2 X 2 + ... • The model still include a linear predictor (to right of “=”) • g(µ) is the “link function” • Wide diversity of link functions accommodated • Non-normal distributions of errors OK (specified by link function) • Unequal error variances OK (specified by link function) • Uses maximum likelihood to estimate parameters • Uses log-likelihood ratio tests to test parameters • glm in R The two most common link functions Log • used for count data η η = logµ The inverse function is µ = e Logistic or logit • used for binary data µ eη η = log µ = 1− µ The inverse function is 1+ eη In all cases log refers to natural logarithm (base e). -
R2 Statistics for Mixed Models
Kansas State University Libraries New Prairie Press Conference on Applied Statistics in Agriculture 2005 - 17th Annual Conference Proceedings R2 STATISTICS FOR MIXED MODELS Matthew Kramer Follow this and additional works at: https://newprairiepress.org/agstatconference Part of the Agriculture Commons, and the Applied Statistics Commons This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License. Recommended Citation Kramer, Matthew (2005). "R2 STATISTICS FOR MIXED MODELS," Conference on Applied Statistics in Agriculture. https://doi.org/10.4148/2475-7772.1142 This is brought to you for free and open access by the Conferences at New Prairie Press. It has been accepted for inclusion in Conference on Applied Statistics in Agriculture by an authorized administrator of New Prairie Press. For more information, please contact [email protected]. Conference on Applied Statistics in Agriculture Kansas State University 148 Kansas State University R2 STATISTICS FOR MIXED MODELS Matthew Kramer Biometrical Consulting Service, ARS (Beltsville, MD), USDA Abstract The R2 statistic, when used in a regression or ANOVA context, is appealing because it summarizes how well the model explains the data in an easy-to- understand way. R2 statistics are also useful to gauge the effect of changing a model. Generalizing R2 to mixed models is not obvious when there are correlated errors, as might occur if data are georeferenced or result from a designed experiment with blocking. Such an R2 statistic might refer only to the explanation associated with the independent variables, or might capture the explanatory power of the whole model. In the latter case, one might develop an R2 statistic from Wald or likelihood ratio statistics, but these can yield different numeric results. -
Linear Mixed Models and Penalized Least Squares
Linear mixed models and penalized least squares Douglas M. Bates Department of Statistics, University of Wisconsin–Madison Saikat DebRoy Department of Biostatistics, Harvard School of Public Health Abstract Linear mixed-effects models are an important class of statistical models that are not only used directly in many fields of applications but also used as iterative steps in fitting other types of mixed-effects models, such as generalized linear mixed models. The parameters in these models are typically estimated by maximum likelihood (ML) or restricted maximum likelihood (REML). In general there is no closed form solution for these estimates and they must be determined by iterative algorithms such as EM iterations or general nonlinear optimization. Many of the intermediate calculations for such iterations have been expressed as generalized least squares problems. We show that an alternative representation as a penalized least squares problem has many advantageous computational properties including the ability to evaluate explicitly a profiled log-likelihood or log-restricted likelihood, the gradient and Hessian of this profiled objective, and an ECME update to refine this objective. Key words: REML, gradient, Hessian, EM algorithm, ECME algorithm, maximum likelihood, profile likelihood, multilevel models 1 Introduction We will establish some results for the penalized least squares representation of a general form of a linear mixed-effects model, then show how these results specialize for particular models. The general form we consider is y = Xβ + Zb + ∼ N (0, σ2I), b ∼ N (0, σ2Ω−1), ⊥ b (1) where y is the n-dimensional response vector, X is an n × p model matrix for the p-dimensional fixed-effects vector β, Z is the n × q model matrix for Preprint submitted to Journal of Multivariate Analysis 25 September 2003 the q-dimensional random-effects vector b that has a Gaussian distribution with mean 0 and relative precision matrix Ω (i.e., Ω is the precision of b relative to the precision of ), and is the random noise assumed to have a spherical Gaussian distribution. -
Lecture 2: Linear and Mixed Models
Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG (Module 12), Seattle 17 – 19 July 2019 1 Quick Review of the Major Points The general linear model can be written as y = Xb + e • y = vector of observed dependent values • X = Design matrix: observations of the variables in the assumed linear model • b = vector of unknown parameters to estimate • e = vector of residuals (deviation from model fit), e = y-X b 2 y = Xb + e Solution to b depends on the covariance structure (= covariance matrix) of the vector e of residuals Ordinary least squares (OLS) • OLS: e ~ MVN(0, s2 I) • Residuals are homoscedastic and uncorrelated, so that we can write the cov matrix of e as Cov(e) = s2I • the OLS estimate, OLS(b) = (XTX)-1 XTy Generalized least squares (GLS) • GLS: e ~ MVN(0, V) • Residuals are heteroscedastic and/or dependent, • GLS(b) = (XT V-1 X)-1 XTV-1y 3 BLUE • Both the OLS and GLS solutions are also called the Best Linear Unbiased Estimator (or BLUE for short) • Whether the OLS or GLS form is used depends on the assumed covariance structure for the residuals 2 – Special case of Var(e) = se I -- OLS – All others, i.e., Var(e) = R -- GLS 4 Linear Models One tries to explain a dependent variable y as a linear function of a number of independent (or predictor) variables. A multiple regression is a typical linear model, Here e is the residual, or deviation between the true value observed and the value predicted by the linear model. -
A Simple, Linear, Mixed-Effects Model
Chapter 1 A Simple, Linear, Mixed-effects Model In this book we describe the theory behind a type of statistical model called mixed-effects models and the practice of fitting and analyzing such models using the lme4 package for R. These models are used in many different dis- ciplines. Because the descriptions of the models can vary markedly between disciplines, we begin by describing what mixed-effects models are and by ex- ploring a very simple example of one type of mixed model, the linear mixed model. This simple example allows us to illustrate the use of the lmer function in the lme4 package for fitting such models and for analyzing the fitted model. We also describe methods of assessing the precision of the parameter estimates and of visualizing the conditional distribution of the random effects, given the observed data. 1.1 Mixed-effects Models Mixed-effects models, like many other types of statistical models, describe a relationship between a response variable and some of the covariates that have been measured or observed along with the response. In mixed-effects models at least one of the covariates is a categorical covariate representing experimental or observational “units” in the data set. In the example from the chemical industry that is given in this chapter, the observational unit is the batch of an intermediate product used in production of a dye. In medical and social sciences the observational units are often the human or animal subjects in the study. In agriculture the experimental units may be the plots of land or the specific plants being studied. -
Stat 714 Linear Statistical Models
STAT 714 LINEAR STATISTICAL MODELS Fall, 2010 Lecture Notes Joshua M. Tebbs Department of Statistics The University of South Carolina TABLE OF CONTENTS STAT 714, J. TEBBS Contents 1 Examples of the General Linear Model 1 2 The Linear Least Squares Problem 13 2.1 Least squares estimation . 13 2.2 Geometric considerations . 15 2.3 Reparameterization . 25 3 Estimability and Least Squares Estimators 28 3.1 Introduction . 28 3.2 Estimability . 28 3.2.1 One-way ANOVA . 35 3.2.2 Two-way crossed ANOVA with no interaction . 37 3.2.3 Two-way crossed ANOVA with interaction . 39 3.3 Reparameterization . 40 3.4 Forcing least squares solutions using linear constraints . 46 4 The Gauss-Markov Model 54 4.1 Introduction . 54 4.2 The Gauss-Markov Theorem . 55 4.3 Estimation of σ2 in the GM model . 57 4.4 Implications of model selection . 60 4.4.1 Underfitting (Misspecification) . 60 4.4.2 Overfitting . 61 4.5 The Aitken model and generalized least squares . 63 5 Distributional Theory 68 5.1 Introduction . 68 i TABLE OF CONTENTS STAT 714, J. TEBBS 5.2 Multivariate normal distribution . 69 5.2.1 Probability density function . 69 5.2.2 Moment generating functions . 70 5.2.3 Properties . 72 5.2.4 Less-than-full-rank normal distributions . 73 5.2.5 Independence results . 74 5.2.6 Conditional distributions . 76 5.3 Noncentral χ2 distribution . 77 5.4 Noncentral F distribution . 79 5.5 Distributions of quadratic forms . 81 5.6 Independence of quadratic forms . 85 5.7 Cochran's Theorem . -
Computational Methods for Mixed Models
Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin – Madison June 21, 2021 Abstract The lme4 package provides R functions to fit and analyze several different types of mixed-effects models, including linear mixed models, generalized linear mixed models and nonlinear mixed models. In this vignette we describe the formulation of these models and the compu- tational approach used to evaluate or approximate the log-likelihood of a model/data/parameter value combination. 1 Introduction The lme4 package provides R functions to fit and analyze linear mixed models, generalized linear mixed models and nonlinear mixed models. These models are called mixed-effects models or, more simply, mixed models because they incorporate both fixed-effects parameters, which apply to an entire popula- tion or to certain well-defined and repeatable subsets of a population, and random effects, which apply to the particular experimental units or obser- vational units in the study. Such models are also called multilevel models because the random effects represent levels of variation in addition to the per-observation noise term that is incorporated in common statistical mod- els such as linear regression models, generalized linear models and nonlinear regression models. We begin by describing common properties of these mixed models and the general computational approach used in the lme4 package. The estimates of the parameters in a mixed model are determined as the values that optimize 1 an objective function — either the likelihood of the parameters given the observed data, for maximum likelihood (ML) estimates, or a related objective function called the REML criterion.