Generalized Linear Mixed Models

Florian Jaeger

Generalized Linear Model Graphical Model View An Introduction to Linear and Theory Linear Model Mixed Models An Example Geometrical Intuitions Day 1 Comparison to ANOVA Generalized Linear Graphical Model View

Linear Mixed Florian Jaeger Model Getting an Intuition Understanding More Complex Models

Mixed Logit February 4, 2010 Models Summary

Extras Fitting Models A Simulated Example Generalized Linear Overview Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View I Class 1: Theory Linear Model I (Re-)Introducing Ordinary Regression An Example I Comparison to ANOVA Geometrical Intuitions Comparison to ANOVA I Linear Mixed Models Generalized Linear I Generalized Linear Mixed Models Mixed Model I Trade-offs & Motivation Graphical Model View I How to get started Linear Mixed Model I Class 2: Getting an Intuition Understanding More Complex Models I Common Issues in Regression Modeling (Mixed or not) Mixed Logit I Solutions Models I Please ask/add to the discussion any time! Summary Extras Fitting Models A Simulated Example Generalized Linear Acknowledgments Mixed Models Florian Jaeger

I’ve incorporated (and modified) a couple of slides Generalized Linear I Model prepared by: Graphical Model View Theory I Victor Kuperman (Stanford) Linear Model I Roger Levy (UCSD) An Example Geometrical Intuitions ... with their permission (naturalmente!) Comparison to ANOVA I am also grateful for feedback from: Generalized Linear I Mixed Model I Austin Frank (Rochester) Graphical Model View I Previous audiences to similar workshops at CUNY, Linear Mixed Model Haskins, Rochester, Buffalo, UCSD, MIT. Getting an Intuition Understanding More I For more materials, check out: Complex Models Mixed Logit I http://www.hlp.rochester.edu/ Models I Summary http://wiki.bcs.rochester.edu:2525/HlpLab/StatsCourses Extras I http://hlplab.wordpress.com/ (e.g. multinomial mixed Fitting Models models code) A Simulated Example Generalized Linear Generalized Linear Models Mixed Models Florian Jaeger

Goal: model the effects of predictors (independent variables) Generalized Linear Model X on a response (dependent variable) Y . Graphical Model View Theory

Linear Model The picture: An Example Geometrical Intuitions Comparison to ANOVA

Predictors Generalized Linear Mixed Model Graphical Model View x x x 1 2 n Linear Mixed Response Model Getting an Intuition Understanding More y1 y2 ··· yn Complex Models Mixed Logit Models

Summary θ Model parameters Extras Fitting Models A Simulated Example Generalized Linear Generalized Linear Models Mixed Models Florian Jaeger

Goal: model the effects of predictors (independent variables) Generalized Linear Model X on a response (dependent variable) Y . Graphical Model View Theory

Linear Model The picture: An Example Geometrical Intuitions Comparison to ANOVA

Predictors Generalized Linear Mixed Model Graphical Model View x x x 1 2 n Linear Mixed Response Model Getting an Intuition Understanding More y1 y2 ··· yn Complex Models Mixed Logit Models

Summary θ Model parameters Extras Fitting Models A Simulated Example Generalized Linear Generalized Linear Models Mixed Models Florian Jaeger

Goal: model the effects of predictors (independent variables) Generalized Linear Model X on a response (dependent variable) Y . Graphical Model View Theory

Linear Model The picture: An Example Geometrical Intuitions Comparison to ANOVA

Predictors Generalized Linear Mixed Model Graphical Model View x x x 1 2 n Linear Mixed Response Model Getting an Intuition Understanding More y1 y2 ··· yn Complex Models Mixed Logit Models

Summary θ Model parameters Extras Fitting Models A Simulated Example Generalized Linear Generalized Linear Models Mixed Models Florian Jaeger

Goal: model the effects of predictors (independent variables) Generalized Linear Model X on a response (dependent variable) Y . Graphical Model View Theory

Linear Model The picture: An Example Geometrical Intuitions Comparison to ANOVA

Predictors Generalized Linear Mixed Model Graphical Model View x x x 1 2 n Linear Mixed Response Model Getting an Intuition Understanding More y1 y2 ··· yn Complex Models Mixed Logit Models

Summary θ Model parameters Extras Fitting Models A Simulated Example Generalized Linear Reviewing GLMs Mixed Models Florian Jaeger Assumptions of the generalized linear model (GLM): Generalized Linear Predictors {X } influence Y through the mediation of a Model I i Graphical Model View linear predictor η; Theory Linear Model I η is a linear combination of the {Xi }: An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model η = α + β1X1 + ··· + β X (linear predictor) N N Graphical Model View

Linear Mixed I η determines the predicted mean µ of Y Model Getting an Intuition Understanding More Complex Models

η = g(µ) (link function) Mixed Logit Models

I There is some noise distribution of Y around the Summary

predicted mean µ of Y : Extras Fitting Models A Simulated Example P(Y = y; µ) Generalized Linear Reviewing GLMs Mixed Models Florian Jaeger Assumptions of the generalized linear model (GLM): Generalized Linear Predictors {X } influence Y through the mediation of a Model I i Graphical Model View linear predictor η; Theory Linear Model I η is a linear combination of the {Xi }: An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model η = α + β1X1 + ··· + β X (linear predictor) N N Graphical Model View

Linear Mixed I η determines the predicted mean µ of Y Model Getting an Intuition Understanding More Complex Models

η = g(µ) (link function) Mixed Logit Models

I There is some noise distribution of Y around the Summary

predicted mean µ of Y : Extras Fitting Models A Simulated Example P(Y = y; µ) Generalized Linear Reviewing GLMs Mixed Models Florian Jaeger Assumptions of the generalized linear model (GLM): Generalized Linear Predictors {X } influence Y through the mediation of a Model I i Graphical Model View linear predictor η; Theory Linear Model I η is a linear combination of the {Xi }: An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model η = α + β1X1 + ··· + β X (linear predictor) N N Graphical Model View

Linear Mixed I η determines the predicted mean µ of Y Model Getting an Intuition Understanding More Complex Models

η = g(µ) (link function) Mixed Logit Models

I There is some noise distribution of Y around the Summary

predicted mean µ of Y : Extras Fitting Models A Simulated Example P(Y = y; µ) Generalized Linear Reviewing GLMs Mixed Models Florian Jaeger Assumptions of the generalized linear model (GLM): Generalized Linear Predictors {X } influence Y through the mediation of a Model I i Graphical Model View linear predictor η; Theory Linear Model I η is a linear combination of the {Xi }: An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model η = α + β1X1 + ··· + β X (linear predictor) N N Graphical Model View

Linear Mixed I η determines the predicted mean µ of Y Model Getting an Intuition Understanding More Complex Models

η = g(µ) (link function) Mixed Logit Models

I There is some noise distribution of Y around the Summary

predicted mean µ of Y : Extras Fitting Models A Simulated Example P(Y = y; µ) Generalized Linear Reviewing GLMs Mixed Models Florian Jaeger Assumptions of the generalized linear model (GLM): Generalized Linear Predictors {X } influence Y through the mediation of a Model I i Graphical Model View linear predictor η; Theory Linear Model I η is a linear combination of the {Xi }: An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model η = α + β1X1 + ··· + β X (linear predictor) N N Graphical Model View

Linear Mixed I η determines the predicted mean µ of Y Model Getting an Intuition Understanding More Complex Models

η = g(µ) (link function) Mixed Logit Models

I There is some noise distribution of Y around the Summary

predicted mean µ of Y : Extras Fitting Models A Simulated Example P(Y = y; µ) Generalized Linear Reviewing Mixed Models Florian Jaeger

Linear regression, which underlies ANOVA, is a kind of Generalized Linear Model generalized linear model. Graphical Model View The predicted mean is just the linear predictor: Theory I Linear Model An Example Geometrical Intuitions η = l(µ) = µ Comparison to ANOVA

Generalized Linear Mixed Model I Noise is normally (=Gaussian) distributed around 0 with Graphical Model View

standard deviation σ: Linear Mixed Model Getting an Intuition  ∼ N(0, σ) Understanding More Complex Models

Mixed Logit I This gives us the traditional linear regression equation: Models Summary Predicted Mean µ = η Noise∼N(0,σ) Extras z }| { z}|{ Fitting Models Y = α + β1X1 + ··· + βnXn +  A Simulated Example Generalized Linear Reviewing Linear Regression Mixed Models Florian Jaeger

Linear regression, which underlies ANOVA, is a kind of Generalized Linear Model generalized linear model. Graphical Model View The predicted mean is just the linear predictor: Theory I Linear Model An Example Geometrical Intuitions η = l(µ) = µ Comparison to ANOVA

Generalized Linear Mixed Model I Noise is normally (=Gaussian) distributed around 0 with Graphical Model View

standard deviation σ: Linear Mixed Model Getting an Intuition  ∼ N(0, σ) Understanding More Complex Models

Mixed Logit I This gives us the traditional linear regression equation: Models Summary Predicted Mean µ = η Noise∼N(0,σ) Extras z }| { z}|{ Fitting Models Y = α + β1X1 + ··· + βnXn +  A Simulated Example Generalized Linear Reviewing Linear Regression Mixed Models Florian Jaeger

Linear regression, which underlies ANOVA, is a kind of Generalized Linear Model generalized linear model. Graphical Model View The predicted mean is just the linear predictor: Theory I Linear Model An Example Geometrical Intuitions η = l(µ) = µ Comparison to ANOVA

Generalized Linear Mixed Model I Noise is normally (=Gaussian) distributed around 0 with Graphical Model View

standard deviation σ: Linear Mixed Model Getting an Intuition  ∼ N(0, σ) Understanding More Complex Models

Mixed Logit I This gives us the traditional linear regression equation: Models Summary Predicted Mean µ = η Noise∼N(0,σ) Extras z }| { z}|{ Fitting Models Y = α + β1X1 + ··· + βnXn +  A Simulated Example Generalized Linear Reviewing Linear Regression Mixed Models Florian Jaeger

Linear regression, which underlies ANOVA, is a kind of Generalized Linear Model generalized linear model. Graphical Model View The predicted mean is just the linear predictor: Theory I Linear Model An Example Geometrical Intuitions η = l(µ) = µ Comparison to ANOVA

Generalized Linear Mixed Model I Noise is normally (=Gaussian) distributed around 0 with Graphical Model View

standard deviation σ: Linear Mixed Model Getting an Intuition  ∼ N(0, σ) Understanding More Complex Models

Mixed Logit I This gives us the traditional linear regression equation: Models Summary Predicted Mean µ = η Noise∼N(0,σ) Extras z }| { z}|{ Fitting Models Y = α + β1X1 + ··· + βnXn +  A Simulated Example Generalized Linear Reviewing Mixed Models Florian Jaeger Logistic regression, too, is a kind of generalized linear model. Generalized Linear Model Graphical Model View Theory I The linear predictor: Linear Model An Example η = α + β1X1 + ··· + βnXn Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model Graphical Model View The link function g is the logit transform: I Linear Mixed Model Getting an Intuition Understanding More Complex Models −1 E(Y) = p = g (η) ⇔ Mixed Logit p Models g(p) = ln = η = α + β X + ··· + β X (1) 1 − p 1 1 n n Summary Extras Fitting Models I The distribution around the mean is taken to be A Simulated Example binomial. Generalized Linear Reviewing Logistic Regression Mixed Models Florian Jaeger Logistic regression, too, is a kind of generalized linear model. Generalized Linear Model Graphical Model View Theory I The linear predictor: Linear Model An Example η = α + β1X1 + ··· + βnXn Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model Graphical Model View The link function g is the logit transform: I Linear Mixed Model Getting an Intuition Understanding More Complex Models −1 E(Y) = p = g (η) ⇔ Mixed Logit p Models g(p) = ln = η = α + β X + ··· + β X (1) 1 − p 1 1 n n Summary Extras Fitting Models I The distribution around the mean is taken to be A Simulated Example binomial. Generalized Linear Reviewing Logistic Regression Mixed Models Florian Jaeger Logistic regression, too, is a kind of generalized linear model. Generalized Linear Model Graphical Model View Theory I The linear predictor: Linear Model An Example η = α + β1X1 + ··· + βnXn Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model Graphical Model View The link function g is the logit transform: I Linear Mixed Model Getting an Intuition Understanding More Complex Models −1 E(Y) = p = g (η) ⇔ Mixed Logit p Models g(p) = ln = η = α + β X + ··· + β X (1) 1 − p 1 1 n n Summary Extras Fitting Models I The distribution around the mean is taken to be A Simulated Example binomial. Generalized Linear Reviewing Logistic Regression Mixed Models Florian Jaeger Logistic regression, too, is a kind of generalized linear model. Generalized Linear Model Graphical Model View Theory I The linear predictor: Linear Model An Example η = α + β1X1 + ··· + βnXn Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model Graphical Model View The link function g is the logit transform: I Linear Mixed Model Getting an Intuition Understanding More Complex Models −1 E(Y) = p = g (η) ⇔ Mixed Logit p Models g(p) = ln = η = α + β X + ··· + β X (1) 1 − p 1 1 n n Summary Extras Fitting Models I The distribution around the mean is taken to be A Simulated Example binomial. Generalized Linear Reviewing GLM Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions I Comparison to ANOVA Generalized Linear I Beta-binomial model (for low count data, for example) Mixed Model Graphical Model View Ordered and unordered multinomial regression. I Linear Mixed Model I ... Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Determining the parameters Mixed Models Florian Jaeger How do we choose parameters (model coefficients) β I i Generalized Linear and σ? Model Graphical Model View Theory I We find the best ones. Linear Model I There are two major approaches (deeply related, yet An Example different) in widespread use: Geometrical Intuitions Comparison to ANOVA

I The principle of maximum likelihood: pick parameter Generalized Linear values that maximize the probability of your data Y Mixed Model Graphical Model View choose {βi } and σ that make the likelihood Linear Mixed P(Y |{β }, σ) as large as possible Model i Getting an Intuition Understanding More Complex Models I Bayesian inference: put a probability distribution on the model parameters and update it on the basis of what Mixed Logit Models parameters best explain the data Summary

Extras Fitting Models A Simulated Example Generalized Linear Determining the parameters Mixed Models Florian Jaeger How do we choose parameters (model coefficients) β I i Generalized Linear and σ? Model Graphical Model View Theory I We find the best ones. Linear Model I There are two major approaches (deeply related, yet An Example different) in widespread use: Geometrical Intuitions Comparison to ANOVA

I The principle of maximum likelihood: pick parameter Generalized Linear values that maximize the probability of your data Y Mixed Model Graphical Model View choose {βi } and σ that make the likelihood Linear Mixed P(Y |{β }, σ) as large as possible Model i Getting an Intuition Understanding More Complex Models I Bayesian inference: put a probability distribution on the model parameters and update it on the basis of what Mixed Logit Models parameters best explain the data Summary

Extras Fitting Models A Simulated Example Generalized Linear Determining the parameters Mixed Models Florian Jaeger How do we choose parameters (model coefficients) β I i Generalized Linear and σ? Model Graphical Model View Theory I We find the best ones. Linear Model I There are two major approaches (deeply related, yet An Example different) in widespread use: Geometrical Intuitions Comparison to ANOVA

I The principle of maximum likelihood: pick parameter Generalized Linear values that maximize the probability of your data Y Mixed Model Graphical Model View choose {βi } and σ that make the likelihood Linear Mixed P(Y |{β }, σ) as large as possible Model i Getting an Intuition Understanding More Complex Models I Bayesian inference: put a probability distribution on the model parameters and update it on the basis of what Mixed Logit Models parameters best explain the data Summary

Extras Fitting Models A Simulated Example Generalized Linear Determining the parameters Mixed Models Florian Jaeger How do we choose parameters (model coefficients) β I i Generalized Linear and σ? Model Graphical Model View Theory I We find the best ones. Linear Model I There are two major approaches (deeply related, yet An Example different) in widespread use: Geometrical Intuitions Comparison to ANOVA

I The principle of maximum likelihood: pick parameter Generalized Linear values that maximize the probability of your data Y Mixed Model Graphical Model View choose {βi } and σ that make the likelihood Linear Mixed P(Y |{β }, σ) as large as possible Model i Getting an Intuition Understanding More Complex Models I Bayesian inference: put a probability distribution on the model parameters and update it on the basis of what Mixed Logit Models parameters best explain the data Summary

Prior Extras z }| { Fitting Models P(Y |{β }, σ) P({β }, σ) A Simulated Example P({β }, σ|Y ) = i i i P(Y ) Generalized Linear Determining the parameters Mixed Models Florian Jaeger How do we choose parameters (model coefficients) β I i Generalized Linear and σ? Model Graphical Model View Theory I We find the best ones. Linear Model I There are two major approaches (deeply related, yet An Example different) in widespread use: Geometrical Intuitions Comparison to ANOVA

I The principle of maximum likelihood: pick parameter Generalized Linear values that maximize the probability of your data Y Mixed Model Graphical Model View choose {βi } and σ that make the likelihood Linear Mixed P(Y |{β }, σ) as large as possible Model i Getting an Intuition Understanding More Complex Models I Bayesian inference: put a probability distribution on the model parameters and update it on the basis of what Mixed Logit Models parameters best explain the data Summary

Likelihood Prior Extras z }| { z }| { Fitting Models P(Y |{β }, σ) P({β }, σ) A Simulated Example P({β }, σ|Y ) = i i i P(Y ) Generalized Linear Penalization, Regularization, etc. Mixed Models Florian Jaeger I Modern moderns are often fit using maximization of likelihood combined with some sort of penalization, a Generalized Linear Model term that ‘punished’ high model complexity (high values Graphical Model View Theory of the coefficients). Linear Model An Example I cf. Baayen, Davidson, and Bates (2008) for a nice Geometrical Intuitions description. Comparison to ANOVA Generalized Linear Mixed Model Graphical Model View

Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear The Linear Model Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA Let’s start with the Linear Model (linear regression, Generalized Linear I Mixed Model multiple linear regression) Graphical Model View Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA You are studying word RTs in a lexical-decision task I Generalized Linear Word or non-word? Mixed Model tpozt Graphical Model View

house Word or non-word? Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA You are studying word RTs in a lexical-decision task I Generalized Linear Word or non-word? Mixed Model tpozt Graphical Model View

house Word or non-word? Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA You are studying word RTs in a lexical-decision task I Generalized Linear Word or non-word? Mixed Model tpozt Graphical Model View

house Word or non-word? Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Data: Lexical decision RTs Mixed Models Florian Jaeger

Generalized Linear Model I Data set based on Baayen et al. (2006; available Graphical Model View through languageR library in the free statistics program Theory Linear Model R) An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model Graphical Model View

Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Data: Lexical decision RTs Mixed Models Florian Jaeger

Generalized Linear I Lexical Decisions from 79 concrete nouns each seen by Model Graphical Model View 21 subjects (1,659 observation). Theory Outcome: log lexical decision latency RT Linear Model I An Example Inputs: Geometrical Intuitions I Comparison to ANOVA

I factor (e.g. NativeLanguage: English or Other) Generalized Linear Mixed Model I continuous predictors (e.g. Frequency). Graphical Model View

Linear Mixed > library(languageR) Model > head(lexdec[,c(1,2,5,10,11)]) Getting an Intuition Understanding More Subject RT NativeLanguage Frequency FamilySize Complex Models 1 A1 6.340359 English 4.859812 1.3862944 Mixed Logit 2 A1 6.308098 English 4.605170 1.0986123 Models 3 A1 6.349139 English 4.997212 0.6931472 Summary 4 A1 6.186209 English 4.727388 0.0000000 Extras 5 A1 6.025866 English 7.667626 3.1354942 Fitting Models 6 A1 6.180017 English 4.060443 0.6931472 A Simulated Example Generalized Linear A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory A simple model: assume that has a linear I Frequency Linear Model effect on average (log-transformed) RT, and trial-level An Example Geometrical Intuitions noise is normally distributed Comparison to ANOVA Generalized Linear I If xi is Frequency, our simple model is Mixed Model Graphical Model View Noise∼N(0,σ) z}|{ Linear Mixed RTij = α + βxij + ij Model Getting an Intuition Understanding More I We need to draw inferences about α, β, and σ Complex Models I e.g., “Does Frequency affects RT?”→ is β reliably Mixed Logit Models non-zero? Summary

Extras Fitting Models A Simulated Example Generalized Linear A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory A simple model: assume that has a linear I Frequency Linear Model effect on average (log-transformed) RT, and trial-level An Example Geometrical Intuitions noise is normally distributed Comparison to ANOVA Generalized Linear I If xi is Frequency, our simple model is Mixed Model Graphical Model View Noise∼N(0,σ) z}|{ Linear Mixed RTij = α + βxij + ij Model Getting an Intuition Understanding More I We need to draw inferences about α, β, and σ Complex Models I e.g., “Does Frequency affects RT?”→ is β reliably Mixed Logit Models non-zero? Summary

Extras Fitting Models A Simulated Example Generalized Linear A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory A simple model: assume that has a linear I Frequency Linear Model effect on average (log-transformed) RT, and trial-level An Example Geometrical Intuitions noise is normally distributed Comparison to ANOVA Generalized Linear I If xi is Frequency, our simple model is Mixed Model Graphical Model View Noise∼N(0,σ) z}|{ Linear Mixed RTij = α + βxij + ij Model Getting an Intuition Understanding More I We need to draw inferences about α, β, and σ Complex Models I e.g., “Does Frequency affects RT?”→ is β reliably Mixed Logit Models non-zero? Summary

Extras Fitting Models A Simulated Example Generalized Linear A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory A simple model: assume that has a linear I Frequency Linear Model effect on average (log-transformed) RT, and trial-level An Example Geometrical Intuitions noise is normally distributed Comparison to ANOVA Generalized Linear I If xi is Frequency, our simple model is Mixed Model Graphical Model View Noise∼N(0,σ) z}|{ Linear Mixed RTij = α + βxij + ij Model Getting an Intuition Understanding More I We need to draw inferences about α, β, and σ Complex Models I e.g., “Does Frequency affects RT?”→ is β reliably Mixed Logit Models non-zero? Summary

Extras Fitting Models A Simulated Example Generalized Linear Reviewing GLMs: A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View

Noise∼N(0,σ) Theory z}|{ RTij = α + βxij + ij Linear Model An Example I Here’s a translation of our simple model into R: Geometrical Intuitions > glm(RT ~ 1 + Frequency, data=lexdec, Comparison to ANOVA + family="gaussian") Generalized Linear α Mixed Model [...] b Graphical Model View Estimate Std. Error t value Pr(>|t|) Linear Mixed (Intercept) 6.5887 0.022296 295.515 <2e-16 *** Model Frequency -0.0428 0.004533 -9.459 <2e-16 *** Getting an Intuition Understanding More > sqrt(summary(l)[["dispersion"]]) Complex Models [1] 0.2353127 βb Mixed Logit Models σ b Summary

Extras Fitting Models A Simulated Example Generalized Linear Reviewing GLMs: A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View

Noise∼N(0,σ) Theory z}|{ RTij = α + βxij + ij Linear Model An Example I Here’s a translation of our simple model into R: Geometrical Intuitions > glm(RT ~ 1 + Frequency, data=lexdec, Comparison to ANOVA + family="gaussian") Generalized Linear α Mixed Model [...] b Graphical Model View Estimate Std. Error t value Pr(>|t|) Linear Mixed (Intercept) 6.5887 0.022296 295.515 <2e-16 *** Model Frequency -0.0428 0.004533 -9.459 <2e-16 *** Getting an Intuition Understanding More > sqrt(summary(l)[["dispersion"]]) Complex Models [1] 0.2353127 βb Mixed Logit Models σ b Summary

Extras Fitting Models A Simulated Example Generalized Linear Reviewing GLMs: A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View

Noise∼N(0,σ) Theory z}|{ RTij = α + βxij + ij Linear Model An Example I Here’s a translation of our simple model into R: Geometrical Intuitions > glm(RT ~ 1 + Frequency, data=lexdec, Comparison to ANOVA + family="gaussian") Generalized Linear α Mixed Model [...] b Graphical Model View Estimate Std. Error t value Pr(>|t|) Linear Mixed (Intercept) 6.5887 0.022296 295.515 <2e-16 *** Model Frequency -0.0428 0.004533 -9.459 <2e-16 *** Getting an Intuition Understanding More > sqrt(summary(l)[["dispersion"]]) Complex Models [1] 0.2353127 βb Mixed Logit Models σ b Summary

Extras Fitting Models A Simulated Example Generalized Linear Reviewing GLMs: A simple example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View

Noise∼N(0,σ) Theory z}|{ RTij = α + βxij + ij Linear Model An Example I Here’s a translation of our simple model into R: Geometrical Intuitions > glm(RT ~ 1 + Frequency, data=lexdec, Comparison to ANOVA + family="gaussian") Generalized Linear α Mixed Model [...] b Graphical Model View Estimate Std. Error t value Pr(>|t|) Linear Mixed (Intercept) 6.5887 0.022296 295.515 <2e-16 *** Model Frequency -0.0428 0.004533 -9.459 <2e-16 *** Getting an Intuition Understanding More > sqrt(summary(l)[["dispersion"]]) Complex Models [1] 0.2353127 βb Mixed Logit Models σ b Summary

Extras Fitting Models A Simulated Example Generalized Linear Linear Model with just an intercept Mixed Models Florian Jaeger

Generalized Linear I The intercept is a predictor in the model (usually one Model we don’t care about). Graphical Model View Theory → A significant intercept indicates that it is different from Linear Model An Example zero. Geometrical Intuitions Comparison to ANOVA > l.lexdec0 = lm(RT ~ 1, data=lexdec) Generalized Linear > summary(l.lexdec0) Mixed Model Graphical Model View

[...] Linear Mixed Residuals: Model Min 1Q Median 3Q Max Getting an Intuition Understanding More -0.55614 -0.17048 -0.03945 0.11695 1.20222 Complex Models

Mixed Logit Coefficients: Models Estimate Std. Error t value Pr(>|t|) Summary (Intercept) 6.385090 0.005929 1077 <2e-16 *** [...] Extras Fitting Models A Simulated Example NB: Here, intercept encodes overall mean. Generalized Linear Visualization of Intercept Model Mixed Models Florian Jaeger Predicting Lexical Decision RTs

Generalized Linear ● Model Graphical Model View 7.5 ● ● Theory ●

● Linear Model

● ● An Example ● ● Geometrical Intuitions ● ● ● ● ● ● ● ● ● ● Comparison to ANOVA ● ● ● ● ● ●●●●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● 7.0 ● ● ● ● Generalized Linear ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● Mixed Model ●● ● ● ● ●●● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●●●●● ● ●● ● Graphical Model View ● ● ● ● ●● ●●●● ● ●● ● ● ● ●● ●● ●●●● ● ● ●●●● ● ● ● ●● ●●●● ● ● ●● ● ● ● ● ● ● ● ●●●●●●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ●●● ● ● ● ● ● ● ● Linear Mixed ● ● ● ●● ● ●●●● ● ● ●●● ●● ● ● ● ● ●● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ●●●●●●● ●● ●● Model ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ●● ●●●● ●● ● ●● ● ●● ● ●● ●● ● ●●●● ● ● ●●●●●● ● ●● ● ● ● ● ● ●● ● ● ●●● ●● ●● ●● ● ●● ●● Getting an Intuition ●● ● ●● ●● ● ● ● ● ● ●● ● ●●●●●● ● ●●● ●●● ● ●●● ● ● ● ● ● ● ●●●● ● ●●● ● ● ● ● ● ●●●●●●●● ● ●●●●●●● ●● ● ●● ●● ●●●● ● ● ● ●●● ● ● ●●●●● ● ● ● ● ● ●● ● ● 6.5 ● ● ●●● ●● ● ● ●● ● ●● ●● ● ● ● ● ● Understanding More ● ●●●●● ● ●●●●● ● ●●●●● ●● ● ● ● ● ●● ●● ●● ●●● ● ● ●● ●●●●● ●● ● ●● ●●● ●● ● ● ●● ● ●● ● ● ● ●●● ●●● ● ●● ● ●● ●●●● ● ● ●●●●●●●● ●● ●● ●●● ●● ● ● ●●● ● ● ●●●●●●● ● ● ●●●●● ● ●● ●●● ●● ● Complex Models ●● ●● ●●●●●● ●● ●●● ● ●● ●● ●●●●●● ●●● ●●●● ●● ● ●●●●●●●●● ●● ●●●● ●● ●●●● ● ● ●●● ● ●●●● ●●●● ● ●● ●● ● ●● ●● ●●●● ●●● ● ●● ● ●● ● ● ● ● ● ● ● ●●● ● ●● ●●● ●●●●●●● ●●● ●●● ●● ● ●●● ●●● ●●●●● ●●●●●● ●●●●●● ● ●●● ●●● ●●●● ●●● ●●● ●● ● ●●● ●●●●● ●●●●● ●● ●●●●●●●●●●● ●●● ●● ●●●●●●●●●● ● ●●● ● ●●● ●● ● ●● ●●●●● ●● ●●●●●● ●● ●●●●●● ●●● ●● ●●●●●●●●●● ●●● ●● ●● ●● ●● ●● ●● ●● ●●●●●●● ●● ●●●●● ●●● ●● ●● ●●●●● ●●●●● ●●●●●●●●●●● ●●●●● ●● ●● ●●●● ● ●●●● ●● ● Mixed Logit ● ●● ●●●● ●●●● ●●● ●● ●●●●●● ● ● ●●● ● ●● ●● ●●●● ● ● ●●●●●● ●● ● ●● ●●● ● ●●●● ●●●●●●● ● ● ● ●●●●● ● ●●● ● ● ● ● ● ●● ●● ●●● ●●●●●● ● ● ●●●●●●● ●● ●● ● ●●● ● ● ● ●● ●● ●● ●● ●●●●● ●● ● ●●●●●●●●● ● ●● ●● ●● ●●●●● ● ●●● ●●●●●● ●● ●● ●●● ●●●●●● ● Models ●● ● ● ● ● ●●● ●● ●●●●●●● ●●●●● ●● ●●● ●●●●●●●●●●●●●● ●●●●●● ●● ● ●●●●● ● ● ●●●●● ●● ● ●● ●● ●●●●● ●●● ●●● ●●●● ● ● ● ● ●●●●●● ● ●●●●●● ●●●●●●●●●●●● ● ●●● ● ●●●● ●● ●●●●● ● ●●● ● ● ● ●●●●●● ●●●●●● ●● ● ● ● ● ●●●● ●● ●● ● ● ● ●●●●●● ●●● ● ● ● ● ● ● Response latency (in log−transformed msecs) ● ●● ●●● ●●●● ● ●● ●● ●●● ●●●● ●●● ● ● ● ● Summary ● ● ●●●● ● ●●●●●●● ● ● ● ●● ● ● ●● ●●● ●●●● ● ● ● ●●● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ● ●● ● ● ● ● 6.0 ● ● ●● ● ● ● ● Extras ● ● ● ● ● ● ● Fitting Models ● ● A Simulated Example

0 500 1000 1500

Case Index Generalized Linear Linear Model with one predictor Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory > l.lexdec1 = lm(RT ~ 1 + Frequency, data=lexdec) Linear Model An Example Geometrical Intuitions I Classic geometrical interpretation: Finding slope for the Comparison to ANOVA predictor that minimized the squared error. Generalized Linear Mixed Model NB: Never forget the directionality in this statement (the Graphical Model View error in predicting the outcome is minimized, not the Linear Mixed Model distance from the line). Getting an Intuition Understanding More NB: Maximum likelihood (ML) fitting is the more general Complex Models

approach as it extends to other types of Generalized Mixed Logit Linear Models. ML is identical to least-squared error for Models Gaussian errors. Summary Extras Fitting Models A Simulated Example Generalized Linear Frequency effect on RT Mixed Models Florian Jaeger Predicting Lexical Decision RTs

Generalized Linear ● Model Graphical Model View 7.5 ● ● Theory ●

● Linear Model

● ● An Example ● ● ● Geometrical Intuitions ● ● ● ● ●● ● ● ● Comparison to ANOVA ● ● ● ● ● ● ● ●●● ● ●● ● ●● ● ●● ● ● ● ● ● ● 7.0 ● ● ● ● Generalized Linear ● ● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● Mixed Model ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● Graphical Model View ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ●● ● ● ● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● Linear Mixed ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● Model ● ● ● ● ● ● ● ●●● ●●● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ●● ● ● ● ●●●● ● ●● ●● ● ●●● ● ●● Getting an Intuition ● ●● ●● ●●●● ● ●● ●● ● ●● ● ● ●● ●● ● ● ● ● ●● ●● ●● ●● ●●●●● ●● ●●●● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ●●● ● ●●●●●● ● ●●●●●● ● ● ● ● ● ● 6.5 ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● Understanding More ● ● ●● ● ● ● ● ●● ● ●● ●●● ●●● ● ●●● ●● ●● ●● ● ● ● ● ● ● ●●● ● ● ●●●●● ●● ●● ●●● ●●●●● ● ● ● ●● ● ● ●● ● ●● ● ●● ●●●● ●●●● ● ● ●●● ●● ● ● ● ●● Complex Models ● ● ● ●● ●● ● ● ● ●●● ● ●● ●●●●●● ●●●●●●●● ●●●● ● ●● ●● ●●● ● ● ● ●●●● ●● ● ●●● ●●●●●●● ●●●●●● ●●●●●● ●●● ● ● ● ●● ● ●● ● ● ● ● ●●●●● ●● ●●●●● ●●● ●● ●●●●●● ● ●● ●● ● ● ● ● ● ●● ●● ●●●●● ●●●●●●●● ●●●●●●●● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●●● ●●●●●●●● ●● ● ●● ●●●●●● ● ●● ● ●● ● ● ●●● ●● ●● ● ●●●●● ●● ●● ●●●●●●●●●●● ●●● ●●●● ●● ● ●● ● ●● Mixed Logit ● ● ●● ● ● ●● ● ●●● ● ●● ●●●●● ●●●●●●●●● ●●●●● ●●● ●●● ●● ● ● ● ● ● ● ● ● ●●●● ●● ●●●● ● ●●●●● ● ● ●● ●● ●● ● ●● ● ● ●● ● ● ●●●● ● ● ●● ●●●●● ● ● ●●● ● ● ●● ●● ● ●● ● ● ● ● ●●●● ●● ●●●●● ●● ● ●●● ●●● ●● ●●●● ●● ● ● Models ● ● ● ● ● ●● ● ● ●●● ● ● ●● ● ● ●●●●●●● ●●●●●●●●● ●● ●●● ●● ● ● ● ● ● ● ●● ● ●●● ● ●●● ●● ●● ●●●● ● ●●●● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ●●●● ●●● ●● ●● ●●●● ●● ● ● ● ●● ● ●● ● ● ● ● ● ●●● ● ●● ● ●● ●● ●●● ● ● ● ● ● ●● ●●● ● ● ●● ●● ● ● ●● ● ● Response latency (in log−transformed msecs) ● ● ● ● ● ●● ●● ● ● ●● ●● ● ●● ●● ● ●● ● Summary ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ●●●● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● 6.0 ● ● ●● ● ● ● Extras ● ● ● ● ● ● ● Fitting Models ● ● A Simulated Example

2 3 4 5 6 7 8

Word Frequency (log−transformed) Generalized Linear Linearity Assumption Mixed Models NB: Like AN(C)OVA, the linear model assumes that the Florian Jaeger

outcome is linear in the coefficients (linearity Generalized Linear assumption). Model Graphical Model View I This does not mean that the outcome and the input Theory variable have to be linearly related (cf. previous page). Linear Model An Example I To illustrate this, consider that we can back-transform Geometrical Intuitions the log-transformed Frequency (→ transformations Comparison to ANOVA may be necessary). Generalized Linear Mixed Model Graphical Model View Predicting Lexical Decision RTs Linear Mixed ● 2000 Model Getting an Intuition ● ● ● Understanding More Complex Models ● 1500

● ● ● ● ● Mixed Logit ●● ● ● ●●● ● Models ● ● ● ● ●● ●●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ●●●● ● ●●● ● ●● ●●● ● ● Summary ●●●●● ● 1000 ●●●● ● ● ● ●●●●●● ● ● ●●●●●● ● ● ● ●●●●●● ● ● ●●●●●●●● ● ● ●● ●●●● ● ● ● ●●●●●●●● ●● ●● ●●●●●●●●●● ● Response latency (in msecs) ●●●●●● ●● ● ●●●●●●●●●● ● ●● ● ●●●●●●● ● ● ● ● Extras ●●●●●●●●●●● ● ● ● ●●●●●●●●●●●●● ●● ● ● ● ● ●●●●●●●●●●● ● ● ● ● ● ● ● ●●●●●●●● ●●●● ●● ● ● ●● ● ● ●●●●●●●●●●●●●● ●●● ● ●● ● ● ●●●●●●●●●●●●●●● ● ●●● ● ● ● ● ● ● ●●●●●●●●●● ●●●● ●● ●● ●● ● ● ● ● Fitting Models ●●●●●●●●●●●●●●● ●●●●● ● ● ●● ● ● ● ●●●●●●●●●●●●●●●●● ●●● ●●● ●●● ●● ● ● ● ● ●●●●●●●●●●●●●●●● ●●●●●● ●●● ● ●● ● ● ● ●●●●●●●●●●●●●●●●●● ●●●●●●● ●● ● ●● ● ● ● ●●●●●●●●●●●●●●● ●●●●●●● ●● ●● ● ● ● ● ●●●●●●●●●●●●●●●●● ●●●●● ●●● ●●● ●● ● ● ● ● ●●●●●●●●●●●●●●●● ●●●●●● ●● ●●● ● ● ● ● ● A Simulated Example ●●●●●●●●●●●●●●● ●●●●●●● ●●● ●●● ● ●● ● ● ● ● ●●●●●●●●●●●●●●●●● ●●●●● ●●● ●●● ● ●● ● ● ● 500 ●●●●●●●●●●●●●● ●●● ●● ●● ●●● ● ● ● ●●●● ●●●●●●●● ●●●●●●● ●● ●● ●● ● ● ●●● ●●●●●●●● ●●●●● ●● ● ● ● ● ● ● ●●● ●●●● ●●●●● ● ●● ● ● ● ● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0 500 1000 1500 2000

Word Frequency Generalized Linear Adding further predictors Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

I FamilySize is the number of words in the Linear Model morphological family of the target word. An Example Geometrical Intuitions Comparison to ANOVA I For now, we are assuming to independent effects. Generalized Linear Mixed Model > l.lexdec1 = lm(RT ~ 1 + Frequency + FamilySize, Graphical Model View + data=lexdec) Linear Mixed Model Coefficients: Getting an Intuition Estimate Std. Error t value Pr(>|t|) Understanding More Complex Models (Intercept) 6.563853 0.026826 244.685 < 2e-16 *** Frequency -0.035310 0.006407 -5.511 4.13e-08 Mixed Logit *** Models FamilySize -0.015655 0.009380 -1.669 0.0953 . Summary

Extras Fitting Models A Simulated Example Generalized Linear Question Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example On the previous slide, is the interpretation of the output Geometrical Intuitions I Comparison to ANOVA

clear? Generalized Linear Mixed Model I What is the interpretation of the intercept? Graphical Model View Linear Mixed I How much faster is the most frequent word expected to Model be read compared to the least frequent word? Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Frequency and Morph. Family Size Mixed Models Florian Jaeger Predicting Lexical Decision RTs

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA ● ● ● Generalized Linear ● ●

8.0 Mixed Model ● ● ● ● ● ● ● ● ● ● Graphical Model View ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● Linear Mixed 7.5 ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● Model ● ● ●● ●● ●● ● ● ● ● ● ● ● ●●● ●● ●● ●● ●● ● ● ● ● ● ● ● ●●● ● ●● ●● ●● ● ● ● ●● ● ● ● ● ●●●● ● ● ●● ●● ● ● Getting an Intuition ●● ● ● ●● ● ● ● ●●● ●● ●●● ●● ●● ● ● ● ● ● ●● ●●● ● ● ● ●●●● ●● ●●●●●●● ● 3.5 ● ● ●● ● ● ● ●●●●●● ● ●●●●●●● 7.0 ● ● ● ● ● Understanding More ● ● ● ●●● ●●● ● ●● ● ●● ●●●● ●● ● ●●●● ● ● ● ● ● ● ● ●●● ● ●●● ●●●●● ●● ●● ●●●●●● ● ● ●●● ● ●●●● ●●●● ●●● ●● ●● ● ●● ● ●●●●● ● 3.0 Complex Models ● ● ●●●● ●●●●●●●●● ● ● ● ●●●● ●●●●●●● ● ●● ● ● ● ● ● ●●● ●●●●●● ●●●● ●●●●● ● ● ●● ● ● ● ● ●● ●●● ●●●● ●●●●●● ●●●●●●● ●● ● ●● ● ● ● ● ●● ●●●● ●●●●● ●● ●●● ●●●●●● ● ● ● ● ● ● ●●● ●● ●●● ●●●●● ●● ●●● ●●●●●●● ● ● 2.5 ● ● ●●●● ● ●●●●● ●●●●●● ●●●● ●●● ●● ● ● ● ● ●●● ●●●●●●●●●● ●●●● ●●●●●● ●●●●●●● ● ● ● Mixed Logit ● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●● ●● ●●● ● 6.5 ● ● ●●● ●●●● ●●● ●●●●●● ●●●●●●● ●●●●● ● ● ●●●●● ●●●●●●● ● ●●●●● ●●●●●● ●● ● ● 2.0 Models ● ● ●●●● ●●●●●●●●●●●●●●●● ●● ● ● ● ● ● ●●●● ●●●●●●●●● ●●●●●● ●● ● ● ● ● ● ● ●● ● ● ●●●●●●●●●●●●●● ●●● ● ● ● ● ●●●● ●●●●●●●● ●●●●● ● ● ●● ● ● ● ● ●● ● ●● ● ●● ● ●●●● ● ● ● ●● 1.5 ● ●●●● ●● ●●●●● ●●●●● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● Summary ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● 6.0 ● ● ● ● 1.0 Extras 0.5 Fitting Models 0.0

5.5 A Simulated Example Number of morph. family members (log−transformed)

Response latency (in log−transformed msecs) 1 2 3 4 5 6 7 8

Word Frequency (log−transformed) Generalized Linear Continuous and categorical predictors Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model > l.lexdec1 = lm(RT ~ 1 + Frequency + FamilySize + An Example + NativeLanguage, data=lexdec) Geometrical Intuitions Comparison to ANOVA

Generalized Linear I Recall that we’re describing the output as a linear Mixed Model combination of the predictors. Graphical Model View Linear Mixed → Categorical predictors need to be coded numerically. Model Getting an Intuition I The default is dummy/treatment coding for regression Understanding More Complex Models

(cf. sum/contrast coding for ANOVA). Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Adding Native Language Mixed Models Florian Jaeger

Predicting Lexical Decision RTs Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA ● ● ● Generalized Linear ● ● 8.0 ● ● ● ● ● Mixed Model ● ● ● ● ● ● ● ● ● ●● ● ● ● ● Graphical Model View ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 7.5 ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● Linear Mixed ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●●● ●● ●● ●● ●● ● ● ● ● ● ● ● ●●● ● ●● ●● ●● ● ● Model ● ●● ● ● ● ● ●●●● ● ● ●● ●● ● ● ●● ● ● ●● ● ● ● ●●● ●● ●●● ●● ●● ● ● ● ● ● ●● ●●● ● ● ● ●●●● ●● ●●●●●●● ● 3.5 ● ● ●● ● ● ● ●●●●●● ● ●●●●●●● 7.0 ● ● ● ● ● Getting an Intuition ● ● ● ●●● ●●● ● ●● ● ●● ●●●● ●● ● ●●●● ● ● ● ● ● ● ● ●●● ● ●●● ●●●●● ●● ●● ●●●●●● ● ● ●●● ● ●●●● ●●●● ●●● ●● ●● ● ●● ● ●●●●● ● 3.0 ● ● ●●●● ●●●●●●●●● ● ● ● ●●●● ●●●●●●● ● ●● ● ● ● ● ● ●●● ●●●●●● ●●●● ●●●●● ● ● ●● Understanding More ● ● ● ● ●● ●●● ●●●● ●●●●●● ●●●●●●● ●● ● ●● ● ● ● ● ●● ●●●● ●●●●● ●● ●●● ●●●●●● ● ● ● ● ● ● ●●● ●● ●●● ●●●●● ●● ●●● ●●●●●●● ● ● 2.5 Complex Models ● ● ●●●● ● ●●●●● ●●●●●● ●●●● ●●● ●● ● ● ● ● ●●● ●●●●●●●●●● ●●●● ●●●●●● ●●●●●●● ● ● ● ● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●● ●● ●●● ● 6.5 ● ● ●●●● ●●●● ●●● ●●●●●● ●●●●●●● ●●●●●● ● ● ●●●●● ●●●●●●●●● ●●●●● ●●●●●● ● ● 2.0 ● ● ●●●● ●●●●●●●●●●●●●●●● ●● ● ● ● ● ● ●●●● ●●●●●●●●● ●●●●● ●● ● ● ● Mixed Logit ● ● ● ●● ● ● ●●●●●●●●●●●●●● ●●●● ● ● ● ● ● ●●●● ●●●●●●●●● ●●●●● ● ● ●● ● ● ● ●● ● ●● ● ●●● ●●●● ● ● ● ●● 1.5 ● ●●●● ●● ●●●● ●●●●● ● ● ● ● Models ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● 6.0 ● ● ● ● 1.0 Summary 0.5 0.0

5.5 Extras Number of morph. family members (log−transformed)

Response latency (in log−transformed msecs) 1 2 3 4 5 6 7 8 Fitting Models A Simulated Example Word Frequency (log−transformed) Native Speakers (red) and Non−Native Speakers (blue) Generalized Linear Question Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA

I Remember that a Generalized Linear Model predicts the Generalized Linear mean of the outcome as a linear combination. Mixed Model Graphical Model View I In the previous figure, what does ‘mean’ mean here? Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Interactions Mixed Models Florian Jaeger

Generalized Linear Interactions are products of predictors. Model I Graphical Model View Theory I Significant interactions tell us that the slope of a Linear Model predictor differs for different values of the other An Example Geometrical Intuitions predictor. Comparison to ANOVA

Generalized Linear Mixed Model > l.lexdec1 = lm(RT ~ 1 + Frequency + FamilySize + Graphical Model View + NativeLanguage + Frequency:NativeLanguage, Linear Mixed + data=lexdec) Model Residuals: Getting an Intuition Min 1Q Median 3Q Max Understanding More -0.66925 -0.14917 -0.02800 0.11626 1.06790 Complex Models

Coefficients: Mixed Logit Estimate Std. Error t value Pr(>|t|) Models (Intercept) 6.441135 0.031140 206.847 < 2e-16 Summary Frequency -0.023536 0.007079 -3.325 0.000905 FamilySize -0.015655 0.008839 -1.771 0.076726 Extras NativeLanguageOther 0.286343 0.042432 6.748 2.06e-11 Fitting Models Frequency:NatLangOther -0.027472 0.008626 -3.185 0.001475 A Simulated Example Generalized Linear Question Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear I On the previous slide, how should we interpret the Mixed Model interaction? Graphical Model View Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Interaction: Frequency & Native Language Mixed Models Florian Jaeger

Predicting Lexical Decision RTs Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA ● ● ● Generalized Linear ● ● 8.0 ● ● ● ● ● Mixed Model ● ● ● ● ● ● ● ● ● ●● ● ● ● ● Graphical Model View ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 7.5 ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● Linear Mixed ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●●● ●● ●● ●● ●● ● ● ● ● ● ● ● ●●● ● ●● ●● ●● ● ● Model ● ●● ● ● ● ● ●●●● ● ● ●● ●● ● ● ●● ● ● ●● ● ● ● ●●● ●● ●●● ●● ●● ● ● ● ● ● ●● ●●● ● ● ● ●●●● ●● ●●●●●●● ● 3.5 ● ● ●● ● ● ● ●●●●●● ● ●●●●●●● 7.0 ● ● ● ● ● Getting an Intuition ● ● ● ●●● ●●● ● ●● ● ●● ●●●● ●● ● ●●●● ● ● ● ● ● ● ● ●●● ● ●●● ●●●●● ●● ●● ●●●●●● ● ● ●●● ● ●●●● ●●●● ●●● ●● ●● ● ●● ● ●●●●● ● 3.0 ● ● ●●●● ●●●●●●●●● ● ● ● ●●●● ●●●●●●● ● ●● ● ● ● ● ● ●●● ●●●●●● ●●●● ●●●●● ● ● ●● Understanding More ● ● ● ● ●● ●●● ●●●● ●●●●●● ●●●●●●● ●● ● ●● ● ● ● ● ●● ●●●● ●●●●● ●● ●●● ●●●●●● ● ● ● ● ● ● ●●● ●● ●●● ●●●●● ●● ●●● ●●●●●●● ● ● 2.5 Complex Models ● ● ●●●● ● ●●●●● ●●●●●● ●●●● ●●● ●● ● ● ● ● ●●● ●●●●●●●●●● ●●●● ●●●●●● ●●●●●●● ● ● ● ● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●● ●● ●●● ● 6.5 ● ● ●●●● ●●●● ●●● ●●●●●● ●●●●●●● ●●●●●● ● ● ●●●●● ●●●●●●●●● ●●●●● ●●●●●● ● ● 2.0 ● ● ●●●● ●●●●●●●●●●●●●●●● ●● ● ● ● ● ● ●●●● ●●●●●●●●● ●●●●● ●● ● ● ● Mixed Logit ● ● ● ●● ● ● ●●●●●●●●●●●●●● ●●●● ● ● ● ● ● ●●●● ●●●●●●●●● ●●●●● ● ● ●● ● ● ● ●● ● ●● ● ●●● ●●●● ● ● ● ●● 1.5 ● ●●●● ●● ●●●● ●●●●● ● ● ● ● Models ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● 6.0 ● ● ● ● 1.0 Summary 0.5 0.0

5.5 Extras Number of morph. family members (log−transformed)

Response latency (in log−transformed msecs) 1 2 3 4 5 6 7 8 Fitting Models A Simulated Example Word Frequency (log−transformed) Interaction with Native Speakers (red) and Non−Native Speakers (blue) Generalized Linear Linear Model vs. ANOVA Mixed Models Florian Jaeger

Generalized Linear I Shared with ANOVA: Model Graphical Model View I Linearity assumption (though many types of Theory

non-linearity can be investigated) Linear Model I Assumption of normality, but part of a more general An Example Geometrical Intuitions framework that extends to other distribution in a Comparison to ANOVA conceptually straightforward way. Generalized Linear Mixed Model I Assumption of independence Graphical Model View

NB: ANOVA is linear model with categorical predictors. Linear Mixed Model I Differences: Getting an Intuition Understanding More I Generalized Linear Model Complex Models

I Consistent and transparent way of treating continuous Mixed Logit and categorical predictors. Models Summary I Regression encourages a priori explicit coding of hypothesis → reduction of post-hoc tests → decrease of Extras Fitting Models family-wise error rate. A Simulated Example Generalized Linear Hypothesis testing in psycholinguistic Mixed Models research Florian Jaeger

Generalized Linear Model Graphical Model View Theory

I Typically, we make predictions not just about the Linear Model An Example existence, but also the direction of effects. Geometrical Intuitions Comparison to ANOVA

I Sometimes, we’re also interested in effect shapes Generalized Linear (non-linearities, etc.) Mixed Model Graphical Model View I Unlike in ANOVA, regression analyses reliably test Linear Mixed Model hypotheses about effect direction, effect shape, and Getting an Intuition Understanding More effect size without requiring post-hoc analyses if (a) Complex Models the predictors in the model are coded appropriately and Mixed Logit Models

(b) the model can be trusted. Summary I cf. tomorrow Extras Fitting Models A Simulated Example Generalized Linear Generalized Linear Mixed Models Mixed Models Florian Jaeger

Generalized Linear Model Experiments don’t have just one participant. Graphical Model View I Theory

I Different participants may have different idiosyncratic Linear Model behavior. An Example Geometrical Intuitions I And items may have idiosyncratic properties, too. Comparison to ANOVA Generalized Linear → Violations of the assumption of independence! Mixed Model NB: There may even be more clustered (repeated) properties Graphical Model View Linear Mixed and clusters may be nested (e.g. subjects  dialects  Model languages). Getting an Intuition Understanding More Complex Models I We’d like to take these into account, and perhaps Mixed Logit investigate them. Models → Generalized Linear Mixed or Multilevel Models Summary (a.k.a. hierarchical, mixed-effects). Extras Fitting Models A Simulated Example Generalized Linear Recall: Generalized Linear Models Mixed Models Florian Jaeger

Generalized Linear Model The picture: Graphical Model View Theory

Linear Model Predictors An Example Geometrical Intuitions Comparison to ANOVA

x1 x2 xn Generalized Linear Response Mixed Model Graphical Model View

Linear Mixed y1 y2 ··· yn Model Getting an Intuition Understanding More Complex Models θ Mixed Logit Models Model parameters Summary

Extras Fitting Models A Simulated Example Generalized Linear Recall: Generalized Linear Models Mixed Models Florian Jaeger

Generalized Linear Model The picture: Graphical Model View Theory

Linear Model Predictors An Example Geometrical Intuitions Comparison to ANOVA

x1 x2 xn Generalized Linear Response Mixed Model Graphical Model View

Linear Mixed y1 y2 ··· yn Model Getting an Intuition Understanding More Complex Models θ Mixed Logit Models Model parameters Summary

Extras Fitting Models A Simulated Example Generalized Linear Recall: Generalized Linear Models Mixed Models Florian Jaeger

Generalized Linear Model The picture: Graphical Model View Theory

Linear Model Predictors An Example Geometrical Intuitions Comparison to ANOVA

x1 x2 xn Generalized Linear Response Mixed Model Graphical Model View

Linear Mixed y1 y2 ··· yn Model Getting an Intuition Understanding More Complex Models θ Mixed Logit Models Model parameters Summary

Extras Fitting Models A Simulated Example Generalized Linear Recall: Generalized Linear Models Mixed Models Florian Jaeger

Generalized Linear Model The picture: Graphical Model View Theory

Linear Model Predictors An Example Geometrical Intuitions Comparison to ANOVA

x1 x2 xn Generalized Linear Response Mixed Model Graphical Model View

Linear Mixed y1 y2 ··· yn Model Getting an Intuition Understanding More Complex Models θ Mixed Logit Models Model parameters Summary

Extras Fitting Models A Simulated Example Generalized Linear Generalized Linear Mixed Models Mixed Models Florian Jaeger

Cluster-specific Generalized Linear Parameters governing Model parameters Σ b inter-cluster variability Graphical Model View (“random effects”) Theory Linear Model An Example Geometrical Intuitions b1 b2 ··· bM Comparison to ANOVA Generalized Linear Mixed Model Graphical Model View x11 x1n1 x21 x2n2 xM1 xMnM ··· ··· ··· Linear Mixed Model ··· Getting an Intuition Understanding More Complex Models y11 ··· y1n1 y21 ··· y2n2 yM1 ··· yMnM Mixed Logit Models

Summary Shared parameters Extras θ (“fixed effects”) Fitting Models A Simulated Example Generalized Linear Generalized Linear Mixed Models Mixed Models Florian Jaeger

Cluster-specific Generalized Linear Parameters governing Model parameters Σ b inter-cluster variability Graphical Model View (“random effects”) Theory Linear Model An Example Geometrical Intuitions b1 b2 ··· bM Comparison to ANOVA Generalized Linear Mixed Model Graphical Model View x11 x1n1 x21 x2n2 xM1 xMnM ··· ··· ··· Linear Mixed Model ··· Getting an Intuition Understanding More Complex Models y11 ··· y1n1 y21 ··· y2n2 yM1 ··· yMnM Mixed Logit Models

Summary Shared parameters Extras θ (“fixed effects”) Fitting Models A Simulated Example Generalized Linear Generalized Linear Mixed Models Mixed Models Florian Jaeger

Cluster-specific Generalized Linear Parameters governing Model parameters Σ b inter-cluster variability Graphical Model View (“random effects”) Theory Linear Model An Example Geometrical Intuitions b1 b2 ··· bM Comparison to ANOVA Generalized Linear Mixed Model Graphical Model View x11 x1n1 x21 x2n2 xM1 xMnM ··· ··· ··· Linear Mixed Model ··· Getting an Intuition Understanding More Complex Models y11 ··· y1n1 y21 ··· y2n2 yM1 ··· yMnM Mixed Logit Models

Summary Shared parameters Extras θ (“fixed effects”) Fitting Models A Simulated Example Generalized Linear Generalized Linear Mixed Models Mixed Models Florian Jaeger

Cluster-specific Generalized Linear Parameters governing Model parameters Σ b inter-cluster variability Graphical Model View (“random effects”) Theory Linear Model An Example Geometrical Intuitions b1 b2 ··· bM Comparison to ANOVA Generalized Linear Mixed Model Graphical Model View x11 x1n1 x21 x2n2 xM1 xMnM ··· ··· ··· Linear Mixed Model ··· Getting an Intuition Understanding More Complex Models y11 ··· y1n1 y21 ··· y2n2 yM1 ··· yMnM Mixed Logit Models

Summary Shared parameters Extras θ (“fixed effects”) Fitting Models A Simulated Example Generalized Linear Generalized Linear Mixed Models Mixed Models Florian Jaeger

Cluster-specific Generalized Linear Parameters governing Model parameters Σ b inter-cluster variability Graphical Model View (“random effects”) Theory Linear Model An Example Geometrical Intuitions b1 b2 ··· bM Comparison to ANOVA Generalized Linear Mixed Model Graphical Model View x11 x1n1 x21 x2n2 xM1 xMnM ··· ··· ··· Linear Mixed Model ··· Getting an Intuition Understanding More Complex Models y11 ··· y1n1 y21 ··· y2n2 yM1 ··· yMnM Mixed Logit Models

Summary Shared parameters Extras θ (“fixed effects”) Fitting Models A Simulated Example Generalized Linear Mixed Linear Model Mixed Models Florian Jaeger

Generalized Linear Model Back to our lexical-decision experiment: Graphical Model View I Theory I A variety of predictors seem to affect RTs, e.g.: Linear Model An Example I Frequency Geometrical Intuitions Comparison to ANOVA I FamilySize Generalized Linear I NativeLanguage Mixed Model I Interactions Graphical Model View Linear Mixed I Additionally, different participants in your study may Model also have: Getting an Intuition Understanding More I different overall decision speeds Complex Models I differing sensitivity to e.g. Frequency. Mixed Logit Models I You want to draw inferences about all these things at Summary the same time Extras Fitting Models A Simulated Example Generalized Linear Mixed Linear Model Mixed Models Florian Jaeger

Generalized Linear Model Back to our lexical-decision experiment: Graphical Model View I Theory I A variety of predictors seem to affect RTs, e.g.: Linear Model An Example I Frequency Geometrical Intuitions Comparison to ANOVA I FamilySize Generalized Linear I NativeLanguage Mixed Model I Interactions Graphical Model View Linear Mixed I Additionally, different participants in your study may Model also have: Getting an Intuition Understanding More I different overall decision speeds Complex Models I differing sensitivity to e.g. Frequency. Mixed Logit Models I You want to draw inferences about all these things at Summary the same time Extras Fitting Models A Simulated Example Generalized Linear Mixed Linear Model Mixed Models Florian Jaeger

Generalized Linear Model Back to our lexical-decision experiment: Graphical Model View I Theory I A variety of predictors seem to affect RTs, e.g.: Linear Model An Example I Frequency Geometrical Intuitions Comparison to ANOVA I FamilySize Generalized Linear I NativeLanguage Mixed Model I Interactions Graphical Model View Linear Mixed I Additionally, different participants in your study may Model also have: Getting an Intuition Understanding More I different overall decision speeds Complex Models I differing sensitivity to e.g. Frequency. Mixed Logit Models I You want to draw inferences about all these things at Summary the same time Extras Fitting Models A Simulated Example Generalized Linear Mixed Linear Model Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions I Random effects, starting simple: let each participant i Comparison to ANOVA have idiosyncratic differences in reaction times (RTs) Generalized Linear Mixed Model

∼N(0,σb ) Graphical Model View Noise∼N(0,σ) z}|{ z}|{ Linear Mixed RTij = α + βxij + bi + ij Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Mixed linear model with one random Mixed Models intercept Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model I Idea: Model distribution of subject differences as An Example deviation from grand mean. Geometrical Intuitions Comparison to ANOVA I Mixed models approximate deviation by fitting a normal Generalized Linear Mixed Model distribution. Graphical Model View Grand mean reflected in ordinary intercept Linear Mixed I Model → By-subject mean can be set to 0 Getting an Intuition Understanding More → Only parameter fit from data is variance. Complex Models Mixed Logit Models > lmer.lexdec0 = lmer(RT ~ 1 + Frequency + + (1 | Subject), data=lexdec) Summary Extras Fitting Models A Simulated Example Generalized Linear Interpretation of the output Mixed Models Florian Jaeger

∼N(0,σb ) Generalized Linear Noise∼N(0,σ) z}|{ Model z}|{ Graphical Model View RTij = α + βxij + bi + ij Theory I Interpretation parallel to ordinary regression models: Linear Model An Example Geometrical Intuitions Formula: RT ~ 1 + Frequency + (1 | Subject) Comparison to ANOVA Data: lexdec Generalized Linear AIC BIC logLik deviance REMLdev Mixed Model -844.6 -823 426.3 -868 -852.6 Graphical Model View Random effects: Linear Mixed Groups Name Variance Std.Dev. Model Getting an Intuition Subject (Intercept) 0.024693 0.15714 Understanding More Residual 0.034068 0.18457 Complex Models Number of obs: 1659, groups: Subject, 21 Mixed Logit Models

Fixed effects: Summary Estimate Std. Error t value Extras (Intercept) 6.588778 0.026981 244.20 Fitting Models Frequency -0.042872 0.003555 -12.06 A Simulated Example Generalized Linear MCMC-sampling Mixed Models Florian Jaeger I t-value anti-conservative → MCMC-sampling of coefficients to obtain non Generalized Linear Model anti-conservative estimates Graphical Model View Theory

> pvals.fnc(lmer.lexdec0, nsim = 10000) Linear Model $fixed An Example Estimate MCMCmean HPD95lower HPD95upper pMCMC Pr(>|t|) Geometrical Intuitions (Intercept) 6.5888 6.5886 6.5255 6.6516 0.0001 0 Comparison to ANOVA Frequency -0.0429 -0.0428 -0.0498 -0.0359 0.0001 0 Generalized Linear Mixed Model $random Groups Name Std.Dev. MCMCmedian MCMCmean HPD95lower HPD95upper Graphical Model View 1 Subject (Intercept) 0.1541 0.1188 0.1205 0.0927 0.1516 Linear Mixed 2 Residual 0.1809 0.1817 0.1818 0.1753 0.1879 Model

Subject (Intercept) sigma Getting an Intuition 120 25 Understanding More

100 Complex Models 20 80 15

60 Mixed Logit 10 40 Models 5 20

0 0 Summary 0.10 0.15 0.20 0.170 0.175 0.180 0.185 0.190 0.195 (Intercept) Frequency Density Extras 100

10 Fitting Models 80

60 A Simulated Example 5 40 20 0 0

6.45 6.50 6.55 6.60 6.65 6.70 −0.055 −0.045 −0.035 Posterior Values Generalized Linear Interpretation of the output Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

I So many new things! What is the output of the linear Linear Model mixed model? An Example Geometrical Intuitions Comparison to ANOVA I estimates of coefficients for fixed and random Generalized Linear predictors. Mixed Model Graphical Model View

I predictions = fitted values, just as for ordinary Linear Mixed regression model. Model Getting an Intuition Understanding More Complex Models

> cor(fitted(lmer.lexdec0), lexdec$RT)^2 Mixed Logit [1] 0.4357668 Models Summary

Extras Fitting Models A Simulated Example Generalized Linear Interpretation of the output Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

I So many new things! What is the output of the linear Linear Model mixed model? An Example Geometrical Intuitions Comparison to ANOVA I estimates of coefficients for fixed and random Generalized Linear predictors. Mixed Model Graphical Model View

I predictions = fitted values, just as for ordinary Linear Mixed regression model. Model Getting an Intuition Understanding More Complex Models

> cor(fitted(lmer.lexdec0), lexdec$RT)^2 Mixed Logit [1] 0.4357668 Models Summary

Extras Fitting Models A Simulated Example Generalized Linear Mixed models vs. ANOVA Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Mixed models inherit all advantages from Theory I Linear Model Generalized Linear Models. An Example Geometrical Intuitions I Unlike the ordinary linear model, the linear mixed model Comparison to ANOVA Generalized Linear now acknowledges that there are slower and faster Mixed Model subjects. Graphical Model View Linear Mixed I This is done without wasting k − 1 degrees of freedom Model Getting an Intuition on k subjects. We only need one parameter! Understanding More Complex Models I Unlike with ANOVA, we can actually look at the Mixed Logit random differences (→ individual differences). Models Summary

Extras Fitting Models A Simulated Example Generalized Linear Mixed models with one random intercept Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory I Let’s look at the by-subject adjustments to the Linear Model intercept. These are called Best Unbiased Linear An Example Predictors (BLUPs) Geometrical Intuitions Comparison to ANOVA

I BLUPs are not fitted parameters. Only one degree of Generalized Linear freedom was added to the model. The BLUPs are Mixed Model Graphical Model View estimated posteriori based on the fitted model. Linear Mixed Model P(bi |α, β,b σb, σ, X) Getting an Intuition b b b Understanding More Complex Models

I The BLUPs are the conditional modes of the bi s—the Mixed Logit choices that maximize the above probability Models Summary

Extras Fitting Models A Simulated Example Generalized Linear Mixed models with one random intercept Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory I Let’s look at the by-subject adjustments to the Linear Model intercept. These are called Best Unbiased Linear An Example Predictors (BLUPs) Geometrical Intuitions Comparison to ANOVA

I BLUPs are not fitted parameters. Only one degree of Generalized Linear freedom was added to the model. The BLUPs are Mixed Model Graphical Model View estimated posteriori based on the fitted model. Linear Mixed Model P(bi |α, β,b σb, σ, X) Getting an Intuition b b b Understanding More Complex Models

I The BLUPs are the conditional modes of the bi s—the Mixed Logit choices that maximize the above probability Models Summary

Extras Fitting Models A Simulated Example Generalized Linear Mixed models with one random intercept Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

NB: By-subjects adjustments are assumed to sum to zero, Linear Model but they don’t necessarily do so (here: −2.3E-12). An Example Geometrical Intuitions Comparison to ANOVA head(ranef(lexdec.lmer0)) Generalized Linear Mixed Model $Subject Graphical Model View

(Intercept) Linear Mixed A1 -0.082668694 Model A2 -0.137236138 Getting an Intuition Understanding More A3 0.009609997 Complex Models C -0.064365560 Mixed Logit D 0.022963863 Models ... Summary

Extras Fitting Models A Simulated Example Generalized Linear Mixed models with one random intercept Mixed Models Florian Jaeger I Observed and fitted values of by-subject means. Generalized Linear Model > p = exp(as.vector(unlist(coef(lmer.lexdec0)$Subject))) Graphical Model View > text(p, as.character(unique(lexdec$Subject)), col = "red") Theory > legend(x=2, y=850, legend=c("Predicted", "Observed"), Linear Model + col=c("blue", "red"), pch=1) An Example Geometrical Intuitions Comparison to ANOVA Subject as random effect Generalized Linear T2 900 Mixed Model Graphical Model View

● Predicted ● Observed Linear Mixed

800 Model Getting an Intuition Understanding More Z V Complex Models

700 M2 Mixed Logit

R2 Models W2 P

Response latency in msecs D Summary A3 R3 600 S T1 C Extras R1 A1 Fitting Models I J W1 A2 A Simulated Example 500 K M1

5 10 15 20

Subject Index Generalized Linear Mixed models with more random intercepts Mixed Models Florian Jaeger

I Unlike with ANOVA, the linear mixed model can Generalized Linear accommodate more than one random intercept, if we Model Graphical Model View think this is necessary/adequate. Theory Linear Model I These are crossed random effects. An Example Geometrical Intuitions Comparison to ANOVA > lexdec.lmer1 = lmer(RT ~ 1 + (1 | Subject) + (1 | Word), Generalized Linear + data = lexdec) Mixed Model > ranef(lmer.lexdec1) Graphical Model View $Word Linear Mixed (Intercept) Model Getting an Intuition almond 0.0164795993 Understanding More ant -0.0245297186 Complex Models apple -0.0494242968 Mixed Logit apricot -0.0410707531 Models ... Summary $Subject Extras (Intercept) Fitting Models A1 -0.082668694 A Simulated Example A2 -0.137236138 A3 0.009609997 Generalized Linear Mixed models with more random intercepts Mixed Models Florian Jaeger

Generalized Linear I Shrinkage becomes even more visible for fitted by-word Model means Graphical Model View Theory

Linear Model Word as random effect An Example

olive Geometrical Intuitions Comparison to ANOVA 750 v Generalized Linear

pineapple Mixed Model frog moosebutterfly camel Graphical Model View 700 v v paprika v vwalnut v Linear Mixed elephant v v whale woodpecker dog v Model v 650 v v stork dolphincucumber monkey Getting an Intuition v v v cat v goose v swan Understanding More owl magpiev v v blueberry v radishv v vultureleekv Complex Models blackberrylemonbeevgoatvv v mushroomv v reindeerv grapev kiwiv 600

Response latency in msecs eggplantv spider^ gull^shark^ avocado^ pear^ carrottortoise^ ^ donkey^ wasp^ Mixed Logit ^ lion^ melon^ tomato^ ^ ^^orange^ ^ bat asparagus^^ ^ apricot^ eagle^ beaverhorsechicken Models ^ ^ ^ potatofox lettuce^ ^ ^ molestrawberry^ crocodile ^sheep broccoli^ ant squirrel^ ^ ^snake ^ ^ ^ 550 almond ^ squid mouse pig hedgehog^ bunny ^ ^ ^ clove Summary mustard banana ^ ^ peanut beetrootgherkin apple cherry Extras Fitting Models 0 20 40 60 80 A Simulated Example Word Index Generalized Linear Mixed models with random slopes Mixed Models Florian Jaeger I Not only the intercept, but any of the slopes (of the Generalized Linear predictors) may differ between individuals. Model Graphical Model View I For example, subjects may show different sensitivity to Theory Frequency: Linear Model An Example Geometrical Intuitions > lmer.lexdec2 = lmer(RT ~ 1 + Frequency + Comparison to ANOVA + (1 | Subject) + (0 + Frequency | Subject) + Generalized Linear Mixed Model + (1 | Word), Graphical Model View + data=lexdec) Linear Mixed Random effects: Model Groups Name Variance Std.Dev. Getting an Intuition Understanding More Word (Intercept) 0.00295937 0.054400 Complex Models Subject Frequency 0.00018681 0.013668 Mixed Logit Subject (Intercept) 0.03489572 0.186804 Models Residual 0.02937016 0.171377 Summary Number of obs: 1659, groups: Word, 79; Subject, 21 Extras Fixed effects: Fitting Models A Simulated Example Estimate Std. Error t value (Intercept) 6.588778 0.049830 132.22 Frequency -0.042872 0.006546 -6.55 Generalized Linear Mixed models with random slopes Mixed Models Florian Jaeger

Generalized Linear I The BLUPs of the random slope reflect the by-subject Model Graphical Model View adjustments to the overall Frequency effect. Theory Linear Model An Example > ranef(lmer.lexdec2) Geometrical Intuitions $Word Comparison to ANOVA (Intercept) Generalized Linear almond 0.0164795993 Mixed Model ant -0.0245297186 Graphical Model View ... Linear Mixed Model $Subject Getting an Intuition (Intercept) Frequency Understanding More A1 -0.1130825633 0.0020016500 Complex Models A2 -0.2375062644 0.0158978707 Mixed Logit A3 -0.0052393295 0.0034830009 Models C -0.1320599587 0.0143830430 Summary D 0.0011335764 0.0038101993 Extras I -0.1416446479 0.0029889156 Fitting Models ... A Simulated Example Generalized Linear Mixed model vs. ANOVA Mixed Models Florian Jaeger

I A mixed model with random slopes for all its predictors Generalized Linear Model (incl. random intercept) is comparable in structure to Graphical Model View an ANOVA Theory Linear Model I Unlike ANOVA, random effects can be fit for several An Example Geometrical Intuitions grouping variables in one single model. Comparison to ANOVA → More power (e.g. Baayen 2004; Dixon, 2008). Generalized Linear Mixed Model I No nesting assumptions need to be made (for examples Graphical Model View Linear Mixed of nesting in mixed models, see Barr, 2008 and his Model blog). As in the examples, so far, random effects can be Getting an Intuition Understanding More crossed. Complex Models Mixed Logit I Assumptions about variance-covariance matrix can be Models tested Summary

I No need to rely on assumptions (e.g. sphericity). Extras Fitting Models I Can test whether specific random effect is needed A Simulated Example (model comparison). Generalized Linear Random Intercept, Slope, and Covariance Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

I Random effects (e.g. intercepts and slopes) may be Linear Model An Example correlated. Geometrical Intuitions Comparison to ANOVA I By default, R fits these covariances, introducing Generalized Linear additional degrees of freedom (parameters). Mixed Model I Note the simpler syntax. Graphical Model View Linear Mixed Model > lmer.lexdec2 = lmer(RT ~ 1 + Frequency + Getting an Intuition Understanding More + (1 + Frequency | Subject) + Complex Models + (1 | Word), Mixed Logit + data=lexdec) Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Random Intercept, Slope, and Covariance Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Random effects: Theory Groups Name Variance Std.Dev. Corr Linear Model Word (Intercept) 0.00296905 0.054489 An Example Subject (Intercept) 0.05647247 0.237639 Geometrical Intuitions Frequency 0.00040981 0.020244 -0.918 Comparison to ANOVA Residual 0.02916697 0.170783 Generalized Linear Mixed Model Number of obs: 1659, groups: Word, 79; Subject, 21 Graphical Model View

Linear Mixed Fixed effects: Model Estimate Std. Error t value Getting an Intuition (Intercept) 6.588778 0.059252 111.20 Understanding More Complex Models Frequency -0.042872 0.007312 -5.86 Mixed Logit Models I What do such covariance parameters mean? Summary Extras Fitting Models A Simulated Example Generalized Linear Covariance of random effects: An example Mixed Models Florian Jaeger

Random Effect Correlation Generalized Linear Model ● Graphical Model View Theory ●

0.02 ● Linear Model ● ● ● An Example ● Geometrical Intuitions

0.01 Comparison to ANOVA

● ● ● Generalized Linear ● ● ●● Mixed Model ● 0.00 ● Graphical Model View Linear Mixed

Frequency ● Model −0.01 Getting an Intuition ● Understanding More Complex Models −0.02 Mixed Logit Models

● −0.03 ● Summary ● Extras Fitting Models −0.04 −0.2 0.0 0.2 0.4 0.6 A Simulated Example

(Intercept) Generalized Linear Plotting Random Effects: Example Mixed Models Florian Jaeger I Plotting random effects sorted by magnitude of first Generalized Linear BLUP (here: intercept) and with posterior Model variance-covariance of random effects conditional on the Graphical Model View Theory

estimates of the model parameters and on the data. Linear Model An Example Geometrical Intuitions > dotplot(ranef(lmer.lexdec3, postVar=TRUE)) Comparison to ANOVA Generalized Linear −0.4 −0.2 0.0 0.2 0.4 0.6 Mixed Model (Intercept) Frequency T2 ● ● Graphical Model View Z ● ● M2 ● ● Linear Mixed V ● ● Model P ● ● Getting an Intuition W2 ● ● R2 ● ● Understanding More Complex Models D ● ● R3 ● ● A3 ● ● Mixed Logit S ● ● Models T1 ● ● A1 ● ● Summary C ● ● I ● ● Extras R1 ● ● Fitting Models W1 ● ● J ● ● A Simulated Example A2 ● ● K ● ● M1 ● ●

−0.4 −0.2 0.0 0.2 0.4 0.6 Generalized Linear Plotting Random Effects: Example Mixed Models Florian Jaeger I Plotted without forcing scales to be identical: > dotplot(ranef(lmer.lexdec3, postVar=TRUE), Generalized Linear Model + scales = list(x = Graphical Model View + list(relation = 'free')))[["Subject"]] Theory

Linear Model (Intercept) Frequency An Example T2 ● ● Geometrical Intuitions Z ● ● Comparison to ANOVA M2 ● ● V ● ● Generalized Linear P ● ● Mixed Model W2 ● ● Graphical Model View

● ● R2 Linear Mixed D ● ● Model R3 ● ● Getting an Intuition A3 ● ● Understanding More S ● ● Complex Models

● ● T1 Mixed Logit A1 ● ● Models C ● ● I ● ● Summary R1 ● ● Extras W1 ● ● Fitting Models J ● ● A Simulated Example A2 ● ● K ● ● M1 ● ●

−0.4 −0.2 0.0 0.2 0.4 0.6 −0.06 −0.04 −0.02 0.00 0.02 0.04 Generalized Linear Plotting Random Effects: Example Mixed Models Florian Jaeger

Generalized Linear I Plotting observed against theoretical quantiles: Model Graphical Model View Theory

−2 −1 0 1 2 Linear Model (Intercept) Frequency An Example Geometrical Intuitions Comparison to ANOVA

● 0.04 0.6

● Generalized Linear

● Mixed Model ● 0.4 ● 0.02 ● ● ● ● Graphical Model View ●

● ● ● ● ● ● Linear Mixed ● ● 0.2 ● 0.00 ● ● Model ● ● ● Getting an Intuition

● ● ● ● ● Understanding More

0.0 ● Complex Models −0.02 ● ● ● ● ● ● ● ● Mixed Logit ● −0.2 ● ● Models −0.04 ● Summary −0.4

−0.06 Extras −2 −1 0 1 2 Fitting Models Standard normal quantiles A Simulated Example Generalized Linear Is the Random Slope Justified? Mixed Models Florian Jaeger I One great feature of Mixed Models is that we can Generalized Linear assess whether a certain random effect structure is Model actually warranted given the data. Graphical Model View Theory I Just as nested ordinary regression models can be Linear Model An Example compared (cf. stepwise regression), we can compare Geometrical Intuitions models with nested random effect structures. Comparison to ANOVA Generalized Linear I Here, model comparison shows that the covariance Mixed Model Graphical Model View

parameter of lmer.lexdec3 significantly improves the Linear Mixed model compared to lmer.lexdec2 with both the Model Getting an Intuition random intercept and slope for subjects, but no Understanding More Complex Models 2 covariance parameter (χ (1) = 21.6, p < 0.0001). Mixed Logit Models The random slope overall is also justified (χ2(2) = 24.1, I Summary p < 0.0001). Extras Fitting Models → Despite the strong correlation, the two random effects A Simulated Example for subjects are needed (given the fixed effect predictors in the model). Generalized Linear Interactions Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View > lmer.lexdec4b = lmer(RT ~ 1 + NativeLanguage * ( Theory + Frequency + FamilySize + SynsetCount + Linear Model + Class) + An Example + (1 + Frequency | Subject) + (1 | Word), Geometrical Intuitions Comparison to ANOVA + data=lexdec) Generalized Linear [...] Mixed Model Fixed effects: Estimate Std. Error t value Graphical Model View (Intercept) 6.385090 0.030425 209.86 Linear Mixed cNativeEnglish -0.155821 0.060533 -2.57 Model cFrequency -0.035180 0.008388 -4.19 Getting an Intuition cFamilySize -0.019757 0.012401 -1.59 Understanding More cSynsetCount -0.030484 0.021046 -1.45 Complex Models cPlant -0.050907 0.015609 -3.26 cNativeEnglish:cFrequency 0.032893 0.011764 2.80 Mixed Logit cNativeEnglish:cFamilySize 0.018424 0.015459 1.19 Models cNativeEnglish:cSynsetCount -0.022869 0.026235 -0.87 cNativeEnglish:cPlant 0.082219 0.019457 4.23 Summary Extras Fitting Models A Simulated Example Generalized Linear Interactions Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory > p.lmer.lexdec4b = pvals.fnc(lmer.lexdec4b, Linear Model An Example nsim=10000, withMCMC=T) Geometrical Intuitions > p.lmer.lexdec$fixed Comparison to ANOVA Estimate MCMCmean HPD95lower HPD95upper pMCMC Pr(>|t|) Generalized Linear (Intercept) 6.4867 6.4860 6.3839 6.5848 0.0001 0.0000 Mixed Model NativeLanguageOther 0.3314 0.3312 0.1990 0.4615 0.0001 0.0000 Graphical Model View Frequency -0.0211 -0.0210 -0.0377 -0.0048 0.0142 0.0156 FamilySize -0.0119 -0.0120 -0.0386 0.0143 0.3708 0.3997 Linear Mixed SynsetCount -0.0403 -0.0401 -0.0852 0.0050 0.0882 0.0920 Model Classplant -0.0157 -0.0155 -0.0484 0.0181 0.3624 0.3767 Getting an Intuition NatLang:Frequency -0.0329 -0.0329 -0.0515 -0.0136 0.0010 0.0006 Understanding More NatLang:FamilySize -0.0184 -0.0184 -0.0496 0.0109 0.2416 0.2366 Complex Models NatLang:SynsetCount 0.0229 0.0230 -0.0297 0.0734 0.3810 0.3866 NatLang:Classplant -0.0822 -0.0825 -0.1232 -0.0453 0.0001 0.0000 Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Visualizing an Interactions Mixed Models Florian Jaeger

Generalized Linear Model

6.8 Graphical Model View Theory

Linear Model

6.7 An Example Geometrical Intuitions Comparison to ANOVA

6.6 Generalized Linear Mixed Model Graphical Model View 6.5 RT Linear Mixed Model Getting an Intuition

6.4 Other Understanding More Complex Models

Mixed Logit

6.3 English Models

Summary

6.2 Extras NativeLanguage Fitting Models 2 3 4 5 6 7 8 A Simulated Example

Frequency Generalized Linear MCMC Mixed Models Florian Jaeger

sigma Generalized Linear

100 Model

50 Graphical Model View Theory 0

0.160 0.170 0.180 Linear Model NativeLanguageOther:SynsetCountNativeLanguageOther:Classplant Word (Intercept) Subject (Intercept) An Example 15 20 Geometrical Intuitions 60 25 15

20 Comparison to ANOVA 10 40 15 10 5 10 20 5 Generalized Linear 5

0 0 0 0 Mixed Model

−0.10 0.00 0.10 −0.15 −0.050.00 0.02 0.04 0.06 0.10 0.15 0.20 Graphical Model View SynsetCount Classplant NativeLanguageOther:FrequencyNativeLanguageOther:FamilySize Density Linear Mixed 40 25 20 15 Model 20 30 15 15 10 Getting an Intuition 20 10 10

5 Understanding More 5 10 5 Complex Models 0 0 0 0

−0.10 0.00 −0.05 0.00 0.05 −0.06 −0.02 −0.05 0.00 0.05 Mixed Logit (Intercept) NativeLanguageOther Frequency FamilySize Models 6 8 30 5 25 40 6 Summary 4 20 30 3 4 15 20 2

10 Extras 2 1 5 10 Fitting Models 0 0 0 0 A Simulated Example 6.3 6.4 6.5 6.6 6.7 0.1 0.3 0.5 −0.06 −0.02 −0.06 −0.02 0.02 Posterior Values Generalized Linear Mixed Logit Model Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA So, what do we need to change if we want to Generalized Linear I Mixed Model investigate, e.g. a binary (categorical) outcome? Graphical Model View Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Recall that ... Mixed Models Florian Jaeger

Generalized Linear logistic regression is a kind of generalized linear model. Model Graphical Model View I The linear predictor: Theory Linear Model An Example η = α + β1X1 + ··· + βnXn Geometrical Intuitions Comparison to ANOVA

Generalized Linear I The link function g is the logit transform: Mixed Model Graphical Model View

Linear Mixed Model −1 Getting an Intuition E(Y) = p = g (η) ⇔ Understanding More p Complex Models g(p) = ln = η = α + β1X1 + ··· + βnXn (2) Mixed Logit 1 − p Models Summary

I The distribution around the mean is taken to be Extras binomial. Fitting Models A Simulated Example Generalized Linear Recall that ... Mixed Models Florian Jaeger

Generalized Linear logistic regression is a kind of generalized linear model. Model Graphical Model View I The linear predictor: Theory Linear Model An Example η = α + β1X1 + ··· + βnXn Geometrical Intuitions Comparison to ANOVA

Generalized Linear I The link function g is the logit transform: Mixed Model Graphical Model View

Linear Mixed Model −1 Getting an Intuition E(Y) = p = g (η) ⇔ Understanding More p Complex Models g(p) = ln = η = α + β1X1 + ··· + βnXn (2) Mixed Logit 1 − p Models Summary

I The distribution around the mean is taken to be Extras binomial. Fitting Models A Simulated Example Generalized Linear Mixed Logit Models Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

I Mixed Logit Models are a type of Generalized Linear Linear Model An Example Mixed Model. Geometrical Intuitions Comparison to ANOVA More generally, one advantage of the mixed model I Generalized Linear approach is its flexibility. Everything we learned about Mixed Model mixed linear models extends to other types of Graphical Model View Linear Mixed distributions within the exponential family (binomial, Model multinomial, poisson, beta-binomial, ...) Getting an Intuition Understanding More Caveat There are some implementational details (depending on Complex Models Mixed Logit your stats program, too) that may differ. Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear An example Mixed Models Florian Jaeger The same model as above, but now we predict whether I Generalized Linear participants’ answer to the lexical decision task was Model Graphical Model View correct. Theory Linear Model I Outcome: Correct vs. incorrect answer (binomial An Example outcome) Geometrical Intuitions Comparison to ANOVA I Predictors: same as above Generalized Linear Mixed Model Graphical Model View

> lmer.lexdec.answer4 = lmer(Correct == "correct" ~ 1 + Linear Mixed + NativeLanguage * ( Model + Frequency + FamilySize + SynsetCount + Getting an Intuition Understanding More + Class) + Complex Models + (1 + Frequency | Subject) + (1 | Word), Mixed Logit + data=lexdec, family="binomial") Models

Summary NB: The only difference is the outcome variable and the Extras family (assumed noise distribution) now is binomial (we Fitting Models didn’t specify it before because ”gaussian” is the A Simulated Example default). Generalized Linear Mixed Logit Output Mixed Models Florian Jaeger

Generalized Linear Model [...] Graphical Model View AIC BIC logLik deviance Theory 495 570.8 -233.5 467 Linear Model Random effects: Groups Name Variance Std.Dev. Corr An Example Word (Intercept) 0.78368 0.88526 Geometrical Intuitions Subject (Intercept) 2.92886 1.71139 Comparison to ANOVA Frequency 0.11244 0.33532 -0.884 Generalized Linear Number of obs: 1659, groups: Word, 79; Subject, 21 Mixed Model Graphical Model View Fixed effects: Estimate Std. Error z value Pr(>|z|) Linear Mixed (Intercept) 4.3612 0.3022 14.433 < 2e-16 *** Model cNativeEnglish 0.2828 0.5698 0.496 0.61960 Getting an Intuition cFrequency 0.6925 0.2417 2.865 0.00417 ** Understanding More cFamilySize -0.2250 0.3713 -0.606 0.54457 Complex Models cSynsetCount 0.8152 0.6598 1.235 0.21665 cPlant 0.8441 0.4778 1.767 0.07729 . Mixed Logit cNativeEnglish:cFrequency 0.2803 0.3840 0.730 0.46546 Models cNativeEnglish:cFamilySize -0.2746 0.5997 -0.458 0.64710 Summary cNativeEnglish:cSynsetCount -2.6063 1.1772 -2.214 0.02683 * cNativeEnglish:cPlant 1.0615 0.7561 1.404 0.16035 Extras Fitting Models A Simulated Example Generalized Linear Interaction in logit space Mixed Models Florian Jaeger

Generalized Linear English Model Graphical Model View Theory

6 Linear Model An Example Geometrical Intuitions Other Comparison to ANOVA

5 Generalized Linear Mixed Model Graphical Model View

Linear Mixed

4 Model Getting an Intuition Understanding More Correct == "correct" Complex Models

Mixed Logit 3 Models

Summary

Extras

2 Fitting Models A Simulated Example NativeLanguage

2 3 4 5 6 7 8

Frequency Generalized Linear Interaction in probability space Mixed Models Florian Jaeger

Generalized Linear English Other Model 1.00 Graphical Model View Theory

Linear Model 0.98 An Example Geometrical Intuitions Comparison to ANOVA 0.96 Generalized Linear Mixed Model Graphical Model View 0.94 Linear Mixed Model

0.92 Getting an Intuition Understanding More Correct == "correct" Complex Models

0.90 Mixed Logit Models

Summary 0.88 Extras Fitting Models

0.86 A Simulated Example NativeLanguage

2 3 4 5 6 7 8

Frequency Generalized Linear Why not ANOVA? Mixed Models Florian Jaeger

ANOVA over proportion has several problems (cf. Generalized Linear I Model Jaeger, 2008 for a summary) Graphical Model View Theory I Hard to interpret output Linear Model I Violated assumption of homogeneity of variances An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model Graphical Model View

Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Why not ANOVA? Mixed Models Florian Jaeger

Generalized Linear Model I These problems can be address via transformations, Graphical Model View weighted regression, etc., But why should we do this is Theory Linear Model if there is an adequate approach that does not need An Example Geometrical Intuitions fudging and has more power? Comparison to ANOVA Generalized Linear Mixed Model Graphical Model View

Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Summary Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View I There are a lot of issues, we have not covered today (by Theory far most of these are not particular to mixed models, Linear Model An Example but apply equally to ANOVA). Geometrical Intuitions Comparison to ANOVA I The mixed model approach has many advantages: Generalized Linear I Power (especially on unbalanced data) Mixed Model Graphical Model View I No assumption of homogeneity of variances Linear Mixed I Random effect structure can be explored, understood. Model Getting an Intuition I Extendability to a variety of distributional families Understanding More I Conceptual transparency Complex Models Mixed Logit I Effect direction, shape, size can be easily understood Models and investigated. Summary → You end up getting another perspective on your data Extras Fitting Models A Simulated Example Generalized Linear Summary Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View I There are a lot of issues, we have not covered today (by Theory far most of these are not particular to mixed models, Linear Model An Example but apply equally to ANOVA). Geometrical Intuitions Comparison to ANOVA I The mixed model approach has many advantages: Generalized Linear I Power (especially on unbalanced data) Mixed Model Graphical Model View I No assumption of homogeneity of variances Linear Mixed I Random effect structure can be explored, understood. Model Getting an Intuition I Extendability to a variety of distributional families Understanding More I Conceptual transparency Complex Models Mixed Logit I Effect direction, shape, size can be easily understood Models and investigated. Summary → You end up getting another perspective on your data Extras Fitting Models A Simulated Example Generalized Linear Summary Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View I There are a lot of issues, we have not covered today (by Theory far most of these are not particular to mixed models, Linear Model An Example but apply equally to ANOVA). Geometrical Intuitions Comparison to ANOVA I The mixed model approach has many advantages: Generalized Linear I Power (especially on unbalanced data) Mixed Model Graphical Model View I No assumption of homogeneity of variances Linear Mixed I Random effect structure can be explored, understood. Model Getting an Intuition I Extendability to a variety of distributional families Understanding More I Conceptual transparency Complex Models Mixed Logit I Effect direction, shape, size can be easily understood Models and investigated. Summary → You end up getting another perspective on your data Extras Fitting Models A Simulated Example Generalized Linear Modeling schema Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View Theory

Linear Model An Example Geometrical Intuitions Comparison to ANOVA

Generalized Linear Mixed Model Graphical Model View

Linear Mixed Model Getting an Intuition Understanding More Complex Models

Mixed Logit Models

Summary

Extras Fitting Models A Simulated Example Generalized Linear Two Methods Mixed Models Florian Jaeger Noise∼N(0,σ) z}|{ RTij = α + βxij + ij Generalized Linear Model How do we fit the parameters β and σ (choose model Graphical Model View I i Theory

coefficients)? Linear Model An Example I There are two major approaches (deeply related, yet Geometrical Intuitions different) in widespread use: Comparison to ANOVA Generalized Linear I The principle of maximum likelihood: pick parameter Mixed Model values that maximize the probability of your data Y Graphical Model View Linear Mixed choose {βi } and σ that make the likelihood Model P(Y |{βi }, σ) as large as possible Getting an Intuition Understanding More Complex Models

I Bayesian inference: put a probability distribution on the Mixed Logit model parameters and update it on the basis of what Models parameters best explain the data Summary Extras Fitting Models A Simulated Example Generalized Linear Two Methods Mixed Models Florian Jaeger Noise∼N(0,σ) z}|{ RTij = α + βxij + ij Generalized Linear Model How do we fit the parameters β and σ (choose model Graphical Model View I i Theory

coefficients)? Linear Model An Example I There are two major approaches (deeply related, yet Geometrical Intuitions different) in widespread use: Comparison to ANOVA Generalized Linear I The principle of maximum likelihood: pick parameter Mixed Model values that maximize the probability of your data Y Graphical Model View Linear Mixed choose {βi } and σ that make the likelihood Model P(Y |{βi }, σ) as large as possible Getting an Intuition Understanding More Complex Models

I Bayesian inference: put a probability distribution on the Mixed Logit model parameters and update it on the basis of what Models parameters best explain the data Summary Extras Fitting Models A Simulated Example Generalized Linear Two Methods Mixed Models Florian Jaeger Noise∼N(0,σ) z}|{ RTij = α + βxij + ij Generalized Linear Model How do we fit the parameters β and σ (choose model Graphical Model View I i Theory

coefficients)? Linear Model An Example I There are two major approaches (deeply related, yet Geometrical Intuitions different) in widespread use: Comparison to ANOVA Generalized Linear I The principle of maximum likelihood: pick parameter Mixed Model values that maximize the probability of your data Y Graphical Model View Linear Mixed choose {βi } and σ that make the likelihood Model P(Y |{βi }, σ) as large as possible Getting an Intuition Understanding More Complex Models

I Bayesian inference: put a probability distribution on the Mixed Logit model parameters and update it on the basis of what Models parameters best explain the data Summary Extras Fitting Models A Simulated Example Generalized Linear Two Methods Mixed Models Florian Jaeger Noise∼N(0,σ) z}|{ RTij = α + βxij + ij Generalized Linear Model How do we fit the parameters β and σ (choose model Graphical Model View I i Theory

coefficients)? Linear Model An Example I There are two major approaches (deeply related, yet Geometrical Intuitions different) in widespread use: Comparison to ANOVA Generalized Linear I The principle of maximum likelihood: pick parameter Mixed Model values that maximize the probability of your data Y Graphical Model View Linear Mixed choose {βi } and σ that make the likelihood Model P(Y |{βi }, σ) as large as possible Getting an Intuition Understanding More Complex Models

I Bayesian inference: put a probability distribution on the Mixed Logit model parameters and update it on the basis of what Models parameters best explain the data Summary Extras Prior Fitting Models z }| { A Simulated Example P(Y |{β }, σ) P({β }, σ) P({β }, σ|Y ) = i i i P(Y ) Generalized Linear Two Methods Mixed Models Florian Jaeger Noise∼N(0,σ) z}|{ RTij = α + βxij + ij Generalized Linear Model How do we fit the parameters β and σ (choose model Graphical Model View I i Theory

coefficients)? Linear Model An Example I There are two major approaches (deeply related, yet Geometrical Intuitions different) in widespread use: Comparison to ANOVA Generalized Linear I The principle of maximum likelihood: pick parameter Mixed Model values that maximize the probability of your data Y Graphical Model View Linear Mixed choose {βi } and σ that make the likelihood Model P(Y |{βi }, σ) as large as possible Getting an Intuition Understanding More Complex Models

I Bayesian inference: put a probability distribution on the Mixed Logit model parameters and update it on the basis of what Models parameters best explain the data Summary Extras Likelihood Prior Fitting Models z }| { z }| { A Simulated Example P(Y |{β }, σ) P({β }, σ) P({β }, σ|Y ) = i i i P(Y ) Generalized Linear Fitting Mixed Models Mixed Models Florian Jaeger

∼N(0,σb ) Generalized Linear Noise∼N(0,σ) z}|{ Model RT = α + βx + b + z}|{ Graphical Model View ij ij i ij Theory

I A couple of caveats about current implementations: Linear Model An Example I To avoid biased variance estimates, linear mixed models Geometrical Intuitions are sometimes fit with Restricted Maximum Comparison to ANOVA Generalized Linear Likelihood. Mixed Model I There are no known analytic solutions to the likelihood Graphical Model View formula of mixed logit models. Instead Laplace Linear Mixed Model Approximation is used, which, however, provides a Getting an Intuition Understanding More decent approximation (Harding and Hausman 2007). In Complex Models modern implementations, this approximation can be Mixed Logit improved (at the cost of increased computational cost). Models Summary I Finally, and as for all models/analysis, statistics are only a tool and, whether we can trust our results, depends on Extras Fitting Models how careful we use these tools → Tomorrow’s lecture. A Simulated Example Generalized Linear A simulated example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View ∼N(0,σb ) Noise∼N(0,σ) Theory z}|{ z}|{ RTij = α + βxij + bi + ij Linear Model An Example Geometrical Intuitions I Simulation of trial-level data can be invaluable for Comparison to ANOVA

achieving deeper understanding of the data Generalized Linear ## simulate some data Mixed Model > sigma.b <- 125 # inter-subject variation larger than Graphical Model View > sigma.e <- 40 # intra-subject, inter-trial variation > alpha <- 500 Linear Mixed > beta <- 12 Model > M <- 6 # number of participants Getting an Intuition > n <- 50 # trials per participant Understanding More > b <- rnorm(M, 0, sigma.b) # individual differences Complex Models > nneighbors <- rpois(M*n,3) + 1 # generate num. neighbors Mixed Logit > subj <- rep(1:M,n) Models > RT <- alpha + beta * nneighbors + # simulate RTs! b[subj] + rnorm(M*n,0,sigma.e) # Summary Extras Fitting Models A Simulated Example Generalized Linear A simulated example Mixed Models Florian Jaeger

Generalized Linear Model Graphical Model View ∼N(0,σb ) Noise∼N(0,σ) Theory z}|{ z}|{ RTij = α + βxij + bi + ij Linear Model An Example Geometrical Intuitions I Simulation of trial-level data can be invaluable for Comparison to ANOVA

achieving deeper understanding of the data Generalized Linear ## simulate some data Mixed Model > sigma.b <- 125 # inter-subject variation larger than Graphical Model View > sigma.e <- 40 # intra-subject, inter-trial variation > alpha <- 500 Linear Mixed > beta <- 12 Model > M <- 6 # number of participants Getting an Intuition > n <- 50 # trials per participant Understanding More > b <- rnorm(M, 0, sigma.b) # individual differences Complex Models > nneighbors <- rpois(M*n,3) + 1 # generate num. neighbors Mixed Logit > subj <- rep(1:M,n) Models > RT <- alpha + beta * nneighbors + # simulate RTs! b[subj] + rnorm(M*n,0,sigma.e) # Summary Extras Fitting Models A Simulated Example Generalized Linear A simulated example Mixed Models Florian Jaeger

Generalized Linear Model ●

1000 Graphical Model View

●● Theory ●● ●● ● ●● ● ●● ● ● ● ● ●●● ●● ● ● ●●●● ●●● ●● ●● ● ●●● ● ●● ● ● Linear Model ● ●●● ●● ●●●● ●● ●● ● ● 800 ●●● ● ●●●● ●●● ● ●●● ●●●● ●●● ●● ● ● ●● ●● ●●●● ●●● ●● ●● ●● An Example ● ●● ●●● ●● ●●●● ● ●●● ●● ● ●●● ● ●● ●● ●●●● ●●● Geometrical Intuitions RT ● ●●● ● ●● ●●● ● ●● ●● ● ●● ● Comparison to ANOVA ● ●● ● ●● ● ● ● ●● 600 ●● ●● ● ● ● ●● ● ● ●● ●● ● ●● ●●● ● ●● ●● ● ● Generalized Linear ● ●● ●●● ● ●●● ●● ● ● ● ●● ●● ●● ●●● ●●● ● Mixed Model ●● ●● ● ● ● ● ● ●● Graphical Model View ● ● ● ● 400 Linear Mixed 0 2 4 6 8 Model Getting an Intuition # Neighbors Understanding More Complex Models

Mixed Logit Models I Participant-level clustering is easily visible Summary I This reflects the fact that (simulated) inter-participant Extras variation (125ms) is larger than (simulated) inter-trial Fitting Models variation (40ms) A Simulated Example I And the (simulated) effects of neighborhood density are also visible Generalized Linear A simulated example Mixed Models Florian Jaeger

Generalized Linear Model ●

1000 Subject intercepts (alpha + beta[subj]) Graphical Model View

●● Theory ●● ●● ● ●● ● ●● ● ● ● ● ●●● ●● ● ● ●●●● ●●● ●● ●● ● ●●● ● ●● ● ● Linear Model ● ●●● ●● ●●●● ●● ●● ● ● 800 ●●● ● ●●●● ●●● ● ●●● ●●●● ●●● ●● ● ● ●● ●● ●●●● ●●● ●● ●● ●● An Example ● ●● ●●● ●● ●●●● ● ●●● ●● ● ●●● ● ●● ●● ●●●● ●●● Geometrical Intuitions RT ● ●●● ● ●● ●●● ● ●● ●● ● ●● ● Comparison to ANOVA ● ●● ● ●● ● ● ● ●● 600 ●● ●● ● ● ● ●● ● ● ●● ●● ● ●● ●●● ● ●● ●● ● ● Generalized Linear ● ●● ●●● ● ●●● ●● ● ● ● ●● ●● ●● ●●● ●●● ● Mixed Model ●● ●● ● ● ● ● ● ●● Graphical Model View ● ● ● ● 400 Linear Mixed 0 2 4 6 8 Model Getting an Intuition # Neighbors Understanding More Complex Models

Mixed Logit Models I Participant-level clustering is easily visible Summary I This reflects the fact that (simulated) inter-participant Extras variation (125ms) is larger than (simulated) inter-trial Fitting Models variation (40ms) A Simulated Example I And the (simulated) effects of neighborhood density are also visible Generalized Linear A simulated example Mixed Models Florian Jaeger

Generalized Linear Model ●

1000 Subject intercepts (alpha + beta[subj]) Graphical Model View

●● Theory ●● ●● ● ●● ● ●● ● ● ● ● ●●● ●● ● ● ●●●● ●●● ●● ●● ● ●●● ● ●● ● ● Linear Model ● ●●● ●● ●●●● ●● ●● ● ● 800 ●●● ● ●●●● ●●● ● ●●● ●●●● ●●● ●● ● ● ●● ●● ●●●● ●●● ●● ●● ●● An Example ● ●● ●●● ●● ●●●● ● ●●● ●● ● ●●● ● ●● ●● ●●●● ●●● Geometrical Intuitions RT ● ●●● ● ●● ●●● ● ●● ●● ● ●● ● Comparison to ANOVA ● ●● ● ●● ● ● ● ●● 600 ●● ●● ● ● ● ●● ● ● ●● ●● ● ●● ●●● ● ●● ●● ● ● Generalized Linear ● ●● ●●● ● ●●● ●● ● ● ● ●● ●● ●● ●●● ●●● ● Mixed Model ●● ●● ● ● ● ● ● ●● Graphical Model View ● ● ● ● 400 Linear Mixed 0 2 4 6 8 Model Getting an Intuition # Neighbors Understanding More Complex Models

Mixed Logit Models I Participant-level clustering is easily visible Summary I This reflects the fact that (simulated) inter-participant Extras variation (125ms) is larger than (simulated) inter-trial Fitting Models variation (40ms) A Simulated Example I And the (simulated) effects of neighborhood density are also visible Generalized Linear A simulated example Mixed Models Florian Jaeger

Generalized Linear Model ●

1000 Subject intercepts (alpha + beta[subj]) Graphical Model View

●● Theory ●● ●● ● ●● ● ●● ● ● ● ● ●●● ●● ● ● ●●●● ●●● ●● ●● ● ●●● ● ●● ● ● Linear Model ● ●●● ●● ●●●● ●● ●● ● ● 800 ●●● ● ●●●● ●●● ● ●●● ●●●● ●●● ●● ● ● ●● ●● ●●●● ●●● ●● ●● ●● An Example ● ●● ●●● ●● ●●●● ● ●●● ●● ● ●●● ● ●● ●● ●●●● ●●● Geometrical Intuitions RT ● ●●● ● ●● ●●● ● ●● ●● ● ●● ● Comparison to ANOVA ● ●● ● ●● ● ● ● ●● 600 ●● ●● ● ● ● ●● ● ● ●● ●● ● ●● ●●● ● ●● ●● ● ● Generalized Linear ● ●● ●●● ● ●●● ●● ● ● ● ●● ●● ●● ●●● ●●● ● Mixed Model ●● ●● ● ● ● ● ● ●● Graphical Model View ● ● ● ● 400 Linear Mixed 0 2 4 6 8 Model Getting an Intuition # Neighbors Understanding More Complex Models

Mixed Logit Models I Participant-level clustering is easily visible Summary I This reflects the fact that (simulated) inter-participant Extras variation (125ms) is larger than (simulated) inter-trial Fitting Models variation (40ms) A Simulated Example I And the (simulated) effects of neighborhood density are also visible