<<

Statistics 203: Introduction to Regression and Analysis of Generalized Linear Models I

Jonathan Taylor

- p. 1/18 Today's class

● Today's class ■ . ● Generalized linear models ● example ■ ● Binary outcomes Generalized linear models. ● transform ■ ● Binary regression . ● Link functions: binary regression ● Link function inverses: binary regression ● Odds ratios & logistic regression ● Link & variance fns. of a GLM ● Binary (again) ● Fitting a binary regression GLM: IRLS ● Other common examples of GLMs ● Deviance ● Binary deviance ● Partial deviance tests 2 ● Wald χ tests

- p. 2/18 Generalized linear models

● Today's class ■ All models we have seen so far deal with continuous ● Generalized linear models ● Binary regression example outcome variables with no restriction on their expectations, ● Binary outcomes ● Logit transform and (most) have assumed that and variance are ● Binary regression ● Link functions: binary unrelated (i.e. variance is constant). regression ● Link function inverses: binary ■ Many outcomes of interest do not satisfy this. regression ● Odds ratios & logistic ■ regression Examples: binary outcomes, Poisson count outcomes. ● Link & variance fns. of a GLM ● Binary (again) ■ A Generalized (GLM) is a model with two ● Fitting a binary regression GLM: IRLS ingredients: a link function and a . ● Other common examples of GLMs ◆ The link relates the of the observations to ● Deviance ● Binary deviance predictors: linearization ● Partial deviance tests 2 ◆ ● Wald χ tests The variance function relates the means to the .

- p. 3/18 Binary regression example

● Today's class ■ A local health clinic sent fliers to its clients to encourage ● Generalized linear models ● Binary regression example everyone, but especially older persons at high risk of ● Binary outcomes ● Logit transform complications, to get a flu shot in time for protection against ● Binary regression ● Link functions: binary an expected flu epidemic. regression ● Link function inverses: binary ■ In a pilot follow-up study, 50 clients were randomly selected regression ● Odds ratios & logistic and asked whether they actually received a flu shot. regression ● Link & variance fns. of a GLM ■ ● Binary (again) In addition, were collected on their age and their health ● Fitting a binary regression GLM: IRLS awareness. ● Other common examples of GLMs ■ Here is the data. ● Deviance ● Binary deviance ● Partial deviance tests 2 ● Wald χ tests

- p. 4/18 Binary outcomes

● Today's class ■ Suppose outcome Y is a 0-1 , then ● Generalized linear models i ● Binary regression example ● Binary outcomes µ E Y π . ● Logit transform i = ( i) = i ● Binary regression ● Link functions: binary ■ regression Variance function ● Link function inverses: binary regression ● Odds ratios & logistic Var(Yi) = πi(1 − πi) = µi(1 − µi) regression ● Link & variance fns. of a GLM ● Binary (again) Variance is related to mean! ● Fitting a binary regression GLM: IRLS ■ A convenient way to model the dependence of Y on ● Other common examples of i GLMs ● Deviance covariates Xi1, . . . , Xi,p−1 is through the logit transform. ● Binary deviance ● Partial deviance tests 2 p−1 ● Wald χ tests logit(πi) = β0 + βj Xij j=1 X

- p. 5/18 Logit transform

● Today's class ■ Logit transform: ● Generalized linear models ● Binary regression example ● Binary outcomes π ● Logit transform logit π , ● Binary regression ( ) = log ∈ (−∞ +∞) ● Link functions: binary 1 − π regression   ● Link function inverses: binary ■ Inverse: regression x ● Odds ratios & logistic −1 e regression logit (x) = ∈ (0, 1). ● Link & variance fns. of a GLM x ● Binary (again) 1 + e ● Fitting a binary regression ■ GLM: IRLS Derivative: ● Other common examples of GLMs ● Deviance d 1 1 ● Binary deviance logit(π) = = . ● Partial deviance tests dπ π(1 − π) V (π) 2 ● Wald χ tests Note: special relation between derivatives and variance function – more on this next lecture.

- p. 6/18 Binary regression

● Today's class ■ Logistic regression model: ● Generalized linear models ● Binary regression example ● Binary outcomes logit(E(Yi)) = logit(πi) ● Logit transform ● Binary regression p−1 ● Link functions: binary regression ● Link function inverses: binary = β0 + βj Xij regression ● Odds ratios & logistic j=1 regression X ● Link & variance fns. of a GLM ■ ● Binary (again) regression model: ● Fitting a binary regression GLM: IRLS ● Other common examples of p−1 GLMs −1 ● Deviance Φ (E(Yi)) = β0 + βj Xij ● Binary deviance ● Partial deviance tests j=1 2 ● Wald χ tests X where Φ is CDF of N(0, 1), i.e. Φ(t) = pnorm(t). ■ In each case, Var(Yi) = πi(1 − πi) but the regression model is different. ■ Here is an example

- p. 7/18 Link functions: binary regression 6 4 logit probit 2 0 f(pi) −2 −4 −6

0.0 0.2 0.4 0.6 0.8 1.0

pi - p. 8/18 Link function inverses: binary regression 1.0 0.8 logit−inv probit−inv 0.6 f(pi) 0.4 0.2 0.0

−4 −2 0 2 4

pi - p. 9/18 Odds ratios & logistic regression

● Today's class ■ For any event A and any P ● Generalized linear models ● Binary regression example ● Binary outcomes P(A) ● Logit transform ODDS(A) = . ● Binary regression P A ● Link functions: binary 1 − ( ) regression ● Link function inverses: binary ■ regression In the logistic regression model with outcome Y ● Odds ratios & logistic regression ● ODDS Y . . . , X x , . . . Link & variance fns. of a GLM ( = 1| j = j + 1 ) βj ● Binary (again) = e ● Fitting a binary regression ODDS(Y = 1| . . . , Xj = xj , . . . ) GLM: IRLS ● Other common examples of GLMs is the (multiplicative) change in odds if variable X increases ● Deviance j ● Binary deviance by 1: βj is known as the for . ● Partial deviance tests e Xj 2 ● Wald χ tests ■ If Xj ∈ {0, 1} is dichotomous, and P(Y = 1| . . . ) is small (rare event hypothesis) then group with Xj = 1 are approximately eβj more likely to have event, all other parameters being the same.

- p. 10/18 Link & variance fns. of a GLM

● Today's class ■ If ● Generalized linear models ● Binary regression example k ● Binary outcomes E ● Logit transform ηi = g( (Yi)) = g(µi) = β0 + βj Xij ● Binary regression j=1 ● Link functions: binary regression X ● Link function inverses: binary then g is called the link function for the model. regression ● Odds ratios & logistic ■ regression If ● Link & variance fns. of a GLM E ● Binary (again) Var(Yi) = φ · V ( (Yi)) = φ · V (µi) ● Fitting a binary regression GLM: IRLS for φ > 0 and some function V , then V is the called variance ● Other common examples of GLMs ● Deviance function for the model. ● Binary deviance ■ ● Partial deviance tests “Canonical”reference: Generalized Linear Models, 2 ● Wald χ tests McCullagh and Nelder.

- p. 11/18 Binary (again)

● Today's class ■ For a logistic model, ● Generalized linear models ● Binary regression example ● Binary outcomes g(µ) = logit(µ), V (µ) = µ(1 − µ). ● Logit transform ● Binary regression ● Link functions: binary ■ For a , regression ● Link function inverses: binary regression −1 ● Odds ratios & logistic g(µ) = Φ (µ), V (µ) = µ(1 − µ). regression ● Link & variance fns. of a GLM ● Binary (again) ● Fitting a binary regression GLM: IRLS ● Other common examples of GLMs ● Deviance ● Binary deviance ● Partial deviance tests 2 ● Wald χ tests

- p. 12/18 Fitting a binary regression GLM: IRLS

● Today's class ■ Algorithm ● Generalized linear models ● Binary regression example 1. Initialize: set µi = 0.999 or 0.001 depending on whether ● Binary outcomes ● Logit transform Yi = 1 or 0. ● Binary regression 0 ● Link functions: binary 2. Compute Z = g(µ ) + g (µ )(Y − µ ). regression i b i i i i ● Link function inverses: binary −1 0 2 regression 3. Use weights Wi = g (µi) V (µi) to regress Z onto X’s to ● Odds ratios & logistic regression get β using WLS.b b b ● Link & variance fns. of a GLM ● Binary (again) −1 b p b ● Fitting a binary regression 4. Compute µi = g β0 + Xijβj . GLM: IRLS j=1 ● Other common examples of b GLMs 5. Repeat steps 2-4 until convergence.  ● Deviance b P b ● Binary deviance ■ b ● Partial deviance tests Approximate distribution 2 ● Wald χ tests β ∼ N(β, φ(XtW X)−1). ■ If φ has to be estimated, a simple choice is Pearson’s X 2: b c n 1 (Y − µ )2 φ = i i . n − p V (µ ) i=1 i X b b - p. 13/18 b Other common examples of GLMs

● Today's class ■ Standard multiple : g µ µ, Var µ ● Generalized linear models ( ) = ( ) = 1 ● Binary regression example ■ ● Binary outcomes Linear regression with variance tied to mean, for example: ● Logit transform 2 ● Binary regression g(µ) = µ, Var(µ) = µ . ● Link functions: binary ■ regression Poisson log-linear models: g(µ) = log(µ), Var(µ) = µ. ● Link function inverses: binary regression ● Odds ratios & logistic regression ● Link & variance fns. of a GLM ● Binary (again) ● Fitting a binary regression GLM: IRLS ● Other common examples of GLMs ● Deviance ● Binary deviance ● Partial deviance tests 2 ● Wald χ tests

- p. 14/18 Deviance

● Today's class ■ Instead of , models are fit on the basis of scaled ● Generalized linear models ● Binary regression example deviance, analogous to SSE when errors are not Gaussian. ● Binary outcomes ● Logit transform ■ ● Binary regression This is not completely clear from the IRLS algorithm, but it is ● Link functions: binary very close to the Newton-Raphson algorithm. Algorithm regression ● Link function inverses: binary often called Fisher scoring. regression ● Odds ratios & logistic ■ regression ● Link & variance fns. of a GLM ● Binary (again) DEV (µ, Y ) = −2 log L(µ, Y ) + −2 log L(Y, Y ) ● Fitting a binary regression GLM: IRLS ● Other common examples of where µ is a location estimator for Y (usually in an GLMs ● Deviance – more next lecture). ● Binary deviance ■ 2 ● Partial deviance tests If Y is Gaussian with independent N µ , σ entries 2 ( i ) ● Wald χ tests n 1 DEV (µ, Y ) = (Y − µ )2 σ2 i i i=1 X

- p. 15/18 Binary deviance

● Today's class ■ If Y is a vector of independent 0-1 random variables ● Generalized linear models ● Binary regression example ● Binary outcomes n ● Logit transform ● Binary regression DEV (µ, Y ) = −2 Yi log µi + (1 − Yi) log(1 − µi) ● Link functions: binary regression i=1 ! ● Link function inverses: binary X regression ■ ● Odds ratios & logistic Uses the facts regression ● Link & variance fns. of a GLM ● Binary (again) 0 Yi = 1 ● Fitting a binary regression lim −2(Yi log µi + (1 − Yi) log(1 − µi)) = GLM: IRLS µi↑1 Y ● Other common examples of (∞ i = 0 GLMs ● Deviance ● Binary deviance 0 Yi = 0 ● Partial deviance tests Y µ Y µ 2 lim −2( i log i + (1 − i) log(1 − i)) = ● Wald χ tests µi↓0 (∞ Yi = 1

- p. 16/18 Partial deviance tests

● Today's class ■ As in multiple regression to test that some subset ● Generalized linear models ● Binary regression example H0 : βi1 = · · · = βil = 0 we fit a full and a reduced model. ● Binary outcomes ● Logit transform ■ Asymptotic theory tell us that ● Binary regression ● Link functions: binary regression ● Link function inverses: binary DEV (R) − DEV (F ) 2 2 regression DEVχ = ∼ χdfR−dfF . ● Odds ratios & logistic φ regression ● Link & variance fns. of a GLM ● Binary (again) If deviance is unavailable, Pearson’s X 2 is substituted, φ is ● Fitting a binary regression b GLM: IRLS 2 ● Other common examples of analogous to σ . GLMs ● ■ 2 Deviance Reject H0 if DEVχ2 is larger than χ . b ● Binary deviance dfR−dfF ,1−α ● Partial deviance tests 2 ● Wald χ tests

- p. 17/18 Wald χ2 tests

● Today's class ■ 2 ● Generalized linear models Test can also be done using a Wald χ which does not fit a ● Binary regression example full and reduced model. ● Binary outcomes ● Logit transform ■ 2 ● Binary regression Wald χ to test Cβ = 0: ● Link functions: binary regression t −1 −1 t −1 t ● Link function inverses: binary W ALDχ2 = Cβ(φ · C(X W X) C ) (Cβ) . regression ● Odds ratios & logistic regression ■ 2 Reject H0 : Cβ = 0 if W ALD 2 is larger than χ ● Link & variance fns. of a GLM b b χ #rowsbC,1−α ● Binary (again) ● Fitting a binary regression (assuming C is full rank). GLM: IRLS ● Other common examples of GLMs ● Deviance ● Binary deviance ● Partial deviance tests 2 ● Wald χ tests

- p. 18/18