<<

Generalized Linear Models

Generalized Linear Models

MIT 18.655

Dr. Kempthorne

Spring 2016

1 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline

1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

2 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

Data: (yi , xi ), i = 1,..., n where yi : response variable T xi = (xi,1,..., xi,p) : p explanatory variables

p Linear predictor: For β = (β1, . . . , βp) ∈ R : rp xi β = j=1 xi,j βj

Probability Model: {yi } independent, canonical exponential r.v.’s: η y −A(η ) Density: p(yi | ηi ) = e i i i h(x) • : µi = E [Yi ] = A(ηi ) Link Function g(·): g(µi ) = xi β ˆ ˆ With estimate β: xi β = gˆ(µi ). • −1 T Canonical Link Function: g(µi ) = ηi = [A ] (µi ) = xi β 3 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Matrix Notation ⎡⎤⎡ ⎤ y1 x1,1 x1,2 ··· x1,p ⎡ ⎤ β1 ⎢ y2 ⎥ ⎢ x2,1 x2,2 ··· x2,p ⎥ . y = ⎢ ⎥ X = ⎢ ⎥ β = ⎢ . ⎥ ⎢ . ⎥ ⎢ ...... ⎥ ⎣ . ⎦ ⎣ . ⎦ ⎣ .. . . ⎦ βp yn xn,1 xn,2 ··· xp,n

⎡ ⎤ ⎡ • ⎤ µ1 A(η1) ⎢ • ⎥ ⎢ µ2 ⎥ ⎢ A(η2) ⎥ • E [y] = µ = ⎢ . ⎥ = ⎢ . ⎥ = A(η) ⎢ . ⎥ ⎢ . ⎥ ⎣ ⎦ ⎣ . ⎦ µ • n A(ηn) Examples: θi yi ∼ Bernoulli(θi ): ηi = log( ) 1−θi yi ∼ Poisson(λi ): ηi = log(λi ) yi ∼ Gaussian(µi , 1) : ηi = µi 4 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline

1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

5 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

Log- for Generalized Linear Model rn i(β) = i=1(ηi yi − A(ηi ) + log[h(yi )]) ∝ ηT y − 1T A(η) = g(µ)T y − 1T A(η) = [Xβ]T y − 1T A(η) (for Canonical Link) Note: T (y) = XT y is sufficient when g(·) is canonical

Maximum Likelihood Estimation of β Solve for {βm, m = 1, 2,...} iteratively: • • •• 0 = i(βm+1) = i(βm) + (βm+1 − βm)i(βm) •• • −1 =⇒ βm+1 = βm + [−i(βm)] i(β m) “Fisher Scoring Algorithm” ⇐⇒ Newton-Raphson

Pr ˆ βm −−−→ β (the MLE)

6 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

For Canonical• Link ∂ rn i(β) = ∂β [(i=1 ηi yi − A(ηi ) + log[h(yi )])] rn ∂ = i=1 ∂β [(ηi yi − A(ηi ) + log[h(yi )])] � � � � rn ∂ ∂ηi = i=1 [(ηi yi − A(ηi ) + log[h(yi )])] ∂ ηi ∂β � • � � T � rn ∂xi β = yi − A(ηi ) i=1 ∂ β =rn (y − µ ) x = rn x (y − µ ) = [X]T (y − µ) i=1 i i i i=1 i i i •• • � • � � ∂xT β � i(β) = ∂ [i(β)] = ∂ [rn y − A(η ) i ] ∂βT ∂βT i=1 i i ∂β � • � = ∂ [r n y − A(η ) x ] ∂βT i =1 i i i � • � = rn − ∂ [ A(η )] x ] i=1 ∂βT i i � • � rn ∂ ∂ηi = − [A(ηi )] T xi ] i=1 ∂ηi ∂β � •• � � •• � = rn −[A (η )] x ] ∂ηi = rn −[A (η )] x xT i=1 i i ∂βT i=1 i i i = XT WX

7 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

For Canonical• Link ∂ rn i(β) =∂β [ i=1 (ηi yi − A(ηi ) + log[h(yi )])] = XT (y − µ)

•• • �• � �∂xT β �  i(β) = ∂ [i(β)] = ∂ [r n y − A(η )]i ∂βT ∂βT i=1 i i ∂β •• rn � � T = i=1 −[A(ηi )] xi xi = XT WX •• where W = Cov(y) is diagonal with Wi,i = A(ηi ) = Var[yi ]

8 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

Iteratively Re-weighted Least Squares Interpretation • Given βm, and µˆ(βm) = A(Xβm), •• • −1 βm+1 = βm + [−i(βm)] i(β m) T −1 T = βm + [X WX] [X] (y − µˆ(βm))

Δ = (βm+1 − βm ) is solved as the WLS regression of y∗ = [y − µˆ(βm)] on X∗ = WX using Σ∗ = Cov(y∗) = W

The WLS estimate of Δ is given by: ˆ −1 −1 T −1 Δ = [X∗Σ∗ X∗] X∗ Σ∗ y∗ T −1 −1 T −1 = [X WW WX] X WW [y − µˆ(βm)] T −1 T = [X WX] X [y − µˆ(βm)]

9 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline

1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

10 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Logistic Regression for Binary Responses

Binomial : Yi ∼ Binomial(mi , πi ), i = 1,..., k. Log-Likelihood Function � �  k πi i(π1, . . . , πk ) = [(Yi log ) + mi log(1 − πi )] i=1 1−πi

Covariates: {xi , i = 1,..., k}

Logistic Regression : T ηi = log[πi /(1 − π − i)] = xi β

W = Cov(Y) = diag(mi πi (1 − πi )), (k × k matrix)

11 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline

1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

12 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

Likelihood Ratio Tests for Generalized Linear Models Consider testing • H0 : µ = µ0 = A(Xβ0). vs HAlt : µ = µ∗ (general n-vector) • e.g., µ∗ = A(X∗β∗) with X∗ = In and β∗ = η. Suppose y is in interior of convex support of {y : p(y | η) > 0}. • −1 Then ηˆ = η(y) = A (y) is the MLE under HAlt The Likelihood Ratio Test of H0 vs HAlt : 2 log λ = 2[i (η(Y )) − i(η(µ0))] [ T = 2 [η(y) − η(µ0)] y − [A(η(y)) − A(η(µ0))] = “Deviance” Between y and µ0

13 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

Deviance Formulas for Distributions

r 2 2 Gaussian :(i yi − µi ) /σ0 r Poisson : 2i [yi log(yi /µˆi ) − (yi − µˆi )] r Binomial : 2 i [yi log(yi /µˆi ) + (mi − yi ) log[(mi − yi )/(mi − µˆi )]

14 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline

1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models

15 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Vector Generalized Linear Models

Data: (yi , xi ), i = 1,..., n where yi : a q-dimensional response vector variable T xi = (xi,1,..., xi,p) : p explanatory variables

Probability Model: The conditional distributions of each yi given xi is of the form p(y | x; B, φ) = f (y, η1, . . . , ηM , φ) for some known function f (·), where B = [β1β2 ··· βM ] is a p × M matrix of unknown regression coefficients.

M Linear Predictors: For j = 1,..., M, the jth linear predictor is T rp ηj = ηj (x) = βj x = k=1 Bkj xk , T where x = (x1,..., xp) with x1 = 1 when there is an intercept.

16 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Matrix Notation

T ⎤⎡ ⎡ T ⎡⎤ ⎤ y1 x1 x1,1 x1,2 ··· x1,p T T ⎢ y 2 ⎥ ⎢ x 2 ⎥ ⎢ x2,1 x2,2 ··· x2,p ⎥ Y = ⎢ ⎥ X = ⎢ ⎥ = ⎢ ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ ...... ⎥ ⎣ . ⎦ ⎣ . ⎦ ⎣ .. . . ⎦ T T yn xn xn,1 xn,2 ··· xp,n

B = [β1|β2| · · · βM ]

Link Function: For each observation i E [yi ] = µi (q × 1 vectors) and T g(µi ) = ηi = B xi (M × 1 vectors) where: ⎛ ⎛⎞ T ⎞ η1(xi ) β1 xi ⎜ . ⎟ ⎜ . ⎟ T ηi = η(xi ) = ⎝ . ⎠ = ⎝ . ⎠ = B xi T ηM (xi ) βmxi

17 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Multivariate Models T η yi −A(ηi ) Density: p(yi | ηi ) = e i h(yi ) • Mean Function: µi = E[yi ] = A(ηi ) T Link Function g(·): g(µi ) = B xi • −1 T Canonical Link Function: g(µi ) = ηi = [A ] (µi ) = B xi

Examples: yi ∼ Multinomial(mi , πi,1, . . . , πi,M+1) rM+1 πi,j ≥ 0, j = 1,..., M + 1 and j=1 πi,j = 1.

yi ∼ M-Variate-Gaussian(µi , Σ0) M µi ∈ R and Σ0 known (M × M)

Case Study: Applying Generalized Linear Models. Note: Reference Yee (2010) on the VGAM Package for R

18 MIT 18.655 Generalized Linear Models MIT OpenCourseWare http://ocw.mit.edu

18.655 Mathematical Spring 2016

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.