Generalized Linear Models
Generalized Linear Models
MIT 18.655
Dr. Kempthorne
Spring 2016
1 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline
1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
2 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Generalized Linear Model
Data: (yi , xi ), i = 1,..., n where yi : response variable T xi = (xi,1,..., xi,p) : p explanatory variables
p Linear predictor: For β = (β1, . . . , βp) ∈ R : rp xi β = j=1 xi,j βj
Probability Model: {yi } independent, canonical exponential r.v.’s: η y −A(η ) Density: p(yi | ηi ) = e i i i h(x) • Mean Function: µi = E [Yi ] = A(ηi ) Link Function g(·): g(µi ) = xi β ˆ ˆ With estimate β: xi β = gˆ(µi ). • −1 T Canonical Link Function: g(µi ) = ηi = [A ] (µi ) = xi β 3 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Matrix Notation ⎡⎤⎡ ⎤ y1 x1,1 x1,2 ··· x1,p ⎡ ⎤ β1 ⎢ y2 ⎥ ⎢ x2,1 x2,2 ··· x2,p ⎥ . y = ⎢ ⎥ X = ⎢ ⎥ β = ⎢ . ⎥ ⎢ . ⎥ ⎢ ...... ⎥ ⎣ . ⎦ ⎣ . ⎦ ⎣ .. . . ⎦ βp yn xn,1 xn,2 ··· xp,n
⎡ ⎤ ⎡ • ⎤ µ1 A(η1) ⎢ • ⎥ ⎢ µ2 ⎥ ⎢ A(η2) ⎥ • E [y] = µ = ⎢ . ⎥ = ⎢ . ⎥ = A(η) ⎢ . ⎥ ⎢ . ⎥ ⎣ ⎦ ⎣ . ⎦ µ • n A(ηn) Examples: θi yi ∼ Bernoulli(θi ): ηi = log( ) 1−θi yi ∼ Poisson(λi ): ηi = log(λi ) yi ∼ Gaussian(µi , 1) : ηi = µi 4 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline
1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
5 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
Log-Likelihood Function for Generalized Linear Model rn i(β) = i=1(ηi yi − A(ηi ) + log[h(yi )]) ∝ ηT y − 1T A(η) = g(µ)T y − 1T A(η) = [Xβ]T y − 1T A(η) (for Canonical Link) Note: T (y) = XT y is sufficient when g(·) is canonical
Maximum Likelihood Estimation of β Solve for {βm, m = 1, 2,...} iteratively: • • •• 0 = i(βm+1) = i(βm) + (βm+1 − βm)i(βm) •• • −1 =⇒ βm+1 = βm + [−i(βm)] i(β m) “Fisher Scoring Algorithm” ⇐⇒ Newton-Raphson
Pr ˆ βm −−−→ β (the MLE)
6 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
For Canonical• Link ∂ rn i(β) = ∂β [(i=1 ηi yi − A(ηi ) + log[h(yi )])] rn ∂ = i=1 ∂β [(ηi yi − A(ηi ) + log[h(yi )])] � � � � rn ∂ ∂ηi = i=1 [(ηi yi − A(ηi ) + log[h(yi )])] ∂ ηi ∂β � • � � T � rn ∂xi β = yi − A(ηi ) i=1 ∂ β =rn (y − µ ) x = rn x (y − µ ) = [X]T (y − µ) i=1 i i i i=1 i i i •• • � • � � ∂xT β � i(β) = ∂ [i(β)] = ∂ [rn y − A(η ) i ] ∂βT ∂βT i=1 i i ∂β � • � = ∂ [r n y − A(η ) x ] ∂βT i =1 i i i � • � = rn − ∂ [ A(η )] x ] i=1 ∂βT i i � • � rn ∂ ∂ηi = − [A(ηi )] T xi ] i=1 ∂ηi ∂β � •• � � •• � = rn −[A (η )] x ] ∂ηi = rn −[A (η )] x xT i=1 i i ∂βT i=1 i i i = XT WX
7 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
For Canonical• Link ∂ rn i(β) =∂β [ i=1 (ηi yi − A(ηi ) + log[h(yi )])] = XT (y − µ)
•• • �• � �∂xT β � i(β) = ∂ [i(β)] = ∂ [r n y − A(η )]i ∂βT ∂βT i=1 i i ∂β •• rn � � T = i=1 −[A(ηi )] xi xi = XT WX •• where W = Cov(y) is diagonal with Wi,i = A(ηi ) = Var[yi ]
8 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
Iteratively Re-weighted Least Squares Interpretation • Given βm, and µˆ(βm) = A(Xβm), •• • −1 βm+1 = βm + [−i(βm)] i(β m) T −1 T = βm + [X WX] [X] (y − µˆ(βm))
Δ = (βm+1 − βm ) is solved as the WLS regression of y∗ = [y − µˆ(βm)] on X∗ = WX using Σ∗ = Cov(y∗) = W
The WLS estimate of Δ is given by: ˆ −1 −1 T −1 Δ = [X∗Σ∗ X∗] X∗ Σ∗ y∗ T −1 −1 T −1 = [X WW WX] X WW [y − µˆ(βm)] T −1 T = [X WX] X [y − µˆ(βm)]
9 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline
1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
10 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Logistic Regression for Binary Responses
Binomial Data: Yi ∼ Binomial(mi , πi ), i = 1,..., k. Log-Likelihood Function � � k πi i(π1, . . . , πk ) = [(Yi log ) + mi log(1 − πi )] i=1 1−πi
Covariates: {xi , i = 1,..., k}
Logistic Regression Parameter: T ηi = log[πi /(1 − π − i)] = xi β
W = Cov(Y) = diag(mi πi (1 − πi )), (k × k matrix)
11 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline
1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
12 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
Likelihood Ratio Tests for Generalized Linear Models Consider testing • H0 : µ = µ0 = A(Xβ0). vs HAlt : µ = µ∗ (general n-vector) • e.g., µ∗ = A(X∗β∗) with X∗ = In and β∗ = η. Suppose y is in interior of convex support of {y : p(y | η) > 0}. • −1 Then ηˆ = η(y) = A (y) is the MLE under HAlt The Likelihood Ratio Test Statistic of H0 vs HAlt : 2 log λ = 2[i (η(Y )) − i(η(µ0))] [ T = 2 [η(y) − η(µ0)] y − [A(η(y)) − A(η(µ0))] = “Deviance” Between y and µ0
13 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
Deviance Formulas for Distributions
r 2 2 Gaussian :(i yi − µi ) /σ0 r Poisson : 2i [yi log(yi /µˆi ) − (yi − µˆi )] r Binomial : 2 i [yi log(yi /µˆi ) + (mi − yi ) log[(mi − yi )/(mi − µˆi )]
14 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Outline
1 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models
15 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Vector Generalized Linear Models
Data: (yi , xi ), i = 1,..., n where yi : a q-dimensional response vector variable T xi = (xi,1,..., xi,p) : p explanatory variables
Probability Model: The conditional distributions of each yi given xi is of the form p(y | x; B, φ) = f (y, η1, . . . , ηM , φ) for some known function f (·), where B = [β1β2 ··· βM ] is a p × M matrix of unknown regression coefficients.
M Linear Predictors: For j = 1,..., M, the jth linear predictor is T rp ηj = ηj (x) = βj x = k=1 Bkj xk , T where x = (x1,..., xp) with x1 = 1 when there is an intercept.
16 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Matrix Notation
T ⎤⎡ ⎡ T ⎡⎤ ⎤ y1 x1 x1,1 x1,2 ··· x1,p T T ⎢ y 2 ⎥ ⎢ x 2 ⎥ ⎢ x2,1 x2,2 ··· x2,p ⎥ Y = ⎢ ⎥ X = ⎢ ⎥ = ⎢ ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ ...... ⎥ ⎣ . ⎦ ⎣ . ⎦ ⎣ .. . . ⎦ T T yn xn xn,1 xn,2 ··· xp,n
B = [β1|β2| · · · βM ]
Link Function: For each observation i E [yi ] = µi (q × 1 vectors) and T g(µi ) = ηi = B xi (M × 1 vectors) where: ⎛ ⎛⎞ T ⎞ η1(xi ) β1 xi ⎜ . ⎟ ⎜ . ⎟ T ηi = η(xi ) = ⎝ . ⎠ = ⎝ . ⎠ = B xi T ηM (xi ) βmxi
17 MIT 18.655 Generalized Linear Models Linear Predictors and Link Functions Maximum Likelihood Estimation Generalized Linear Models Logistic Regression for Binary Responses Likelihood Ratio Tests Vector Generalized Linear Models Multivariate Exponential Family Models T η yi −A(ηi ) Density: p(yi | ηi ) = e i h(yi ) • Mean Function: µi = E[yi ] = A(ηi ) T Link Function g(·): g(µi ) = B xi • −1 T Canonical Link Function: g(µi ) = ηi = [A ] (µi ) = B xi
Examples: yi ∼ Multinomial(mi , πi,1, . . . , πi,M+1) rM+1 πi,j ≥ 0, j = 1,..., M + 1 and j=1 πi,j = 1.
yi ∼ M-Variate-Gaussian(µi , Σ0) M µi ∈ R and Σ0 known (M × M)
Case Study: Applying Generalized Linear Models. Note: Reference Yee (2010) on the VGAM Package for R
18 MIT 18.655 Generalized Linear Models MIT OpenCourseWare http://ocw.mit.edu
18.655 Mathematical Statistics Spring 2016
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.