[2Mm] Generalized Linear Models and Poisson Regression

CHAPTER 12 Generalized Linear Models and Poisson Regression 1 / 36 The Model 2 / 36 Generalized linear models • Extensions of traditional linear models (e.g., logistic regression model) • Allow (i) Population mean to depend on a linear predictor through a nonlinear link function (ii) Response distribution to be any member of the exponential family • Three building blocks I y1; y2;:::; yn following the same distribution from the exponential family I β and x1; x2;:::; xp 0 I Monotone link function: g(µi ) = xi β 3 / 36 Examples • Standard linear regression model I g(µ) = µ I y ∼ Normal • Logistic regression model µ I g(µ) = ln 1 − µ I y ∼ Bernoulli (or binomial) • Poisson regression model I g(µ) = ln µ I y ∼ Poisson 4 / 36 Poisson regression model • When the response represents count data • Examples: the number of I Daily equipment failures I Weekly traffic fatalities I Monthly insurance claims I ::: • Poisson distribution µy P(Y = y) = e−µ; y = 0; 1; 2;::: y • E(Y ) = V (Y ) = µ > 0 5 / 36 Poisson regression model (cont.) • Link function: g(µ) = ln µ = β0 + β1x1 + ··· + βpxp • Then µ = exp(β0 + β1x1 + ··· + βpxp) > 0 • Interpretation: a unit change of x1 affects the mean by exp[β + β (x + 1) + β x + ··· + β x ] − exp[β + β x + β x + ··· + β x ] 100 0 1 2 2 p p 0 1 2 2 p p exp[β0 + β1x + β2x2 + ··· + βpxp] = 100[exp(β1) − 1] percent 6 / 36 Estimation of the Parameters in the Poisson Regression Model 7 / 36 Maximum likelihood estimation • Likelihood function n yi Y µi L(βjy1; y2;:::; yn) = exp(−µi ) yi i=1 • Lok-likelihood function n n X X ln L(βjy1; y2;:::; yn) = c + yi ln µi − µi i=1 i=1 X where c = − ln(yi) • Use the Newton-Raphson method 8 / 36 Maximum likelihood estimation (cont.) • Derivative: n n @ ln L X y X y − µ = i − 1 = i i @µi µi µi i=1 i=1 • Note: @µi =@β = µi xi • Hence, n n @ ln L @ ln L @µi X yi − µi X = = µi xi = (yi − µi )xi @β @µi @β µi i=1 i=1 • Maximum likelihood score equations: X 0(y − µ) = 0 9 / 36 Newton-Raphson procedure • First derivative of the negative log-likelihood: n @ ln L X g = − = − (y − µ )x @β i i i i=1 • Second derivative: ( n ) n @2 ln L @ X X − = − (yi − µi )xij∗ = µi xij xij∗ @βj @βj∗ @βj i=1 i=1 • Hessian matrix: n X 0 G = µi xi xi i=1 • Newton-Raphson iteration: β∗ = β~ − [G(β~)]−1g(β~) 10 / 36 Iteratively reweighted least squares (IRLS) algorithm • Iteratively computed response: 0 ~ 1 zi = xi β + (yi − µ~i ) µ~i • Weighted linear regression of zi on xi with weights wi =µ ~i • Equivalent to the Newton-Raphson iteration 11 / 36 Iteratively reweighted least squares (IRLS) algorithm (cont.) • Weighted least squares (WLS) estimate: " n #−1 " n # " n #−1 " n # ^WLS X 0 X X 0 X β = wi xi xi wi xi zi = µ~i xi xi µ~i xi zi i=1 i=1 i=1 i=1 with " n # n " n # n X X 0 ~ 1 X 0 ~ X µ~i xi zi = µ~i xi xi β + (yi − µ~i ) = µ~i xi xi β + (yi − µ~i )xi µ~i i=1 i=1 i=1 i=1 = Gβ~ − g • Hence, WLS β^ = G−1(Gβ~ − g) = β~ − G−1g 12 / 36 Inference in the Poisson Regression Model 13 / 36 Likelihood ratio tests • LR test statistic: L(full) T = 2 ln = 2fln L(full) − ln L(restricted)g L(restricted) 2 • Under H0; T ∼ χ (df) where df = the number of independent constraints 2 • Reject H0 if T > χ (1 − α; df) • Can be used to test the significance of I An individual coefficient (a partial test) I two or more coefficients simultaneously I All coefficients (test of overall regression) 14 / 36 Standard errors of the maximum likelihood estimates and wald tests • Estimate of the covariance matrix: " n #−1 ^ ∼ −1 X 0 V (β) = G = µî xi xi i=1 • Wald confidence intervals for βj : ^ ^ βj ± (1:96)s.e.(βj ) 15 / 36 Standard errors of the estimated mean 0 ^ • Estimate of the ith mean: µî = exp(xi β) • Tylor series expansion: ∼ ^ 0 @µî 0 ^ µî = µi + (β − β) ^ = µi + µi x (β − β) @β^ β=β i • Estimated variance of µî : " n #−1 ∼ 2 0 ^ ∼ 2 0 X 0 V (^µi ) = µi xi V (β)xi = µi xi µî xi xi xi i=1 • Approximate 95% confidence intervals for µi : v u " n #−1 u 0 X 0 µî ± (1:96)^µi txi µî xi xi xi i=1 16 / 36 Deviance • Saturated model: I ith log-likelihood, yi ln(µi ) − µi ; is maximized for µi = yi n X I Log-likelihood function: c + [yi ln(yi ) − yi ] i=1 • Deviance: D = 2fln L(saturated) − ln L(parameterized)g ( n n ) X X = 2 [yi ln(yi ) − yi ] − [yi ln(^µi ) − µî ] i=1 i=1 n n X yi X yi = 2 yi ln − (yi − µî ) = 2 yi ln µî µî i=1 i=1 n X since [yi − µî ] = 0 as long as an intercept is included in i=1 the model 17 / 36 Goodness of fit y • Second-order Taylor series expansion of y ln around µ y = µ: y 1 y ln =∼ (y − µ) + (y − µ)2 µ 2µ • Pearson chi-square statistic: n n 2 ∼ X 1 2 X (yi − µî ) 2 D = 2 (yi − µî ) + (yi − µî ) − (yi − µî ) = = χ 2µ^ µî i=1 i=1 • Goodness of fit: 2 2 I D (or χ ) > χ (1 − α; n − p − 1) I standardized deviance or standardized Pearson chi-square statistic, D χ2 or 1 n − p − 1 n − p − 1 ) question the adequacy of the model 18 / 36 Residuals • Deviance residuals: s yi di = sign(yi − µî ) 2 yi ln − (yi − µî ) µî • Pearson residuals: yi − µî ri = p µî 19 / 36.

[2Mm] Generalized Linear Models and Poisson Regression

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support