Overdispersion and Quasilikelihood

' $ Overdispersion and Quasilikelihood Objectives: ² Mechanisms that generate overdispersion. ² Analysis with overdispersion ± Model the distribution ± Model the variance ± Correct standard errors ² Quasilikelihood and estimating equations. & % 128 Heagerty, Bio/Stat 571 ' $ Overdispersion and Quasilikelihood ² Recall that when we used Poisson regression to analyze the seizure data that we found the var(Yi) ¼ 2:5 £ ¹i. De¯ne: Overdispersion describes the situation above. That is, data are overdispersed when the actual var(Yi) exceeds the GLM variance ÁV (¹). ² For Binomial and Poisson models we often ¯nd overdispersion 1. Binomial: Y = s=m, E(Y ) = ¹, var(Y ) > ¹(1 ¡ ¹)=m. 2. Poisson: E(Y ) = ¹, var(Y ) > ¹. & % 129 Heagerty, Bio/Stat 571 ' $ Overdispersion and Quasilikelihood Q: How does overdispersion arise? Q: What is the impact on the GLM regression estimate ¯b and the associated inference? Q: What expanded methods exist that model, or correct for, overdispersion? & % 130 Heagerty, Bio/Stat 571 Seizure Data: Residuals versus fitted 25 · 20 · 15 · (3 large points not shown) resids^2 10 · · · 5 · · · ·· · · ·· · · · · · · · ·· ···· · ·· · · · 0 · ··· ······ ·· ······ · · 5 10 15 20 mu Residuals versus linear predictor 25 · 20 · 15 · resids^2 10 · · · 5 · · · ·· · · ·· · · · · · · · ·· ··· · · ··· · · 0 · ··· ···· ··· ·· ······ · · 1.0 1.5 2.0 2.5 3.0 3.5 4.0 log(mu) 130-1 Heagerty, Bio/Stat 571 ' $ Example: Teratology Data ± Low-Iron Teratology Data Shepard, Mackler & Finch (1980) (analyzed by Moore & Tsiatis, 1991) ² Female rats were put on iron de¯cient diets and divided into four groups. The groups received di®erent numbers of injections to maintain iron levels: group description group 4 (normal) 3 injections (weekly for 3 weeks) group 3 2 injections (days 0 and 7) group 2 1 injection (day 7 or 10) group 1 placebo injections & % 131 Heagerty, Bio/Stat 571 ' $ Example: Teratology Data ² Rats were pregnant and sacri¯ced at 3 weeks. For each animal the total number of fetuses (Ni) and the number of dead fetuses (Yi) were recorded. ² The hemoglobin (blood iron) levels were recorded. ² These data are typical of studies of the of chemical agents or dietary conditions on fetal development. & % 132 Heagerty, Bio/Stat 571 Observed proportion versus Iron · ·· · ·· · · · 1.0 · · · · 0.8 · · · · · · 0.6 · p 0.4 · · · · · · · · 0.2 · · · · · ·· · · · ··· · ·· ·· · ·· ·· 0.0 -4 -2 0 2 4 6 8 iron Empirical logit versus Iron · ·· · · · · · · · · 2 · · · · · · · · · · · 0 · logit.p · · · · · · · · · · · · · · · -2 · · ·· · · · · · ·· · · ·· · -4 -2 0 2 4 6 8 iron 132-1 Heagerty, Bio/Stat 571 Pearson residuals versus litter size (n_i) 4 · · · · 2 · · · · · · · · · · · · · · · · · · · 0 · · · · · · · · · · · · · · · · · · · · resids · · · · -2 · · · -4 · · -6 · · 5 10 15 n Pearson residuals versus litter size (n_i) 20 · (3 points dropped) 15 · 10 · resids^2 · · · 5 · · · · · · · · · · · · · · · · · · · · · · · · · · · 0 · · · · · · · · · 5 10 15 n 132-2 Heagerty, Bio/Stat 571 ' $ Overdispersion? ? If there is population heterogeneity then this can introduce overdispersion. Example: Suppose there exists a binary covariate, Zi, and that Yi j Zi = 0 » Poisson(¸0) Yi j Zi = 1 » Poisson(¸1) P (Zi = 1) = ¼ E(Yi) = ¼¸1 + (1 ¡ ¼)¸0 = ¹ & % 133 Heagerty, Bio/Stat 571 ' $ Overdispersion? var(Yi) = E( ¸1Zi + ¸0(1 ¡ Zi) ) + var( ¸1Zi + ¸0(1 ¡ Zi)) 2 = ¹ + (¸1 ¡ ¸0) ¼(1 ¡ ¼) Therefore, if we do not observe Zi then this omitted factor leads to increased variation. & % 134 Heagerty, Bio/Stat 571 ' $ Impact of Model Violation ? Huber (1967) and White (1982) studied the properties of MLEs when the model is misspeci¯ed. Setup: 1. Let Fθ be the assumed distributional family for independent data Yi, i = 1; 2; : : : ; n. b b 2. Let θn be the MLE (based on n observations). That is, θn solves the score equations that arise from the assumption Fθ: Xn F b U i (θn) = 0 i=1 3. However, the true distribution of Yi is given by Yi » G. & % 135 Heagerty, Bio/Stat 571 ' $ Result: b ¤ 1. θn ! θ such that " # Xn 1 F ¤ lim EG U i (θ ) = 0 (???) n n i=1 b 2. The estimator θn is asymptotically normal: p ³ ´ b ¤ ¡1 ¡1 n θn ¡ θ !N ( 0;A BA ) Xn 1 @ F A = ¡ lim U (θ)j ¤ n @θ i θ i=1 Xn h i 1 F B = lim var U (θ)j ¤ n i θ i=1 & % 136 Heagerty, Bio/Stat 571 ' $ Note: 1. A is just the observed information (times 1=n). F 2. B is just the true variance of U i (θ) which may no longer be F equal to minus the expected derivative of U i (θ) if Yi » Fθ doesn't hold. b ¤ 3. Sometimes we get lucky (or we've selected θn wisely) and θ = θ0 b { the model misspeci¯cation doesn't hurt the consistency of θn. 4. Sometimes we get lucky and (3) holds, and A = B and the model misspeci¯cation doesn't hurt our standard error estimates either. & % 137 Heagerty, Bio/Stat 571 ' $ Huber, White, & GLMs Q: What does this misspeci¯cation story mean for GLMs? A: It says that if we are modelling the mean E(Yi) = ¹i via a regression model, ¯, then our estimator, ¯b, will converge to whatever value solves EG [U(¯)] = 0 Recall that we have µ ¶ Xn @¹ T U(¯) = i [a (Á) ¢ V (¹ )]¡1 (Y ¡ ¹ ) @¯ i i i i i=1 ) As long as Yi » G such that EG(Yi) = ¹i then our regression estimator will be consistent! We don't need Poisson, or Binomial for the GLM point estimate ¯b to be valid. & % 138 Heagerty, Bio/Stat 571 ' $ Huber, White & GLMs Comment: ? We obtain properties of ¯b by considering properties (expectation, variance) of U(¯)... Q: Yes, but what about the standard errors for ¯b? A: Depends... Let's consider the matrices A and B: & % 139 Heagerty, Bio/Stat 571 ' $ Teratology and Binomial Overdispersion ? Binomial model information matrix 1 Xn A = XT V X n i i i i=1 V i = diag[Ni¹i(1 ¡ ¹i)] & % 140 Heagerty, Bio/Stat 571 ' $ Teratology and Binomial Overdispersion ² Two common variance models: 1. Scale model: var(Yi) = ÁNi¹i(1 ¡ ¹i) B = Á ¢ A 2. beta-binomial: var(Yi) = Ni¹i(1 ¡ ¹i)[1 + ½(Ni ¡ 1)] 1 Xn B = XT V¤X n i i i i=1 ¤ V i = diagfNi¹i(1 ¡ ¹i)[1 + ½(Ni ¡ 1)]g & % 141 Heagerty, Bio/Stat 571 ' $ Teratology and Binomial Overdispersion Scale model var(¯b) = beta-binomial model var(¯b) = & % 142 Heagerty, Bio/Stat 571 ' $ Teratology and Binomial Overdispersion Unspeci¯ed variance model var(¯b) = & % 143 Heagerty, Bio/Stat 571 Teratology Analysis # rats.q # # ---------------------------------------------------------------------- # # PURPOSE: Analysis of overdispersed binomial data # # AUTHOR: P. Heagerty # # DATE: 00/01/12 # # ---------------------------------------------------------------------- # data<-matrix( scan("rat_teratology.data"), ncol=4, byrow=T ) # rats <- data.frame( y=data[,2], n=data[,1], iron = data[,3]-8, 143-1 Heagerty, Bio/Stat 571 iron0 = (data[,3]-8)*(data[,3]>8), id=c(1:nrow(data)) ) # ##### ##### EDA ##### # p <- rats$y/rats$n logit.p <- log( rats$y+0.5 ) - log( rats$n - rats$y + 0.5 ) # postscript( file="rats-EDA-1.ps", horiz=F ) par( mfrow=c(2,1) ) attach( rats ) plot( iron, p ) title("Observed proportion versus Iron") plot( iron, logit.p ) title("Empirical logit versus Iron") detach() graphics.off() # logit<-function(x){ log(x)-log(1-x) } 143-2 Heagerty, Bio/Stat 571 antilogit<-function(x){ exp(x)/(exp(x)+1) } # ##### GLM fit # glmfit <-glm( cbind(y,n-y) ~ iron + iron0, family=binomial, data=rats ) # resids <- residuals( glmfit, type="pearson" ) # postscript( file="rats-EDA-2.ps", horiz=F ) par( mfrow=c(2,1) ) attach( rats ) plot( n, resids ) title("Pearson residuals versus litter size (n_i)") plot( n, resids^2, ylim=c(0,20) ) title("Pearson residuals versus litter size (n_i)") lines( smooth.spline( n, resids^2, df=4 ) ) text( 10, 15, "(3 points dropped)", cex=0.5 ) abline( h=1, lty=2 ) abline( h=mean( resids^2, trim=0.05 ), lty=3 ) detach() graphics.off() 143-3 Heagerty, Bio/Stat 571 # ##### ##### Correction of standard errors ##### # attach( rats ) nobs <- nrow( rats ) # ##### (1) estimate of scale parameter # phi <- sum( resids^2 )/(nobs-3) cat( paste("phi =", round(phi,3), "\n\n") ) # ##### (2) estimate of correlation parameter # rho <- mean( ((resids^2 - 1)/(rats$n-1))[rats$n>1] ) cat( paste("rho =", round(rho,3), "\n\n") ) detach() # ##### Calculate standard errors... # 143-4 Heagerty, Bio/Stat 571 X <- cbind( 1, rats$iron, rats$iron0 ) # mu <- fitted( glmfit ) V.mu <- rats$n * mu * (1-mu) # est.fnx <- X*(rats$y - rats$n*mu) # ##### Fisher Information # XtWX <- t(X)%*%( V.mu*X ) I0.inv <- solve( XtWX ) se.model <- sqrt( diag( I0.inv ) ) # ##### with Scale parameter # se.scale <- sqrt(phi)*se.model # ##### with beta-binomial variance # cheese <- t(X)%*%( rats$n*mu*(1-mu)*(1+rho*(rats$n-1))*X ) se.sandwich <- sqrt( diag( I0.inv %*% ( cheese ) %*% I0.inv ) ) 143-5 Heagerty, Bio/Stat 571 # ##### with empirical variance of estimating function # UUt <- t(est.fnx)%*%est.fnx se.empirical <- sqrt( diag( I0.inv %*% ( UUt ) %*% I0.inv ) ) # beta <- glmfit$coef # out <- cbind( beta, se.model, se.scale, se.sandwich, se.empirical ) print( round( out, 4 ) ) cat("\n\n -------------------------------------------------- \n\n") # summary( glmfit, cor=F ) cat("\n\n -------------------------------------------------- \n\n") # quasifit <- glm( cbind(y,n-y) ~ iron + iron0, family=quasi(link=logit, variance="mu(1-mu)" ), data=rats ) summary( quasifit, cor=F ) cat("\n\n -------------------------------------------------- \n\n") # 143-6 Heagerty, Bio/Stat 571 phi = 3.596 rho = 0.23 beta se.model se.scale se.sandwich se.empirical (Intercept) -1.5196 0.2405 0.4561 0.4511 0.4388 iron -0.7708 0.0830 0.1575 0.1532 0.1700 iron0 0.5056 0.1517 0.2876 0.2842 0.2836 143-7 Heagerty, Bio/Stat 571 Call: glm(formula = cbind(y,

Overdispersion and Quasilikelihood

An Introduction to Poisson Regression Russ Lavery, K&L Consulting Services, King of Prussia, PA, U.S.A

Overdispersion, and How to Deal with It in R and JAGS (Requires R-Packages AER, Coda, Lme4, R2jags, Dharma/Devtools) Carsten F

Poisson Versus Negative Binomial Regression

Negative Binomial Regression

Generalized Linear Models

Modeling Overdispersion with the Normalized Tempered Stable Distribution

Variance Partitioning in Multilevel Logistic Models That Exhibit Overdispersion

Meta-Analysis of Count Data What Is Count Data Complicated by What Is

Generalized Linear Models and Point Count Data: Statistical Considerations for the Design and Analysis of Monitoring Studies

Handling Overdispersion with Negative Binomial and Generalized Poisson Regression Models

Stat 5421 Lecture Notes: Overdispersion

Count Data Models