Topic 15 Maximum Likelihood Estimation Multidimensional Estimation

Fisher Information Example Topic 15 Maximum Likelihood Estimation Multidimensional Estimation 1 / 10 Fisher Information Example Outline Fisher Information Example Distribution of Fitness Effects Gamma Distribution 2 / 10 Fisher Information Example Fisher Information For a multidimensional parameter space θ = (θ1; θ2; : : : ; θn), the Fisher information I (θ) is a matrix. As with one-dimensional case, the ij-th entry has two alternative expressions, namely, @ @ @2 I (θ)ij = Eθ ln L(θjX ) ln L(θjX ) = −Eθ ln L(θjX ) : @θi @θj @θi @θj Rather than taking reciprocals to obtain an estimate of the variance, we find the matrix inverse I (θ)−1. • The diagonal entries of I (θ)−1 gives estimates of variances. • The off-diagonal entries of I (θ)−1 give estimates of covariances. 3 / 10 Fisher Information Example Fisher Information To be precise, for n observations, let θî;n(X ) be the maximum likelihood estimator of the i-th parameter. Then 1 1 Var (θ^ (X )) ≈ I (θ)−1 Cov (θ^ (X ); θ^ (X )) ≈ I (θ)−1: θ i;n n ii θ i;n j;n n ij When the i-th parameter is θi , the asymptotic normality and efficiency can be expressed by noting that the z-score θî (X ) − θi Zi;n = : q −1 I (θ)ii =n is approximately a standard normal. As we saw in one dimension, we can replace the information matrix with the observed information matrix, @2 J(θ^)ij = − ln L(θ^(X )jX ): @θi @θj 4 / 10 Fisher Information Example Distribution of Fitness Effects We return to the model of the gamma distribution for the distribution of fitness effects of deleterious mutations. To obtain the maximum likelihood estimate for the gamma family of random variables, write the likelihood βα βα L(α; βjx) = xα−1e−βx1 ··· xα−1e−βxn Γ(α) 1 Γ(α) n βα n = (x x ··· x )α−1e−β(x1+x2+···+xn) : Γ(α) 1 2 n and its logarithm n n X X ln L(α; βjx) = n(α ln β − ln Γ(α)) + (α − 1) ln xi − β xi : i=1 i=1 @ @ The score function is a vector @α ln L(α; βjx); @β ln L(α; βjx) . 5 / 10 Fisher Information Example Gamma Distribution n n X X ln L(α; βjx) = n(α ln β − ln Γ(α)) + (α − 1) ln xi − β xi : i=1 i=1 The zeros of the components of the score function determine the maximum likelihood estimators. Thus, to determine these parameters, we solve the equations n @ d X ln L(^α; β^jx) = n(ln β^ − ln Γ(^α)) + ln x = 0 @α dα i i=1 n @ α^ X α^ and ln L(^α; β^jx) = n − x = 0; or¯ x = : @β ^ i ^ β i=1 β Substituting β^ =α= ^ x¯ into the first equation results the following relationship for^α. d n(lnα ^ − lnx ¯ − ln Γ(^α) + ln x) = 0 dα 6 / 10 Fisher Information Example Gamma Distribution This can be solved numerically. The deriva- tive of the logarithm of the gamma function 1.5 d 1.0 (α) = ln Γ(α) dα 0.5 is know as the digamma function and is alpha-score called in R with digamma. 0.0 For the example for the distribution of fit- -0.5 ness effects in humans, a simulated data 0.14 0.16 0.18 0.20 0.22 0.24 set (rgamma(500,0.19,5.18)) yields^α = alpha ^ 0:2006 and β = 5:806 for maximum likeli- d Figure: lnα ^ − lnx ¯ − dα ln Γ(^α) + ln xi crosses hood estimates. the horizontal axis at^α = 0:2006. 7 / 10 Fisher Information Example Gamma Distribution Exercise. To determine the variance of these estimators, compute the appropriate second derivatives. @2 d2 @2 α I (α; β) = − ln L(α; βjx) = n ln Γ(α); I (α; β) = − ln L(α; βjx) = n ; 11 @α2 dα2 22 @β2 β2 @2 1 I (α; β) = − ln L(α; βjx) = −n : 12 @α@β β This give a Fisher information matrix 2 ! d ln Γ(α) − 1 28:983 −0:193 I (α; β) = n dα2 β I (0:19; 5:18) = 500 : 1 α −0:193 0:007 − β β2 2 2 NB. 1(α) = d ln Γ(α)=dα is known as the trigamma function and is called in R with trigamma. 8 / 10 Fisher Information Example Gamma Distribution The inverse matrix 1 0:0422 1:1494 I (α; β)−1 = : 500 1:1494 172:5587 Thus, −5 Var(^α) ≈ 8:432 × 10 σα^ ≈ 0:00918 ^ Var(β) ≈ 0:3451 σβ^ ≈ 0:5875 Compare this with the method of moments estimators σα^ ≈ 0:02838 σβ^ ≈ 0:9769 Exercise. Estimate the correlation ρ(^α; β^). 9 / 10 Fisher Information Example Gamma Distribution 2120 2110 loglikelia 2100 0.14 0.16 0.18 0.20 0.22 0.24 loglikeli alpha 2125.5 loglikelib alpha 2124.5 beta 5.0 5.5 6.0 6.5 7.0 beta Figure: Graphs of vertical slices through the Figure: The log-likelihood surface. The domain log-likelihood function surface through the is0 :14 ≤ α ≤ 0:24 and5 ≤ β ≤ 7 MLE. (top) β^ = 5:806 (bottom)^α = 0:20066. 10 / 10.

Load more