ECON 4160, Autumn Term 2017. Lecture 2
Total Page:16
File Type:pdf, Size:1020Kb
MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) ECON 4160, Autumn term 2017. Lecture 2 Ragnar Nymoen University of Oslo 31 August 2017 1 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) References to Lecture 2 I HN: CH: 6,7,8 Matrix algebra and the multivariate regression model I HN: CH 9, about mis-specification tests in for regression models for cross-section data. See also the note about standard tests of mis-specification posted under Computer Class material on the semester page. I Lecture note 2 (posted on the semester page) is referred to several places below. 2 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) ML estimation of the multivariate modelI I Ch. 7 in HN I The assumptions of the statistical model are identical to Ch. 5 in HN. I But the model has two or more regressors (plus constant term). I Let k denote the number of regressors I k = 2 in Ch 5 in HN and k = 1 in Ch 3 of HN(3011) because they k counts the the constant. I The likelihood function is constructed from the assumed normality of the conditional pdf: f (Y X1, ... , Xk ) j 3 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) ML estimation of the multivariate modelII I Assume identical distributions of n independent sets of variables n f (Y1, ... Yn X11, ... , Xkn) = ∏ f (Yi X1i , ... , Xki ) j i=1 j I Assume that each of the n conditional pdfs are normal. I By direct analogy to Ch. 5, the log-likelihood function is: n 1 n l = log(2 2) (Y k X )2 Y1,...,Yn X ps 2 ∑ i ∑j=1 bj ji j 2 2ps i=1 (1) I MLEs for b1,. ,bk are found by using OLS for the second part of the expression: Therefore OLS estimators of the b’s are MLE, and vise versa. 4 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) ML estimation of the multivariate model III 2 I Next, from the concentrated log-likelihood function for s , we find the MLE of the scale parameter as: n 2 1 2 sˆ = ∑ eˆi n i=1 where the residuals are from the k-variable model of course. 5 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) Marginal effects and location shiftsI I In multivariate regression, b2, b3,. ,bk are partial effects or marginal effects, when the Xj -variables (j = 2, 3, .., k) are continuous. I The interpretation of estimated regression coeffi cients as partial effects is also clear from the famous Frisch-Waugh theorem, see for example Lecture Note 2. I The economic interpretation depends on how the raw variables have been transformed prior to the specification of the regression model (also knows as choice of functional form). 6 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) Marginal effects and location shifts II DIY Exercise: 2.1: For each case below: Calculate the partial derivative, the partial elasticity and the semi-elasticity of Y with respect to a small change in Xj Regressand Regressor a) YXj b) ln(Y ) ln(Xj ) c) Y ln(Xj ) 1 d) YX I If one or more of the X’sare dummies (for example 0, 0, 1, -1, 1,0,0...), also called indicator variables, the interpretation of bj is different: Shift in the intercept, location shift 7 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) Marginal effects and location shifts III DIY Exercise: 2.2: Assume that we have n1 mutually independent vectors (Yi ,Zi ) where both variables are continuous, and that we sample (independently) another n2 mutually independent vectors (Yi ,Zi ). The two conditional expectations are E (Yi Zi ) = g1 + g2Zi , i = 1, 2, ... , n1 j E (Yi Zi ) = l1 + l2Zi , i = 1, 2, ... , n2 j where g1, g2, l1, l2 are parameters. a) How can you write this as regression equation with one intercept and three regressors (so k = 4 in HN’scounting) with the aid an indicator variable? b) Which of the parameters in your model equation are derivative coeffi cients, and which are location parameters? 8 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) Partial correlation coeffi cientsI I In case of k = 3, we can calculate the multiple correlation coeffi cient R2 n n 2 ˆ 2 2 R = ∑ Yi / ∑ Yi i=1 i=1 "ESS" "TSS" | {z } | {z } and three correlations coeffi cients: rY ,X2 , r Y ,X3 and r X2,X3 . I In addition, we can calculate two partial correlation coeffi cients, between Y and X2, and between Y and X3: rY ,X2 X3 and rY ,X3 X2 . j j 9 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) Partial correlation coeffi cientsII I rY ,X2 X3 is constructed from the residuals from two simple j regressions: between Y and X3 and between X2 and X3: #ˆiY X3 and #ˆiX2 X3 : j j 1 n n ∑i=1#ˆiY X3 #ˆiX2 X3 j j rY ,X2 X3 = j 1 n #ˆ2 1 n #ˆ2 n ∑i=1 iY X3 n ∑i=1 iX1 X3 j j q q DIY Exercise: 2.3: Show that rY ,X2 X3 can be expressed as j 1 n #ˆ2 n ∑i=1 iX2 X3 ˆ j rY ,X2 X3 = b2 j q 1 n 2 n ∑i=1#ˆiY X 3 j q where bˆ2 is the OLS estimator of b2 in the k = 3 model equation Yi = b1 + b2X2i + b3X3i + ei . 10 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) Heuristically: 2 R = all squared simple and partial correlation coeffi cient and this is supported by the relationship in the next exercise: DIY Exercise: 2.4: Try to show that, in the case of k = 3 (1 R2) = (1 r 2 )(1 r 2 ) Y ,X2 Y ,X3 X2 j by following the hints in Exercise 7.8 in HN. A slightly different argument will be given in the answers note. 11 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) The regression model in matrix notationI Ch 6 and 8 in HN. Let X denote a n k matrix with the regressors of the model y = Xb + e (2) where y is n 1and # is the n 1 vector with disturbances and the parameter vector b is k 1. Notation convention: Use lowercase bold for data vectors. Uppercase bold for data matrices. Y1 X11 X12 ... X1k b1 e1 Y2 X21 X22 ... X2k b2 e2 2 . 3 = 2 . 3 2 . 3 + 2 . 3 . 6 7 6 7 6 7 6 7 6 Yn 7 6 Xn1 Xn2 ... X 7 6 b 7 6 en 7 6 7 6 nk 7 6 k 7 6 7 4 5 4 5 4 5 4 5 12 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) The regression model in matrix notationII th if we let Xi denote the i row in X. (1 k matrix), a typical row in equation (2) is: k Yi = Xi b+ #i = ∑ Xij bj + ei , i = 1, 2, ... , n (3) j=1 Unless both the regressand and all the regressors are measured as deviations from their means, there is an intercept in the model. When we need to make this explicit, we can rewrite X as the partitioned matrix: X = i X2 13 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) The regression model in matrix notation III where 1 1 i = 2 . 3 . 6 7 6 1 7 6 7n 1 4 5 X12 ... X1k X22 ... X2k X2 = 2 . 3 . 6 7 6 Xn2 ... X 7 6 nk 7n (k 1) 4 5 14 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) ML estimatorI I By solving Exercise C to the first seminar, you will show that OLS gives the estimator 1 bˆ = (X0X) X0y (4) for b. 1 I X0 is the transpose matrix and (X0X) is the inverse of the X0X matrix with (uncentered) moments between the regressors. I For the inverse to exist, rank(X0X) = k, (full rank). This is the generalization of the “absence of perfect multicollinearity” condition. 15 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) ML estimatorII I Under the assumption of the statistical regression model IID, normal conditional pdf and exogenous regressors, the likelihood is n 1 l = log(2 2) (y X ) (y X ) Y1,...,Yn X ps 2 b 0 b j 2 2ps I bˆ in (4) minimizes (y Xb)0(y Xb) meaning that is also the MLE of b. I By solving Exercise C to the first seminar, you can show that ˆ0 ˆ ˆ0 b = ( b1 b2 ) and 1 bˆ = (X2 X2)0(X2 X2) (X2 X2)0y (5) 2 In X2, the typical row is iX¯ i , i = 2, ... , k. 16 / 44 MLE and multivariate regression Matrixnotation Inference Non-linear regression GLS and SUR (for reference) ML estimator III I (5) is the generalization of the “x-deviation from mean” form of the slope coeffi cient in simple regression.