Note a Random }U Y3
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 02 Ch.2 The simple regression model 1. Definition of the simple regression model The Simple Regression Model 2. Deriving the OLS estimates 3. Mechanics of OLS 4. Units of measurement & functional form y = 0 + 1x + u 5. Expected values & variances of OLSE 6. Regression through the origin Econometrics 1 Econometrics 2 2.1 Definition of the model Equation (2.1), y = 0 + 1x + u, defines the Simple Regression model. In the model, we typically refer to y as the Dependent Variable x as the Independent Variable s as parameters, and u as the error term. Econometrics 3 Econometrics 4 The Concept of Error Term A Simple Assumption for u u represents factors other than x that affect y. The average value of u, the error term, in If the other factors in u are held fixed, so that the population is 0. That is, E(u) = 0. u = 0, then y = 1x. This is not a restrictive assumption, since we can always use 0 to normalize E(u) to 0. Ex. 2.1: yield = 0 + 1fertilizer + u (2.3) u includes land quality, rainfall, etc. To draw ceteris paribus conclusions about Ex. 2.2: wage = + educ + u (2.4) how x affects y, we have to hold all other 0 1 factors (in u) fixed. u includes experience, ability, tenure, etc. Econometrics 5 Econometrics 6 Simple Regression Model 1 Chapter 02 E(y|x) as a linear function of x, where for any x Zero Conditional Mean the distribution of y is centered about E(y|x) y We need to make a crucial assumption about how u and x are related. f(y) We want it to be the case that knowing something about x does not give us any . E(y|x) = + x information about u, so that they are 0 1 completely unrelated. That is, that . E(u|x) = E(u) = 0 (2.5&2.6), which implies E(y|x) = 0 + 1x (PRF) (2.8) x1 x2 x Econometrics 7 Econometrics 8 Population regression line, sample data points 2.2 Deriving the OLSE and the associated error terms y E(y|x) = 0 + 1x y4 . Basic idea of regression is to estimate the u4{ population parameters from a sample. Let {(xi,yi): i = 1, …, n} denote a random }u y3 . 3 sample of size n from the population. y . 2 u2{ For each observation in this sample, it will be the case that y .} u1 yi = 0 + 1xi + ui. (2.9) 1 x1 x2 x3 x4 x Econometrics 9 Econometrics 10 Deriving OLSE using MM Cont. Deriving OLSE using MM To derive the OLS estimates, we need to Since u = y – 0 – 1x, we can rewrite; realize that our main assumption of E(u) = E(y – 0 – 1x) = 0 (2.12) E(u|x) = E(u) = 0 also implies that E(xu) = E[x(y – 0 – 1x)] = 0 (2.13) Cov(x,u) = E(xu) = 0 These are called moment restrictions Because Cov(X,Y) = E(XY) – E(X)E(Y) (B.27) The approach to estimation implies imposing the Now we prepare 2 restrictions to estimate s. population moment restrictions on the sample E(u) = 0 (2.10) moments. It means, a sample estimator of E(X), E(xu) = 0 (2.11) the mean of a population distribution, is simply the arithmetic mean of the sample. Econometrics 11 Econometrics 12 Simple Regression Model 2 Chapter 02 More Derivation of OLS Cont. More Derivation of OLS Given the definition of a sample mean, and We want to choose values of the parameters properties of summation, we can rewrite the first that will ensure that the sample versions of condition as follows our moment restrictions are true ˆ ˆ ˆ ˆ y 0 1x (2.16) or 0 y 1x (2.17) The sample versions are as follows: n So the OLS estimated slope is n1 y ˆ ˆ x 0 (2.14) n i 0 1 i x x y y i1 i i ˆ i1 n 1 n (2.19) n1 x y ˆ ˆ x 0 (2.15) 2 i i 0 1 i xi x i1 i1 Econometrics 13 Econometrics 14 Summary of OLS slope estimate More OLS The slope estimate is the sample Intuitively, OLS is fitting a line through the covariance between x and y divided by the sample points such that the sum of squared sample variance of x. residuals is as small as possible, hence the If x and y are positively (negatively) term is called least squares. correlated, the slope will be positive The residual, û, is an estimate of the error (negative). term, u, and is the difference between the x needs to vary in our sample. fitted line (sample regression function) and See (2.18) & Figure (2.3) the sample point. Econometrics 15 Econometrics 16 Sample regression line, sample data points and the associated estimated error terms Alternate approach to derivation y y . Given the intuitive idea of fitting a line, we 4 û { 4 can set up a formal minimization problem. yˆ ˆ ˆ x n n 2 0 1 2 ˆ ˆ uˆi yi 0 1xi (2.22) } y3 . û3 i1 i1 . y2 { û2 The first order conditions, which are the almost same as (2.14) & (2.15), û n n y .} 1 ˆ ˆ ˆ ˆ 1 yi 0 1xi 0, xi yi 0 1xi 0 i1 i1 x1 x2 x3 x4 x Econometrics 17 Econometrics 18 Simple Regression Model 3 Chapter 02 2.3 Properties of OLS Cont. Algebraic Properties Algebraic Properties of OLS 2. The sample covariance between the regressors and the OLS residuals is zero 1. The sum of the OLS residuals is zero. n Thus, the sample average of the OLS ˆ xiui 0 (2.31) residuals is zero as well. i1 n 1 n 3. The OLS regression line always goes ˆ ˆ u i 0 and thus, u i 0 (2.30) through the mean of the sample i 1 n i 1 ˆ ˆ y 0 1x Econometrics 19 Econometrics 20 Cont. Algebraic Properties Goodness-of-Fit We can think of each observation as being made It’s useful we think about how well the up of an explained part, and an unexplained part, sample regression line fits sample data. yi yˆi uˆi (2.32) Then we define the following : From (2.36), 2 y y SST (2.33) 2 SSE SSR i R 1 (2.38). ˆ 2 SST SST yi y SSE (2.34) R2 indicates the fraction of the sample ˆ 2 ui SSR (2.35) variation in yi that is explained by the Then, SST SSE SSR (2.36) model. Econometrics 21 Econometrics 22 2.4 Measurement Units & Function Form 2.5 Means & Variance of OLSE If we use the model y* = * + *x*+ u* ˆ 0 1 Now, we view i as estimators for the parameters instead of y = 0 + 1 x + u, we get i that appears in the population, which means properties of the distributions of ˆ over different ˆ * ˆ ˆ * c ˆ i 0 c0 and 1 1 random samples from the population. d where y* = c y and x* = d x. Similarly, Unbiasedness of OLS Unbiased estimator: An estimator whose expected ˆ * y x ˆ y 1 1 value (or mean of its sampling distribution) equals x y x the population value (regardless of the population where y* = ln y and x* = ln x. value). Econometrics 23 Econometrics 24 Simple Regression Model 4 Chapter 02 Cont. Unbiasedness of OLS Cont. Unbiasedness of OLS Assumption for unbiasedness In order to think about unbiasedness, we 1. Linear in parameters as y = + x + u need to rewrite our estimator in terms of 0 1 the population parameter. 2. Random sampling {(xi, yi): i = 1, 2, …, n}, Thus, y = + x + u xi xyi xi x ui i 0 1 i i ˆ (2.49),(2.52) 1 (x x)2 1 (x x)2 3. Sample variation in the xi, thus i i 2 (x x) 0 xi x i ˆ then E1 1 2 E(ui | x) 1 (2.53) 4. Zero conditional mean, E(u|x) = 0 (xi x) ˆ *we can also get E(0 ) 0 in the same way. Econometrics 25 Econometrics 26 Unbiasedness Summary Variances of the OLS Estimators The OLS estimates of 1 and 0 are Now we know that the sampling unbiased. distribution of our estimate is centered Proof of unbiasedness depends on our 4 around the true parameter. assumptions – if any assumption fails, then We want to think about how spread out this OLS is not necessarily unbiased. distribution is. Remember unbiasedness is a description of It is much easier to think about this variance the estimator – in a given sample our under an additional assumption, so assume estimate may be “near” or “far” from the 5. Var(u|x) = 2 (Homoskedasticity) true parameter. Econometrics 27 Econometrics 28 Cont. Variance of OLSE Homoskedastic Case y 2 is also the unconditional variance, called f(y|x) the error variance, since 2 2 Var(u|x) = E(u |x) - [E(u|x)] . 2 2 2 E(u|x) = 0, so = E(u |x) = E(u ) = Var(u) E(y|x) = 0 + 1x And , the square root of the error variance, is . called the standard deviation of the error. Then we can say 2 E(y|x)=0 + 1x and Var(y|x) = x1 x2 Econometrics 29 Econometrics 30 Simple Regression Model 5 Chapter 02 Heteroskedastic Case Cont. Variance of OLSE f(y|x) 2 ˆ Var ( 1 ) 2 (2.57 ) ( xi x ) .