Definition of the Simple Regression Model
Deriving the The Simple Regression Model Ordinary Least Squares Estimates
Properties of OLS on any Caio Vigo Sample of Data
Units of Measurement The University of Kansas and Functional Department of Economics Form Using the Natural Logarithm in Simple Regression Fall 2019 Expected Value of OLS
These slides were based on Introductory Econometrics by Jeffrey M. Wooldridge (2015)
1 / 86 Topics
Definition of the Simple Regression Model 1 Deriving the Definition of the Simple Regression Model Ordinary Least Squares Estimates 2 Deriving the Ordinary Least Squares Estimates Properties of OLS on any Sample of 3 Data Properties of OLS on any Sample of Data
Units of Measurement 4 and Functional Units of Measurement and Functional Form Form Using the Natural Logarithm in Simple Regression Using the Natural Logarithm in Simple Regression Expected 5 Expected Value of OLS Value of OLS
2 / 86 Definition of the Simple Regression Model
Definition of the Simple Regression • What type of analysis will we do? Cross-sectional analysis Model
Deriving the Ordinary Least • First step: Clearly define what is your population (in what you are interested Squares Estimates to study). Properties of OLS on any Sample of • x y Data Second Step: There are two variables, and , and we would like to “study
Units of how y varies with changes in x.” Measurement and Functional Form • Using the Natural Third Step: We assume we can collect a random sample from the population Logarithm in Simple Regression of interest. Expected Value of OLS Now we will learn to write our first econometric model, derive an estimator (what’s an estimator again?) and use this estimator in our sample.
3 / 86 Introduction
Definition of the Simple Regression Model
Deriving the Ordinary Least We must confront three issues: Squares Estimates 1 How do we allow factors other than x to affect y? There is never an exact Properties of relationship between two variables. OLS on any Sample of Data 2 Units of What is the functional relationship between y and x? Measurement and Functional Form 3 ceteris paribus y Using the Natural How can we be sure we are capturing a relationship between Logarithm in Simple Regression and x? Expected Value of OLS
4 / 86 Introduction
Definition of the Simple Regression Model
Deriving the Ordinary Least Consider the following equation relating y to x: Squares Estimates Properties of y = β0 + β1x + u, OLS on any Sample of Data which is assumed to hold in the population of interest. Units of Measurement and Functional • This equation defines the simple linear regression model (or two-variable Form Using the Natural regression model, or bivariate linear regression model). Logarithm in Simple Regression
Expected Value of OLS
5 / 86 Introduction
Definition of the Simple Regression Model • y and x are not treated symmetrically. We want to explain y in terms of x.
Deriving the Ordinary Least Squares Estimates x explains y Properties of OLS on any Sample of Data
Units of Measurement x −→ y and Functional Form Using the Natural Logarithm in Simple Regression • Example: Expected Value of OLS size of the city x, explains number of crimes (y) (not the other way around).
6 / 86 Terminology for Simple Regression
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares y x Estimates
Properties of Dependent Variable Independent Variable OLS on any Sample of Explained Variable Explanatory Variable Data Response Variable Control Variable Units of Measurement Predicted Variable Predictor Variable and Functional Form Regressand Regressor Using the Natural Logarithm in Simple Regression
Expected Value of OLS
7 / 86 The error term
Definition of the Simple Regression Model
Deriving the Ordinary Least y = β0 + β1x + u Squares Estimates Properties of u y OLS on any This equation explicitly allows for other factors, contained in , to affect . Sample of Data This equation also addresses the functional form issue (in a simple way). Units of Measurement and Functional Form Namely, y is assumed to be linearly related to x. Using the Natural Logarithm in Simple Regression β β Expected We call 0 the intercept parameter and 1 the slope parameter. These describe Value of OLS a population, and our ultimate goal is to estimate them.
8 / 86 The simple linear regression model equation
Definition of the Simple • The equation also addresses the ceteris paribus issue. In Regression Model Deriving the y = β0 + β1x + u, Ordinary Least Squares Estimates all other factors that affect y are in u. We want to know how y changes when x Properties of changes, holding u fixed. OLS on any Sample of • Let ∆ denote “change.”Then holding u fixed means ∆u = 0. So Data
Units of Measurement and Functional Form ∆y = β1∆x + ∆u Using the Natural Logarithm in Simple = β ∆x ∆u = 0 Regression 1 when .
Expected Value of OLS
• This equation effectively defines β1 as a slope, with the only difference being the restriction ∆u = 0. 9 / 86 The simple linear regression model equation
Definition of the Simple Regression Model
Deriving the Ordinary Least Example: Yield and Fertilizer Squares Estimates • A model to explain crop yield to fertilizer use is Properties of OLS on any Sample of yield = β0 + β1fertilizer + u, Data
Units of Measurement and Functional where u contains land quality, rainfall on a plot of land, and so on. The slope Form Using the Natural parameter, β1, is of primary interest: it tells us how yield changes when the amount Logarithm in Simple Regression of fertilizer changes, holding all else fixed. Expected Value of OLS
10 / 86 The simple linear regression model equation
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Example: Wage and Education Estimates Properties of wage = β0 + β1educ + u OLS on any Sample of Data where u contains somewhat nebulous factors (“ability”) but also past workforce Units of experience and tenure on the current job. Measurement and Functional Form Using the Natural ∆wage = β1∆educ when ∆u = 0 Logarithm in Simple Regression
Expected Value of OLS
11 / 86 The simple linear regression model equation
Definition of the Simple We said we must confront three issues: Regression Model 1. How do we allow factors other than x to affect y? Deriving the Answer: u Ordinary Least Squares Estimates 2. What is the functional relationship between y and x? Properties of OLS on any Answer: Linear (x has a linear effect on y) Sample of Data 3. How can we be sure we a capturing a ceteris paribus relationship between y and Units of Measurement x? and Functional Form Answer: Related with ∆u = 0 Using the Natural Logarithm in Simple Regression • We have argued that the simple regression model Expected Value of OLS y = β0 + β1x + u addresses each of them. 12 / 86 Relation between u and x
Definition of the Simple Regression Model To estimate β1 and β0 from a random sample we also need to restrict how u and Deriving the x are related to each other. Ordinary Least Squares Estimates Properties of • Recall that x and u are properly viewed as having distributions in the population. OLS on any Sample of Data • What we must do is restrict the way in when u and x relate to each other in the Units of Measurement population. and Functional Form Using the Natural • Logarithm in Simple First, we make a simplifying assumption that is without loss of generality: the Regression average, or expected, value of u is zero in the population: Expected Value of OLS E(u) = 0
13 / 86 Relation between u and x
Definition of the Simple Regression • Normalizing u should cause no impact in the most important parameter: β1 Model
Deriving the Ordinary Least • The presence of β0 in Squares Estimates Properties of y = β0 + β1x + u OLS on any Sample of Data allows us to assume E(u) = 0. Units of Measurement and Functional • If the average of u is different from zero, we just adjust the intercept, leaving the Form Using the Natural slope the same. If α0 = E(u) then we can write Logarithm in Simple Regression Expected y = (β + α ) + β x + (u − α ), Value of OLS 0 0 1 0
where the new error, u − α0, has a zero mean.
14 / 86 Relation between u and x
Definition of the Simple Regression Model Deriving the We need to restrict the dependence between u and x Ordinary Least Squares Estimates Properties of • OLS on any Option 1: Uncorrelated Sample of Data We could assume u and x uncorrelated in the population: Units of Measurement and Functional Form Corr(x, u) = 0 Using the Natural Logarithm in Simple Regression
Expected Value of OLS It implies only that u and x are not linearly related. Not good enough.
15 / 86 Relation between u and x
Definition of the Simple Regression Model • Option 2: Mean independence Deriving the Ordinary Least Squares The mean of the error (i.e., the mean of the unobservables) is the same across all Estimates x Properties of slices of the population determined by values of . OLS on any Sample of Data We represent it by: Units of Measurement and Functional E(u|x) = E(u), all values x, Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS And we say that u is mean independent of x
16 / 86 Relation between u and x
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares • Suppose u is “ability” and x is years of education. We need, for example, Estimates
Properties of OLS on any E(ability|x = 8) = E(ability|x = 12) = E(ability|x = 16) Sample of Data
Units of Measurement and Functional so that the average ability is the same in the different portions of the population with Form an 8th grade education, a 12th grade education, and a four-year college education. Using the Natural Logarithm in Simple Regression
Expected Value of OLS
17 / 86 Relation between u and x
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares • Combining E(u|x) = E(u) (the substantive assumption) with E(u) = 0 (a Estimates normalization) gives Properties of OLS on any Sample of Data E(u|x) = 0, all values x
Units of Measurement and Functional Form Using the Natural • Logarithm in Simple Called the zero conditional mean assumption. Regression
Expected Value of OLS
18 / 86 The PRF
Definition of the Simple Regression Model
Deriving the • First, recall the properties of conditional expectation. (see slides with a review of Ordinary Least Squares Probability) Estimates
Properties of OLS on any • Now, take the conditional expectation of our Simple Linear Regression Function. Sample of Data Then, we get:
Units of Measurement E(y|x) = β + β x + E(u|x) and Functional 0 1 Form = β0 + β1x Using the Natural Logarithm in Simple Regression
Expected which shows the population regression function is a linear function of x. Value of OLS
19 / 86 The PRF
Definition of the Simple Regression Model Figure: Example: The goal is to explain weekly consumption expenditure in terms of Deriving the Ordinary Least weekly income Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics.
20 / 86 The PRF
Definition of the Simple Regression Model Figure: Conditional Probabilities of the data Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics.
21 / 86 The PRF
Definition of the Simple Regression Model Figure: Conditional distribution of expenditure for various levels of income
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics. 22 / 86 The PRF
Definition of the Simple Regression Model Figure: The Population Regression Function (PRF) Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics.
23 / 86 The PRF
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares • The straight line in the previous graph is the PRF, E(y|x) = β0 + β1x. The Estimates conditional distribution of y at three different values of x are superimposed. Properties of OLS on any Sample of Data • For a given value of x, we see a range of y values: remember, y = β0 + β1x + u, Units of and u has a distribution in the population. Measurement and Functional Form • Using the Natural In practice, we never know the population intercept and slope. Logarithm in Simple Regression
Expected Value of OLS
24 / 86 The PRF
Definition of the Simple Regression Model • Assuming we know the PRF, consider this example: Deriving the Ordinary Least Squares Estimates Example Properties of OLS on any • Suppose for the population of students attending a university, we know the PRF: Sample of Data
Units of E(colGPA|hsGP A) = 1.5 + 0.5 hsGP A, Measurement and Functional Form Using the Natural Logarithm in Simple • y x Regression For this example, what is ? what is ? What is the slope? What’s the intercept?
Expected Value of OLS • If hsGP A = 3.6 what’s the expected college GPA? 1.5 + 0.5(3.6) = 3.3
25 / 86 Topics
Definition of the Simple Regression Model 1 Deriving the Definition of the Simple Regression Model Ordinary Least Squares Estimates 2 Deriving the Ordinary Least Squares Estimates Properties of OLS on any Sample of 3 Data Properties of OLS on any Sample of Data
Units of Measurement 4 and Functional Units of Measurement and Functional Form Form Using the Natural Logarithm in Simple Regression Using the Natural Logarithm in Simple Regression Expected 5 Expected Value of OLS Value of OLS
26 / 86 Deriving the OLS
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates • Given data on x and y, how can we estimate the population parameters, β0 and Properties of β1? OLS on any Sample of Data • Let {(xi, yi): i = 1, 2, ..., n} be a random sample of size n (the number of Units of Measurement observations) from the population. Think of this as a random sample. and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
27 / 86 Deriving the OLS
Definition of the Simple Regression Derivation: (On white board) Model
Deriving the Estimator for β0 Ordinary Least Squares ˆ ˆ Estimates β0 =y ¯ − β1x¯
Properties of OLS on any Sample of Data Estimator for β1 Units of Measurement Pn (xi − x¯)(yi − y¯) Sample Covariance(x, y) and Functional βˆ = i=1 = Form 1 Pn (x − x¯)2 (x) Using the Natural i=1 i Sample Variance Logarithm in Simple S Regression x,y = 2 Expected Sx Value of OLS σˆy =ρ ˆx,y σˆx
28 / 86 Deriving the OLS
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
29 / 86 Deriving the OLS
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
30 / 86 Deriving the OLS
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
31 / 86 Deriving the OLS
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics.
32 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Example: Effects of Education on Hourly Wage Model
Deriving the • Data: random sample from the US workforce population in 1976. Ordinary Least Squares wage: dollars per hour, Estimates educ: highest grade completed (years of education). Properties of OLS on any Sample of • Data The estimated equation is
Units of Measurement and Functional Form wage[ = −0.90 + 0.54 educ Using the Natural Logarithm in Simple Regression n = 526
Expected Value of OLS • Each additional year of schooling is estimated to be worth $0.54.
33 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates Properties of The function OLS on any Sample of wage[ = −0.90 + 0.54 educ Data Units of is the OLS (or sample) regression line. Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
34 / 86 Interpreting the OLS Estimates - R Output
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
35 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Model Deriving the • When we write the simple linear regression model, Ordinary Least Squares Estimates wage = β0 + β1educ + u, Properties of OLS on any Sample of it applies to the population, so we do not know β0 and β1. Data Units of ˆ ˆ Measurement • β0 = −0.90 and β1 = 0.54 are our estimates from this particular sample. and Functional Form Using the Natural Logarithm in Simple • These estimates may or may not be close to the population values. If we obtain Regression another sample, the estimates would almost certainly change. Expected Value of OLS
36 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Model Deriving the • If educ = 0, Ordinary Least Squares the predicted wage is: Estimates
Properties of OLS on any wage = −0.90 + 0.54(0) = −0.90 Sample of [ Data
Units of Measurement The predicted value does not fit in reality. and Functional Form Using the Natural Logarithm in Simple Mainly because when we extrapolate outside the majority range of our data can Regression
Expected produce strange predictions. Value of OLS
37 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares • When educ = 8, Estimates the predicted wage is: Properties of OLS on any Sample of Data wage[ = −0.90 + 0.54(8) = 3.42 Units of Measurement and Functional Form which we can think of as our estimate of the average wage in the population when Using the Natural Logarithm in Simple educ = 8. Regression
Expected Value of OLS
38 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Model Deriving the Sample Regression Line (SRF) Ordinary Least Squares Estimates yˆi = βˆ0 + βˆ1xi i = 1, . . . , n Properties of OLS on any Sample of Also known as: Data • OLS Regression Line Units of Measurement • and Functional Sample Regression Function Form • Using the Natural OLS Regression Function Logarithm in Simple Regression • Estimated Equation Expected Value of OLS
39 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Model
Deriving the Ordinary Least Population Regression Function (PRF) Squares Estimates Since the simple linear regression model (or just econometric model) is: Properties of OLS on any Sample of Data yi = β0 + β1xi + u
Units of Measurement Then, the PRF is: and Functional Form Using the Natural Logarithm in Simple ⇒ E(yi|x) = β0 + β1xi i = 1, 2, . . . , n Regression
Expected Value of OLS
40 / 86 Interpreting the OLS Estimates
Definition of the Simple Regression Model
Deriving the Ordinary Least Residuals Squares Estimates uˆi = yi − yˆi i = 1, 2, . . . , n Properties of OLS on any Sample of Data Units of Error Term Measurement and Functional Form ui = yi − E(y|x) Using the Natural Logarithm in Simple = yi − β0 − β1xi i = 1, 2, . . . , n Regression
Expected Value of OLS
41 / 86 Topics
Definition of the Simple Regression Model 1 Deriving the Definition of the Simple Regression Model Ordinary Least Squares Estimates 2 Deriving the Ordinary Least Squares Estimates Properties of OLS on any Sample of 3 Data Properties of OLS on any Sample of Data
Units of Measurement 4 and Functional Units of Measurement and Functional Form Form Using the Natural Logarithm in Simple Regression Using the Natural Logarithm in Simple Regression Expected 5 Expected Value of OLS Value of OLS
42 / 86 Properties of OLS on any Sample of Data
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of • Recall that the OLS residuals are OLS on any Sample of Data ˆ ˆ uˆi = yi − yˆi = yi − β0 − β1xi , i = 1, 2, ..., n Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
43 / 86 Properties of OLS on any Sample of Data
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics.
44 / 86 Properties of OLS on any Sample of Data
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates • Some residuals are positive, others are negative. Properties of OLS on any Sample of • uˆ ⇒ y Data If i is positive the line underpredicts i
Units of Measurement • If uˆi is negative ⇒ the line overpredicts yi and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
45 / 86 Algebraic Properties of OLS Statistics
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of (1) The sum of the OLS residuals is 0 OLS on any n Sample of X Data uˆi = 0 Units of i=1 Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
46 / 86 Algebraic Properties of OLS Statistics
Definition of the Simple Regression Model (2) The sample covariance between the explanatory variables and the residuals is
Deriving the always zero Ordinary Least n Squares X Estimates xiuˆi = 0 Properties of i=1 OLS on any Sample of Data • Therefore the sample correlation between the x and uˆi is also equal to zero. Units of Measurement and Functional • Because the yˆi are linear functions of the xi, the fitted values and residuals are Form Using the Natural uncorrelated, too: Logarithm in Simple Regression n Expected X Value of OLS yˆiuˆi = 0 i=1
47 / 86 Algebraic Properties of OLS Statistics
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates (3) The point (¯x, y¯) is always on the OLS regression line. Properties of OLS on any ˆ ˆ Sample of y¯ = β0 + β1x¯ Data
Units of Measurement and Functional • That is, if we plug in the average for x, we predict the sample average for y. Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
48 / 86 Goodness-of-Fit
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics. 49 / 86 Goodness-of-Fit
Definition of the Simple Regression Model Goodness-of-Fit Deriving the Ordinary Least Squares Estimates • For each observation, write
Properties of OLS on any Sample of yi =y ˆi +u ˆi Data
Units of • Define: Measurement and Functional Pn 2 Form Total Sum of Squares = SST = i=1(yi − y¯) Using the Natural Pn 2 Logarithm in Simple = SSE = (ˆy − y¯) Regression Explained Sum of Squares i=1 i Pn 2 Expected Residual Sum of Squares = SSR = i=1 uˆi Value of OLS
50 / 86 Goodness-of-Fit
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
Source: Gujarati, Damodar (2002). Basic Econometrics. 51 / 86 Goodness-of-Fit
Definition of the Simple Regression Model
Deriving the Ordinary Least (Other names) Squares Estimates
Properties of • SSR is also know as Sum of Squared Residuals or Model Sum of Residuals OLS on any Sample of Data • SST = TSS Units of Measurement and Functional • SSE = ESS Form Using the Natural Logarithm in Simple Regression • SSR = RSS Expected Value of OLS
52 / 86 Goodness-of-Fit
Definition of the Simple Regression Model n Deriving the X 2 Ordinary Least SST = (yi − y¯) Squares Estimates i=1 n Properties of X 2 OLS on any = [(yi − yˆi) + (ˆyi − y¯)] Sample of Data i=1 n Units of X 2 Measurement = [ˆui − (ˆyi − y¯)] and Functional Form i=1 Using the Natural Logarithm in Simple Regression Using the fact that the fitted values and residuals are uncorrelated: Expected Value of OLS SST = SSE + SSR
53 / 86 Goodness-of-Fit
Definition of the Simple Regression Model The R-Squared Deriving the Ordinary Least Goal: We want to evaluate how well the independet variable x explains the Squares Estimates dependent variable y.
Properties of OLS on any • y x Sample of We want to obtain the fraction of the sample variation in that is explained by . Data 2 Units of • We will summarize it in one number: R (or coefficient of determination.) Measurement and Functional Form • SST > 0 Using the Natural Assuming , Logarithm in Simple Regression SSE SSR Expected R2 = = 1 − Value of OLS SST SST
54 / 86 Goodness-of-Fit
Definition of the Simple Regression • SSE SST Model Since cannot be greater than the , then:
Deriving the Ordinary Least 0 ≤ R2 ≤ 1 Squares Estimates
Properties of OLS on any • R2 = 0 ⇒ (between y and x ) Sample of No linear relationship i i . Data 2 Units of • R = 1 ⇒ Perfect linear relationship yi xi Measurement (between and ). and Functional Form 2 Using the Natural • As R increases ⇒ yi gets closer and closer to the OLS regression line. Logarithm in Simple Regression
Expected Value of OLS We should not focus only on R2 to analyze our regression.
55 / 86 Goodness-of-Fit
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Example (Wage) Estimates
Properties of OLS on any Sample of wage[ = −0.90 + 0.54 educ Data n = 526,R2 = .16 Units of Measurement and Functional Form • Therefore, years of education explains only about 16% of the variation in hourly Using the Natural Logarithm in Simple wage. Regression
Expected Value of OLS
56 / 86 Goodness-of-Fit - R output
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
57 / 86 Goodness-of-Fit - R output (using stargazer)
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
58 / 86 Suggested Exercise
Definition of the Simple Regression Model Suggested Exercise Deriving the Below you have a random sample with 10 data points. Your observations are (xi, yi). Ordinary Least ˆ ˆ 2 Squares Find the β0, β1 and the R . Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
59 / 86 Topics
Definition of the Simple Regression Model 1 Deriving the Definition of the Simple Regression Model Ordinary Least Squares Estimates 2 Deriving the Ordinary Least Squares Estimates Properties of OLS on any Sample of 3 Data Properties of OLS on any Sample of Data
Units of Measurement 4 and Functional Units of Measurement and Functional Form Form Using the Natural Logarithm in Simple Regression Using the Natural Logarithm in Simple Regression Expected 5 Expected Value of OLS Value of OLS
60 / 86 The effects of Changing Units of Measurement on OLS Statistics
Definition of the Simple Regression Model Example Deriving the Ordinary Least salary: Annual CEO’s salary in thousands of dollars Squares Estimates roe: Average return on equity (measured in percentage) Properties of OLS on any Sample of Data Units of salary\ = 963.19 + 18.50 roe Measurement and Functional 2 Form n = 209,R = .01 Using the Natural Logarithm in Simple Regression
Expected Value of OLS • A one unit increase in the independent variable (i.e. roe increases one percent) ⇒ increases the predicted salary by 18.501, or $18,501.
61 / 86 The effects of Changing Units of Measurement on OLS Statistics
Definition of the Simple Regression Model
Deriving the • If we measure roe as a decimal (rather than a percent), what will happen to the Ordinary Least 2 Squares intercept, slope, and R ? Estimates We want: Properties of OLS on any roedec = roe/100 Sample of Data Units of • Measurement What if we measure salary in dollars (rather than thousands of dollars)? what will and Functional R2 Form happen to the intercept, slope, and ? Using the Natural Logarithm in Simple We want: Regression salarydol = 1, 000 · salary Expected Value of OLS
62 / 86 The effects of Changing Units of Measurement on OLS Statistics
Definition of the Simple Regression Model Deriving the Changing Units of Measurement Ordinary Least Squares • If the dependent variable y Estimates c ⇒ c · βˆ c · βˆ Properties of is multiplied by a constant 0 and 1 OLS on any Sample of Data • If the independent variable x 1 Units of ˆ Measurement is multiplied by a constant c ⇒ · β1 and Functional c Form Using the Natural Logarithm in Simple In general, changing the units of measurement of only the independent variable does Regression not affect the intercept Expected Value of OLS
63 / 86 The effects of Changing Units of Measurement on OLS Statistics
Definition of the Simple Regression Model Example: CEO’s salary - Original Regression Deriving the Ordinary Least Squares Estimates salary\ = 963.19 + 18.50 roe Properties of 2 OLS on any n = 209,R = .01 Sample of Data Units of Example: CEO’s salary - roe as a decimal Measurement and Functional Form The new regression is: Using the Natural Logarithm in Simple Regression salary\ = 963.191 + 1, 850.1 roedec Expected 2 Value of OLS n = 209,R = .01
64 / 86 The effects of Changing Units of Measurement on OLS Statistics
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
65 / 86 The effects of Changing Units of Measurement on OLS Statistics
Definition of the Simple Regression Model Example: CEO’s salary - Original Regression Deriving the Ordinary Least Squares Estimates salary\ = 963.19 + 18.50 roe Properties of 2 OLS on any n = 209,R = .01 Sample of Data Units of Example: CEO’s salary - salary in dollars Measurement and Functional Form The new regression is Using the Natural Logarithm in Simple Regression salarydol\ = 963, 191 + 18, 501 roe Expected 2 Value of OLS n = 209,R = .01
66 / 86 The effects of Changing Units of Measurement on OLS Statistics
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
67 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple • Regression Recall the wage example: Model
Deriving the Example (Wage) Ordinary Least Squares Estimates Properties of wage[ = −0.90 + 0.54 educ OLS on any 2 Sample of n = 526,R = .16 Data
Units of Measurement • Now, think about the econometric model and how this OLS Regression Function is and Functional Form interpreted. Using the Natural Logarithm in Simple Regression • What the OLS Regression Line says may not fit how economically we see the Expected Value of OLS problem.
Possible issue: the dollar value of another year of schooling is constant.
68 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model
Deriving the th Ordinary Least • So the 16 year of education is worth the same as the second. Squares Estimates • Properties of We expect additional years of schooling to be worth more, in dollar terms, than OLS on any Sample of previous years. Data Units of • How can we incorporate an increasing effect? One way is to postulate a constant Measurement and Functional percentage effect. Form Using the Natural Logarithm in Simple Regression •We can approximate percentage changes using the natural log. Expected Value of OLS
69 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model
Deriving the Ordinary Least Constant Percent Model Squares Estimates
Properties of • Let the dependent variable be log(wage) and write a (new) simple linear OLS on any Sample of regression model: Data
Units of Measurement log(wage) = β0 + β1educ + u and Functional Form Using the Natural Logarithm in Simple Regression • Let’s define log(wage) (write it as lwage) and run a new regression. Expected Value of OLS
70 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
71 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model
Deriving the Ordinary Least lwage\ = 0.58 + .08 educ Squares Estimates n = 526, R2 = .19 Properties of OLS on any Sample of Data
Units of Measurement • The estimated return to each year of education is about 8%. and Functional Form Using the Natural Logarithm in Simple • Attention: Regression This R-squared is not directly comparable to the R-squared when wage is the Expected Value of OLS dependent variable. The total variation (SSTs) in wagei and lwagei that we must explain are completely different.
72 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Constant Elasticity Model Regression Model Deriving the • We can use the log on both sides of the equation to get constant elasticity Ordinary Least Squares models. For example, if Estimates
Properties of OLS on any log(salary) = β0 + β1 log(sales) + u Sample of Data then Units of Measurement and Functional %∆salary Form β1 ≈ Using the Natural Logarithm in Simple %∆sales Regression
Expected Value of OLS • The elasticity is free of units of salary and sales. • A constant elasticity model for salary and sales makes more sense than a constant dollar effect. 73 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates Model Dependent Variable Independent Variable Interpretation of β1 Properties of Level-Level y x ∆y = β1∆x OLS on any Sample of Level-Log y log(x) ∆y = (β1/100)%∆x Data log(y) x %∆y = (100β1)∆x Units of Log-Level Measurement log(y) log(x) %∆y = β1%∆x and Functional Log-Log Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
74 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model • Recall the CEO salary example, but now the independent variable is sales.
Deriving the Ordinary Least Squares salary = βo + β1sales + u Estimates
Properties of OLS on any Sample of Data • Applying log on both variables (dependent and independent) we get: Units of Measurement and Functional Example (CEO salary) Form Using the Natural Logarithm in Simple Regression log\(salary) = 4.82 + 0.26 log(sales) Expected Value of OLS n = 209,R2 = .21
75 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Sample of Data
Units of Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
76 / 86 Using the Natural Logarithm in Simple Regression
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares • Estimates The estimated elasticity of CEO salary with respect to firms sales is about .26.
Properties of OLS on any • A 10 percent increase in sales is associated with a Sample of Data Units of .26(10) = 2.6 Measurement and Functional Form percent increase in salary. Using the Natural Logarithm in Simple Regression
Expected Value of OLS
77 / 86 Topics
Definition of the Simple Regression Model 1 Deriving the Definition of the Simple Regression Model Ordinary Least Squares Estimates 2 Deriving the Ordinary Least Squares Estimates Properties of OLS on any Sample of 3 Data Properties of OLS on any Sample of Data
Units of Measurement 4 and Functional Units of Measurement and Functional Form Form Using the Natural Logarithm in Simple Regression Using the Natural Logarithm in Simple Regression Expected 5 Expected Value of OLS Value of OLS
78 / 86 The 4 Assumptions for Unbiasedness
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates
Properties of OLS on any Goal: We want to study statistical properties of the OLS estimator Sample of Data
Units of Measurement • In order to that, we will need to impose 4 assumptions. and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
79 / 86 The 4 Assumptions for Unbiasedness
Definition of the Simple Regression Model
Deriving the Assumption SLR.1 (Linear in Parameters) Ordinary Least Squares The population model can be written as Estimates
Properties of OLS on any y = β0 + β1x + u Sample of Data β β Units of where 0 and 1 are the (unknown) population parameters. Measurement and Functional Form Using the Natural Logarithm in Simple • What linear in parameters mean? Regression
Expected Value of OLS • Example of non linear in parameters on white board
80 / 86 The 4 Assumptions for Unbiasedness
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates Properties of Assumption SLR.2 (Random Sampling) OLS on any Sample of n {(x , y ): i = 1, ..., n} Data We have a random sample of size , i i , following the
Units of population model. Measurement and Functional Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
81 / 86 The 4 Assumptions for Unbiasedness
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Assumption SLR.3 (Sample Variation in the Explanatory Variable) Estimates x Properties of The sample outcomes on i are not all the same value. OLS on any Sample of Data
Units of • This is the same as saying the sample variance of {xi : i = 1, ..., n} is not zero. Measurement and Functional Form • If in the population x does not change then we are not asking an interesting Using the Natural Logarithm in Simple Regression question.
Expected Value of OLS
82 / 86 The 4 Assumptions for Unbiasedness
Definition of the Simple Regression Model Deriving the Assumption SLR.4 (Zero Conditional Mean) Ordinary Least Squares Estimates In the population, the error term has zero mean given any value of the explanatory
Properties of variable: OLS on any Sample of Data E(u|x) = 0 for all x. Units of Measurement and Functional Form Using the Natural • Key assumption. Logarithm in Simple Regression Expected • We can compute the OLS estimates whether or not this assumption holds. Value of OLS
83 / 86 The 4 Assumptions for Unbiasedness
Definition of the Simple Regression Model ˆ ˆ Deriving the Goal: We want to know if β1 is unbiased for β1, and β0 is unbiased for β0 Ordinary Least Squares Estimates Properties of • If, OLS on any Sample of Data E(βˆ1) = β1 Units of Measurement and Functional ˆ Form E(β0) = β0 Using the Natural Logarithm in Simple Regression Then, the OLS estimator is unbiased. Expected Value of OLS • Demonstration: On the white board.
84 / 86 Unbiasedness of OLS
Definition of the Simple Regression Model
Deriving the Ordinary Least Squares Estimates Theorem: Unbiasedness of OLS Properties of OLS on any Under Assumptions SLR.1 through SLR.4 Sample of Data E(βˆ ) = β E(βˆ ) = β Units of 0 0 and 1 1 , Measurement ˆ ˆ and Functional for any values of β0 and β1, i.e., β0 is unbiased for β0, and β1 is unbiased for β1 Form Using the Natural Logarithm in Simple Regression
Expected Value of OLS
85 / 86 Unbiasedness of OLS
Definition of the Simple Regression Model • Therefore, the four assumptions for the OLS estimator to be unbiased are: Deriving the Ordinary Least Squares SLR.1: (Linear in Parameters) y = β0 + β1x + u Estimates SLR.2: (Random Sampling) Properties of OLS on any SLR.3: (Sample Variation in xi) Sample of Data SLR.4: (Zero Conditional Mean) E(u|x) = 0 Units of Measurement and Functional Form • If any of these assumptions fails, the OLS estimator will (generally) be biased. Using the Natural Logarithm in Simple Regression
Expected • To be discussed in the next chapter: What are the omitted factors? Are they Value of OLS likely to be correlated with x? If so, SLR.4 fails and OLS will be biased.
86 / 86