Ordinary Least Squares: the Univariate Case

Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Ordinary Least Squares: the univariate case Clément de Chaisemartin Majeure Economie September 2011 Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises 1 Introduction 2 The OLS method Objective and principles of OLS Deriving the OLS estimates Do OLS keep their promises ? 3 The linear causal model Assumptions Identification and estimation Limits 4 A simulation & applications OLS do not always yield good estimates... But things can be improved... Empirical applications 5 Conclusion and exercises Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Objectives Objective 1 : to make the best possible guess on a variable Y based on X . Find a function of X which yields good predictions for Y . Given cigarette prices, what will be cigarettes sales in September 2010 in France ? Objective 2 : to determine the causal mechanism by which X influences Y . Cetebus paribus type of analysis. Everything else being equal, how a change in X affects Y ? By how much one more year of education increases an individual’s wage ? By how much the hiring of 1 000 more policemen would decrease the crime rate in Paris ? The tool we use = a data set, in which we have the wages and number of years of education of N individuals. Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Objective and principles of OLS What we have and what we want For each individual in our data set we observe his wage and his number of years of education. Assume we have a graph such as the one below. Relationship between the two variable seems to be linear. We want to find the line which describes best the relationship between these variables. 4000 3500 3000 2500 Wage 2000 1500 1000 500 0 8 10 12 14 16 18 20 Years of Schooling Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Objective and principles of OLS The principle of OLS A line is characterized by a slope and by an intercept that we denote αb and βb. Idea = choose for αb and βb the values which minimize P 2 (Yi − αb − βb × Xi ) . Ordinary Least Squares Estimates. Let us denote Ybi = αb + βb × Xi . It represents the wage of individual i as predicted by our model. We also denote "bi = Yi − Ybi . The "bi are called the estimated residuals and represent the mistake made by our model when predicting individual i’s wage based on his number of years of schooling. => the principle of OLS is merely to minimize the sum of the mistakes we make when we use an affine function of Xi to predict Yi . Why do we take the square of "bi ? Could we have used another function ? Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Objective and principles of OLS A graphical example 4000 3500 3000 2500 Wage 2000 1500 1000 500 0 8 10 12 14 16 18 20 Years of Schooling Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Deriving the OLS estimates Finding αb and βb (Theorem 1.1) 1 P We denote Y = N Yi the empirical mean of (Yi ), X the 1 P 2 1 P 2 empirical mean of (Xi ), Ve (X ) = N Xi − N Xi the empirical variance of (Xi ) and finally 1 P cove (X ; Y ) = N Xi Yi − X Y the empirical covariance of (Xi ) and (Yi ). P 2 We want to minimize f (α;b βb) = (Yi − αb − βb × Xi ) . Solution: β = cove (X ;Y ) and α = Y − cove (X ;Y ) × X . b Ve (X ) b Ve (X ) Can we compute βb from the sample ? Any problem with the computations ? Any idea to interpret this result ? Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Deriving the OLS estimates An example Compute βb in this simple example: Individual Years of Schooling Wage 1 5 1000 2 5 1500 3 10 1000 4 15 2000 5 15 2500 Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Do OLS keep their promises ? Do OLS attain objectives 1 and 2 ? Objective 1: find the best prediction for Y based on X / find a function P(Xi ) of Xi which yields good predictions for Yi . Objective 2: determine the causal mechanism by which X influences Y . Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Do OLS keep their promises ? OLS partially reach objective 1. Once agreed that a good prediction is a prediction which minimizes the square of errors, OLS yield by construction the best prediction function for Y , among all affine functions of X . But: P the criterion can be challenged: minimize j"bi j instead of P 2 "i . This is not so big an issue. Quantile regression models b P minimize j"bi j and results usually close from OLS. even if the criterion is accepted, OLS yield the best prediction function among all affine functions of X , not among all functions of X . There might for instance exist a polynomial 0 0 0 0 2 function of X : α + βb X + γ X which yields errors "i such 2 b b b P 0 P 2 that "bi < "bi . Not so big an issue neither, see next chapter. How to measure the extent to which Objective 1 is reached ? Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Do OLS keep their promises ? The R2: a measure of the quality of our predictions P 2 SST = (Yi − Y¯ ) : the dispersion of wages. P ¯ 2 SSE = (Ybi − Y ) : the dispersion of predicted wages. P 2 SSR = (Yi − Ybi ) : the sum of the square of the errors. P ¯ 2 P ¯ 2 SST = (Yi − Y ) = (Yi − Ybi + Ybi − Y ) = P 2 P ¯ 2 P ¯ (Yi − Ybi ) + (Ybi − Y ) + 2 "i (Ybi − Y ) = P P b P SSE + SSR + 2αb "bi + 2βb "bi Xi − 2Y "bi . P According to FOC1, "i = 0, according to FOC2, P b "bi xi = 0. Therefore, SST = SSE + SSR. 2 SSE 2 R = SST . The R is always included between 0 and 1 (why ?). It is a measure of the share of the variance observed in the sample our model is able to account for, of the quality of our predictions for Y based on X . However, a model with a low R-square can still be helpful and models with high R-squared can be helpless. Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Do OLS keep their promises ? But OLS do not necessarily reach objective 2. 4000 3500 3000 2500 Wage 2000 1500 1000 500 0 8 10 12 14 16 18 20 Years of Schooling Individuals with more schooling have higher wages. Does it imply that schooling has a causal impact on wages ? Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Do OLS keep their promises ? But OLS do not necessarily reach objective 2. The line can be inverted ) causality goes in the other direction. Reverse causality. Here, not an issue: higher wages cannot cause longer education because schooling takes place before labor market participation. Individuals with many years of schooling make more money than those with few years of schooling. But do those two groups only differ on their number of years of schooling ? Probably not. For instance, those with more years of schooling might have richer parents, or might also be more clever. ) this correlation between wages and education, is it only due to the effect of education on wages, or to the fact that those with more education are also more clever and have richer parents ? Omitted variable bias. Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Do OLS keep their promises ? A causal framework Parents’ wage Well paid parents can afford Well paid parents have good sending their children to networking skills, know how school, then to college and to get good positions => can finally to university help their children Education increases children’s productivity + Children’s education ability to find a well paid job Children’s wage (signalling theory) True causal impact of education on wages = green cell. If this framework is true, does β;b i.e. the correlation between children’s education and wage measures the green cell only ? Does it overestimate or underestimate the green cell ? Clément de Chaisemartin Ordinary Least Squares Introduction The OLS method The linear causal model A simulation & applications Conclusion and exercises Assumptions Positing a linear causal model We assume that for every individual, his income is generated according to the following model: Income = α + β × Number of Years of Education + " More formally: Yi = α + β × Xi + "i . Yi is the dependent variable, Xi the explanatory variable, and "i the error term: all other determinants of income (cleverness, gender...). Assumption 1. β measures by how much wage changes when education of an individual increases by one year and all the other determinants of income (") remain unchanged (cetebus paribus impact of education), i.e.

Ordinary Least Squares: the Univariate Case

Ordinary Least Squares 1 Ordinary Least Squares

Chapter 2: Ordinary Least Squares Regression

Time-Series Regression and Generalized Least Squares in R*

Chapter 2 Simple Linear Regression Analysis the Simple

Regression Analysis

Testing for Heteroskedastic Mixture of Ordinary Least 5

Spatial Autocorrelation and Red Herrings in Geographical Ecology

Bayesian Inference

Note 4: Statistical Properties of the OLS Estimators

R-Squared for Bayesian Regression Models⇤

Chapter 11 Autocorrelation

The Multiple Linear Regression Model