Randomized Experiments Education Policies Market for Credit Conclusion
Total Page:16
File Type:pdf, Size:1020Kb
Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Randomized experiments Clément de Chaisemartin Majeure Economie September 2011 Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion 1 Limits of OLS 2 What is causality ? A new definition of causality Can we measure causality ? 3 Randomized experiments Solving the selection bias Potential applications Advantages & Limits 4 Education policies Access to education in developing countries Quality of education in developing countries 5 Market for credit The demand for credit Adverse Selection and Moral Hazard 6 Conclusion Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Omitted variable bias (1/3) OLS do not measure the causal impact of X1 on Y if X1 is correlated to the residual. Assume the true DGP is : Y = α + β1X1 + β2X2 + ", with cov(X1;") = 0 If you regress Y on X1, cove (Y ;X1) cove (X1;X2) cove (X1;") β1 = = β1 + β2 + c Ve (X1) Ve (X1) Ve (x1) If β2 6= 0, and cov(X1; X2) 6= 0, the estimator is not consistent. Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Omitted variable bias (2/3) In real life, this will happen all the time: You never have in a data set all the determinants of Y . This is very likely that these remaining determinants of Y are correlated to those already included in the regression. Example: intelligence has an impact on wages and is correlated to education. Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Omitted variable bias (3/3) Assume you have two groups of people, one group: high levels of education, one group: low levels of education. In the group with high education, you observe that people have higher wages than in the group with lower education. You could say: this difference in wages is due to a difference in levels of education if the only difference across those two groups was their different levels of education. But those two groups probably differ on many more dimensions than education only. Group with higher education is probably made up of people coming from wealthier families, with higher cognitive ability... ) because of all those omitted variables (parents’ wealth, IQ...) you can not interpret this difference in wages as the causal impact of education on wages but only as a mere correlation. Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Is it so big an issue ? (1/2) No if your objective is objective 1: make the best prediction on some future outcome based on present information in the credit risk analysis, objective = make good predictions on who will default in one year based on information on customers avialable at the time of their application. whether coefficients reflect causal relationships or are merely due to correlations is not an issue from the moment the predictive power of the model is good. Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Is it so big an issue ? (2/2) Yes if your objective is objective 2: to assess the causal impact of a variable on another. Assume you are a policy maker trying to assess the efficacy of a program in order to decide whether to maintain it or not. Training program for the unemployed to help them to find a job. I run the following regression: found a job = α + β1ffollowed trainingg + ". If βb > 0, can I conclude that the training program is effective ? Question really crucial when evaluating the impact a program, the efficacy of a new drug, the relevance of a new marketing campaign... Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion A new definition of causality Defining causality The causal impact of a program is the difference between what happens to recipients of the program and what would have happened to them if they had not received the program. The big problem of impact evaluation is that we do not observe what would have happened to beneficiaries of a program if they had not benefited from it. Running example: training program offered to 12 000 unemployed among those 12 000 unemployed, 10 000 chose to participate in the training program, 2 000 declined the offer. objective = measure whether it increases their chances of finding a job in less than 6 months Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion A new definition of causality The Rubin framework of potential outcomes We consider a program (treatment) represented by a binary variable T . Ti equals 1 if individual i follows the program (treated) / receives the treatment and to 0 otherwise (untreated). T = 1 if an unemployed follows the training program. T = 1 if a sick person receives a given medicine. Each individual i has ex ante two potential outcomes Yi;1 and Yi;0. Yi;1 = what will happen to him if he receives the treatment. Yi;0 = what will happen to him without it. Each individual has two lives: his life when he follows the training program and his life when he does not follow it. Smoking or no smoking movie. Yi;1 = 1 and Yi;0=1: the unemployed finds a job in his two lives. Program useless: he would have found a job anyway. Definition: the causal impact of the treatment on mister i is Yi;1 − Yi;0 Clément de Chaisemartin Randomized experiments Yi;1 − Yi;0 means: does the program change (positively or negatively) mister i’s life ? Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Can we measure causality ? Do we observe causality ? Can we compute Yi;1 − Yi;0 for any individual in our sample ? The issue we have is that for each individual, we observe only one of his potential outcomes. We observe Yi;1 for the treated but we do not observe their Yi;0. Conversely, for the untreated, we observe their Yi;0 but not their Yi;1. => impossible to compute Yi;1 − Yi;0 for any individual in the sample because one of the two figures is missing. We can not assess the impact of the program on each individual. What we observe is Yi = Yi;1 × Ti + Yi;0 × (1 − Ti ). Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Can we measure causality ? Finding comparison groups Since there is no hope to measure the impact of a program on any single individual, what we can hope to achieve is to measure the impact of the program on groups of individuals. Idea = we have a group of treated individuals, we are going to find a group of individuals not treated and use it as a comparison group. To measure the impact of the training program, we should find a group of unemployed which did not benefit from the training program and compare the share of people having found a job in less than 6 months in the treated group and in the comparison group. For this comparison to yield a credible measure of the impact of the training program, the treated group and the comparison group should be similar in every respect except that the treated group benefited from the training program while the comparisonClément group de did Chaisemartin not. Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Can we measure causality ? The average treatment effect Since we can never measure Yi;1 − Yi;0, let us try something else: E(Yi;1jTi = 1) − E(Yi;0jTi = 1): average effect of the treatment on the treated (ATT). We can easily estimate the first quantity from our sample: 1 P Yi , where N1 is the number of treated. N1 i=Ti =1 Example: % of unemployed who found a job after 6 months among those who followed the treatment. But we can not estimate E(Yi;0jTi = 1): percentage of the treated who would have found a job should they not have received the treatment. Natural idea: replace it by E(Yi;0jTi = 0) which we can estimate from our data: percentage of those who found a job among those 2 000 unemployed who chose not to participate to the training program. Good idea ? Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Can we measure causality ? The selection bias In most cases this is not a good idea. Underlying assumption: E(Yi;0jTi = 1) = E(Yi;0jTi = 0). What happened to the untreated is representative for what would have happened to the treated if they had not been treated. But the two populations are very likely not to be similar: unemployed who enroll for a training program might be more motivated to find a new job than those who do not. enrollment into the program is selective: selection bias. Since those two groups vary on more than one dimension (treated group benefited from the program but also probably more motivated to find a job), impossible to know whether we should attribute the difference in their placement rate after 6 months to the fact that one group benefited from the program and not the other, or to the fact that one group was more motivated than the other.