Quantile Regression and Other Distributional Estimators: Moving Beyond Mean Impacts of Policies
Total Page:16
File Type:pdf, Size:1020Kb
Quantile Regression and Other Distributional Estimators: Moving Beyond Mean Impacts of Policies Marianne Bitler, Department of Economics, UC Davis and NBER The research results described in this publication were supported in part by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number P01HD065704. The content is 1 solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health. Means versus distributions • Much of what we estimate in applied micro fields are means or conditional means. We then argue about the extent to which these estimate causal impacts. • Another approach is to look at the overall distribu- tion of impacts. • Today, I'll touch on what we can learn when we do this. 2 • Examples from my research with various coauthors (Thad Domina, Hilary Hoynes, and Emily Penner). • Also examples in a do file available on the INID webside. Why would we want to look at distributions? (I) • If there is interest in ex-post of intervention gaps across groups, effects on distributions are the rel- evant object for accounting with a social welfare function. (Issues about the distribution of treat- ment effects (DOTE) for individuals versus what we learn with quantile treatment effects or other treatment effects on distributions are mentioned be- low.) For example, racial/ethnic gaps in wages, test scores ... • If you are using an outcome with a ceiling/floor, means can be biased but quantile regression or quan- tile treatment effects (QTE) away from the ceil- ing/floor are not (e.g., test scores). 3 Why would we want to look at distributions? (II) • If effects are offsetting, the overall impact mean can be zero while there can be important treatment ef- fect heterogeneity by subgroups, relevant for policy targeting. Policy makers are probably happier with programs where the whole distribution benefits than ones where there are winners and losers. • These subgroups may be suggested by theory or by a specific institutional feature of your setting. • But then, why not look at means within those sub- groups? 4 { In various experimental settings, researchers have found that even within the subgroups, the within heterogeneity swamps the between heterogene- ity. This holds with QTE and a welfare experi- ment (Bitler, Gelbach, and Hoynes, forthcoming, ReStat). { \Subgroup" itself is not a fixed concept across most data sets, incorporating individuals with different characteristics (observed and unobserved). Hispanics in one state (e.g., Texas) may be quite different from those in another (e.g., those of Cuban descent in Florida) in terms of background, income, etc. Example where effects on the distribution are in- teresting: Welfare experiment in the US in CT (Bitler, Gelbach, and Hoynes, AER) • Control group face the (old) AFDC program: Guar- antee with essentially a 100% tax on earnings. So, for each dollar earned, the welfare recipients lost a dollar of welfare benefits. • Treatment group face the (new) Jobs First pro- gram: 0% tax rate on earnings up to a cliff, then all taxed away. So, they got to keep each dollar they earned until they reached the poverty guide- line, then if they earned one more dollar, they lost the entire welfare benefit. 5 • Static model of labor supply predicts: No effects at the bottom of the earnings distribution, a positive effect higher up, and then a negative effect near the top. Graph of differences between quantiles of the treat- ment and control distribution of earnings 1 to 7 quarters after Random Assignment, from Bitler, Gelbach, and Hoynes, AER • Graph is quantile treatment effects (QTE) at each percentile (x-axis). y-axis measures QTE in dollars. The dashed horizontal line is the mean treatment effect. The solid line is the QTE. The dotted lines are 90% pointwise CIs. • The mean would miss a lot of interesting hetero- geneity that is consistent with economic theory. 6 1000 800 600 400 200 0 Quarterly impact −200 −400 −600 10 20 30 40 50 60 70 80 90 Quantile 7 Quantile regression • Quantile regression provides one with a way to look at effects across the distribution of an outcome 8 What is quantile regression? (I) • It is a way to see how the quantiles of the depen- dent variable change in response to changes in some independent variables, in a linear framework. • Often, we estimate a multivariate regression of an outcome Y on some controls X1;X2; :::. In the linear case, the estimated coefficients on the independent variables, βck, tell us the average effect on Y of a 1 unit change in the variable Xk. 9 What is quantile regression? (II) • Quantile regression instead tells us how the effect of changing Xk by 1 unit varies across the conditional distribution of Y . • It is also possible to undo the conditioning, and learn what changing Xk does to the unconditional distribution of Y . 10 Accessible references (more technical papers/references at the end) • Roger Koenker and Kevin Hallock. 2001. Quan- tile Regression. Journal of Economic Perspectives 15(4):143{156. Also, Roger Koenker's web site has many helpful other examples, presentations, and pa- pers. See: http://www.econ.uiuc.edu/%7Eroger/courses/ • Josh Angrist and Jorn-Steffen Pischke. Mostly Harmless Econometrics. Chapter 7. 11 • Markus Fr¨olichand Blaise Melly. 2010. Estima- tion of Quantile Treatment Effects with Stata. The Stata Journal 10(3): 423{457. Supporting material • See example program in Stata which produce the estimates I present below, along with others I don't discuss here, and including a bootstrapping exam- ple. • Also links to the data and relevant statistical rou- tines in the do file. • Will need to comment some lines out if not in linux/unix setting (commented). 12 Code in various packages (I) • Stata: { qreg (built in, conditional quantile regression and SEs under homoskedasticity) { bsqreg (built in, does bootstrap SEs) { sqreg (simultaneously estimates numerous quan- tiles, can test cross quantile restrictions) { ivqte (conditional and unconditional QTE with and without endogeneity, Blaise and Melly, 2010) 13 { cdeco (decompositions of change in distribution of covariates, Chernozhukov, Fernandez-Val, and Melly, 2009) { rqdeco (decompositions, Melly, 2005, like Machado and Mata) { rifreg (recentered influence regression functions or a version of unconditional effects, Firpo, Fortin, and Lemiuex, 2009) Code in various packages (II) • R: { quantreg package (Koenker, suite of programs) { iv qte (unconditional IV QTE and more, Fr¨olich and Melly, 2009) { rqdeco3 (decompositions, Melly, 2005, like Machado and Mata) { Lots of code by Victor Chernozhukov and coau- thors 14 Code in various packages (III) • SAS: { quantreg (conditional quantile regression) 15 Motivation for looking at quantile regression in the context of educational interventions (I) • Often there is a goal of increasing educational at- tainment and achievement. • At the same time, there is a focus on narrowing gaps between haves and have nots. • Either of these suggest that looking at effects across the distribution and not at average effects is impor- tant. 16 Motivation for looking at quantile regression in the context of educational interventions (II) • Theoretical: For example, developmental science might suggest different effects for children of dif- ferent \ability" levels. • Tied to specific context: Interventions like exit ex- ams may result in teachers refocusing effort on those near the passing threshold, to the possible detri- ment of those at the bottom or top. • Both kinds of hypotheses suggest an analysis of ef- fects on the distribution may be particularly mean- ingful. 17 Motivation for looking at quantile regression in general • Much research focuses on mean differences or mean differences within subgroup. • Yet, it seems intuitive that a program from which everyone benefits might be more desirable than one where there are winners and losers. • Zero or small effects at the mean may obscure off- setting effects. • This also motivates an interest in distributional es- timators. 18 Outline • Stylized motivating example • Background on the potential outcomes framework, as applied to this setting • Practical example of applying method to following { Unconditional quantiles in an experimental set- ting { Conditional and unconditional QTE, assuming exogeneity 19 { Conditional and unconditional IV QTE, assum- ing exogeneity • Other considerations Stylized motivating example (I) • Suppose there are two types of people who are af- fected by an educational intervention. They are evenly distributed in the population. • The first group (Type 1) gains by an amount δ, a measure of achievement, and the second group (Type 2) loses by the same amount. • The average effect of this intervention would be zero. 20 Stylized motivating example (II) • If we could, we would like to know that this is be- cause some gain and some lose. Quantile regression can tell us there is this variation. • If we knew who was in each group, we probably wouldn't give the Type 2 people the intervention. But often we can't figure out a way to tell the peo- ple apart. But with quantile regression, we might learn how observables are associated with the vari- ation. 21 Outline • Stylized motivating example • Background on the potential outcomes frame- work, as applied to this setting • Practical example of applying method to following { Unconditional quantiles in an experimental set- ting { Conditional and unconditional QTE, assuming exogeneity 22 { Conditional and unconditional IV QTE, assum- ing exogeneity • Other considerations Methodology: Potential outcomes model (I) • Potential outcomes framework of Rubin, Neyman, etc. • Assume there is a binary treatment, D. Let D = 1 if treated, and D = 0 otherwise. 0 • Let Yi denote person i's outcome when not treated. 1 • Let Yi denote person i's outcome when treated. 23 1 • Define the individual treatment effect as ∆i = Yi − 0 Yi . 0 • Write Yi = Yi + ∆i · Di. Methodology: Potential outcomes model (II) • The evaluation problem is that we never see both outcomes for any person (Holland, 1986).