<<

Regression and Other Distributional Estimators: Moving Beyond Impacts of Policies

Marianne Bitler, Department of Economics, UC Davis and NBER

The research results described in this publication were supported in part by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number P01HD065704. The content is

1 solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health. versus distributions

• Much of what we estimate in applied micro fields are means or conditional means. We then argue about the extent to which these estimate causal impacts.

• Another approach is to look at the overall distribu- tion of impacts.

• Today, I’ll touch on what we can learn when we do this.

2 • Examples from my research with various coauthors (Thad Domina, Hilary Hoynes, and Emily Penner).

• Also examples in a do file available on the INID webside. Why would we want to look at distributions? (I)

• If there is interest in ex-post of intervention gaps across groups, effects on distributions are the rel- evant object for accounting with a social welfare function. (Issues about the distribution of treat- ment effects (DOTE) for individuals versus what we learn with quantile treatment effects or other treatment effects on distributions are mentioned be- low.) For example, racial/ethnic gaps in wages, test scores ...

• If you are using an outcome with a ceiling/floor, means can be biased but quantile regression or quan- tile treatment effects (QTE) away from the ceil- ing/floor are not (e.g., test scores).

3 Why would we want to look at distributions? (II)

• If effects are offsetting, the overall impact mean can be zero while there can be important treatment ef- fect heterogeneity by subgroups, relevant for policy targeting. Policy makers are probably happier with programs where the whole distribution benefits than ones where there are winners and losers.

• These subgroups may be suggested by theory or by a specific institutional feature of your setting.

• But then, why not look at means within those sub- groups? 4 – In various experimental settings, researchers have found that even within the subgroups, the within heterogeneity swamps the between heterogene- ity. This holds with QTE and a welfare experi- ment (Bitler, Gelbach, and Hoynes, forthcoming, ReStat).

– “Subgroup” itself is not a fixed concept across most sets, incorporating individuals with different characteristics (observed and unobserved). Hispanics in one state (e.g., Texas) may be quite different from those in another (e.g., those of Cuban descent in Florida) in terms of background, income, etc. Example where effects on the distribution are in- teresting: Welfare experiment in the US in CT (Bitler, Gelbach, and Hoynes, AER)

• Control group face the (old) AFDC program: Guar- antee with essentially a 100% tax on earnings. So, for each dollar earned, the welfare recipients lost a dollar of welfare benefits.

• Treatment group face the (new) Jobs First pro- gram: 0% tax rate on earnings up to a cliff, then all taxed away. So, they got to keep each dollar they earned until they reached the poverty guide- line, then if they earned one more dollar, they lost the entire welfare benefit. 5 • Static model of labor supply predicts: No effects at the bottom of the earnings distribution, a positive effect higher up, and then a negative effect near the top. Graph of differences between of the treat- ment and control distribution of earnings 1 to 7 quarters after Random Assignment, from Bitler, Gelbach, and Hoynes, AER

• Graph is quantile treatment effects (QTE) at each percentile (x-axis). y-axis measures QTE in dollars. The dashed horizontal line is the mean treatment effect. The solid line is the QTE. The dotted lines are 90% pointwise CIs.

• The mean would miss a lot of interesting hetero- geneity that is consistent with economic theory.

6

1000

800

600

400

200

0 Quarterly impact

−200

−400

−600 10 20 30 40 50 60 70 80 90 Quantile

7 Quantile regression

• Quantile regression provides one with a way to look at effects across the distribution of an outcome

8 What is quantile regression? (I)

• It is a way to see how the quantiles of the depen- dent variable change in response to changes in some independent variables, in a linear framework.

• Often, we estimate a multivariate regression of an outcome Y on some controls X1,X2, .... In the linear case, the estimated coefficients on the independent variables, βck, tell us the average effect on Y of a 1 unit change in the variable Xk.

9 What is quantile regression? (II)

• Quantile regression instead tells us how the effect of changing Xk by 1 unit varies across the conditional distribution of Y .

• It is also possible to undo the conditioning, and learn what changing Xk does to the unconditional distribution of Y .

10 Accessible references (more technical papers/references at the end)

• Roger Koenker and Kevin Hallock. 2001. Quan- tile Regression. Journal of Economic Perspectives 15(4):143–156. Also, Roger Koenker’s web site has many helpful other examples, presentations, and pa- pers. See: http://www.econ.uiuc.edu/%7Eroger/courses/

• Josh Angrist and Jorn-Steffen Pischke. Mostly Harmless Econometrics. Chapter 7.

11 • Markus Fr¨olichand Blaise Melly. 2010. Estima- tion of Quantile Treatment Effects with . The Stata Journal 10(3): 423–457. Supporting material

• See example program in Stata which produce the estimates I present below, along with others I don’t discuss here, and including a bootstrapping exam- ple.

• Also links to the data and relevant statistical rou- tines in the do file.

• Will need to comment some lines out if not in linux/unix setting (commented).

12 Code in various packages (I)

• Stata:

– qreg (built in, conditional quantile regression and SEs under homoskedasticity)

– bsqreg (built in, does bootstrap SEs)

– sqreg (simultaneously estimates numerous quan- tiles, can test cross quantile restrictions)

– ivqte (conditional and unconditional QTE with and without endogeneity, Blaise and Melly, 2010)

13 – cdeco (decompositions of change in distribution of covariates, Chernozhukov, Fernandez-Val, and Melly, 2009)

– rqdeco (decompositions, Melly, 2005, like Machado and Mata)

– rifreg (recentered influence regression functions or a version of unconditional effects, Firpo, Fortin, and Lemiuex, 2009) Code in various packages (II)

:

– quantreg package (Koenker, suite of programs)

– iv qte (unconditional IV QTE and more, Fr¨olich and Melly, 2009)

– rqdeco3 (decompositions, Melly, 2005, like Machado and Mata)

– Lots of code by Victor Chernozhukov and coau- thors

14 Code in various packages (III)

• SAS:

– quantreg (conditional quantile regression)

15 Motivation for looking at quantile regression in the context of educational interventions (I)

• Often there is a goal of increasing educational at- tainment and achievement.

• At the same time, there is a focus on narrowing gaps between haves and have nots.

• Either of these suggest that looking at effects across the distribution and not at average effects is impor- tant.

16 Motivation for looking at quantile regression in the context of educational interventions (II)

• Theoretical: For example, developmental science might suggest different effects for children of dif- ferent “ability” levels.

• Tied to specific context: Interventions like exit ex- ams may result in teachers refocusing effort on those near the passing threshold, to the possible detri- ment of those at the bottom or top.

• Both kinds of hypotheses suggest an analysis of ef- fects on the distribution may be particularly mean- ingful.

17 Motivation for looking at quantile regression in general

• Much research focuses on mean differences or mean differences within subgroup.

• Yet, it seems intuitive that a program from which everyone benefits might be more desirable than one where there are winners and losers.

• Zero or small effects at the mean may obscure off- setting effects.

• This also motivates an interest in distributional es- timators.

18 Outline

• Stylized motivating example

• Background on the potential outcomes framework, as applied to this setting

• Practical example of applying method to following

– Unconditional quantiles in an experimental set- ting

– Conditional and unconditional QTE, assuming exogeneity 19 – Conditional and unconditional IV QTE, assum- ing exogeneity

• Other considerations Stylized motivating example (I)

• Suppose there are two types of people who are af- fected by an educational intervention. They are evenly distributed in the population.

• The first group (Type 1) gains by an amount δ, a measure of achievement, and the second group (Type 2) loses by the same amount.

• The average effect of this intervention would be zero.

20 Stylized motivating example (II)

• If we could, we would like to know that this is be- cause some gain and some lose. Quantile regression can tell us there is this variation.

• If we knew who was in each group, we probably wouldn’t give the Type 2 people the intervention. But often we can’t figure out a way to tell the peo- ple apart. But with quantile regression, we might learn how observables are associated with the vari- ation.

21 Outline

• Stylized motivating example

• Background on the potential outcomes frame- work, as applied to this setting

• Practical example of applying method to following

– Unconditional quantiles in an experimental set- ting

– Conditional and unconditional QTE, assuming exogeneity 22 – Conditional and unconditional IV QTE, assum- ing exogeneity

• Other considerations Methodology: Potential outcomes model (I)

• Potential outcomes framework of Rubin, Neyman, etc.

• Assume there is a binary treatment, D. Let D = 1 if treated, and D = 0 otherwise.

0 • Let Yi denote person i’s outcome when not treated.

1 • Let Yi denote person i’s outcome when treated.

23 1 • Define the individual treatment effect as ∆i = Yi − 0 Yi .

0 • Write Yi = Yi + ∆i · Di. Methodology: Potential outcomes model (II)

• The evaluation problem is that we never see both outcomes for any person (Holland, 1986). So we never see ∆i for any person.

• The outcome for person i is never observed under the counterfactual assignment.

• Effect estimates comparing means for those who take up the program and those who do not are likely biased.

24 Methodology: Potential outcomes model (III)

• We can get around this if we randomly assign peo- ple to either get the treatment or not get the treat- ment.

• Under random assignment and full compliance (ev- eryone assigned to the offer of treatment and no one not assigned to the offer can get it), the aver- age treatment effect can be estimated by the differ- ence in means of the outcome for the control and treatment groups, δb = Y 1 − Y 0.

• If there is non-compliance (some assigned to treat- ment don’t take up the offer or some assigned to the 25 control take it up), we can get effect of an offer of treatment (intent to treat) as this mean difference. Typical approach to get treatment on the treated is to inflate by the probability that the treatment in- duces participation (Bloom). Local Average Treat- ment Effect interpretation says this is the average effect for compliers (Angrist, Imbens, Rubin.) Methodology: Quantile treatment effects (I)

• Recall if Y is a random variable, and F is the cu- mulative distribution function (CDF) of Y , then F (y) = P rob(Y ≤ y). And the τth quantile of Y is the inverse function Q(τ) = inf{y : F (y) ≥ τ}. For example, the is Q(.5).

• We can always estimate marginal distributions for Y in the treatment and control groups.

• We can always find the quantiles of the treatment and control groups.

26 Methodology: Quantile treatment effects (II)

• Let F 0(Y 0) denote the cumulative distribution func- tion when untreated, and F 1(Y 1) denote the cumu- lative distribution function when treated.

• Random assignment → F 0(y0|D = 0) = F 0(y0|D = 1) & F 1(y1|D = 0) = F 1(y1|D = 1).

27 Methodology: Quantile treatment effects (III)

• We can then calculate the quantile treatment effect at the τth quantile as ∆Q(τ) = Q1(τ) − Q0(τ), where Q1(τ) is the τth quantile of the treatment distribution and Q0(τ) is the τth quantile of the control distribution. For example, the median QTE is the difference in the .

• This object is the distributional analog to the dif- ference in means estimate for the mean treatment effect.

28 Methodology: Quantile treatment effects (QTE) (IV)

• So, the qth quantile treatment effect is the horizon- tal distance between the graphs of F 1 and F 0.

• Equivalently, the qth quantile treatment effect (QTE) is the difference between the quantiles of the treat- ment and control groups (the vertical distance be- tween the inverse CDFs).

29 Example of inverse CDF functions for a treatment and control group

30 Treatment Control

120

100

80

60 1 98 Percentiles Test score inverse CDFs

31 Example of quantile treatment effects (vertical dif- ferences in inverse CDF functions for treatment and control group)

32 QTE Mean difference

2

0

−2 1 98 Percentiles Test score QTE

33 What do QTE estimate? (I)

• The QTE tell us how the distribution changes when assigning a treatment randomly.

• It does not tell us about the distribution of individ- ual treatment effects without further assumptions. (That is the thing we would need to know to figure out the share of winners and losers). See work by Heckman, Todd, Smith, others for discussions of these interesting other parameters.

• It does tell us enough to evaluate the program from the perspective of a social planner. We can see if 34 there are gains at the bottom or top of the distri- butions, and weight them differently perhaps. What do QTE estimate? (II)

• Further if any QTE are positive, there is some pos- itive one, and similarly with a negative one.

• With a constant effect, the distribution is degener- ate and the ATE is the treatment effect for every- one.

• Rank preservation means that individual’s position in the two distributions is the same. With rank preservation, the QTE are the same as the distri- bution of treatment effects.

35 What do QTE estimate? (III)

• We can test for evidence against rank preservation (Dong and Shu, 2017; Frandsen and Lefgren forth- coming; example in Bitler, Gelbach, and Hoynes, 2005). These tests are necessary but not sufficient.

• There are techniques which produce bounds on the DOTE (Fan and Park; Heckman, Smith, and Clements).

• Not more limiting than many other approaches.

36 Conditional quantile regression: What do we esti- mate?

• With conditional quantile regression, instead of the unconditional QTE above, we estimate linear con- ditional quantile functions.

• Start by thinking about how to estimate the me- dian.

37 How do we calculate the median?

• Could think of ordering the data, and taking the value ymed in the data where F (ymed) ≥ 0.5.

• Alternatively, given a sample, the median minimizes the sum of absolute deviations from it. (This is not what you’d think, but the location of the median is only a function of their signs.)

38 How do we calculate the qth quantile?

• The qth quantile of the unconditional distribution can be obtained by minimizing the weighted sum of the absolute values of the residuals. The weight function, is the check function, ρτ (z), where ρτ (z) = z·(τ −I(z < 0)), 0 < τ < 1, and where I(.) is the indicator function.

• If the desired quantile q is above the median, the weight places a higher value on residuals above q than below q and vice verse for the values below the median. The weighting function is an inverted right angle, with slope τ − 1 to the left and τ to the right.

39 Conditional quantile regression (I)

• Can extend this to the linear conditional quantile 0 function Q(τ|X = x) = x β(τ). Estimated as fol- P 0 lows: βd(τ) = argminb ρτ · (yi − Xi · b).

• This is not something that can be done with usual gradient methods. But it can be solved as a problem, which is fast.

• This is what qreg or quantreg are doing.

40 QTE with a treatment indicator

• If no other Xs besides the treatment indicator, quan- tile regression is equivalent to differencing the quan- tiles of the treatment and control groups.

• If there is selection on observables (Xs), Firpo (2007) shows that one can create inverse propensity score weights from a regression of treatment status on the relevant Xs, and then weighted differences in the quantiles yields consistent estimates of the QTE. (Logit, probit, others.) The Xs have been inte- grated out.

41 Inference in quantile regression (I)

• Can get asymptotic expressions from quantile re- gression.

• Inference (SEs and testing) is easiest if there is ho- moskedasticity. But this isn’t very interesting.

• Other setups lead to various expressions for the SEs. Buchinsky (JEP) has some discussion of this.

• There is a simple alternative to getting the analytic covariance matrix in some settings, the bootstrap, 42 which works when there are sampling weights, clus- ters/blocks (like schools or PSUs), a desire to bal- ance on the Xs, etc. Inference in quantile regression (II)

• Intuitively, the bootstrap treats the observed data as the population, and samples from it with replace- ment. You re-estimate the objects of interest within each resample.

• Then, you can use the distribution of resample es- timates to derive confidence intervals.

43 Inference in quantile regression (III)

• In Stata, the analytic SEs are reported by qreg.

• In Stata, bsqreg bootstraps the SEs. This is done by resampling. The most convenient number of resamples is 249 or 499 or 999 for getting round p-values.

• In Stata, sqreg allows you to estimate the quantiles simultaneously with the same covariates, which al- lows testing whether estimates are the same across quantiles.

44 • Or you can do this by hand in some settings (ex- ample below and in code). Bootstrap example: Inference with QTE

• In Bitler, Domina, and Hoynes (2016), we look at the QTE for the effect of an offer of Head Start on test scores, using HSIS data.

• Children within Head Start centers of application are randomly assigned to an offer or not, so also want to resample centers of HS application rather than kids (kids within centers are more similar than 2 children in the population of HS applicants be- cause among other things they come from the same neighborhoods).

45 • There are differences in attrition by T/C, and also sampling weights. We estimate propensity score weights incorporating the baseline sampling weights, and this balances attrition.

• We repeat this estimation (logits, weights) within each bootstrap replicate (where we sample the cen- ters with replacement), say 999 times. Then for the τth quantile, we sort the bootstrapped estimates, and the 50th to 950th provides a 90% confidence interval for that estimate.

• Better to do uniform CIs. Chernozhukov and coau- thors provide such estimates. Graph of QTE for PPVT test scores in 2003, for 3-year olds in HSIS

• Red line is mean difference. Solid blue line is QTE estimates for centiles 1 to 98, connected. Dashed blue lines are pointwise CIs from BS.

• Note that the CIs are not symmetric b/c of the bootstrapping.

46 −30 −10 0 10 30 QTE forPPVTscoresin2003,3−year−oldcohort,90%CIs 0 20 pe n f9%C Meandifference QTE Upper endof90%CI Lower endof90%CI 40 Percentiles 60 80 100 47 What do the quantile regression estimates mean? (I)

• The βb are the partial derivatives of the conditional quantile function with respect to Xs.

• Often show them as graphs for each of the Xs. Sometimes with point-wise CIs, and sometimes with joint (uniform) ones. The coefficient is on the Y axis and the quantile is on the X axis. (Or in ta- bles.) Like the example above.

48 What do the quantile regression estimates mean?

• Somewhat complicated to interpret them (compared to ).

• They say something about effects on the distribu- tion, not for individuals without the assumptions.

49 What do the conditional quantiles mean? Uncon- ditional quantiles (III)

• One might want to say something about the un- conditional distribution. This is tough. Need all the conditional quantiles to get the unconditional ones. Very different from linear regression, where can easily do this. One exception, is if the other Xs are all incidental controls, can deal with them as an estimated propensity score (e.g., Firpo, 2007).

• Examples of how to do this: Machado and Mata; Melly; others.

50 • See Firpo, Fortin, Lemiuex for another approach under some assumptions. Chernozhukov, Fernandez- Val, and Melly also. Instrumental variable conditional and unconditional QTE (I)

• When we leave the experimental setting, often we worry about omitted variables bias or selection or reverse causation biasing our estimates.

51 Instrumental variable conditional and unconditional QTE (II)

• Abadie, Angrist and Imbens (2002) extend the LATE framework of IV to conditional quantiles. Assump- tion include:

– There is an instrument. (Could consider treat- ment to be take-up of the program, and the in- strument could be offer of the program in many experimental settings.)

– Treatment assignment is independent of out- comes and takeup given the controls. (Poten- tial outcomes not directly affected by instrument 52 (exclusion) AND the instrument identifies causal effects.)

– Nontrivial assignment. Given Xs, the probability I take up the treatment is affected by treatment assignment.

– Requires no defiers (people who take-up if in the control group, but don’t in the treatment group). (Monotonicity.) Instrumental variable conditional and unconditional QTE (III)

• This approach gets effects for the compliers only (people who only take up if offered the treatment.) Can estimate the conditional QTE for the compliers by a weighted regression. Start by estimating the probability the instrument is 1 given the Xs. The ivqte ado used below in the examples uses a local logit.

• There are other quantile IV approaches. Some (Cher- nozhukov and Hansen) impose a rank similarity con- dition. This requires that the people don’t move ranks a lot. 53 • Unconditional IV approach (Fr¨olich and Melly (2008). This identifies the unconditional distribution for com- pliers. Weighted also, with different weights than Abadie, Angrist, and Imbens. First stage the same. LATE interpretation still.

• Canned software for all of these in Stata and R, maybe SAS and Matlab. Outline

• Stylized motivating example

• Background on the potential outcomes framework, as applied to this setting

• Practical examples of applying method to fol- lowing

– Unconditional quantiles in an experimental set- ting

– Conditional and unconditional QTE, assuming exogeneity 54 – Conditional and unconditional IV QTE, assum- ing exogeneity

• Other considerations Example (I)

• Drawn from public-use data from paper by Josh An- grist, Kevin Lang, and Phil Oreopoulos. 2009. “In- centives and Services for College Achievement: Ev- idence from a Randomized Trial.” American Eco- nomic Review: Applied Economics 1(1): 136-163. The data are posted at econ-www.mit.edu/faculty/angrist/data1/data then click on Angrist, Lang, and Oreopoulos.

55 Example (II)

• The goal of the Student Achievement and Reten- tion Project intervention was to improve academic performance among college freshman at a satel- lite campus of a large Canadian university. There were multiple treatment groups. We will focus on the group offered both financial incentives for good grades and academic support.

• The paper finds women use the services more than men. They find effects of the combined treatment for women but nothing for men.

56 Example (III)

• Awards varied by high school grade quartile. Pay- ments were substantial.

• N = 250 for either one treatment or the other, N = 150 for getting both treatments, and N = 1006 for the control group.

• The analysis in the paper relies on a set of basic or more thorough controls.

• Outcomes of interest here include grades in the first year and later. 57 • Takeup was not 100%. So, OLS intent to treat (re- gressing outcome on offer of treatment) and treat- ment on the treated (from 2SLS, IV for takeup with offer) differ on average (unless both are 0). Balance

• Always a good idea to check balance with experi- mental data and observational data.

• Paper reports in table 1 means by treatment/control status.

• Seem balanced. 85 students don’t show up, dropped. Some students don’t have fall grades, dropped. Nei- ther of these conditions related to treatment status.

58 Unconditional quantile treatment effects

• If random assignment is done correctly, then not only should the means be balanced across treatment and control groups, but also the distributions should be balanced.

• Example do file checks balance in the means.

• If we had a pre-RA outcome, we could check to see if the distributions were the same for the treatment and control groups.

59 • We can simply estimate unconditional quantiles of the treatment and control groups. Their differences are the QTE of the intent to treat distributions, and say what is the effect of being offered treatment. Stata

• Code posted mostly uses stata ado “ivqte”, but I show that it is the same as the thing done by brute force.

• For example, qreg in our example for the first decile is the same as ivqte ignoring endogeneity.

• Or my by hand unconditional estimate with no con- trols is the same as ivqte without controls.

• Examples included to help you understand what es- timators do.

60 ivqte in Stata

• Download ivqte package from Stata journal, requires moremata, kdens ado packages too.

• ivqte has some options you need to specify or that are preset.

• indepvars is the list of Xs for the conditional esti- mates. The aai option uses the Abadie, Angrist, and Imbens conditional IV QR estimator. If no instrument specified and no indepvars specified, it uses the Firpo estimator. If no aai and instrument

61 and no indepvars specified it uses the Fr¨olichand Melly estimator.

• Only works for binary IVs (can make them binary).

• Much of this has a propensity score being estimated first. linear would use a local linear estimator, and if it is off, a local logit is set.

• kernel selects different kernels for the propensity score estimation, Epanechnikov is the default.

• bandwidth is used if there are continuous variables in the propensity-score. • λ is a variable ∈ (0, 1) used to smooth in the esti- mation of the p-score.

• There are some weighting options. positive for Fr¨olich and Melly restricts to positive weights.

• See the Stata Journal article for more info. Conditional QTE, regress test on treatment group assignment (offer)

• Marked Estimation A in the do file. Classical Koenker and Bassett estimator, with ivqte adjusting SEs for heteroskedasticity.

• First use qreg, to get first decile (or 10th percentile) from quantile regression of GPA year1 on treatment assignment (sfsp) and controls (global $all2, gen- der, language, other Xs), estimate is 0.0533333, qreg SE is 0.1228043, line 184 of do file.

• Then use ivqte, point estimate for .1 quantile is identical at 0.0533333, ivqte SE is 0.1165126, OK 62 with heteroskedasticity, line 197 of the do file inside a loop.

• The median estimate is 0.0625 and the estimate at the 90th percentile is 0.1936, suggesting a fair amount of heterogeneity in the point estimates. This compares with an insignificant 0.091 for the mean effect.

• Because these are conditional estimates, they are comparing people at the 10th percentile of the group offered treatment and the control group within groups created by Xs. Graph of ITT, conditional QTE, key RHS is offer of treatment, Estimation A

• The graph in the following slide is the mean ef- fect of 0.091 (dashed horizontal line) and solid line connecting the 9 QTE for each decile.

• The pattern suggests considerable variation although the SEs suggest none of these are statistically dif- ferent from 0.

63 Conditional QTE, RHS is offer Mean, treatment is offer (ITT)

.3

.2

.1

0

−.1

−.2 10 90 Decile Conditional QTE, treatment is offer, assumes exogeneity

64 Conditional QTE, regress test on treatment takeup

• This would be the treatment effect on the treated if there were no selection into treatment (probably think there is).

• Still classical quantile regression.

• Marked Estimation B in the do file, probably wouldn’t believe this.

• First use qreg, to get first decile (or 10th percentile) from quantile regression of GPA year1 on treatment 65 assignment (sfsp) and controls (global $all2, gen- der, language, other Xs), estimate is 0.11, line 232 of do file.

• Then use ivqte, point estimate for .1 quantile is identical at 0.11, line 246 of do file in loop.

• Compares with OLS estimate of 0.116. Graph of TOT, conditional QTE, key RHS is takeup of treatment, Estimation B

• The graph in the following slide is the mean ef- fect of 0.116 (dashed horizontal line) and solid line connecting the 9 QTE for each decile.

• The pattern suggests less variation than the intent to treat did.

66 Conditional QTE, RHS is takeup Mean, treatment is takeup (TOT)

.3

.2

.1

0

−.1

−.2 10 90 Decile Conditional QTE, treatment is takeup dummy, assumes exogeneity

67 Conditional IVQTE, regress test on treatment takeup (endogenous), instrument with offer of treatment

• This would be the treatment effect on the treated using the Abadie, Angrist and Imbens instrumental- variable quantile regression estimator.

• Marked Estimation C in the do file.

• Then use ivqte, point estimate for .1 quantile is not very different from the previous, at 0.116, line 279 of do file in loop.

• Compares with 2SLS estimate of 0.1196. 68 • Next graph shows all of the decile estimates with the mean 2SLS estimate, not much variation except at the top. Conditional IV QTE, IV is offe 2SLS for compliers, treatment i

.3

.2

.1

0

−.1

−.2 10 90 Decile Conditional IV QTE for compliers, treatment is takeup dummy

69 Moving to unconditional estimates

• I prefer unconditional estimates in general, as much of the discussion about gaps is about differences for those at the bottom of the overall distribution, not say the bottom within the conditional distributions of those with highly educated parents and unedu- cated parents, where the former may be fairly high up in the overall distribution.

70 Intent to treat, unconditional QTE, assumes exo- geneity, use controls as in Firpo paper to balance Xs across T & C with inverse p-score weights.

• Now use the Firpo unconditional QTE estimator, ITT, Estimation D in do file, line 310.

• General pattern the same as conditional QTE ITT. Point estimates differ.

• At the bottom, effects are quite different from the mean.

71 Unconditional QTE, RHS is offer Unconditional QTE,treatmentisoffer, assumesexogeneity −.2 −.1 .2 .3 .1 0 10 Decile 90 72 Unconditional QTE, no controls, assumes exogene- ity

• Estimation E estimates effect of the offer of treat- ment (intent to treat) without balancing the Xs.

• Done this way so I can show that ivqte and the “by hand” estimate differencing the inverse CDFs produce the same estimates.

• ivqte with no controls and done by hand (line 341), using pctile for the T and C groups and differencing (lines 361, 368, and 375)

73 • Both yield same point estimates. SEs on graph come from a simple bootstrap with 999 replicates. Within each decile, sort the resampled estimates. The 50th such estimate for each quantile q is the bottom of the 90% CI, and the 950th such estimate is the top of the CI. (This is code in lines 386 to 463.)

• Graph on next page shows CIs and point estimates see some evidence of variation but not significantly so. Bottom of BS 90% CI QTE Top of BS 90% CI

.3

.2

.1

0

−.1

−.2

10 90 Decile Uncond. QTE, treatment is offer, no controls, assumes exogeneity

74 TOT, unconditional QTE, assumes exogeneity

• Treatment on the treated if no selection into takeup. Uses ivqte. Key RHS is participation in the treat- ment (takeup) sfsp p. Controls are gender, group at RA in terms of HS grades, number of courses, etc. Estimation F, line 492.

75 Unconditional QTE, RHS is takeu Mean, treatment is takeup (TOT)

.3

.2

.1

0

−.1

−.2 10 90 Decile Unconditional QTE, treatment is takeup dummy, assumes exogeneity

76 TOT, unconditional IV QTE, Fr¨olichand Melly

• Treatment on the treated. So key RHS is partici- pation in sfsp p. Uses ivqte. Instrument with offer of treatment sfsp. Line 511.

• Again, big gains at the bottom, maybe a negative effect at the top.

• Some further regressions with changes in some op- tions are below this.

77 Unconditional IV QTE, IV is off 2SLS for compliers, treatment i

.3

.2

.1

0

−.1

−.2 10 90 Decile Unconditional IV QTE, treatment is takeup dummy

78 Other issues

• Can estimate quantile treatment effects in the re- gression discontinuity design with enough data. (Frand- sen, Fr¨olich,and Melly.)

• Challenges to doing inference in some settings with sampling weights.

• Within vs. across subgroups

• Bootstrapping won’t work if there are point masses (like earnings and some non-workers), see Cher- nozhukov and Fernandez-Val paper using subsam- pling. 79 Conclusions

• Examples of use of distributional approach in an experimental setting.

• With valid instruments, can extend to non-experimental setting.

• Yields effects across the distribution.

80 Key papers: Quantile regression and QTE through about 2012 when these notes were first written

• Koenker, Roger, and Gilbert Bassett. (1978). “Re- gression Quantiles.” Econometrica.

• Koenker, Roger. (2005). Quantile Regression. Cambridge University Press.

• Buchinsky, Moshe. (1994). “Changes in the U.S. Wage Structure 1963-1987: Application of Quan- tile Regression.” Econometrica.

81 • Powell, James. (1986.) “Censored Regression Quan- tiles.” Journal of Econometrics.

• Angrist, Joshua, Victor Chernozhukov, and Ivan Fernandez- Val. (2006). “Quantile Regression under Misspec- ification, with an Application to the U.S. Wage Structure.” Econometrica.

• Firpo, Sergio. (2007). “Efficient Semiparametric Estimation of Quantile Treatment Effects.” Econo- metrica.

• Heckman, James J., Jeffrey Smith, and Nancy Clements. (1997). “Making the Most Out of Programme Evaluations and Social Experiments.” The Review of Economic Studies. Key papers: IV quantile regression and QTE with endogeneity

• Abadie, Alberto, Joshua Angrist, and Guido Im- bens. (2002). “Instrumental Variables Estimates of the Effect of Subsidized Training on the Quan- tiles of Trainee Earnings.” Econometrica. QTE within Late.

• Fr¨olich,Markus, and Blaise Melly. (2007). “Un- conditional Quantile Treatment Effects under En- dogeneity.” Working paper.

82 • Good review piece. Chernozhukov, Victor, and Chris- tian Hansen. (2013). “Quantile Models with En- dogeneity.” Annual Review of Economics.

• Other set of approaches not covered in these notes

– Chernozhukov, Victor, and Christian Hansen. (2005). “An IV Model of Quantile Treatment Effects.” Econometrica.

– Chernozhukov, Victor, and Christian Hansen. (2004). “The Impact of 401K on Savings: An IV-QR Analysis. Review of Economics and Statistics.

– Rank preservation and rank similarity important here. Key papers counterfactuals and decompositions (not covered here)

• Machado, Jose, and Jose Mata. (2005). “Counter- factual Decompositions of Changes in Wage Dis- tributions using Quantile Regression.” Journal of Applied Econometrics.

• Chernozhukov, Victor, Ivan Fernandez-Val and Blaise Melly. (2013). “Inference on Counterfactual Dis- tributions.” Econometrica.

• Firpo, Sergio, Nicole Fortin, and Thomas Lemieux. (2009). “Unconditional Quantile Regressions.” Econo- metrica.

83