NBER WORKING PAPER SERIES
OPTIMAL TAXATION AND HUMAN CAPITAL POLICIES OVER THE LIFE CYCLE
Stefanie Stantcheva
Working Paper 21207 http://www.nber.org/papers/w21207
NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue Cambridge, MA 02138 May 2015
I want to thank James Poterba and Ivan Werning for invaluable advice, guidance, and encouragement throughout this project. Esther Duflo and Robert Townsend provided helpful and generous support. I benefited very much from Emmanuel Saez' insightful comments. I also thank seminar participants at Berkeley, Booth, Chicago, Harvard, Michigan, MIT, Penn, Princeton, Santa Barbara, Stanford, Stanford GSB, Wharton, and Yale for their useful feedback and comments. The views expressed herein are those of the author and do not necessarily reflect the views of the National Bureau of Economic Research.
NBER working papers are circulated for discussion and comment purposes. They have not been peer- reviewed or been subject to the review by the NBER Board of Directors that accompanies official NBER publications.
© 2015 by Stefanie Stantcheva. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source. Optimal Taxation and Human Capital Policies over the Life Cycle Stefanie Stantcheva NBER Working Paper No. 21207 May 2015 JEL No. H21,H23,I21,I22,I24
ABSTRACT
This paper derives optimal income tax and human capital policies in a dynamic life cycle model of labor supply and risky human capital formation. The wage is a function of both stochastic, persistent, and exogenous "ability'' and endogenous human capital. Human capital is acquired throughout life through monetary expenses. The government faces asymmetric information regarding the initial ability of agents and the lifetime evolution of ability, as well as the labor supply. The optimal subsidy on human capital expenses is determined by three considerations: counterbalancing distortions to human capital investment from the taxation of wage and capital income, encouraging labor supply, and providing insurance against adverse draws from the productivity distribution. When the wage elasticity with respect to ability is increasing in human capital, the optimal subsidy involves less than full deductibility of human capital expenses on the tax base, and falls with age. I consider two ways to implement the optimum: income contingent loans, and a tax scheme that allows for a deferred deductibility of human capital expenses. Numerical results are presented that suggest that full dynamic risk-adjusted deductibility of expenses might be close to optimal, and that simple linear age-dependent policies can achieve most of the welfare gain from the second best.
Stefanie Stantcheva Department of Economics Littauer Center 232 Harvard University Cambridge, MA 02138 and NBER [email protected]
An online appendix is available at: http://www.nber.org/data-appendix/w21207 1 Introduction
Investments in human capital, in the form of both time and money, play a key role in most people’s lives. Children and young adults acquire education, and human capital accumulation continues throughout life through job training. There is a two-way interaction between human capital and the tax system. On the one hand, investments in human capital are influenced by tax policy – a point recognized early on by Schultz (1961).1 Taxes on labor income discourage investment by capturing part of the return to human capital, yet also help insure against earnings risk, thereby encouraging investment in risky human capital. Capital taxes affect the choice between physical and human capital. On the other hand, investments in human capital directly impact the available tax base and are a major determinant of the pre-tax income distribution. Policies to stimulate human capital acquisition, which vary greatly across countries, shape the skill distribution of workers –a crucial input into optimal income taxation models. This two-way feedback calls for a joint analysis of optimal income taxes and optimal human capital policies over the life cycle, which is the goal of this paper. The vast majority of optimal tax research assumes that productivity is exogenously determined, instead of being the product of investment decisions made throughout life. Therefore, this paper addresses the following questions. First, how, if at all, should the tax and social insurance system take into account human capital acquisition? Should human capital expenses be tax deductible? Second, what parameters are important for setting optimal human capital policies, such as subsidies, and how do optimal policies evolve over time? Finally, what combination of policy instruments implements the optimum? Can simple policies yield a level of welfare close to that achieved with complex systems? Specifically, this paper jointly determines optimal tax and human capital policies over the life cycle, and incorporates essential characteristics of the human capital acquisition process. First, human capital pays off over long periods of time and thus returns are inherently uncertain: skills can be rendered obsolete by unpredictable changes in technology, industry shocks, or macroeconomic contractions. Yet, private markets for insurance against personal productivity shocks are limited. Second, there are important and growing financial costs to human capital acquisition, which can be deterrents to an efficient investment in human capital. Finally, individuals have heterogeneous intrinsic abilities, which may differentially affect their returns to human capital investment.2 Accordingly, in the model, each individual’s wage is a function of endogenous human capital and
1“Our tax laws everywhere discriminate against human capital. Although the stock of such capital has become large and even though it is obvious that human capital, like other forms of reproducible capital, depreciates, becomes obsolete and entails maintenance, our tax laws are all but blind on these matters.” Schultz (1961), page 17. 2 The empirical evidence on this issue is reviewed in section 5.1.
1 stochastic “ability.” Ability, as in the standard Mirrlees (1971) income taxation model, is a comprehen- sive measure of the exogenous component of productivity. Agents have heterogeneous innate abilities, which are subject to persistent and privately uninsurable shocks. Throughout their lives, they can invest in human capital with risky returns by spending money. The government maximizes a standard social welfare function under asymmetric information about any agent’s ability – both its initial level and its evolution over life – as well as labor effort. This requires the imposition of incentive compatibil- ity constraints in the dynamic mechanism designed by the government. To describe the distortions in the resulting constrained efficient allocations, the wedges, or implicit taxes and subsidies, are analyzed. Despite the complexity of the model, a very simple relation between the optimal human capital and labor wedges is derived. The implicit subsidy for human capital expenses is determined by three goals. The first is to counterbalance the distortions to human capital indirectly stemming from the labor and savings distortions. When these distortions are perfectly counterbalanced, the tax system is neutral with respect to human capital investments. I introduce the notion of a full dynamic risk-adjusted deductibility that ensures this neutrality. The second goal is to stimulate labor supply by increasing the wage, i.e., the returns to labor. The third is to redistribute and provide insurance, taking into account the differential effect of human capital on the pre-tax income of high and low ability people. When the percentage (or proportional) change in the wage of high ability agents from human capital is not larger than that of low ability agents, human capital has a positive redistributive effect on after-tax income and a positive insurance value. It is then optimal to subsidize human capital expenses beyond simply insuring a neutral tax system with respect to human capital expenses, i.e., beyond making human capital expenses fully tax deductible in a dynamic, risk-adjusted fashion. In this case, the human capital wedge also drifts up with age. The persistence of ability shocks directly translates into a persistence of the optimal human capital wedge over time. While the sign of the human capital subsidy is exclusively determined by the complementarity between human capital and ability, its magnitude is modulated by the strength of the insurance and redistributive motives and the persistence of ability shocks. This paper considers two ways of implementing the constrained efficient allocations: Income Con- tingent Loans (ICLs), and a “Deferred Deductibility” scheme. For ICLs, the loan repayment schedules are contingent on the past history of earnings and human capital investments. In the Deferred De- ductibility scheme, only part of current investment in human capital can be deducted from current taxable income. The remainder is deducted from future taxable incomes, to account for the risk and the nonlinearity of the tax schedule. I calibrate the model based on U.S. data to illustrate the optimal policies under different assump-
2 tions regarding the complementarity between human capital and ability. When human capital has a positive redistributive or insurance value, the net stimulus to human capital is small and positive, and grows with age. It is not optimal to deviate much from a neutral system with respect to human capital, a type of “production efficiency” result and, hence, full dynamic risk-adjusted deductibility is close to optimal. Simple linear age-dependent human capital subsidies, as well as income and savings taxes, achieve almost the entire welfare gain from the full second-best optimum for the calibrations studied.
1.1 Related literature
The complex process of human capital acquisition has been studied in a long-standing literature, starting with Becker (1964), Ben-Porath (1967), and Heckman (1976). The model in this paper tries to adopt, in a stylized way, some of this literature’s main findings. The structural branch of the literature (Cunha, Heckman, Lochner, and Masterov, 2006; Cunha and Heckman, 2007, 2008) emphasizes that human capital acquisition occurs throughout life, underscoring the need for a life cycle model. Both ex ante heterogeneity in the returns to human capital and uncertainty matter. A large body of empirical work documents the importance of human capital as a determinant of earnings (Card, 1995; Goldin and Katz, 2008; Huggett and Kaplan, 2011), and the financial and other factors shaping individuals’ decisions to acquire human capital (Lochner and Monge-Naranjo, 2011). The subset of this literature which studies the interaction between ability and schooling for earnings – a crucial consideration for optimal policies in this paper – is reviewed in detail in section 5.1. On the other hand, the optimal taxation literature, dating back to Mirrlees (1971), and developed more recently by Saez (2001), Kocherlakota (2005), Albanesi and Sleet (2006), Golosov, Tsyvinski, and Werning (2006), Battaglini and Coate (2008), Scheuer (2014), Golosov, Troshkin, and Tsyvinski (2013), and Farhi and Werning (2013) typically assumes exogenous ability, thus abstracting from endogenous human capital investments. Therefore, this paper builds on the lifecycle framework in Farhi and Werning (2013), and introduces endogenous stochastic productivity as the result of human capital acquisition by agents. A series of papers, evolving from static to dynamic, have considered optimal taxation jointly with education policies. Bovenberg and Jacobs (2005), using a static taxation model, find that education subsidies and income taxes are “Siamese Twins” and should always be set equal to each other, which is equivalent to making human capital expenses fully tax deductible. A few subsequent static papers emphasize the importance of the complementarity between intrinsic ability and human capital (Mal- donado (2008), with two types, Jacobs and Bovenberg (2011) with a continuum of types), or between risk and human capital (Da Costa and Maestri, 2007).
3 Several recent dynamic optimal tax papers examine the impact of taxation on human capital, with important differences to the current paper. Previous dynamic models allowed for heterogeneity across agents, but not uncertainty (Bohacek and Kapicka, 2008; Kapicka, 2013), or uncertainty, but not heterogeneity (Anderberg, 2009; Grochulski and Piskorski, 2010), which precludes a discussion of redistributive policies. Findeisen and Sachs (2012) include both heterogeneity and uncertainty, but focus on a one-shot investment during “college,” before the work life of the agent starts, with a one-time realization of uncertainty. By contrast, this paper features life cycle investment in human capital, through both expenses and time, and a progressive realization of uncertainty throughout life. A complementary analysis is Kapicka and Neira (2014), who posit a different human capital accumulation process with time investments and a fixed ability, and consider the case in which effort spent to acquire human capital is unobservable. Also complementary is the work by Krueger and Ludwig (2013), who adopt a Ramsey approach by specifying ex ante the instruments available to the government, in contrast to the Mirrlees approach adopted here, which considers an unrestricted direct revelation mechanism. In their overlapping generations general equilibrium model, “education” is a binary decision that occurs exclusively before entry into the labor market. The lifecycle analysis also addresses the issue of age-dependent taxation, as explored in Kremer (2002), Weinzierl (2011), and Mirrlees et al. (2011).3
The rest of the paper is organized as follows. Section 2 presents the dynamic lifecycle model and the full information benchmark. Section 3 sets up a recursive mechanism design program using the first- order approach. Section 4 solves for the optimal policies and interpretes the results. Section 5 contains the numerical analysis. Section 6 discusses the implementation of the optimal policies using Income Contingent Loans (ICLs) and a Deferred Deductibility scheme. Section 7 concludes and discusses three alternative applications of the model: to intergenerational transfers and bequest taxation, to entrepreneurial taxation, and to health investments.
2 A Lifecycle Model of Human Capital Acquisition and Labor Sup- ply
The economy consists of agents who live for T years, during which they work and acquire human capital. Agents who work lt ≥ 0 hours in period t at a wage rate wt earn a gross income yt = wtlt. Each period, agents can build their stock of human capital by spending money. A monetary investment 0 of amount Mt (et) generates an increase in human capital et ≥ 0. The cost function satisfies: Mt (e) > 0, 3This paper is more generally related to the dynamic mechanism design literature, as developed by, among many others, Fernandes and Phelan(2000), Doepke and Townsend(2006), and Pavan, Segal, and Toikka (2014).
4 0 00 ∀et > 0; Mt (0) = 0; Mt (et) ≥ 0, ∀et ≥ 0. These monetary investments add to a stock of human capital 4 acquired by expenses (“expenses” for short), st, which evolves according to st = st−1 + et. Expenses can be thought of as the necessary material inputs into the production of human capital, such as books, tuition fees, or living and board costs while at college, net of the cost of living elsewhere. The disutility cost to an agent of supplying labor effort lt is φt(lt). φt is strictly increasing and convex.
The wage rate wt is determined by the stock of human capital built until time t and stochastic ability θt:
wt = wt (θt, st)
∂w ∂2w wt is strictly increasing and concave in each of its arguments ( ∂m > 0, ∂m2 ≤ 0 for m = θ, s). Importantly though, no restrictions are placed on the cross-partials. This formulation allows for human capital to affect the wage differently at different ages.5 1 Agents are born at time t = 1 with a heterogeneous earning ability θ1 with distribution f (θ1). Earning ability in each period is private information, and evolves according to a Markov process with t a time-varying transition function f (θt|θt−1) , over a fixed support Θ ≡ θ, θ¯ . There are several possible interpretations for θt, such as stochastic productivity or stochastic returns to human capital.
For example, with a separable wage form wt = θt + ht (st), for some increasing, concave function ht, θt resembles a stochastic version of productivity from the static Mirrlees (1971) model. With a wage such as wt = θtht (st), θt is perhaps more naturally interpreted as the stochastic return to human capital. To keep with the tradition in the literature, θt will be called “ability” throughout. Ability to earn income can be stochastic among others because of health shocks, individual labor market idiosyncrasies or luck. The agent’s per period utility is separable in consumption and effort (both labor and training): yt u˜t (ct, yt, st; θt) = ut (ct) − φt wt (θt, st) ut is increasing, twice continuously differentiable, and concave. Denote by θt the history of ability shocks up to period t, by Θt the set of possible histories at t, t t t t 2 1 and by P θ the probability of a history θ , P θ ≡ f (θt|θt−1) ...f (θ2|θ1) f (θ1) . An allocation
{xt}t specifies consumption, output, and expenses and training stocks for each period t, conditional t t t t t on the history θ , i.e., xt = x θ Θt = c θ , y θ , s θ Θt . The expected lifetime utility from an allocation, discounted by a factor β, is given by:
4 The agent cannot wilfully destroy human capital, hence et ≥ 0. The ability shock θ described right below can partially account for stochastic depreciation. Deterministic depreciation would enter as a scaling factor of the next period’s human capital stock (st+1) in all formulas. 5Note that human capital yields an immediate benefit in the period in which it is acquired, as well as into the future. This reduces the uncertainty by one period and simplifies the optimal formulas below.
5 T X Z y (θt) U c θt , y θt , s θt = βt−1 u c θt − φ P θt dθt (1) t t w (θ , s (θt)) t=1 t t
t where, with some abuse of notation, dθ ≡ dθt...dθ1.
Let wm,t denote the partial of the wage function with respect to argument m (m ∈ {θ, s}), and wmn,t the second order partial with respect to arguments m,n ∈ {θ, s} × {θ, s}. One crucial parameter is the Hicksian coefficient of complementarity between ability and human capital in the wage function at time t (Hicks, 1970; Samuelson, 1974), denoted by ρθs,t
wθsw ρθs ≡ (2) wswθ
A positive Hicksian complementarity between human capital s and ability θ means that higher 6 ability agents have a higher marginal benefit from human capital (wθs ≥ 0). Put differently, human capital compounds the exposure of the agent to stochastic ability and to risk. A Hicksian comple- mentarity greater than 1 means that higher ability agents have a higher proportional benefit from human capital, i.e., the wage elasticity with respect to ability is increasing in human capital, i.e., ∂ ∂w θ 7 ∂s ∂θ w ≥ 0.
A separable wage function of the form wt = θt + ht (st) for some function ht implies that ρθs,t = 0.
A multiplicative form wt = θtht (s), the one typically used in the taxation literature, implies that
ρθs,t = 1. Finally, with a CES wage function, of the form
h i 1 1−ρt 1−ρt 1−ρt wt = α1tθ + α2tst (3) ability and human capital can be substituted one for the other at a fixed, but potentially time-varying rate: ρθs,t = ρt.
3 The Planning Problem
Every period, the planner can observe an agent’s choices of output yt, consumption ct, and human capital st. The informational problem is that he cannot see ability θt in any period. This implies that, while the planner knows the wage function, he cannot know the wage realization wt (θt, st), nor labor supply lt = yt/wt since those depend on the unobserved θ. Put differently, when seeing a low output produced by an agent, he can not know whether it was due to the agent’s low labor effort, or to a bad ability (and, hence, wage) shock.
6 ρθs is also the Hicksian complementarity coefficient between education and ability in earnings y. 7Equivalently, the wage elasticity with respect to human capital is increasing in ability.
6 This technical section sets up the planning problem, starting from the sequential problem, and defining incentive compatibility. It then goes through two steps to make this problem tractable, following the recent procedure proposed for dynamic Mirrlees models by Farhi and Werning (2013), augmented here with human capital. First, a relaxed problem based on the first-order approach is written out, which replaces the full set of incentive compatibility constraints by the agent’s envelope condition. This relaxed program is then turned into a recursive dynamic programing problem through a suitable definition of state variables.
3.1 Incentive compatibility
To solve for the constrained efficient allocations, suppose that the planner designs a direct revelation mechanism, in which, each period, agents have to report their ability θt. Denote a reporting strategy, t T specifying a reported type rt after each history by r = rt θ t=1. Let R be the set of all possible t t reporting strategies and r = r1 (θ1) , ..., rt θ be the history of reports generated by reporting strategy r. Because output, savings, and human capital are observable, the planner can directly specify allocations as functions of the history of reports, according to some allocation rules c rt , y rt, s rt.8 Let the continuation value after history θt under a reporting strategy r, denoted by ωr θt, be the solution to:
t t Z r t t t y(r (θ )) r t+1 t+1 ω (θ ) = ut(c(r (θ ))) − φt t t + β ω θ f (θt+1|θt) dθt+1 wt(θt, s(r (θ )))
The continuation value under truthful revelation, ω θt , is the unique solution to:
t Z t t y (θ ) t+1 t+1 ω θ = ut c θ − φt t + β ω θ f (θt+1|θt) dθt+1 wt (θt, s (θ )) Incentive compatibility requires that truth-telling yields a weakly higher continuation utility than any reporting strategy r: r (IC): ω (θ1) ≥ ω (θ1) ∀θ1, ∀r (4)
Denote by XIC the set of allocations which satisfy incentive compatibility condition (4). To solve this dynamic problem, a version of the first order approach is used, requiring the following assumptions:
t Assumption 1 i) u˜ (c, y, s; θ) and ∂φ(l) ∂w(θ,s) l are bounded. ii) ∂f (θt|θt−1) exists and is bounded.9 t ∂l ∂θ w ∂θt−1 t iii) f (θt|θt−1) has full support on Θ.
8 t t t Hours of work are determined residually by l r = y r /w θt, s r 9For some distributions, this derivative is not bounded and assumption 3 in Kapiˇcka (2013) could be used instead, t R θ t ∂ t t namely that for F (θt|θt−1) ≡ f (θs|θt−1) dθs, we have F (θt|θt−1) ≤ 0 and F (θt|θt−1) either concave or θ ∂θt−1 convex.
7 Suppose the agent has witnessed a history of shocks θt. Consider one particular deviation strategy s r˜t, under which he reports truthfully until period t (˜rs (θ ) = θs ∀s ≤ t − 1), and lies in period t by t 0 reportingr ˜t θ = θ 6= θt. The continuation utility under this strategy is the solution to:
t−1 0 Z r˜ t t−1 0 y(θ , θ ) r˜ t−1 0 t ω (θ ) = ut(c(θ , θ )) − φt t−1 0 + β ω θ , θ , θt+1 f (θt+1|θt) dθt+1 wt(θt, s(θ , θ ))
Incentive compatibility in (4) implies that, after almost all θt, the temporal incentive constraint holds:
ω θt = max ωr˜ θt (5) θ0
t−1 Inversely, if (5) holds after all θ and for almost al θt, then (4) also holds (see Kapiˇcka (2013), Lemma 1). If we take the derivative of promised utility with respect to (true) ability, there are two direct effects, namely on the wage (higher types have higher wages) and on the Markov transition t f (θt|θt−1), and indirect effects on the allocation through the report. By the first-order condition of the agent, all indirect effects that affect the report and the allocation are jointly zero and only the two direct effects remain. This leads to the envelope condition of the agent: