Journal of Economic Literature 2016, 54(4), 1232–1287 http://dx.doi.org/10.1257/jel.20161153

Executive Compensation: A Modern Primer†

Alex Edmans and Xavier Gabaix*

This article studies traditional and modern theories of executive compensation, ­bringing them together under a simple unifying framework accessible to the general-interest­ reader. We analyze assignment models of the level of pay, and static and dynamic mor- al-hazard models of incentives, and compare their predictions to empirical findings. We make two broad points. First, traditional theories find it difficult to explain the data, suggesting that compensation results from “rent extraction” by CEOs. However, more modern “shareholder value” theories, that arguably better capture the CEO set- ting, do deliver predictions consistent with observed practices, suggesting that these practices need not be inefficient. Second, seemingly innocuous features of the modeling setup, often made for tractability or convenience, can lead to significant differences in the model’s implications and conclusions on the efficiency of observed practices. We close by highlighting apparent inefficiencies in executive compensation and additional directions for future research. ( JEL G38, M12, M48, M52)

1. Introduction contract theory, corporate finance, , labor economics, and income here is considerable debate on execu- inequality. One side is the “rent extraction” Ttive compensation in both the public view, which claims that current compensa- arena and academia. This debate spans sev- tion practices sharply contrast the predictions eral important topics in economics, such as of traditional agency models. Thus, contracts are not chosen by boards to maximize share- * Edmans: London , CEPR, and the holder value, but instead by the executives European Corporate Governance Institute. Gabaix: themselves to maximize their own rents. This Department of Economics, Harvard University, NBER, CEPR, and the European Corporate Governance Insti- perspective, espoused most prominently by tute. We thank Steven Durlauf (the editor), two anony- Bebchuk and Fried (2004), has been taken mous referees, Taylor Begley, David Dicks, Irem Erten, very seriously by both scholars and policy- Steve Kaplan, Stefan Lewellen, Robert Miller, Yuliy San- nikov, Fenghua Song, Luke Taylor, David Yermack, and makers, and led to major regulatory changes. especially Pierre Chaigneau for helpful comments, and In the United States, the Securities and Deepal Basak, Saptarshi Mukherjee, and Jan Starmans for Exchange Commission (SEC) mandated valuable research assistance. † Go to http://dx.doi.org/10.1257/jel.20161153 to visit increased disclosure of compensation in the article page and view author disclosure statement(s). 2006, and say-on-pay legislation was passed

1232 Edmans and Gabaix: Executive Compensation: A Modern Primer 1233 as part of the Dodd–Frank Act in 2010. In ­contracts. The theoretically optimal contract 2013, the European Union imposed caps on is typically highly nonlinear and so never bankers’ bonuses, the SEC mandated disclo- observed in reality; under a strict definition sure of the ratio of of optimal contracting, this view would be (CEO) pay to median employee pay, and immediately rejected. A second is bounded Switzerland held an ultimately unsuccess- rationality, which may lead to boards not ful referendum to limit CEO pay to twelve being aware of certain (potentially nonobvi- times the pay of the lowest worker. ous) performance measures that could the- The more modern “shareholder value” oretically improve the contract if included. view reaches a different conclusion. While it This article critically assesses the rent acknowledges that elementary agency mod- extraction versus shareholder value debate els are inconsistent with practice, it argues by analyzing newer models of executive com- that such models do not capture the specifics pensation and evaluating the extent to which of the CEO setting, since they were created they can explain observed practices. In partic- as frameworks for the principal–agent prob- ular, while recent theories have used different lem in general. For example, CEOs have a frameworks and focused on different dimen- very large effect on firm value compared to sions of the contracting problem, we present rank-and-file employees. Thus, in a com- a tractable unifying model to bring together petitive labor market, it may be optimal to the conclusions of this large literature, start- pay high wages to attract talented CEOs, ing with classic theories and then moving to and implement high effort from them even modern ones. In section 2, we begin by ana- though doing so requires paying a premium.1 lyzing the determinants of the level of pay, Newer models aim to capture the specifics of starting with neoclassical production mod- the CEO employment relationship, and can els of the firm and then moving to modern indeed generate predictions consistent with assignment models. These assignment mod- the data. Under this perspective, regulation els yield empirical predictions for how CEO will do more harm than good. pay varies cross-sectionally between firms of The “shareholder value” view broadens different sizes, and over time as the size of what is commonly referred to as the “optimal the average firm in the economy changes. contracting” view, which typically focuses on Having determined the level of pay, we the details of bilateral contracts. We use the then move to incentives. In section 3, we con- term “shareholder value” for two main rea- sider a static moral-hazard model where the sons. First, it emphasizes the need to take CEO takes an action that improves expected into account additional dimensions such as firm value, starting in section 3.1 with the market forces and competitive equilibrium. risk-neutral case and moving to risk aver- Second, in reality boards are unlikely to sion in section 3.2. While this setting appears choose the perfectly optimal contract, even quite standard, we will show that seemingly if they are concerned with shareholder value, innocuous features of the modeling setup, rather than rent extraction. One poten- often made for convenience or tractability tial reason is a preference for simplicity, (e.g., the choice between additive or mul- which may restrict them to piecewise linear tiplicative utility and production functions, and binary or continuous actions) can lead to significant differences in the model’s impli- 1 A simple model can justify high CEO pay simply by cations—and thus conclusions as to whether assuming a high level for the reservation utility, which is an exogenous parameter. Modern assignment models endog- observed practices are consistent with theory. enize the reservation utility. In particular, newer multiplicative models 1234 Journal of Economic Literature, Vol. LIV (December 2016) are able to explain some stylized facts (such actually consistent with efficiency, particu- as the relationship between incentives and larly when studying more modern theories. firm size) that traditional additive models However, empirical correlations cannot be cannot. We also discuss various frameworks interpreted as definitive proof of the share- that researchers can use to yield tractable holder value view, given the difficulties in solutions to the contracting problem, and the identifying causality. Section 5 highlights appropriate empirical measure of incentives. apparent inefficiencies in executive compen- Section 3.4 embeds the moral hazard model sation, as well as open questions for future into a market equilibrium to generate addi- research. Section 6 concludes. tional empirical implications, and section 3.5 This article aims to differ from existing discusses the evidence. Section 3.6 considers surveys of executive compensation. Core, the case of multiple signals. In contrast to Guay, and Larcker (2003) and Frydman and the Holmstrom (1979) informativeness prin- Jenter (2010) focus largely on the empirical ciple, in many firms the CEO’s pay depends evidence. Murphy (2013) provides a histor- on industry shocks outside his , which ical perspective and discusses the role of Bebchuk and Fried (2004) argue is strong institutional constraints. Edmans and Gabaix evidence that contracting is suboptimal. We (2009) focus exclusively on recent theories show that the theory does not unambigu- and use verbal descriptions, rather than a ously predict that industry shocks should be formal model. Our main contribution is to filtered out, due to other considerations in a study both traditional and modern contract- CEO setting that are absent from Holmstrom ing theories, with a particular attention to (1979). Section 3.7 allows the CEO to affect the role of modeling choices, and combine the as well as mean of firm value, their findings into a single unifying frame- by choosing the firm’s risk. It discusses how work. As with any survey, we are forced to options can encourage “good” risk taking, draw boundaries and so we focus on moral and -based compensation can deter hazard, rather than adverse selection, as “bad” risk shifting if the firm is levered. the former literature is more extensive. For Section 4 moves to a multi-period model. learning models of CEO contracts, where A dynamic setting poses several challenges either the CEO’s general ability or his spe- absent from standard single-period models: cific match quality with a firm is initially contracts that are initially optimal may lose unknown, we refer the reader to Harris and their incentive effect over time, the CEO can Holmstrom (1982), Gibbons and Murphy take myopic actions that boost short-term (1992), Holmstrom (1999), Hermalin and returns at the expense of long-run value, and Weisbach (1998, 2012), Taylor (2010, 2013), he may undo the contract by private saving. and Garrett and Pavan (2012). In addition to these complications, a dynamic setting provides the principal with additional 2. The Level of Pay opportunities: she can provide incentives through the threat of termination, and base Trends in the level of pay are perhaps the the CEO’s pay on returns in previous, as well most commonly cited statistic in support of as current, periods. the rent-extraction view. The median CEO Each section will compare the empir- in the S&P 500 earned $9.6 million in 2011 ical predictions of the theories with the (Murphy 2013), which is substantially higher evidence. Broadly speaking, we will argue than in other countries and represents a six- that many, but not all, features of observed fold increase since 1980. In contrast, the pay contracts that are frequently criticized are of the average worker has risen much more Edmans and Gabaix: Executive Compensation: A Modern Primer 1235 slowly. Figures from the Bureau of Labor The Lucas (1978) theory of the firm spe- Statistics show that CEO pay was 350 times cializes the model to apply to the pay of a that of the average worker in 2013, com- single CEO, rather than several managers. pared to 40 times in 1980, according to the The variable T​ ​ now refers to the CEO’s level Economic Policy Institute. Thus, the rapid of human capital (i.e., his talent), rather than increase in executive compensation may the number of managers. A CEO with talent​ have contributed significantly to the recent T​ hires capital and labor2 and maximizes rise in income inequality (Piketty and Saez 2003, Piketty 2014), and has potential polit- (1) W​ ​ max ​ ​ F ​(K, L, T)​ ​w​ ​ L rK, ​ T = − L − ical-economy implications. Bebchuk and K, L Fried (2004) argue that this increase is a where ​​wL​ ​ and ​r​ are the prices of labor and result of rent extraction by CEOs. Supporting capital, and the surplus W​​ T​ ​​​ is the CEO’s pay. this argument, Bebchuk, Cremers, and Peyer Consider the Cobb–Douglas production (2011) show that the fraction of CEO pay rel- function ative to total pay across the top-five executives ​ ​ ​ ​ ​ ​ αK αL is linked to lower firm value, profitability, and ​ T​ ​ ___K ___L (2) ​V ​T​​ α ​ ​​ ​ ​ ​ ​ ​ ​ ​​ ​ ​ ​ ​ ​ ​ , returns to acquisition announcements. We = (αK) (αL) study the extent to which rises in pay over where ​​ ​ ​, ​​ ​ ​, and ​​ ​ ​ represent the shares time can be explained by shareholder value αT αK αL models. In this section, we abstract from of output that go to the CEO, capital, and agency problems (which we later introduce labor, respectively, under perfect com- petition. We assume ​​​ ​ ​ ​ ​ ​ ​ ​ 1​ in section 3) and study the pay required to αT + αK + αL = attract the CEO to a firm. (constant returns to scale). The first-order­ condition of (1) with respect to ​K​ is ​​ ​ ​ __V ​ r​ αK K = 2.1 Talent as a Factor of Production i.e., ​​ __K ​ __V ​ ​. Optimizing over labor likewise ​ K​ ​ = r α __L __V One approach to determining the level of yields ​​ ​ ​ ​ ​ ​w​ ​ . Substituting into the pro- αL = L pay is to view the CEO as a factor of produc- duction function (2) gives tion separate from standard employees. Let ​ ​ ​ ​ ​ ​ the production function be αK αL ​ T​ ​ V V ​V ​T​​ α ​ ​​ ​ __ ​ ​ ​ ​​ ​ ___ ​ ​ ​ = ( r ) (​wL​ ​) ​V F​(K, L, T)​,​ = ​ ​ ​ 1 ​ ​ ______1 αT −α​ T ​ K​ ​ ​ L​ ​ T​​ ​ ​V​​ ​ .​ where V​ ​ is firm value and the factors of pro- = ​r​​ α ​ ​w​ Lα​ ​ duction are units of capital ​K​, number of workers L​ ​ (labor), and number of managers​ Solving for V​ ​, we have F T​. Each manager is paid a wage ​​w​ ​ __∂ ​ ​ . T = T ∂ ​ ​ ​ ​ ​ ​ 1/​ ​ ​ His pay is determined by the production (3) ​V ( ​r​​ αK​ ​wα​​ L​)​ − αT​ T; function, and changes in pay result from = shifts in technology. This is the perspective ​ ​ ​ V ​ ​ ​ V of most economic theories on the aggregate- K ____ αK ​ ; L ____αL .​ = r = ​wL​ ​ production function and supply of talent (see Goldin and Katz 2009 and Acemoglu and Autor 2012 for recent ­surveys). In ­particular, labor economists use this perspective to 2 An alternative formulation is for capital to hire the CEO and labor. In competitive markets, the identity of the compare the wages of, say, college graduates principal is immaterial. Here, we follow the Lucas (1978) versus high-school dropouts. formulation for ease of exposition. 1236 Journal of Economic Literature, Vol. LIV (December 2016)

From (3), a more talented CEO runs a talent T​ (m)​ for one period. The CEO’s talent larger firm, in part because he hires more increases firm value according to capital and labor: ​V, K, L​ are all linear in ​T​. (5) ​V S(n) CS(n)​​​ γ T(m), ​ His pay is given by = + ​ where ​C​ parametrizes the productivity of tal- (4) WT​ ​ V rK ​wL​ ​ L ​ T​ ​ V.​ ent and ​ ​ the elasticity of talent with respect = − − = α γ to firm size. If ​ ( ) 1​, the model exhibits γ = < The model generates the qualitative predic- constant (decreasing) returns to scale. tion that CEO pay, ​​WT​ ​​ ,​ is increasing in firm We now determine equilibrium wages, size, because a larger firm generates more sur- which requires us to allocate one CEO to plus. It also generates the quantitative predic- each firm. Let ​w(m)​ denote the equilibrium tion that his pay scales linearly with firm size. wage of a CEO with index m​ ​. Firm n​ ​, taking Various empirical studies confirm the quali- the market wage of CEOs as given, selects tative prediction that CEO pay is increasing­ CEO ​m​ to maximize its value net of wages: 3 in firm size : Baker, Jensen, and Murphy max ​ ​ CS (n​)​​ γ T(m) w(m).​ (1988, p. 609) call this relationship “the best m ​ − documented empirical regularity regarding levels of executive compensation.” However, The competitive equilibrium involves pos- the quantitative prediction that pay is linear itive assortative matching, i.e., m​ n​, and = in firm size is contradicted by the data. The so ​w (n) CS (n​)​​ γ T (n)​. Let _​​​ w ​​ ​ denote the ′ = ​ ′ N above papers find that CEO pay increases as a reservation wage of the least talented CEO power function of firm size ​​W​ ​ ​S​​ κ​, where (who is matched to firm ​n N​). Hence T ∼ ​ = a typical elasticity is ​ 1/ 3.​ Hence, the we obtain the classic assignment equation κ ≃ Lucas model needs to be refined. This is what (Sattinger 1993), also derived by Terviö assignment models do, to which we now turn. (2008) in the context of CEOs: 2.2 Assignment Models N (6) ​w(n) ​ ​ ​ CS(u​)​​ γ T (u) du _ w ​ ​ ​ .​ = − n ​ ′ + N Gabaix and Landier (2008) present a trac- ∫ table market-equilibrium model of CEO pay. Specific functional forms are required A continuum of firms and potential CEOs to proceed further. We assume a Pareto are matched together. Firm n​ ​[0, N ]​ has firm-size distribution with exponent​1/ ​: ∈ α a “baseline” size ​S(n)​ and CEO ​m ​[0, N ]​ ​S(n) A​n−α​​ ​. Using results from extreme ∈ = ​ has talent T​ (m)​. Low n​ ​ denotes a larger firm value theory, Gabaix and Landier (2008) and low m​ ​ a more talented CEO: S​ (n) 0​, use the following asymptotic value for the ′ < ​T (m) 0​. The value ​n​(​m​) can be thought spacings of the talent distribution: T​(n) ′ < 1 ′ of as the rank of the firm (CEO), or a num- B​n​​ β− ​​. These functional forms give = − ber proportional to it, such as its quantile of the wage in closed form, taking the limit as​ rank. n/N 0​: → We consider the problem faced by one N 1 particular firm. Att ​ 0​, it hires a CEO of (7) ​w(n) ​ ​ ​ ​A​​ γ BC ​u​​ −αγ+β− ​ du _ w ​​ ​ = = ∫n ​ + N ​Aγ​​ BC (​ )​ (​ )​ ______n​​ − αγ−β ​ ​N​​ − αγ−β ​​ 3 See, e.g., Roberts (1956), Cosh (1975), Baker, Jensen, = ​ [ ​ − ] and Murphy (1988), Barro and Barro (1990), Kostiuk α γ − β (1990), Rosen (1992), Joskow, Rose, and Shepard (1993), ​Aγ​​ BC (​ )​ _w ​​ ​ ______​ n​​ − αγ−β ​ .​ Rose and Shepard (1997), and Frydman and Saks (2010). + N ∼ ​ ​ α γ − β Edmans and Gabaix: Executive Compensation: A Modern Primer 1237

To interpret equation (7), we consider a is ­stronger when industry conditions are reference firm, for instance the median firm ­favorable, as talented CEOs are not only in the universe of the top 500 firms. Denote paid a greater premium, but also optimally its index n​​ ​ ​​​, and its size S​ (​n​ ​) A​n​ −α​ ​. Using ​ grow their firms to a larger size. ∗ ∗ = ∗ ​ S(n) A​n−α​​ ​, we derive: In addition, equation (8) shows that pay = ​ increases with the size of the average firm in the economy S​(​n​ ​​)​. Since a CEO’s tal- ​Aγ​​ BC (​ )​ ∗ ​w(n) ______​ n​​ − α γ−β ​ = ​ ​ ent can be applied to the entire firm, when αγ − β firms are larger, the dollar benefits from a (​ )​ ​Aγ​​ BC 1/ 1/ − αγ−β more talented CEO are higher, so there is ______​ ​A​​ α S (n​)−​​ α ​ ​ = ​ ​ (( ​ ​)​)​ more competition for talent. This is a sim- α γ − β ilar “superstars” effect to Rosen (1992). / ​Aβ​​ α BC / Moreover, the model’s closed form solutions ______S (n​)​​ γ−β α = ​ ​ ​ yield quantitative predictions. Average firm α γ − β / size increased sixfold between 1980 and ​​(S(​n​ ​) n​ α​ )​ β α BC / ______∗ ∗ ​​ ​ S (n)​​​ γ−β α 2011. When both S​(​n​ ​)​ and ​S(n)​ rise by a = ​ ​ ∗ α γ − β factor of 6, CEO pay should rise by a fac- tor of ​6 ​[ / ​( / )]​ ​ 6 6​, as ​n​ β​ BC / / × β α + γ − β α = γ = ______∗ ​ S (​n​ ​)​​ β α S (n​)​​ γ−β α .​ has been the case. The relevant benchmark = ​ ∗ ​ ​ α γ − β against which to compare the level of CEO pay is not the pay of the average worker, or Finally, we obtain CEO pay in closed form: pay of CEOs in the past, but the CEO’s cur- rent contribution to the firm. This in turn / / (8) ​w(n) D(​n​ ​)S​(​n​ ​)​​ β α S(n​)​​ γ−β α​, depends on variables such as firm size and = ∗ ∗ ​ ​ talent; while the latter is difficult to measure, where ​D(​n​ ​) C​n​ ​ T (​n​ ​)/(​ )​ is a the pay of the average worker is unlikely to ∗ = − ∗ ′ ∗ α γ − β constant independent of firm size. Similar to be a determinant. Thus, assignment models Lucas (1978), equation (8) yields the qual- suggest that regulation to mandate disclosure itative cross-sectional prediction that CEO of the ratio of CEO pay to median employee pay is increasing in firm size. However, the pay may not be useful, as median employee intuition is different: here the prediction pay is not the relevant benchmark. arises because large firms hire the most tal- However, the empirical evidence is not ented CEOs, who command the highest unambiguously in favor of assignment mod- wages. Moreover, equation (8) yields a dif- els. Nagel (2010) raises sample selection ferent quantitative prediction. It predicts a concerns and suggests alternative method- pay–firm size elasticity of ​ / ​. ologies, but Gabaix, Landier, and Sauvagnat ρ = γ − β α Gabaix and Landier (2008) calibrate using​ (2014) conclude that the results are robust 1​ (a Zipf’s law, as in Axtell 2001 and to these changes. While Gabaix and Landier α = Gabaix 1999) and ​ 1​ (constant returns to (2008) can fully explain the growth in CEO γ = scale). Since there is no clear a priori value pay from 1980 to the present, Frydman and for ​ ​, they set ​ 2 / 3​ to yield the empirical Saks (2010) find that median CEO pay was β β = pay–size elasticity of ​ 1 / 3,​ which con- relatively constant between the 1940s and ρ = trasts Lucas’s (1978) prediction of ​ 1​. early 1970s, despite firm size increasing over ρ = Baranchuk, MacDonald, and Yang (2011) this period. Gabaix and Landier (2008) dis- extend the model to endogenize firm size cuss potential explanations for this apparent and show that the pay–size relationship discrepancy. One is that the supply of talent 1238 Journal of Economic Literature, Vol. LIV (December 2016) greatly increased, which creates a downward capital.4 Writing and calibrating such a pressure on CEO wages; quantifying the model would be valuable. impact of increased supply on wages would Rather than using firm size as a proxy be a useful direction for future research. for talent, other authors have attempted to Another is that, in the early period, CEOs ­measure talent directly. Chang, Dasgupta, tended to be internally promoted rather than and Hilary (2010) infer talent from the mar- externally hired, similiar to the Japanese ket reaction to CEO departure, which they CEO market today. find is positively related to pay. To address In addition, observing that both firm size concerns that the market reaction cap- and CEO pay have trended upwards since tures perception, rather than true ability, 1980 does not imply causality. Even if causal, they show that it is negatively correlated to the positive correlation between pay and firm ­predeparture performance. Falato, Li, and size cannot be interpreted as definitive evi- Milbourn (2015) directly measure ability dence for assignment models, since it is also using a CEO’s reputational, career, and edu- potentially explainable by an as-yet unwritten cational credentials, which they corroborate rent-extraction model. For example, large by showing a positive association with future firms may have more resources, allowing the firm performance. They find that such cre- CEO to extract more . Alternatively, dentials are positively related to pay, consis- Dicks (2012) shows that the correlation can tent with talent-based models. arise if poor governance causes a small frac- 2.3 Alternative Explanations for High CEO tion of firms to overpay for talent, which then Pay forces all others to overpay as well, in order to remain competitive. This channel is also Moving to other explanations for high CEO predicted by Gabaix and Landier (2008); see pay, section 3 discusses how agency prob- Bereskin and Cicero (2013) for supportive lems may lead to the CEO being paid a pre- evidence. While these alternative explana- mium for the disutility of effort and the risk tions would generate the qualitative predic- imposed by incentives. Other papers point to tion that pay is positively correlated with firm the changing nature of the employment rela- size and average firm size, it is not yet clear tionship. Hermalin (2005) argues that tighter whether they can generate empirically con- corporate governance increases both the sistent quantitative predictions. level of effort that the CEO must exert and Another concern is that assignment mod- the risk of dismissal, and so managers demand els predict a reassignment of CEOs as rel- greater pay as a compensating differential. ative firm size changes. While external Indeed, Peters and Wagner (2014) show that poaching of CEOs has increased in recent CEO turnover risk is significantly positively years, it is still relatively rare: Cremers and associated with pay, but reject an entrench- Grinstein (2014) find that 63 percent of new ment model in which powerful CEOs enjoy CEOs are insiders. Similarly, it does not both lower turnover risk and high pay. Their appear to be the case that large changes in firm size (e.g., a firm undertaking a large acquisition) are accompanied by changes 4 In addition, Cremers and Grinstein (2014) find the relationship between pay and firm size differs little across in the acquiring CEO for non-disciplinary industries, according to the proportion of outsider CEOs reasons. This low mobility can be generated in the industry. They interpret this proportion as a poten- by a simple extension of assignment mod- tial measure of competition for CEO talent. However, this proportion may reflect small frictions that cause a firm to els to incorporate frictions, such as a cost choose an insider CEO on the margin, but are not large of firing the CEO or firm-specific human enough to meaningfully affect equilibrium pay. Edmans and Gabaix: Executive Compensation: A Modern Primer 1239 identification strategy focuses on industry on twenty leveraged buyouts.­ Due to data volatility, which (after controls) is unlikely limitations, their inferences are based on to affect CEO pay other than through turn- differences in means without controls, but over risk. In Garicano and Rossi-Hansberg they show that these changes do not occur (2006), CEOs specialize in knowledge acqui- in control firms that experienced similar sition and problem solving, leaving routine increases in leverage—i.e., it is likely con- production tasks to lower-level employees. centrated ownership, rather than leverage, Recent increases in communication tech- that explains the results. More generally, nologies (e.g., e-mail) allow the CEO to Kaplan (2012) reports that over the past specialize more in skilled tasks, thus increas- three decades, executive pay in closely held ing his pay. Frydman (2014) and Murphy firms has outpaced that in public companies. and Zabojnik (2007) argue that the increas- In addition to the value-maximization ing importance of transferable, rather than and rent-extraction views, a third perspec- firm-specific, human capital increases pay tive is that institutional constraints or prac- through expanding CEOs’ outside options. tices may have contributed to the rise of Moreover, despite the significant relation- pay. Murphy (2013) discusses the role of ship between pay and factors such as firm tax policy, accounting rules, and disclosure size, risk, and (as we will discuss in section 3) requirements: for example, the Clinton performance, Graham, Li, and Qiu (2012) Administration’s $1 million salary cap led to find that a large component is explained by many firms increasing salary to $1 million. manager fixed effects. Inclusion of these Shue and Townsend (forthcoming a) note fixed effects changes the coefficient esti- that firms tend to grant the same number of mates on other determinants; for example, options each year. Thus, when prices the firm-size coefficient falls by 40 percent. rise, the value of options increases which, These fixed effects could be a proxy for tal- together with downward rigidity in ent (consistent with their effect on the firm and bonuses, led to overall pay levels ­rising in size coefficient), for the manager’s ability to the 1990s and early 2000s. While this friction extract rent, or other factors such as manage- is indeed significant in the short run, its effect rial preferences, risk aversion, or personality. on long-run outcomes is less clear—similar Thus, a significant proportion in the variance to economics more broadly, where pricing of firm pay remains unexplained. frictions (e.g., menu costs) are important in The above explanations—talent, agency, the short run but less so in the long run. and the changing nature of the CEO’s job— are part of the shareholder value view. To 3. Static Incentives test this view more broadly, Cronqvist and Fahlenbrach (2013) study the effect on com- We now turn from determining the level pensation of firms transitioning from pub- of pay to the CEO’s incentives. This sec- lic to private ownership. Since the private tion considers a single-period moral haz- equity sponsor’s concentrated stake gives it ard model, which we extend to multiple the incentives and control rights to set pay periods in section 4. This setting has been optimally, compensation in private firms widely covered in textbooks (e.g., Bolton and should be closer to the efficient benchmark. Dewatripont 2005, Tirole 2006) and earlier Salary and bonuses actually rise upon going surveys (Prendergast 1999), but typically private, inconsistent with the notion that with additive production functions and pref- CEOs of public firms are overpaid. They erences, and often a binary effort level. We issue the caveat that their results are based will show that multiplicative specifications,­ 1240 Journal of Economic Literature, Vol. LIV (December 2016) which may be particularly relevant for a CEO value is independent of initial firm size. setting, lead to quite different conclusions for This specification is appropriate for a perk- the best empirical measure of incentives and consumption decision, if the amount of perks how incentives should vary cross-sectionally that can be consumed is independent of firm between firms. We will also show that the use size. For example, buying a $10 million cor- of a continuous versus binary action space, as porate jet reduces firm value by $10 million, well as the specification of noise before ver- regardless of S​ ​. Another is b​ (​S)​ bS​, which = sus after the action, also lead to significant yields V​ ​(a)​ S(​1 ba)​ ​: a multiplica- = + + ε differences in the model’s results. tive production function where the effect of We start with a standard principal–agent firm value is linear in firm size. Many CEO problem applied to an executive compensa- actions can be “rolled out” across the entire tion setting. The principal (board of direc- firm and thus have a greater effect in a larger tors on behalf of shareholders) hires an agent company, such as a change in strategy or a (CEO) to run the firm. The production func- program to improve production efficiency. A tion is given by ​V​(a, S, )​​, which is increasing multiplicative specification is also appropri- ε in ​a​ and ​S​. We specialize this to ate for a rent-extraction setting, if there are greater resources to extract in a larger firm.6 (9) ​V S b(​S)​ a .​ The agent is paid a wage c​​(V)​ contin- = + + ε gent upon firm value. (Note that ​c​ refers to We consider an all-equity firm for simplic- actual pay, in contrast to w​ ​ which refers to ity and discuss leverage in section 3.7. The the expected wage). We always assume lim- variable a​ 0, )​ is an action taken by the ited liability on the principal (​c(​V)​ V​): ∈ [ ∞ ≤ agent that improves expected firm value but she cannot pay out more than total firm is personally costly. Examples include effort value. In some versions of the model, we (low ​a​ represents shirking), project choice will also assume limited liability on the agent (low ​a​ involves selecting value-destructive (​c​(V)​ 0​). He has reservation utility of ​ ≥ projects that maximize private benefits), w 0​ and his objective function is given by ≥ or rent extraction (low a​​ reflects cash flow diversion.) We typically refer to ​a​ as “effort” (10) ​E U E u(v(c) g(a)) .​ [ ] = [ − ] for brevity. The variable ​​ is mean-zero ε _ noise, with interval support on ​​(_​ , ​ )​, where The function ​g​ represents the cost of effort, ε​ ε ​ the bounds may be infinite.5 Shortly after the which is increasing and weakly convex.​ agent takes his action, noise is realized, and Further, u​ is the utility function and ​v​ is the then final firm valueV ​ ​ is realized. Firm value felicity7 function, which denotes the agent’s is observable and contractible, but neither utility from cash; both are increasing and effort nor noise is individually observable. weakly concave. The functions ​g​, ​ u​, and ​v​ The function b​​(S)​​ measures the effect are all twice continuously differentiable. of effort on firm value for a firm of size ​S​. The objective function (10) contains func- One possibility is ​b(​S)​ b​, which yields tions for both utility and felicity to ­maximize = ​V​(a)​ S ba ​: an additive production = + + ε function where the effect of effort on firm 6 See Bennedsen, Perez-Gonzalez, and Wolfenzon (2010) for empirical evidence that CEOs have the same percentage effect on firm value regardless of firm size. 5 For simplicity, we assume that ​S​ is sufficiently large, 7 We note that the term “felicity” is typically used to or the probability of low ​​ is sufficiently small, thatV ​ ​ is denote one-period utility in an intertemporal model. We ε nonnegative almost surely, so we do not need to complicate use it in a nonstandard manner here to distinguish it from the model with nonnegativity constraints. the utility function ​u​. Edmans and Gabaix: Executive Compensation: A Modern Primer 1241 generality. One common assumption is ​ The principal is assumed to be ­risk neu- v(c) c​ so that E​ U E u(c g(a)) ​, tral, since shareholders are typically well-­ = [ ] = [ − ] in which case the cost of effort is pecuni- diversified. Her program is given by: ary, i.e., can be expressed as a subtraction to cash pay. This is appropriate if effort (11) max ​ ​ E V(a) c(V(a)) s.t. c( ), a [ − ] involves a financial expenditure or the · opportunity cost of foregoing an alterna- (12) E u(v(c(V(a))) g(a)) w, tive income-generating activity. Another is [ − ] ≥ ​u(x) x​, which yields ​E v(c) g(a) ​, where = [ − ] (13) a arg max ​ ​ E u(v(c(V(​aˆ ​))) g(​aˆ ​)) .​ the cost of effort is separable from the bene- ∈ [ − ] ​aˆ ​ fits of cash. This specification is reasonable if effort involves disutility, or foregoing leisure She chooses the effort level a​​ and contract ​ or private benefits. c​(V)​9 to maximize (11), expected firm value Both of the above specifications represent minus the expected wage, subject to the additive preferences. Effort of a​ ​ reduces the agent’s individual rationality or participation agent’s utility by g​ (a)​ in utils (dollars) in the constraint (IR, (12)) and incentive compati- first (second) specification. A third speci- bility constraint (IC, (13)). fication isv ​(c) ln c​, in which case (10) We begin with a first-best benchmark, = g(a) which leads to a simple optimal contract that becomes ​E​[u(​ ln ​(c​e​​ − )​ )​ ]​ ​. This specifi- cation corresponds to multiplicative prefer- is the same across all firms and thus does not have the potential to explain observed ences, where the cost of effort is increasing contracts. We then explore two departures in ​c​. Here, private benefits are a normal from the first-best that generate a meaning- good: the utility they provide is increasing in ful contract that does yield empirical predic- consumption, consistent with the treatment tions. The first is limited liability, which only of most goods and services in consumer the- leads to small variations in the optimal con- ory. This specification is also plausible under tract across firms. The second is risk aversion the ­literal interpretation of effort as forego- (section 3.3), which leads to much richer ing leisure: a day of vacation is more valuable implications. Section 3.4 embeds the con- to a richer CEO, as he has more wealth to tracting problem in a market equilibrium to enjoy during it. Thus, the CEO’s expendi- generate additional empirical implications. ture on leisure and private benefits rises in We compare all implications to the data in proportion to his wealth. Multiplicative pref- section 3.5. erences are also commonly used in macro- Under the first-best, effort is observable. economic models (e.g., Cooley and Prescott Let ​​a∗​​ ​ be the effort level that the principal 1995) to generate realistic income effects. In ​ wishes to implement. She can simply direct particular, they are necessary for labor sup- the agent to exert effort ​​a∗​​ ​, and so we can ply to be constant over time as the hourly ​ ignore the IC (13). It is easy to show that the wage rises.8

9 Here, we focus on deterministic contracts, so that 8 When the hourly wage rises, working becomes pref- there is a one-to-one mapping between firm value​V ​ and erable to leisure (the substitution effect). With multiplica- compensation c​​. An even more general model allows for tive preferences, the rise in the wage increases the agent’s stochastic contracts, where firm value of​V ​ leads to a ran- labor endowment income and thus demand for leisure (the dom amount c​ ​. Gjesdal (1982), Arnott and Stiglitz (1988), income effect), which exactly offsets the substitution effect. and Edmans and Gabaix (2011b) derive sufficient condi- With additive preferences, there is no income effect, and tions for random contracts to be suboptimal, allowing the so leisure falls to zero as the wage increases. focus on deterministic contracts. 1242 Journal of Economic Literature, Vol. LIV (December 2016)

_ agent is given a constant wage c​(​V)​ c ​, ­taking into account the cost of the contract ​ = as this leads to efficient risk sharing. The IR c(V)​ needed to implement each action a​​∗​​ ​. _ ​ (12) yields ​​ c w g(​a∗​​ )​. This will bind in Starting with the first stage, the first-order ≥ + ​ the optimal contract, and so the principal condition of the agent’s effort choice (17) is maximizes given by

(18) ​E c (V)b(S) g (​a∗​​ )​. (14) ​E V(​a∗​​ ) g(a​ ∗​​ ) w.​ [ ′ ] = ′ ​ [ ​] − ​ − Rogerson (1985), Jewitt (1988), and Carroll This defines the first-best effort level as (2012) give conditions under which the first-order condition is sufficient, and so the IC (17) can be replaced by the first-order (15) ​g ​​a​ ∗ ​ b(S).​ ′( FB ​)​ = condition (18), which greatly simplifies the problem. Throughout this paper, we assume The principal trades off the marginal that these conditions are satisfied, so that the increase in firm value from effort,​b (S)​, first-order approach is valid. with the agent’s marginal cost, g​ ​​a​ ∗ ​ ​. Thus, ​​ Given risk neutrality and unlimited liabil- ′( FB ​)​ a​ FB∗ ​​ maximizes total surplus. In turn, ​​a​ FB∗ ​​ is ity, there is no loss of generality in focusing on decreasing in the convexity of the cost of a linear contract of the form c​ (V) V​, = ϕ + θ effort. It is also increasing in firm size​S ​ if ​ where ​ ​ is the fixed wage and​ ​ is the agent’s ϕ θ b(S)​ is increasing in ​S​, since effort then has a percentage stake in firm value. Then, using greater dollar effect in a larger firm. (18), in order to implement effort of ​​a∗​​ ​, the ​ We now turn to a setting in which effort CEO’s incentives are given by is unobservable and the IC (13) must be imposed. We first assume a risk-neutral (19) ​b(S) g (​a​​ ∗)​. θ = ′ ​ agent, before moving to risk aversion. Empiricists typically measure the CEO’s 3.1 Risk-Neutral Agent incentives to improve firm value, i.e., to exert We first consider risk neutrality and addi- effort ​​a​​ ∗​. Equation (19) shows how the opti- tive preferences. We have u​(x) x​ and mal measure of incentives depends on how = ​v(c) c​ so the participation and incentive we specify the production function. When = constraints (12) and (13) specialize to it is additive (b​(S) b​), then to imple- = ment a given effort level ​​a∗​​ ​, the firm must ​ (16) ​E c(V) g(a) w set correctly the incentive measure ​​, the [ ] − ≥ θ agent’s percentage stake in firm valueV ​ ​. This and measure corresponds to the dollar change in pay for a one dollar change in firm value

(17) a arg max ​ ​ E c(V) g(​aˆ ​).​ ($–$ incentives) and is used by Demsetz and ∈ [ ] − ​aˆ ​ Lehn (1985) and Jensen and Murphy (1990), among others. Grossman and Hart (1983) show that the Hall and Liebman (1998) argue that most contracting problem can be solved in two CEO actions have a multiplicative effect on stages, which correspond to the principal’s firm value. With a multiplicative production two choice variables. She first chooses the function (​b(S) bS​), we have ​ bS​​ g (​a​​ ∗)​, = θ = ′ ​ contract ​c(V)​ that implements a given action​​ and so the relevant incentive measure is ​S​, θ a∗​​ ​ at least cost, and then the optimal ​​a∗​​ ​ the CEO’s dollar equity stake. This measure ​ ​ Edmans and Gabaix: Executive Compensation: A Modern Primer 1243 corresponds to the dollar change in pay for The IR is given by E​​ c a ​a∗​​ w​, [ | = ​]​ = a one percentage point change in firm value which yields ($–% incentives). Thus, while it is common to assume an ​w ​ c a ​a∗​​ E​ V a ​a∗​​ S.​ = [ | = ​]​= ϕ + θ [ | = ​]​= ϕ + θ additive production function for simplicity, researchers should think carefully about how to specify these functions as this choice If the CEO exerts effort ​a​, his utility is has important implications for the relevant g(a) measure of incentives—a point first noted ​E U(a) E[​c(a)​e​​ − ]​ ​ by Baker and Hall (2004). Moreover, if CEO [ ] = effort has a multiplicative effect on firm g(a) ( EV(a))​e​​ − ​ value, then CEO incentives are a quantita- = ϕ + θ tively much more important issue than his g(a) level of pay, even though the latter receives ​( S(​1 ba))​​e​​ − ​ much greater attention in the media. While a = ϕ + θ + $9.6 million salary is substantial compared to g(a) ​(w Sba)​e​​ − ​ average worker pay, relative to a $10 billion = + θ firm it constitutes 0.1 percent of firm value. Sb g(a) In contrast, if incentives are insufficient w​ 1 ____ a ​e​​ − ​ = ( + θw ) to induce the CEO to implement a major restructuring or reject a bad acquisition, the ln​ 1 ____Sb a ​ g(a) w ​e​​ ( + ​ θw )− ​. losses to shareholders could run into several = percentage points. Before moving to the second stage of The IC is ​​a∗​​ arg ma​x​ ​ E U(a) ​. At ​​a∗​​ 0​, ​ ∈ a [ ] ​ = Grossman and Hart (1983), we demonstrate this yields ​E[U (0)] 0​, i.e., ′ = the effect of multiplicative preferences, as studied by Edmans, Gabaix, and Landier S g (​a∗​​ ) (20) ___θ ​ ______′ ​​ ​ . (2009). In the general objective function (10), w = b this corresponds to ​u(x) ​e​​ x​ and ​v(c) = ln c​, which yields Thus, to implement a given effort = level ​​a​​ ⁎​,​ the firm must set correctly the __S g(a) incentive measure ​​ θw ​ ​ , i.e., the CEO’s dol- ​U E​ c ​e​​ − ​ ​.​ = [ ] lar equity stake scaled by his annual pay, or alternatively the fraction of total pay w​ ​ We normalize ​​a∗​​ 0​, and so the ​t 0​ that is in equity. It corresponds to the per- ​ = = stock price (net of CEO pay) is ​S​.10 In (9) we centage change in pay for a 1 percentage have ​b(S) bS​, i.e., a multiplicative pro- point change in firm value (%–% incen- = duction function, so that firm value at ​t 1​ tives, i.e., the elasticity of pay to firm value), = is given by as used by Murphy (1985), Gibbons and Murphy (1992), and Rosen (1992). Using ​​ ​​ I​, ​​ ​​ II​, and ​​ ​​ III​, respectively, to ​V(a) S(1 ba) .​ θ θ θ = + + ε denote %–%, $–$, and $–% incentives, we have:

I c 1 ln Pay 10 For simplicity, we assume that initial firm sizeS ​ ​ is net (21) ​​ ​ ___∂ ​ __ ​ ______Δ ​ ; θ = r w = ln Firm Value of the expected wage ​w​. ∂ Δ 1244 Journal of Economic Literature, Vol. LIV (December 2016)

II c 1 $Pay (2005) overturn Bebchuk and Fried’s (2004) (22) ​​​ ​ ___∂ ​ __ ​ ______Δ ​ ; θ = r S = $Firm Value conclusion that CEOs have weak incen- ∂ Δ tives when studying wealth– rather than III c $Pay pay–performance sensitivities. Section 3.4 (23) ​​​ ​ ___∂ ​ ______Δ ​ ​ , θ = r = ln Firm Value predicts how the three incentive measures ∂ Δ scale with firm size and section 3.5 will test where ​r V/S 1​ is the firm’s stock these predictions. These tests will shed light = − market return. In our one-period model, on whether utility and production functions incentives arise from new grants of stock are additive or multiplicative, and thus the and options, plus changes in cash pay (sal- optimal measure of incentives. ary and bonuses). Thus, these incentive We now solve for the second stage of measures are referred to as “pay–perfor­ - Grossman and Hart (1983), i.e., the opti- mance sensitivity.” In reality, CEOs are in mal effort level, returning to the case of office for multiple periods, and the vast additive preferences. If the agent exhibits majority of incentives stem from changes unlimited liability, the principal can always in the value of previously granted stock adjust fixed pay ​ ​ so that the participation ϕ and options, which swamp changes in cash constraint (16) binds. Thus, his expected pay pay (Jensen and Murphy 1990, Hall and is ​E c(V) w g(​a∗​​ )​, just as in the first- [ ] = + ​ Liebman 1998). Replacing flow compen- best, and so the principal’s objective function sation ​c​ in the numerator of expressions remains (14). As a result, she implements the (21) to (23) with the CEO’s wealth ​W​ yields first-best effort level, defined by (15). Using analogous expressions for “wealth–perfor- (15) and (18), the optimal contract satisfies mance sensitivity,” the change in the CEO’s entire wealth (including previously granted (27) ​E c (V)b(S) b(S)​. [ ′ ] = stock and options) for a change in firm performance: With a linear contract, this yields ​ 1​ and θ = so the optimal contract is given by I W 1 ln Wealth (24) ​​ ​ ____∂ ​ __ ​ ______Δ ​ ; Θ = r w = ln Firm Value (28) ​c(V) V, where​ ∂ Δ = ϕ +

II W 1 $Wealth (25) ​​​ ​ ____∂ ​ __ ​ ______Δ ​ ; (29) ​ w g(​a∗​​ ) S b(S) ​a∗​​ ​. Θ = r S = $Firm Value ϕ = + ​ − − ​ ∂ Δ The principal effectively “sells” the firmV ​ ​ III W $Wealth (26) ​​​ ​ ____∂ ​ ______Δ ​ .​ to the agent for an up-front fee of ​ ​, Θ = r = ln Firm Value −ϕ ∂ Δ chosen so that the participation con- straint (16) binds. Since the agent benefits I W 1 For example, ​​ ​​ ​ ___ ∂ ​ __ ​​ is the percent- one-for-one from any increase in firm value, Θ = r w age change in wealth for ∂a 1 percentage point he fully internalizes the benefits of effort change in the stock return, scaled by annual and the first-best effort level ​​a​ FB∗ ​​ is achieved. pay, which Edmans, Gabaix, and Landier The level of incentives is “one size fits all”: (2009) call “scaled wealth–performance sen- regardless of the cost or utility function, we sitivity.” Empirical studies should consider have ​ 1​. θ = wealth–performance sensitivities, rather In the above framework, the effort level than pay–performance sensitivities, since the ​​a​ FB∗ ​​ is chosen endogenously and so the latter only capture a small part of the CEO’s principal implements whatever effort level incentives. Indeed, Core, Guay, and Thomas is implied by ​ 1​. One simple way to θ = Edmans and Gabaix: Executive Compensation: A Modern Primer 1245 obtain meaningful contracts that do differ to buy the firm. Second, he may be risk averse across firms is to _consider a binary effort and demand a premium for ­bearing the risk decision, ​a ​_​ a ​, a ​, where the princi- associated with firm valueV​ .​ We explore ∈ { _ } pal implements ​​ a ​, as in Holmstrom and these two departures in turn and show that Tirole (1997); Edmans, Gabaix, and Landier they both lead to nontrivial contracts. (2009); Biais et al. (2010); and the textbook 3.2 Limited Liability by Tirole (2006). A similar specification is a continuous_ but bounded action space, Innes (1990) studies the case of limited ​a ​ _​ a ​, a ​, where again the principal wishes liability and risk neutrality. The optimal con- ∈ [ ] _ to implement ​​ a ​. The upper bound reflects tract is no longer linear and so we return the fact that there may be a limit to the num- to a general contract ​c(V)​. He considers ber of actions that a CEO can take_ to increase two versions of the model. In the first, the firm value. The high effort level​​ a ​ represents only restriction on the contract is limited full productive efficiency, rather than work- liability on both the principal and agent, ​ ing twenty-four hours a day. In a cash flow 0 c(V) V.​ To keep the proof simple, ≤ ≤ diversion model, full productive efficiency we normalize both w​ ​and g​ (0)​ to ​0​, although corresponds to zero stealing; in a project- these assumptions are not necessary. Denote selection model, it corresponds to taking all by f​ (V, a ) ​the_ probability density function positive net present value (NPV) projects of ​V ​0, V ​ ​ conditional on effort ​a​ and ∈ [ ] while rejecting negative NPV ones; in an assume that it satisfies the monotone likeli- effort model, it corresponds to the CEO not hood ratio property (MLRP), i.e., deliberately refraining from an action that ​f​ ​ (V, a) will improve firm value because he prefers ​​ ______a f (V, a) to shirk. Then, from equations (19) and (20),_ the optimal incentive level is ​b(S) g (​ a ) ​ b(S) θ _= ′ if utility is additive and ​​ ____θ g (​ a ) if util- is strictly increasing in ​V​. w = ′ ity is multiplicative.11 Thus, the optimal level The principal’s problem is given by of incentives ($–$, $–%, or %–%, depending _ ​ V ​ on the model specification)_ is increasing in max ​ ​ ​ ​ (V c(V)) f (V, ​a∗​​ ) dV s.t. c( ) 0 − ​ the cost of effort g​ (​ a ).​ Incentives are higher · ∫ ′ _ in firms with greater agency problems, rather ​ V ​ (30) ​​ ​ c(V) f (V, ​a∗​​ ) dV g(​a∗​​ ) w than one size fits all. 0 ​ − ​ ≥ The first-best is still achieved in the ∫ _ fixed-action setting. In reality, the first-best ​ V ​ (31) ​​ ​ c(V) ​f​ ​ (V, ​a∗​​ ) dV g (​a∗​​ ) cannot be achieved for two reasons. First, ∫0 a ​ = ′ ​ the agent may be subject to limited liability (c​ (V) 0).​ Under contract (28), the agent (32) 0 c(V) V.​ ≥ ≤ ≤ will receive a negative payoff if V​ ​ is suffi- Note that for all contracts c​( )​ satisfying ciently low, violating limited liability. Put dif- · ferently, the agent may not have enough cash the IC (31), we have _ ​ V ​ ​​​ ​ c(V) f (V, ​a∗​​ ) dV g(​a∗​​ ) 0 ​ − ​ 11 When ​a​ is a boundary action, the IC becomes an ∫ _ ​ V ​ inequality_ and a continuum of contracts will implement ​ ​ ​ ​ c(V) f (V, 0) dV g(0) a a ​. We choose the contract that involves the minimum ≥ 0 − = ∫ _ amount of incentives, as this is optimal for any nonzero ​ V ​ level of risk aversion, and so the IC continues to bind. ​ ​ ​ c(V) f (V, 0) dV 0,​ = ∫0 ≥ 1246 Journal of Economic Literature, Vol. LIV (December 2016) where the first inequality arises becausea ​​∗​​ ​ it exceeds a threshold ​​Xˆ ​, and zero otherwise. ​ maximizes the agent’s utility if the IC (31) The intuition is that the tails of the distribu- is satisfied, and the final inequality is due to tion are most informative about whether the the agent’s limited liability, i.e., ​c(V) 0​. agent has exerted effort. Thus, the optimal ≥ Thus, the IC (31) implies the IR (30) and so contract punishes the agent as much as possi- we can ignore the latter. We thus have the ble for left-tail realizations of V​ ​, and rewards following Lagrangian: him as much as possible for right-tail realiza- _ tions of V​ ​. With limited liability on the agent, ​ V ​ ​L ​ ​ ​ (V c(V)) f (V, ​a∗​​ ) dV he can receive no less than ​0 ​ for low outputs; = ∫0 − ​ with limited liability on the principal, she can _ pay no more than the entire firm valueV ​ ​ for ​ V ​ ​ ​ c(V) ​f​ ​ (V, ​a∗​​ ) dV g (​a∗​​ ) ​,​ high outputs. + λ(​ ​ 0 a ​ − ′ ​) ∫ A potentially unrealistic feature of con- which can be rewritten as tract (34) is that it is discontinuous: when ​V​ rises from X​​ˆ ​ ​ to ​​Xˆ ​, the principal’s payoff _ − ε ​ V ​ falls from X​​ˆ ​ ​ to 0​ ​. Thus, the principal may ​L ​ ​ ​ c(V) f (V, ​a∗​​ ) − ε = ∫0 ​ wish to exercise her control rights on the firm to “burn” output from X​​ˆ ​ to ​​Xˆ ​ ​ to increase ​fa​ ​ (V, ​a∗​​ ) − ε ​ 1 ​ ______​ ​ ​ dV her payoff. Alternatively, since the agent’s × − + λ f (V, ​a∗​​ ) ( ​ ) pay rises more than one-for-one around this _ ​ V ​ threshold, he may wish to borrow on his own ​ ​ ​ Vf (V, ​a∗​​ ) dV g (​a∗​​ )​. account to increase output from ​​Xˆ ​ ​ to ​​Xˆ​ + ∫0 ​ − λ ′ ​ − ε because the gain in his payoff will exceed the Pointwise optimization with respect to ​c(V)​, amount he must repay. subject to the limited liability constraint (32), To deter both actions, the second version yields the following contract: of the Innes (1990) model also assumes a monotonicity constraint: the principal’s pay- ⎧ ______​fa​ ​ (V, ​a∗​​ ) __1 off cannot fall with firm value V(​ c(V)​ is 0 if ​ − ⎪ f (V, ​a∗​​ ) < ​ nondecreasing in ​V​), so the agent’s pay can- (33) ​c​(V)​ ​⎨ ​ ​ λ​ ​ . = ​​ not increase more than one-for-one­ with firm ​fa​ ​ (V, ​a∗​​ ) 1 ⎪V if ______​ __ value. Following similar steps to above, the ⎩ f (V, ​a∗​​ ) ≥ ​ ​ λ optimal contract is very similar except that at the new cutoff X​​ ​ ​Xˆ ​, the contract jumps Due to MLRP, f​​a​ ​ (V, ​a∗​​ )/ f (V, ​a∗​​ )​ is strictly ̃ < ​ ​ˆ from ​0​ not to V​ ​, but only to V​ ​X ​, since this increasing. Thus, there exists an X​​ ​ such that − ̃ is the highest payoff that does not violate the 0 if V ​Xˆ ​ monotonicity constraint. This yields the fol- (34) ​c(​V)​ ​ ​ < ​​​​ , lowing contract: = {V if V ​Xˆ ​ ≥ ˆ 0 if V ​X ​ where ​​X ​ is the largest X​ ​ that satisfies the IC (35) ​c​(V)​ ​ ​ < ̃ ​​ ​ ​, (31), which can be rewritten = {V ​X ​ if V ​X ​ − ̃ ≥ ̃ _ ​ V ​ ​ ​ V ​fa​ ​ (V, ​a∗​​ ) dV g (​a∗​​ ).​ where ​​X ​ is again the largest ​X​ that satisfies X ​ = ′ ​ ̃ ∫ the IC (31), which can be rewritten Contract (34) is a “live-or-die” contract: _ ​ V ​ the agent receives the entire firm value ​V​ if ​ ​ (V X)​ ​f​ ​ ​(V, ​a∗​​ ) dV g (​a∗​​ ).​ ∫X − a ​​ = ′ ​ Edmans and Gabaix: Executive Compensation: A Modern Primer 1247

Contract (35) is a standard call , convex, and thus whether they should com- where the agent receives zero if V​ ​ falls below prise stock or options. a threshold ​X​, and the residual ​V ​X ​ other- − ̃ 3.3 Risk-Averse Agent wise. The intuition is similar to contract (33): for low output ​​ V ​X ​ ​, the agent receives Returning to the case of unlimited liability, ( < ̃) the lowest possible payoff (​0​); for high out- another route to a meaningful contract is to put ​​ V ​X ​ ​, he gains one-for-one with have a risk-averse agent. Under the gener- ( > ̃) any increase in ​V​, which is the maximum al-utility function (10), and returning to gen- possible gain without violating monotonicity. eral (rather than linear) contracts, the agent’s Contract (35) implies not only that the first-order condition is given by CEO should be paid exclusively with options, but also that his wealth–performance (36) E​ u ( )(v (c)c (V)b(S) g (​a∗​​ )) ​ 0.​ [ ′ · ′ ′ − ′ ​ ] = sensitivity is ​1​ ​​ for V ​X ​ ​—i.e., he gains ( > ̃) ­dollar-for-dollar with any increase in firm Even assuming a given implemented action​​ 12 value above ​​X ​. Thus, the only source of a∗​​ ​, the contracting problem remains dif- ̃ ​ variation between CEOs is the X​​ ​. ficult because equation (36) only requires ̃ It is easy to show that, when the marginal the contract to satisfy the agent’s incentive cost of effort c​ (​a∗​​ )​ rises, the strike price falls constraint on average. Even in the simplest ′ ​ in order to increase the delta of the option case in which u​​ is linear, the agent’s aver- and maintain the agent’s incentives. Hence, age expected marginal benefit from effort, even though the optimal contract is no lon- ​E v (c)c (V)b(S) ​, must equal the (known) [ ′ ′ ] ger trivial, this model does not capture much marginal cost of effort, g​ (​a∗​​ )​. There are many ′ ​ of the cross-sectional variation in real-life potential contracts that will satisfy the incen- CEO contracts. tive constraint on average, and so the prob- In reality, while CEOs are often paid with lem is complex because the principal must options in practice, they also receive salary, solve for the one contract out of this contin- bonuses, and stock. Moreover, they often uum that minimizes the expected wage. (The have very few shares compared to the num- problem is more complex if u​ ​ is nonlinear.) ber of shares outstanding, meaning that they 3.3.1 Holmstrom–Milgrom Framework gain far less than dollar-for-dollar­ with any increase in firm value. This wealth–perfor- Holmstrom and Milgrom (1987, hereafter mance sensitivity differs widely across firms, cited as HM) showed that the contracting which the above model does not capture. We problem becomes substantially simpler if now incorporate risk aversion, which leads four assumptions are made. First, the agent x to the optimal sensitivity being below 1 and exhibits exponential utility, so u​ (​x)​ ​e−η​​ ​, = − differing across CEOs. In addition to the where ​​ is the coefficient of absolute risk η strength of incentives, these models will also aversion. Second, the cost of effort is pecu- derive predictions for the optimal shape of niary, so v​(c) c​. Third, the noise ​​ is = ε contracts—whether they should be linear or Normal, i.e., ​ N​ 0, ​ ​​ 2​ ​. Fourth, they ε ∼ ( σ ) consider a multi-period model where the agent chooses his effort every instant in con- tinuous time. Under these assumptions, HM 12 If there are ​x​ existing shares outstanding and the CEO is given options on ​y​ shares, his share of firm value show that the optimal contract is linear, i.e., ​ y is ​​ _____ ​ ​ if he exercises all his options. Thus, strictly speak- c V​, and that the problem is equiv- x y = ϕ + θ + ing, he must be given infinite options to obtain a ­wealth– alent to a single-period static problem. The performance sensitivity of ​1​. intuition is that a linear contract subjects the 1248 Journal of Economic Literature, Vol. LIV (December 2016) agent to a constant incentive pressure irre- level of incentives is (see Appendix A for a spective of the history of past performance. full proof): This result suggests that incentives should be implemented purely with stock, and not non- ______1 (42) ​ 2 ​​ . linear instruments such as options. θ = 1 g ​​ ​ ___σ ​ ​ ​ + η b(S ) If we also assume a quadratic cost of effort ( ) (​g(a) 1 ​ g​a​​ 2​​) for simplicity, the principal’s Optimal incentives ​​ are a trade-off = _2 θ problem becomes between two forces. A sharper contract b(S) increases effort a​ ____θ ​ ​ and thus firm = g (37) max ​ ​ E V c value, but also increases disutility ​​ _1 ​ g ​a​ 2​ and ​a∗​​ ​, , [ − ] 2 ϕ θ the risk premium ​​ _η ​​ 2​ ​ ​​ 2​. Thus, ​ ​ is decreas- 2 2 θ σ 2 θ c __1 ​ g​a∗​​ ​ ​ w ing in risk aversion ​ ​ and risk ​​ ​​ ​ ​ as these aug- (38) s.t. E​ ​e​​ −η[​ −​ 2 ]​ ​ ​e−η​​ ​ η σ [− ] ≥ − ment the risk premium required. The effect of the cost of effort g​ ​ is more nuanced. On c __1 ​ g​​aˆ ​​​ 2​ ​ (39) ​a∗​​ ​ arg max ​ ​ E​ ​ e​​ −η[​ −​ 2 ]​ ​. ∈ [− ] the one hand, fixing​​a ∗​​ ​, the required incen- ​aˆ ​ ​a∗​​ g ​ tives are ​ ___​ ​​ and increasing in ​g​. On the Substituting for c​ V​ and V​ S θ = b(S) = ϕ + θ = + other hand, when effort is costlier to imple- b(S)a ​, the agent’s objective function sim- ment (g​ ​ is higher), the optimal effort level a​​ ∗​​ ​ + ε ​ plifies to is lower. The second effect dominates: when effort is costlier, an increase in ​​ leads to a cˆ ​(a) θ (40) ​ ​e​​ −η ​ ​, smaller rise in effort, and so the optimal ​​ − θ falls. (Since the benefit of effortb ​ ​( ) has the where ​​cˆ ​(a) (S b(S)a) _ 1 ​ g ​a​​ 2​ · = ϕ + θ + − 2 − opposite effect of the cost of effort ​g​, we dis- ​ _η ​​ 2​ ​ ​​ 2​​ is his utility from the contract. It com- 2 θ σ cuss only the latter throughout.) prises the expected wage ​ (S b(S)a)​, To find fixed pay ​ ​, we set the participation ϕ + 2θ + ϕ minus the cost of effort ​​ _1 ​ g ​a​​ ​, minus the constraint to bind (​​cˆ ​(a) w​). This yields 2 = risk premium ​​ _η ​​ 2​ ​ ​​ 2​​ that the agent requires. 2 θ σ ​( b(S))​​ 2​ This risk premium is increasing in the agent’s ​ w S __ 1 ______θ ​ __η ​​ 2​ ​ ​​ 2​ .​ risk aversion ​​, risk ​​ ​​ 2​​, and incentives ​​. ϕ = − θ − 2 g + 2 θ σ η σ θ From (40), the agent’s first-order condition The comparative statics for ​ ​ are ambig- is given by ϕ uous (see Appendix A). On the one hand, a higher cost of effort ​g​, higher risk aversion ​ b(S) (41) a∗​​ _____θ ​ ​ . ​, and higher risk ​​ ​​ 2​​ increase the required ​ = g η σ fixed pay ​ ​ as a compensating differential ϕ His effort choice is independent of risk ​​ ​​ 2​ (i.e., to ensure the IR remains satisfied). On σ and risk aversion ​ ​, since noise is additive. It the other hand, these changes also reduce η is also independent of the fixed wage​ ​, since the optimal level of incentives (from (42)), ϕ exponential utility removes wealth effects. which lowers the risk premium. Thus, ​ ​ can be adjusted to satisfy the agent’s The HM framework is attractive for a ϕ participation constraint without affecting his number of reasons. First, it derives (rather incentives. than assumes) a linear contract as being Plugging (41) into the principal’s objec- optimal. Second, it solves for not only the tive function (37) and setting the partici- optimal contract to implement a given effort pation constraint (38) to bind, the optimal level, but also the optimal effort level, i.e., Edmans and Gabaix: Executive Compensation: A Modern Primer 1249

_ both stages of Grossman and Hart (1983). in the target effort ​​ a ​, but independent of ​ ​ η Solving for the optimal effort level is valu- and ​​ ​​ 2​​, since the contract is not determined σ able not so much because empiricists test by any trade-off with these parameters. the model’s predictions for the effort level Thus, if the fixed-action model accurately (which is hard to observe), but, as will be represents reality, it has the attractive prac- made clear shortly, endogenizing the effort tical ­implication that the contract does not level leads to different predictions for the depend on the agent’s risk aversion, which contract (which is observed). Third, the fixed is typically hard to observe. It thus offers salary ​​ does not affect the agent’s effort a potential explanation for why real-world ϕ choice. Thus, changes in reservation util- contracts do not seem as complicated and ity can be simply met by varying ​ ​, without contingent on as many details of the envi- ϕ changing incentives. ronment as standard contract theories would However, HM stressed that a number of suggest. For example, section 3.5 shows that assumptions were necessary for their lin- there is no systematic relationship between earity result: exponential utility, a pecuniary incentives and risk; the textbook by Bolton cost of effort, normal noise, and continuous and Dewatripont (2005, p. 158) notes that time. Hellwig and Schmidt (2002) show “what is surprising is the relative simplicity of that linearity continues to hold in discrete observed managerial compensation packages time under two additional assumptions: the given the complexity of the incentive prob- principal does not observe the time path lem.” These details do not matter because of profits­ (only the total profit in the final the contract is_ determined by the need to period); and the agent can destroy profits induce effort ​​ a ​, rather than a trade-off with before he reports them to the principal. In risk. In addition, we now have unambigu- Appendix B we discuss the role played by the ous predictions for how increases in risk ​​ ​​ 2​ σ first three assumptions. and risk aversion ​​ affect the level of pay. η The HM model has proven extremely influ- There is now only the direct effect, that pay ential due to its tractability. Given the bene- rises as a compensating differential, but no fits of tractability, researchers have attempted indirect effect because these parameters do to achieve tractability in other settings. We not affect the optimal effort level. explore these alternative models here. Whether the endogenous or fixed-action model is more realistic depends on the set- 3.3.2 Fixed-Target Action ting. In many cases, the endogenous action b(S) In HM, the effort level a​ ____θ ​ ​ is case is more accurate as principals choose = g chosen endogenously. As described in sec- to implement less-than-full effort to save tion 3.1, an alternative specification_ is for the on wages. For example, a factory boss may action space to be bounded above by ​​ a and the_ only require a production operative to work principal to implement a fixed target action ​​ a_ ​. an eight-hour day, to avoid paying overtime. The optimal contract is now ​b(S) g​ a ​, However, for CEOs, a fixed action may be θ = which leads to very different empirical impli- more appropriate. Edmans and Gabaix cations. Now, the level of incentives ​b(S)​ (2011b) show that, if CEO effort has a mul- _θ arises from the desire to induce effort ​​ a ​, and tiplicative effect on firm value,_ implement- not any trade-off with disutility or risk. Thus, ing full productive efficiency ​​ a ​is optimal only the first effect of ​g​ exists—a higher cost if the firm is large enough. (The result also of effort_ raises the incentives required to holds for any increasing function ​b(S)​.) induce ​​ a ​—and so incentives are increasing The benefits of effort are a function of in ​g​, in contrast to HM. It is also increasing firm size; the cost of effort (a higher wage 1250 Journal of Economic Literature, Vol. LIV (December 2016) to compensate­ for risk and disutility) is a specify the noise ​ ​ as being realized before, ε function of the CEO’s reservation wage w​ ​. rather than after, the action a​ ​ is taken. This Thus, if ​S​ is sufficiently large compared to timing is also featured in models in which the ​w​, the benefits of effort dominate the trade- agent observes total cash flow before decid- off and it is optimal to induce full productive ing how much to divert (e.g., Lacker and ­efficiency regardless of ​g​, ​ ​, or ​​ ​​ 2​. For exam- Weinberg 1989; Biais et al. 2007; DeMarzo η σ ple, in a $10_ billion firm, if_ implementing and Fishman 2007) and in which he observes effort level ​​ a rather than ​​ a ​reduces firm the “state of nature” before choosing effort − ξ value by only 0.1 percent, this translates into (Harris and Raviv 1979; Sappington 1983; $10 million. If the CEO’s salary is $10 mil- Baker 1992; and Prendergast 2002). Note lion, then even if salary can be reduced by that this timing assumption does not render 50 percent_ by allowing the_ CEO to exert the CEO immune to risk, because noise is only ​​ a ​, implementing ​​ a ​remains opti- unknown when he signs his contract. EG − ξ mal. Indeed, the structural estimation of also show that the contract retains the same Margiotta and Miller (2000) finds that the form in continuous time, where noise and costs of inducing effort are substantially less effort occur simultaneously. This consistency than the benefits. The fixed-action model suggests that, if underlying reality is contin- more likely applies to CEOs than rank-and- uous time, it is best approximated in discrete file employees, who have a limited effect on time by modeling noise before effort. firm value. The timing assumption allows for signif- The overall point that we would like to icant tractability. Since the noise is known stress is not that one model is superior to the when the agent takes his action, we can other. Different models apply to different remove the expectation from his objective scenarios. Rather, we wish to highlight how function (10) to yield a contracting model’s empirical implica- tions hinge critically on the assumptions—­ (43) ​u​ v(c(S b(S)a )) g(a) ​.​ ( + + ε − ) whether we specify multiplicative versus additive production or preference func- In turn, ​u( )​ also drops out. The specific · tions, or a fixed versus continuous imple- form of u​ ​ is irrelevant—since it is monotonic, mented action. Sometimes, researchers it is maximized by maximizing its argument. may assume a binary action space or addi- This yields the first-order condition: tive functions out of convenience, but this modeling choice can lead to vastly different (44) ​v (​ c(​S b(S)​a∗​​ )) predictions. ′ + ​+ ε ​ ​

3.3.3 Noise before Action c (S b(S)​a∗​​ )b(S) g (​a∗​​ )​. × ′ + ​+ ε = ′ ​ The framework of Edmans and Gabaix (2011b, cited as EG hereafter) provides This first-order condition must hold for another way to obtain tractable con- every possible ​​, i.e., state-by-state, rather ε tracts, without the need to assume expo- than simply on average. This pins down the nential utility, a pecuniary cost of effort, slope of the contract: for all ​ ​, the agent must ε or normal noise. It considers the imple- receive a marginal felicity of g​ (​a∗​​ )​ for a mar- ′ ​ mentation of a given effort level a​​∗​​ ​, ginal increase in ​V​. Thus, for all ​V​, the con- ​ i.e., the first stage of Grossman and Hart tract must satisfy (1983), and thus is particularly_ applicable to CEOs where ​​a∗​​ a ​if the firm is large. EG ​v (c(V)) c ​(V)​ b(S) g (​a∗​​ ).​ ​ = ′ ′ = ′ ​ Edmans and Gabaix: Executive Compensation: A Modern Primer 1251

Integrating over ​0​ to V​ ​, we obtain in felicity from effort remains ​v (c)c (V)b(S) g (​a∗​​ )​, ′ ′ = ′ ​ units, and incentives are preserved regardless of ​w​ or ​ ​. Allowing for convex contracts removes g (​a∗​​ ) ε ​v​(c​(V)​)​ _____′ ​ V k,​ the need to assume a pecuniary cost of = + b(S) effort. In contrast, if ​v​ is convex, the contract for an integration constant k​ ​. This yields, in is concave. dollar terms, The contract is linear in two special cases. The first is a pecuniary cost of effort, as in HM: 1 g (​a∗​​ ) g (​a∗​​ ) (45) ​c​(V)​ ​v​​ − ​ ​ ​ _____′ ​ V k ​. when v​ (c) c​, we have c​ (V) ___ ′ ​ V k​. = ( b(S) + ) = = b(S) + With an additive production function (​ 0​), γ = The constant k​ ​ is chosen to make the partic- the CEO’s dollar incentives are linear in the ipation constraint bind, i.e., firm’s dollar valueV ​ ​; with a multiplicative production function (​ 1​), they are lin- γ = g (​a​∗) ear in the firm’s percentage return__ ​​ V ​ ​. The (46) ​E​ u​ ​ _____′ ​ ​ V k g(​a∗​​ ) ​ ​ w.​ S [ ( b(S) + − ​)] = former result echoes Lacker and Weinberg (1989), who also feature a pecuniary cost of There is a unique optimal contract. The effort and an additive production function. slope is chosen so that the incentive con- They show that the optimal contract to deter_ straint (44) holds state by state, and the all cash flow diversion (the analogy ofa ​​ ∗​​ a ​) ​= scalar k​​ is chosen so that the participation is piecewise linear. The second case is ​ constraint binds. v(c) ln c​, i.e., multiplicative preferences. = Equation (45) shows that the optimal g (​a∗​​ ) The contract is ​ln c(V) ____′ ​ V k​, and so contract is typically nonlinear. Even though = b(S) + the noise is known when the agent takes his log pay is linear in ​V ​ __V ​ ​ with an additive action, it is not irrelevant because it has the (S) potential to undo the agent’s incentives. If​ (multiplicative) production function. In both ​ is high, ​V​ and thus ​c​ will already be high; cases, the framework delivers linear con- ε a high reservation wage ​w​ increases the tracts without requiring exponential utility or required constant k​​ and thus c​​, and so has Normal noise. More broadly, the framework the same effect. If the agent exhibits dimin- allows for contracts that are convex and con- ishing marginal felicity (i.e., ​v​ is concave), cave, rather than purely linear as in HM— he has lower incentives to exert effort. Put thus, tractability can be achieved without differently, the agent does not face risk (as ​ linearity—and shows what determines the ​ is known) but distortion (as ​​ affects his optimal curvature or linearity of the contract: ε ε effort incentives). HM assume that the cost the form of ​v( )​. · of effort is in financial terms so that, like Equation (45) also clarifies the parame- the benefit of effort, it also declines with ​​, ters that do and do not matter for the con- ε and so incentives are unchanged with a lin- tract’s functional form. It depends only ear contract. EG instead address distortion on the felicity function ​v​ and the cost of by the shape of the contract: it is convex, via effort g​ ​. The functional form is independent 1 the ​​v​​ − ​​ transformation. If noise is high, the of the utility function ​u​, the reservation utility ​ contract gives a greater number of dollars for w​, and the distribution of the noise ​ ​, i.e., the ε each incremental unit of firm value c(​ (V)​), contract can be written without reference to ′ to offset the lower marginal felicity of each these parameters. These parameters will still dollar (​v (c)​). Therefore, the marginal felicity affect the contract’s slope via their impact on ′ 1252 Journal of Economic Literature, Vol. LIV (December 2016) the scalar k​ ​. However, the contract’s slope as here the version calibrated in Dittmann and well as its functional form are independent Maug (2007). End-of-period firm value is of ​u​, ​w​, and ​​ in the cases of ​v(c) c​ and given by ε = ​v(c) ln c​, where it depends only on ​g ​(a​​ ∗)​. = ′ ​ 2 This “detail independence” contrasts with V​ ​ ​V​ ​ (a) exp​ ​ R ___​ ​​ ​ ​ ​T ​ __T ​ ​,​ standard agency models where the contract T = 0 [( − σ2 ) + εσ√ ] depends on many specific features of the setting.13 where ​ N(0, 1)​ and ​​V​ ​ satisfies ​​V​​ 0​ ε ∼ 0 ′0 ​ > The above framework allows for tractable and ​​V​​ 0​. They assume that a contract ″0​ < contracts with fewer restrictions on the util- is composed of salary ​​, ​ ​ shares, and ​ ​ ϕ θ ψ ity function, cost of effort, and noise distri- options (as a fraction of shares outstand- bution, as well as nonlinear contracts. This ing), with strike price X​ ​ and maturity T​ ​.14 tractability allows it to be used in dynamic Both shares and options are paid out at the models with private saving (Edmans et al. end of the period; salary ​ ​ is paid out at the ϕ 2012) and assignment models with moral start. The CEO begins with non-firm wealth​​ hazard under risk aversion (Edmans and W0​ ​​​, which is invested at the risk-free rate ​R​. Gabaix 2011a). However, it has a number His ­end-of-period wealth is then given by of disadvantages. It requires the assumption that noise precedes the action; while applica- ​​c​ ​ ( ​W​ ​)​e​​ RT​ ​V​ ​ T = ϕ + 0 + θ T ble in some settings (e.g., a cash flow diver- sion model), it may not apply to others. It max​ ​VT​ ​ X, 0 .​ also assumes a fixed implemented actiona ​​ ∗​​ ​, + ψ { − } ​ which again may only apply in some settings In the general-utility function (10), we have​ (e.g., a CEO of a large firm). The goal of 1 ​c​ ​ −ζ u​(x)​ x​ and v​(​c​ ​) ____ T ​​, where ​​ is the this article is to provide a range of modeling T 1 = = − ζ ​ ζ frameworks, each of which may be applica- parameter of relative risk aversion. Thus, the ble under different conditions. CEO’s preferences are given by

3.3.4 Constant Relative Risk Aversion 1 ​c​ ​ −ζ and Lognormal Firm Value ​U(​c​ ​ , a) _____ T ​ g(a)​. T = 1 ​ − We have so far considered two models − ζ that yield tractable contracts at the cost of Assuming risk-neutral pricing, the CEO’s some assumptions. Other papers do not aim end-of-period pay (change in wealth) is given to achieve a tractable analytical solution, but by instead to calibrate the optimal contract, and so use fewer assumptions. Perhaps the ​ ​ ​e​​ RT​ ​V​ ​ max ​V​ ​ X, 0 ,​ πT = ϕ + θ T + ψ { T − } most commonly used framework for calibra- tion involves constant relative risk aversion with expected present value (CRRA) utility and lognormal firm value, RT studied by Lambert, Larcker, and Verrecchia ​ ​ E e​​ − ​ ​ ​ ​ ​ ​V​ ​ BS,​ π0 = ​[​ πT] = ϕ + θ 0 + ψ (1991); Hall and Murphy (2000, 2002); Hall and Knox (2004); and others. We present where ​BS​ denotes the Black–Scholes value of the option. They solve for the first stage

13 Chassang (2013) derives contracts that are relatively independent of the environment (i.e., the probability 14 In an additional analysis, Dittmann and Maug (2007) space), in a risk-neutral setting. also solve for the optimal unrestricted contract. Edmans and Gabaix: Executive Compensation: A Modern Primer 1253 of Grossman and Hart (1983), in which the an option be expensed, thus leading to an principal wishes to ­implement action ​​a∗​​ ​, accounting charge for at-the-money options. ​ and so her problem is given by However, Dittmann, Maug, and Spalt (2010) show that options can be rationalized if the

min​ 0​ ​V0​ ​ BS CEO is loss averse: since options provide , , ​ π​ = ϕ + θ​ + ψ ϕ θ ψ downside protection, they are particularly valuable to a loss-averse agent. Moreover, s.t. E U(​​W​ ​, a∗​​ ) U_ , [ T ​]​ ≥ as we will discuss in section 3.7, if the agent chooses firm risk in addition to effort, a∗​​ ​ arg max ​ E U(​​W​ ​, a) , = ​ [ T ] a 0, options may be useful to induce him to take ∈[ ∞) value-adding risky projects. W​ ​ 0, 0 1, 0. ϕ + 0 ≥ ≤ θ ≤ ψ ≥ 3.4 Incentives in Market Equilibrium Dittmann and Maug (2007) calibrate Section 3 has thus far taken the reserva- this model to a sample of 598 US CEOs. tion wage w​ ​ as given. We now endogenize w​ ​ In particular, they study whether it is more using the assignment model of Gabaix and efficient to incentivize the CEO with stock Landier (2008) to study how CEO incen- or options. Since options are riskier, $1 of tives vary across firms in market equilibrium. options is worth less to the CEO than $1 We use the Edmans, Gabaix, and Landier of stock, rendering them less effective in (2009) framework of a risk-neutral CEO, meeting the CEO’s participation constraint. multiplicative preferences, and a fixed_ target On the other hand, $1 of options provides action, as in section 3.1, with a​​∗​​ a ​. We ​ = greater incentives than $1 of stock, ren- will show that even this simple model leads dering them more effective in meeting his to predictions consistent with empirical find- incentive constraint. They find that the ings. (Edmans and Gabaix 2011a extend the first effect is dominant, suggesting that the model to risk aversion.) w optimal contract should involve only stock From (20), we have ​ ___Λ ​ ​ where ​ θ = S and not options. This prediction is shared g (​a∗​​ )​. The fixed salary ​​ is chosen Λ = ′ ​ ϕ with Holmstrom and Milgrom (1987), who so that the IR binds, i.e., ​ w S ϕ = − θ predict linear contracts, although in a dif- w(​1 )​​. Thus, the CEO in firm ​n​ is = − Λ ferent setting. Moreover, when they drop given a fixed salary ​​ ​​, and ​​ ​​ ​S​ ​ worth of ϕn θn n the restriction that the contract must be shares, with ­piecewise linear (i.e., consist of salary, stock, and options), they find that the optimal non- (47) ​​ ​S​ ​ w(n) , θn n = Λ linear contract is concave. In contrast to both frameworks, option (48) ​​​ w(n)(​1 )​, ϕn = − Λ compensation is widespread in the United States. One interpretation, consistent with where ​w(n)​ is given by equation (8) from Bebchuk and Fried (2004), is that the use of Gabaix and Landier (2008). Thus, a fraction ​ ​ Λ options indicates rent extraction: since at-the- of the equilibrium wage is paid in equity, and money options did not have to be expensed the remainder is paid in cash. until 2006, they constitute “stealth compen- We can now solve for the three incentive sation” not noticed by shareholders. Indeed, measures in equations (21)–(23) in terms of Hayes, Lemmon, and Qiu (2012) found that model primitives: the use of options fell ­substantially after FAS 123R mandated that the economic value of (49) ​​ I​ ​S​​ 0​ θ = Λ ∝ 1254 Journal of Economic Literature, Vol. LIV (December 2016)

II __w 1 1/ 3​. Larger firms hire more talented (50) ​​​ ​ ​ ​S​​ ρ− ​ ρ = θ = Λ S ∝ CEOs who command higher wages. Since III the benefits of shirking are higher, given (51) ​​​ ​ w ​S​​ ρ​. θ = Λ ∝ ​ multiplicative preferences, a higher dollar equity stake is needed to induce effort. In Equation (20) earlier suggested that, in section 3.5, we will compare these predic- a multiplicative model, the optimal incen- tions to the data. tive measure is ​​​​ I​ (%–% incentives), since Turning to the strength of incentives, θ it affects the implemented effort level. the $–$ incentives measured by Jensen and Equations (49)–(51) illustrate a related Murphy (1990) are given by ​​​​ II​ ​​​ I​ __ w ​ ​ . θ = θ S advantage: in a multiplicative model, ​​​​ I​ is Since firm sizeS ​ ​ is substantially larger than θ independent of firm size and thus compara- the CEO’s wage ​w​, $–$ incentives should be ble across firms of different size. Intuitively, low. Because firms are so large, the dollar since effort has a percentage effect on both benefits of effort are much greater than the firm value and CEO utility, it is %–% incen- disutility cost to the CEO, and so only a small tives that are relevant. Comparability across equity stake is needed to induce effort. firms of different size is useful to study which Another strand of research justifies low firms are incentivized more or less than their $–$ incentives by pointing out the disad- peers. For example, a passive investor who vantages of strong incentives. Lambert, believes that incentives are not fully priced Larcker, and Verrecchia (1991) show that in the market may wish to invest in a stock a high equity stake may induce the CEO with high CEO incentives; an activist inves- to take inefficiently low risk. Benmelech, tor may wish to target a firm with low incen- Kandel, and Veronesi (2010) assume that tives. However, if the CEO of a large firm equity incentives vest in the short term, has $2 million of equity and the CEO of a since long-term­ incentives expose the CEO smaller firm has $1 million of equity, we to risks outside his control. Then, the CEO cannot immediately conclude which CEO may conceal information that his investment is better incentivized, as dollar equity hold- opportunities have declined to keep the cur- ings should optimally increase with firm rent stock price high, even though disclosing size. Relatedly, comparability is valuable for such information will allow him to efficiently boards or compensation consultants under- disinvest. In a similar vein, Peng and Roell taking benchmarking analyses.15 (2008, 2014) and Goldman and Slezak (2006) While %–% incentives should be inde- demonstrate that high-powered incentives pendent of size, with ​ / 1/ 3​ that vest in the short term can encourage the ρ = γ − β α = as in Gabaix and Landier (2008), $–$ incen- manager to expend firm resources to manipu- tives should have a firm-size elasticity of​ late the stock price upwards. However, these 1 2 / 3​. If effort has a multiplicative disadvantages can potentially be avoided by ρ − = − effect on firm value, it has a higher dollar granting equity with long-vesting horizons. effect in a larger firm, and so a lower equity 3.5 Empirical Analyses stake is needed to induce effort. In addition, $–% incentives should have an elasticity of ​ We now turn to tests of the empirical predictions of these models. The first set of tests study the level of incentives. Motivated 15 By analogy, fund managers are compared according by traditional additive models, Jensen and to their risk-adjusted percentage returns, rather than dollar returns, as the former is comparable across funds of differ- Murphy (1990) estimated $–$ incentives ent size (assuming constant returns to scale). and showed that the CEO loses only $3.25 Edmans and Gabaix: Executive Compensation: A Modern Primer 1255 for every $1,000 loss in firm value, an effec- how the dollar equity stake ​ S​ varies with g​ ​, θ tive equity stake of only 0.325 percent. They ​ ​, and ​​ ​​ 2​​, studying the percentage equity η σ interpreted this stake as too low to be rec- stake ​ ​ will not be a precise test of the model θ onciled with optimal contracting, and thus as these parameters may vary with firm sizeS ​ ​. concluded that CEOs are “paid like bureau- In addition, equation (42) implies that crats.” However, such a conclusion hinges with a multiplicative production function critically on whether we believe CEO effort (​b(S) S​), the relevant measure of risk is ​​ __σ ​ ​ , = S has additive or multiplicative effects on firm the volatility of the firm’s percentage returns; value and CEO utility. $–$ incentives are the with an additive production function (​ 0​) γ = relevant measure only in an additive model. it is ​​, the volatility of the firm’s dollar σ As discussed above, in a multiplicative returns. model, %–% incentives are relevant and $–$ Starting with size, Jensen and Murphy incentives should optimally be low. In Hall (1990) find that $–$ incentives are even and Liebman (1998), $–% incentives are lower in large firms, perhaps because gov- relevant—i.e., dollar ownership, rather than ernance is particularly weak in these firms. percentage ownership. They overturned As discussed above, Edmans, Gabaix, and Jensen and Murphy’s conclusion by showing Landier (2009) show that, under a multipli- that dollar ownership is sizable. cative model, CEO effort has a larger dol- Separately, theory_ predicts that incen- lar effect in a bigger firm, and so a smaller __g​ a ______1 equity stake is required to induce effort. tives should be ​​ ​​ or ​​ 2 ​​ , but b(S) 1 g ​​ ​ __σ ​ ​ ​ + η(b(S )) They quantitatively predict a firm-size elas- ticity of ​ 2 / 3​, consistent with their empiri- parameters such as the cost of effort ​g​ are − cal finding of​ 0.60​. Similarly, they find that difficult to quantify. Thus, it is difficult to − evaluate whether quantitative findings on $–$ incentives are independent of firm size the level of incentives are consistent with and $–% incentives have a size elasticity of​ efficiency. Haubrich’s (1994) calibration 1/ 3​, both as predicted. Thus, a model with suggests that the seemingly low incen- multiplicative utility and production func- tives found by Jensen and Murphy (1990) tions quantitatively explains the size scalings can be optimal if the CEO is sufficiently of incentives. While these results are con- risk averse, but attaches wide confidence sistent with incentives being set optimally intervals to his conclusion, given the difficul- and the true model indeed being multipli- ties in calibration. The structural estimation cative, they could also be consistent with a of Margiotta and Miller (2000) also finds that ­non-multiplicative model and suboptimal low incentives are sufficient to induce effort, incentive setting. given the multiplicative effect of effort on We now turn to HM’s prediction that incentives ​​ are decreasing in risk ​ .​ While firm value. θ σ Given the difficulties of quantifying param- Lambert and Larcker (1987), Aggarwal and eters such as ​g​ to calculate the optimal level Samwick (1999), and Jin (2002) indeed find of incentives, incentive theories are typically a negative relationship, Demsetz and Lehn tested instead in terms of their ­cross-sectional (1985), Core and Guay (1999), Oyer and predictions—whether they vary with param- Schaefer (2005), and Coles, Daniel, and eters such as S​​, ​g​, ​ ​, and ​​ ​​ 2​ as predicted. Naveen (2006) document a positive rela- η σ Note that it is important for empirical tests tionship, and Garen (1994), Yermack (1995), to study the precise measure of incentives Bushman, Indjejikian, and Smith (1996), predicted by the theory. For example, if the Ittner, Larcker, and Rajan (1997), Conyon and theory is a multiplicative model that predicts Murphy (2000), Edmans, Gabaix, and Landier 1256 Journal of Economic Literature, Vol. LIV (December 2016)

(2009), and Cheng, Hong, and Scheinkman shown that incentives are significantly (2015) show either no relationship­ or mixed related to determinants such as firm size, results. Aggarwal and Samwick (1999) and Jin risk, and wealth, Coles and Li (2013) find (2002) study the volatility of dollar returns, that a large portion is explained by manager and the other papers study percentage fixed effects, similar to Graham, Li, and returns. Thus, the empirical evidence points Qiu (2012) who find significant manager to a weak relationship between risk and fixed effects for pay levels. Thus, a sizable incentives. The fixed-action model provides component of the variation in incentives a potential explanation: risk is second order remains unexplained. compared to the benefits of effort—it is The theories also derive predictions for incentive considerations, not risk consider- expected pay ​E c ​, often referred to as the [ ] ations, that affect the slope of the contract.16 level of pay. As discussed, firm risk and disut- The prediction that ​ ​ is decreasing in risk ility have an ambiguous effect on the level θ aversion ​ ​ is harder to test as risk aversion of pay in the HM model, but increase it in η is unobservable. Becker (2006) uses data the fixed-action model due to the required on CEO wealth, available in Sweden, as compensating differential. Garen (1994) a (negative) proxy for risk aversion under shows that pay is insignificantly increasing the assumption of decreasing absolute risk in firm risk as measured by dollar volatility, aversion. As predicted, he finds that wealth and insignificantly decreasing in percentage is positively related to both $–$ and %–% volatility. Cheng, Hong, and Scheinkman incentives.17 In addition, wealth can affect (2015) find a significant positive relationship incentives through channels other than with percentage volatility for financial firms. risk aversion. In the EG model, where the Gayle and Miller (2009) show theoretically contract is not driven by a trade-off with and empirically that larger firms are more risk aversion, the CEO’s outside option ​w​ complex to manage, and so CEOs require may include consuming his existing wealth. greater pay in return. In addition, greater Higher wealth increases ​w​ and thus the agency problems in large firms necessitate constant ​k​ (equation (46)), which in turn higher equity incentives, and thus more augments incentives (equation (45)). Intui­ pay as a risk premium. Conyon, Core, and tively, if the CEO is wealthier, his marginal Guay (2011) and Fernandes et al. (2013) utility from money is lower, and so greater compare CEO pay in the United States to incentives are required to induce him to the rest of the world, and show that the pay work. More generally, while studies have premium to US CEOs can be explained by the greater risk that they bear, rather than rent extraction. The structural estimation of 16 Prendergast (2002) provides another explanation for the weak relationship between risk and incentives. When Gayle, Golan, and Miller (2015) finds that uncertainty is low, principals assign tasks to agents and the risk premium can explain over 80 per- directly monitor them. When uncertainty is high, they del- cent of the pay differential between small egate tasks to agents and incentivize them through output. His model applies principally to rank-and-file employees, and large firms. It arises both because large since day-to-day monitoring of the CEO by directors is firms require greater incentives to address more limited. moral hazard, and also because stock returns 17 While HM assume constant absolute risk aversion utility and so risk aversion is independent of wealth, the are a poorer signal of effort in large firms. model of Sannikov (2008), analyzed in section 4.2, gener- ally predicts that incentives fall with risk aversion by the 3.6 Multiple Signals same intuition as in HM. His model allows for general-util- ity functions, and thus absolute risk aversion to be decreas- The analysis has thus far studied the sensi- ing in wealth. tivity of the manager’s pay to the performance Edmans and Gabaix: Executive Compensation: A Modern Primer 1257 of his own firm. Here, we study the extent for some function h​ ​. Condition (55) holds if to which it should depend on other signals, and only if ​V​ is a sufficient statistic for ​V, z ​ 18 { } such as industry and market conditions.­ We with respect to ​a ​a∗​​ ​. Thus, any signal ​z​, = ​ first demonstrate the Holmstrom (1979) no matter how noisy, that provides informa- informativeness principle. This principle tion incremental to V​ ​ on the agent’s effort considers the case in which, in addition to choice, should be included in the contract. firm valueV ​ ​, the principal has access to an The most common application of the infor- additional contractible signal z​​ (such as the mativeness principle to CEO pay is relative performance of peers) and studies the extent performance evaluation (RPE). Specifically, to which CEO pay c​ ​ should depend on z​ ​. The peer performance is informative about the joint density function is given by f​ (V, z, a)​, degree to which high firm value ​V​ is due to and the principal’s problem is high effort or good luck and so should enter the contract. For example, CEO pay should (52) be based on performance relative to a peer _ group, rather than using standard stock and ​ V ​ ​​max ​ ​ ​ ​ (V c(V, z)) f (V, z, ​a∗​​ ) dV dz options whose value is based on absolute c( , ) z 0 − ​ · · ∫ ∫ performance. However, Holmstrom’s result was derived s.t. assuming optimal contracts. In reality, con- tracts may not be perfectly optimal. For (53) example, a preference for simplicity can lead _ to the use of piecewise linear contracts— ​ V ​ ​ ​ ​ ​ u(c(V, z)) f (V, z, ​a∗​​ ) dV dz g(​a∗​​ ), indeed, cash, stock, and options are typically ∫z ∫0 ​ ≥ ​ used in practice.19 Dittmann, Maug, and Spalt (2013) study the effect of indexation (54) when contracts are restricted to these instru- _ ments and show that the indexation of options ​ V ​ can destroy incentives. Since an indexed ​ ​ ​ ​ u(c(V, z)) ​fa​ ​ (V, z, ​a∗​​ ) dV dz g (​a∗​​ )​. ∫z ∫0 ​ = ′ ​ option is in the money only if the stock price rises high enough to outperform the bench- Denote by ​ ​ and ​ ​ the Lagrange multipli- mark, indexation is tantamount to increasing λ μ ers for the IR (53) and IC (54), respectively. the strike price of an option and reducing Pointwise optimization yields the following the drift rate of the underlying asset. Both condition for the optimal contract c​ (V, z)​: effects reduce the option’s delta and thus his incentives. To preserve incentives, additional ​f​ ​ (V, z, ​a∗​​ ) equity must be given, and their calibration ______1 ​ ______a ​ u (c(V, z)) = λ + μ f (V, z, ​a∗​​ ) ′ ​ 18 for all ​(V, z)​. Hence, assuming that ​ 0​ Condition (55) is required to be satisfied only for effort level a​ ​a​​ ∗​; it may be that the likelihood ratio μ ≠ = ​ (i.e., the IC binds), the contract ​c(V, z)​ is not depends on z​ ​ for a different a​ ​a​​ ∗​. We thus say that (55) ≠ ​ holds if and only if ​V​ is a sufficient statistic for ​ V, z ​ with a function of z​​ if and only if the likelihood { } respect to a​ ​a​​ ∗​. Holmstrom (1979) assumes that con- ratio ​​f​ ​ (V, z, ​a∗​​ )/f (V, z, ​a∗​​ )​ does not depend = a ​ ​ dition (55) is satisfied either for alla ​​ or no a​​, and so he on z​ ​, i.e., shows that the contract is a function of ​z​ if and only if ​V​ is a sufficient statistic for​{ V, z}​ with respect to ​a​ (rather than ​a ​a∗​​ ​). ​fa​ ​ (V, z, ​a∗​​ ) = ​ (55) ______​ h(V, ​a∗​​ )​ 19 See Gabaix (2014) for a sparsity-based model where f (V, z, ​a∗​​ ) = ​ ​ agents have a preference for simplicity. 1258 Journal of Economic Literature, Vol. LIV (December 2016) shows that full indexation of all options would firm’s industry exposure optimally. DeMarzo increase compensation­ costs by 50 percent, and Kaniel (2015) and Liu and Sun (2015) on average. If firms choose the optimal pro- show that, when CEOs have relative wealth portion of options to index, average compen- concerns, it is optimal for the firm to pay him sation costs would only fall by 2.3 percent, for general industry upswings to ensure that and 75 percent of firms would choose zero his pay does not lag his industry peers. indexation. They show that indexing stock Turning to the evidence, conventional wis- also has little benefit. Chaigneau, Edmans, dom is that RPE is very rarely used in reality. and Gottlieb (2016a) study limited liability, Aggarwal and Samwick (1999) and Murphy another contracting constraint relevant in (1999) show that CEO pay is determined reality, and show that the informativeness by absolute, rather than relative, perfor- principle may no longer hold. In the stan- mance, and Jenter and Kanaan (2015) find dard Innes (1990) framework, the agent an absence of RPE in CEO firing decisions. receives zero below a threshold and gains However, more recent evidence suggests one-for-one above the threshold, which is that RPE may be more common than pre- the maximum possible without violating the viously thought. Albuquerque (2009) argues agent’s monotonicity constraint (if imposed) that relevant peers are not only firms in the or principal’s limited liability (if monotonic- same industry, but also those of similar size, ity is not imposed). Since constraints on the since common external shocks may affect contract are binding almost everywhere, the different firms in the same industry in dif- principal’s ability to use the signal is severely ferent ways—for example, increases in reg- restricted. If low firm value ​V​ is accompa- ulation may be particularly costly for small nied by a low signal z​​, she cannot punish firms. When defining firms according to both the agent further without violating limited industry and size, rather than industry alone, liability, as he is already receiving zero. Her she finds significant evidence for RPE. only degree of freedom is on the level of the Gong, Li, and Shin (2011) argue that conclu- threshold, and so a signal is only valuable if sions that RPE is rare arise from identifying it affects the optimal threshold. Moreover, RPE based on an implicit approach—assum- Chaigneau, Edmans, and Gottlieb (2016b) ing a peer group (e.g., one based on industry show that, even if a signal has strictly positive and/or size) and relevant performance mea- value because it affects the optimal cutoff, its sures, and studying whether those perfor- usage can also weaken incentives (similar to mance measures for that peer group affect Dittmann, Maug, and Spalt 2013), and so its CEO pay. These assumptions may lead to value is quantitatively small. measurement error that biases downwards In addition, other concerns can lead to the estimated use of RPE. Gong et al. study pay optimally depending on industry per- the explicit use of RPE, based on the disclo- formance. Oyer (2004) shows that, if equity sure of peer firms and performance measures is forfeited upon departure, it induces the mandated by the SEC in 2006. They find agent to stay with the firm. Since non-in- that 25 percent of S&P 1500 firms explicitly dexed equity is more valuable in high market use RPE. Similarly, rather than assuming a conditions, when the outside option is also peer group, Lewellen (2015) hand-collects higher, its retention power increases pre- the peers that firms report as their primary cisely when retention concerns are greatest. product market competitors in their 10-K fil- Gopalan, Milbourn, and Song (2010) argue ings, and finds evidence for RPE. DeAngelis that not indexing an executive to industry and Grinstein (2016) examine the actual performance induces him to choose the terms of compensation contracts and find Edmans and Gabaix: Executive Compensation: A Modern Primer 1259 that 88 percent of RPE contracts measure the agent’s utility function. Dittmann, Yu, the rank performance of the CEO relative and Zhang (2015) calibrate a model where to peers. In contrast, most empirical stud- the CEO chooses both effort and risk, and ies measure the difference between firm show that it can explain the mix of stock performance and a peer–firm benchmark, and options found empirically. However, implicitly assuming that contracts concern Carpenter (2000) and Ross (2004) show the- absolute peer-adjusted performance. Using a oretically that options may not increase the rank-based specification, they find significant manager’s risk-taking incentives: while an evidence of RPE. However, while recent evi- option has “vega” (positive sensitivity to vol- dence suggests that RPE is not rare, it is still atility), it also has “delta” (positive sensitivity far less common than the universality that to firm value). Thus, a risk-averse manager frictionless models would predict. may wish to reduce volatility in the value In addition to signals about peer perfor- of the firm and thus his options. Shue and mance, the informativeness principle implies Townsend (forthcoming b) evaluate this the- that any informative signal, no matter how oretical debate empirically by showing that noisy, should be in the optimal contract. In exogenous increases in options, resulting reality, in addition to firm value, CEO pay from their multiyear grant cycles, lead to an may depend on accounting performance increase in risk taking. measures (such as sales growth, return-on-as- The above models consider “good” risk­ sets, and growth) through taking that improves firm value. However, their impact on discretionary bonuses, per- the CEO may also have incentives to engage formance-based vesting provisions (Bettis in “bad” risk taking that reduces firm value. et al. 2010), and subjective evaluations by In particular, in a levered firm, an equity- principals (e.g., Cornelli, Kominek, and aligned manager may undertake a project Ljungqvist 2013). However, CEO pay does even if it is negative NPV because share- not appear to depend on non-accounting holders benefit from the upside, but have performance measures such as surveys on limited downside risk due to limited liability intangible assets (e.g., customer satisfaction, (Jensen and Meckling 1976). Anticipating brand strength, and employee engagement) this, creditors will demand a high cost of and the number of patent citations. These debt and/or tight covenants, to the detriment all potentially provide information over and of shareholders. above that contained in the stock price, since Edmans and Liu (2011) show that a the stock market does not immediately capi- potential solution to such risk shifting is to talize intangibles (Edmans 2011). compensate the CEO with debt as well as equity. (Such debt is referred to as “inside” 3.7 Risk Taking debt, as it is owned by the manager, rather Thus far, the CEO takes an action that than outside creditors.) Previously proposed changes the firm’s expected value, but has no remedies for risk shifting include bonuses direct effect on its risk. In Smith and Stulz for achieving solvency, or salaries and pri- (1985), the agent takes a single action that vate benefits that are forfeited in bankruptcy reduces risk via hedging. If the agent is risk (e.g., Brander and Poitevin 1992). These averse, he will engage in excessive hedging; instruments are sensitive to the incidence of in a CEO context, this corresponds to turning bankruptcy, but if bankruptcy occurs, they down positive NPV risky projects. They show pay zero regardless of liquidation value. In how options address this issue, since their contrast, inside debt yields a positive payoff convexity counterbalances the concavity­ of in bankruptcy proportional to the recovery 1260 Journal of Economic Literature, Vol. LIV (December 2016) value. Thus, it renders the manager sensitive the European Commission and the Federal to firm value in bankruptcy, and not just the Reserve have advocated debt-like compen- incidence of bankruptcy—exactly as desired sation to curb excessive risk taking. However, by creditors—and thus reduces the cost of even if the above studies can be interpreted raising debt, to the benefit of sharehold- as showing causal effects of inside debt ers. Indeed, recent empirical studies have on risk taking and borrowing conditions, shown that CEOs hold a substantial amount they do not study whether shareholders of inside debt through defined-benefit pen- benefit overall, nor whether alternative solu- sions and deferred compensation. These are tions to risk shifting would be superior. unsecured obligations that yield an equal In Smith and Stulz (1985), the firm is claim with other creditors in bankruptcy, and unlevered so there are no risk-shifting con- thus constitute inside debt. For example, cerns; the contract contains options but no Sundaram and Yermack (2007) show that debt. In Edmans and Liu (2011), the CEO is GE’s had over $100 million of risk neutral, so there is no problem of induc- inside debt when he retired in 2001. ing him to take “good” risk; the contract Since traditional contracting theories typi- contains debt, but not options. For future cally advocate only the use of equity, and dis- research, it would be interesting to incorpo- closure of pensions and especially deferred rate both leverage and risk aversion into a compensation was limited prior to a 2006 model of both effort and risk taking, to study SEC disclosure reform, Bebchuk and Fried the optimal mix of salary, stock, options, and (2004) argue that inside debt constitutes debt. “stealth compensation through retirement benefits.” However, the risk deterrence story 4. Dynamic Incentives suggests that inside debt can be consistent with optimal contracting. The disclosure of This section analyzes dynamic models of significant inside debt positions following the moral hazard. In reality, CEOs are employed SEC reform led to an increase in bond prices for several years. A dynamic setting leads to (Wei and Yermack 2011), and is associated­ additional questions, such as how to spread with lower bond yields and fewer covenants the rewards for good performance over time, (Anantharaman, Fang, and Gong 2014, using how the level and sensitivity of pay vary personal state income taxes as an instru- over time, and when the CEO quits or is ment for inside debt). Debt-aligned execu- fired. We start in section 4.1 with a tractable tives manage the firm more conservatively, ­discrete-time model that yields closed-form as measured by the firm’s lower distance to solutions, at the cost of some assumptions. default (Sundaram and Yermack 2007), and In section 4.2 we move to a continuous-time lower stock-return volatility, R&D expendi- model which typically yields numerical tures, and financial leverage (Cassell et al. solutions, but allows for departures and 2012, also using the income tax instrument). terminations. Indeed, the alignment of executives with 4.1 Dynamic Incentives: A Simple Discrete debt has gathered pace in the recent cri- Time Model sis. In 2010, American International Group tied 80 percent of highly paid employees’ We present here the discrete-time version pay to the price of its bonds, and 20 per- of the Edmans et al. (2012) model, which uses cent to the price of its stock, and UBS and the EG framework to yield tractable solu- Credit Suisse have since started paying tions. In every period t​ ​, the CEO first observes_ bonuses in bonds. The Liikanen Report of noise ​​ ​​​​ and then takes action ​​a​ ​ 0, ​ a ]​, εt t ∈ [ Edmans and Gabaix: Executive Compensation: A Modern Primer 1261 which affects terminal (period-​T​) firm value where ​​ is the agent’s discount rate, i.e., δ as follows: his impatience. His per-period utility func- tion u​ (c, a) ln c g(a)​ corresponds T ​ s 1​ ​​​(​as​​ s​)​ ​ = − ​​V​ ​ S​e​​ ∑ = +ε​ ​ .​ to (10) with v​(c) ln c​ and u​(x) x​, i.e., T = = = ­multiplicative preferences. The agent’s res- We assume that a signal about ​​a​ ​, ervation utility is ​w​. t t ​ s 1​ ​​​(​as​​ s​)​ ​ ​​V​ ​ S​e​​ ∑ = +ε​ ​​, is contractible. The incre- The principal is risk-neutral and so her t = mental information contained in ​​Vt​ ​ over and objective function is expected discounted above that contained in ​​Vt​ 1​ can be summa- terminal firm value minus expected pay: rized by the stock “return”− (57) rt​ ​ ln ​Vt​ ​ ln ​Vt​ 1​ ​a​ t​ ​ t​​ , T = − − = + ε RT Rt ​ max ​ ​ E​ ​e−​​ ​ ​VT​ ​ ​ ​ ​e−​​ ​ ​yt​ ​ ​. (​a​​, t 1, … , L), (​y​​, t 1, … , T) [ − t∑1 ] where, as in EG, the CEO observes the t = t = = noise ​​ ​​ before he takes his action ​​a​ ​. Also in She wishes to implement a target action εt t every period t​​, the principal pays the CEO sequence (​​ ​a​ t∗​ ​)​. ​​yt​ (​ ​r1​ ​, … , rt​ )​ ​​ which may depend on the entire There are two constraints to consider. history of returns. The agent consumes ​​ct​ ​ and The first is the effort constraint (EF), which saves ​​ ​y​ ​ ​c​ ​ ​​ (which may be positive or neg- ensures that the CEO does not wish to devi- ( t − t) ative) at the continuously compounded risk- ate from (​​​a​ t∗​ ​)​. We consider a local deviation free rate R​ ​. We consider two versions of the in the action ​​a​ t​ after history ​( ​r1​ ​, … , rt​ 1​, t​​).​ − ε model. In one, private saving is observed by the The effect on CEO utility should be zero: principal and so she can stipulate that ​y​ t​ ​c​ t​, = ​r​ ​ i.e., that the CEO does not engage in private 0 ​E​ ​ ​ ___ U ___∂ t ​ ___ U ​ ​.​ = t ∂ ​r​ ​ ​a​ ​ + ∂ ​a​ ​ saving. In another, private saving is unob- [∂ t ∂ t ∂ t] served. This leads to additional complications, t since the CEO may have incentives to engage Since ​ ​r​ ​ / ​a​ ​ 1​ and ​ U/ ​a​ ​ e​ ​​ −δ ​ ∂ t ∂ t = ∂ ∂ t = in a joint deviation of simultaneously shirk- ​u​​ (​c​ ​ , a​ ​)​, the EF constraint (evaluated at ​​ × a t t ing and saving—by saving, he insures against a​ ​ ​a​ ∗​ ​) is t = t ​ future income shocks, thus reducing effort incentives. Put differently, by privately saving, ___ U t (58) EF : ​Et​ ​ ​ ​ ​ ​e​​ −δ ​ ​ua​(​ ​ct​ ​ , a​ t∗​ ​)​ the CEO can achieve a different consumption ∂ ​r​ ​ = [∂ t ] profile ​​c​ t​ from the income ​​y​ t​ provided by the _ if ​a​ ∗​ ​(0, a ) ​ ; contract, thus undoing effort incentives. The t ​ ∈ contract must therefore remove his incentives ___ U t to do so. ​Et​ ​ ​ ∂ ​ ​ ​e​​ −δ ​ ​ua​(​ ​ct​ ​ , a​ t∗​ ​)​ [ ​rt​ ​] ≥ The CEO lives for ​T​ periods and retires ∂ _ after period ​L T.​ 20 His lifetime utility is if ​a​ ∗​ a ​, ≤ t ​ = T t for ​t L​. (56) ​U ​ e​​ −δ ​ u(​c​ t​ , a​ t​), ≤ = t∑1 = The second constraint is the private sav- u(c, a) ln c g(a)​, ings constraint (PS), which ensures that the = − CEO consumes his income in period ​t​, i.e., ​​ c​ t​ ​yt​ ​​​, so that he has no incentive to save 20 In the model, the principal replaces the CEO with = a new one and continues to contract optimally, but this privately. If the CEO saves a small amount​​ d​ ​ in period ​t​ and invests it until ​t 1,​ his assumption can easily be weakened. t + 1262 Journal of Economic Literature, Vol. LIV (December 2016) utility increases to the leading order by The closed-form solutions allow transpar- ___ U ____ U R ent economic implications. Equation (61) ​ ​Et​ ​ ​ ∂ ​ ​ dt​ ​ ​Et​ ​ ​ ∂ ​ ​ e​​ ​ ​dt​ ​. To deter private − [ ​ct​ ​] + [ ​ct​ 1​] ∂ ∂ + shows that time-t​​ income should be linked saving or borrowing, this change should be to the return not only in period ​t​, but also in zero to the leading order, all previous periods. Therefore, increases in​​ r​ ​​​ boost log pay in the current and all future R ______​uc​​ (​ct​ 1​ , a​ t 1​) t (59) PS : 1 ​Et​ ​ ​e​​ −δ + + ​ ​, periods equally. Since the CEO is risk averse, = [ ​u​​ (​c​ ​ , a​ ​) ] c t t it is efficient to spread the reward for good that is, the consumption Euler equation performance over the future to achieve con- should hold. If the CEO cannot engage in sumption smoothing: the “deferred reward” private saving (e.g., because the principal principle. This result was first derived by can observe saving), then instead the inverse Lambert (1983) and Rogerson (1985), who Euler equation (IEE) holds, as is standard in consider a two-period model where the agency problems with additively separable agent only chooses effort. utility (e.g., Rogerson 1985 and Farhi and We now consider how contract sensitivity Werning 2012). This is given by changes over time. Equation (62) shows that, in an infinite-horizon model T(​ ​), the → ∞ ____1 ______​uc​​ ( ​ct​ ​ , at​ ​ ) sensitivity is constant and given by (60) IEE : 1 ​Et​ ​ ​ R ​ ​. = [​e​​ −δ ​uc​​ ( ​ct​ 1​ , at​ 1​ )] ​ + + (65) ​​ ​(1 ​e​​ −δ)g (​a∗​​ ).​ θt = θ = − ​​ ′ ​ We next present the solution (a heuristic proof is in Appendix A). For simplicity, we This is intuitive: the contract must be suffi- assume a constant target action (a​​​ ∗​ ​a∗​​ ​, ciently sharp to compensate for the disutility t ​ = ​ which may_ correspond to full productive effi- of effort, which is constant. The sensitivity ciency ​​ a ​). The contract is given by to the current-period return is decreasing in t the discount rate—if the CEO is more impa- (61) ln ​ct​ ​ ln ​c0​ ​ ​ ( ​ s​​ ​rs​ ​ ​ks​ )​ ​, tient (higher ​ ​), it is necessary to reward him = + s∑1 θ + δ = today more than in the future. where ​​ ​​ and ​​k​ ​ are constants. The sensitivity ​​ If T​ ​ is finite, equation (62) shows that ​​ ​​ is θs s θt ​​ is given by increasing over time: the “increasing incen- θs tives principle.” When there are fewer peri- (62) ods over which to spread the reward for effort, the current-period reward (​ ​ut​​ / ​a​ t​ ​ t​​) ⎧ g (​a∗​​ ) ∂ ∂ = θ ⎪​ ______′ ​ ​ for s L must increase to keep the lifetime increase (T s) ≤ ​​ ​s​ ​⎨ ​1​ ​e​​ δ ​e​​ δ − ​​ ​​​​ . in utility ​ U/ ​a​ t​ constant. Other moral-haz- θ = ⎪ + ​+ ⋯ + ∂ ∂ ⎩0 for s L ard models predict increasing incentives > through different channels. Gibbons and If private saving is impossible, the constant ​​ks​ ​ Murphy (1992) generate an increasing cur- ensures that the IEE (60) holds: rent sensitivity because the lifetime increase in utility ​ U/ ​a​ t​​​ rises over time to offset fall- ​ ​​(​a∗​​ ) ∂ ∂ (63) k​ ​ R ln E​ ​e​ θ​ s +ε​ ​ .​ ing career concerns. In Garrett and Pavan s = − [ s ​]​ (2015), the current sensitivity rises over time If private saving is possible, k​​​ ​ ensures that because ​ U/ ​a​ ​​​ increases to minimize the s ∂ ∂ t the PS constraint (59) holds: agent’s informational rents. Here, ​ U/ ​a​ ​ is ∂ ∂ t constant since we have no adverse selection ​​(​a∗​​ ) (64) k​ ​ R ln E​ ​e​ −​ θ​ s +ε​ ​ . or career concerns; instead, the increase in ​ s = + [ s ​]​ Edmans and Gabaix: Executive Compensation: A Modern Primer 1263

​u​​ / ​a​ ​​​ stems from the reduction in con- ­nondecreasing over time.21 The model thus ∂ t ∂ t sumption smoothing possibilities as the CEO predicts a positive relationship between pay approaches retirement. and tenure, consistent with the common While ​​ ​​​​ depends on the model horizon, practice of seniority-based pay. Moreover, θt it is independent of whether private saving the growth rate depends on the risk to which is possible—this possibility only affects ​​kt​ ​ . the CEO is exposed, which is in turn driven Since private saving does not affect the by his incentives ​ ​ and firm volatility​ ​. This θ σ agent’s action and thus firm returns, the sen- is intuitive: greater risk increases the CEO’s sitivity of pay to returns is unchanged. From motive to engage in precautionary saving (61), the possibility of private saving alters (since, with CRRA utility, ​u (c) 0​), and ‴ > the time trend in the level of pay. The log so a rapidly rising level of pay is necessary to expected growth rate in pay is ln​ E[​ ​c​ t​ /​c​ t 1]​ ​ remove the need for him to save privately. ​ ​​ ​r​​ − ​k​ ​ ln E​ ​e​​ θt t​ ​. Furthermore, in a finite-horizon model, ​​ ​​ is = t + [ ] θt If private saving is impossible, substituting increasing over time, so the growth rate of for ​​k​ t​ using (63) yields consumption rises with tenure. That is, pay accelerates over time. ​c​ ​ To illustrate the economic forces behind ln E​ ​ ____t ​ ​ R ,​ [​ct​ 1​] − = − δ the contract, we now present a simple numerical example with ​T 3​, ​L 3​, = = which is constant over time. If and only if ​ 0​, ​​a​ ∗​ ​a∗​​ ​, and g​ (​a∗​​ ) 1​. From (62), δ = t ​ = ​ ′ ​ = the CEO is more patient than the aggregate the contract is economy (​ R​), then the growth rate is δ < positive, as is intuitive. If private saving is __​r1​ ​ ln ​c1​ ​ ​ 1​​ , possible, (64) yields = 3 + κ

__​r1​ ​ __​r2​ ​ ​ct​ ​ ​​ ​r​​ ln ​c2​ ​ ​ 2​​ , ln E​ ​ ____ ​ ​ R ln E e​​ −θ​ t t​ ​ = 3 + 2 + κ [​ct​ 1​] [ ] − = − δ + ​​ ​ ​​ ​r​​ __​r1​ ​ __​r2​ ​ __​r3​ ​ ln E​ ​e​​ θt t​ ​. ln ​c3​ ​ ​ 3​​ ,​ + [ ] = 3 + 2 + 1 + κ

t In the limit of small time intervals ​​ or, equiv- where ​​ ​t​ ​ s 1​ ​​ ​ks​ ​. An increase in ​​r1​ ​ leads ( κ = ∑ = alently, in the limit of small variance of noise ​ to a permanent increase in log consump- ​r​ ​ ​​ 2​ ​, this yields tion—it rises by __​​ 1 ​​ in all future periods. In σ ) 3 addition, the sensitivity ​ ​u​​ / ​a​ ​ increases ∂ t ∂ t ____​ct​ ​ 2 2 over time, from 1/​ 3​ to 1/​ 2​ to 1/1​ ​. The total ln E​ ​ ​ ​ R ​ ​ t​ ​ ​ ​ t​ ​ .​ [​ct​ 1​] = − δ + θ σ lifetime reward for effort ​ ​U​ ​ / ​a​ ​ is a con- − ∂ t ∂ t stant 1 in all periods. Thus, the growth rate of consumption is always higher when private saving is possi- ble. This faster upward trend means that 21 Lazear (1979) has a back-loaded wage pattern for the contract effectively saves for the agent, incentive, rather than private saving considerations (the removing the need for him to do so himself. agent is risk neutral in his model). Since the agent wishes This result is consistent with He (2012), who to ensure he receives the high future payments, he induces effort to avoid being fired. Similarly, in Yang (2010), a finds that the optimal contract under pri- ­back-loaded wage pattern induces agents to work to avoid vate saving involves a wage pattern that is the firm being shut down. 1264 Journal of Economic Literature, Vol. LIV (December 2016)

We now consider ​T 5​, so that the CEO Let ​​ ​​ ​M​​ (0)​ 0, 1)​ be inversely = i = ′i ​ ∈ [ lives after retirement. The contract is now related to the ­marginal inefficiency of manip- ulation at release lag ​i​. ​r​ ​ (66) ln ​c​ ​ __ 1 ​ ​​, If the firm is sufficiently large, the prin- 1 = 5 + κ1 cipal will wish to implement zero manipula- tion, i.e., m​​ ​ ​ 0​ ​ ​ ​t​. If the agent engages t, i = ∀ __​r1​ ​ __​r2​ ​ in a small myopic action ​​m​ ​ 0​ at time ​t​, ln ​c2​ ​ ​ 2​​, t, i ≥ = 5 + 4 + κ his utility changes to the leading order by __​r1​ ​ __​r2​ ​ __​r3​ ​ ln ​c3​ ​ ​ 3​​, ___ U ____ U = 5 + 4 + 3 + κ Et​ ​ ​ ∂ ​ ​ ​m​ t, i​ ​Et​ ​ ​ ∂ ​ ​m​ t, i​ .​ [ ​rt​ ​] − [ ​rt​ i​] ​r​ ​ ​r​ ​ ​r​ ​ ∂ ∂ + ln ​c​ ​ __ 1 __2 __3 ​ ​​, 4 = 5 + 4 + 3 + κ4 This should be weakly negative for all small​​ m​ ​ 0​. Hence, we obtain an additional no ​r​ ​ ​r​ ​ ​r​ ​ t, i ≥ ln ​c​ ​ __ 1 __ 2 __3 ​ ​​. manipulation (NM) constraint: 5 = 5 + 4 + 3 + κ5 ___ U ____ U (67) NM : ​Et​ ​ ​ ∂ ​ ​ i​​ ​Et​ ​ ​ ∂ ​ ​ Since the CEO takes no action from t​ 4​, his [ ​rt​ ​]  ≤ [ ​rt​ i​] = ∂ ∂ + pay does not depend on ​​r4​ ​ or ​​r5​ ​. However, it depends on r​​ ​ ​, ​​r​ ​, and ​​r​ ​​​ as his earlier efforts for ​t L.​ The optimal contract is now as 1 2 3 ≤ affect his wealth, from which he consumes. above, with the additional constraint (67). We apply this constraint to the five-period Short-Termism. We finally extend the model of equation (66), with H​ 1​. The = basic model to allow the agent to engage in optimal contract now changes to short-termism, and study how this possibility ​r​ ​ affects the optimal contract. Short-termism (68) ln ​c​ ​ __ 1 ​ ​​, is broadly defined to encompass any action 1 = 5 + κ1 that increases current returns at the expense __​r1​ ​ __​r2​ ​ of future returns—scrapping positive NPV ln ​c2​ ​ ​ 2​​, investments (see, for example, Stein 1988) or = 5 + 4 + κ taking negative NPV projects that generate ​r​ ​ ​r​ ​ ​r​ ​ ln ​c​ ​ __ 1 __2 __3 ​ ​​, an immediate return but weaken long-run 3 = 5 + 4 + 3 + κ3 value (such as subprime lending), earnings , and accounting manipulation. ​r1​ ​ ​r​ ​ ​r3​ ​ ​ ​1​ ​r4​ ​ ln ​c​ ​ __ __2 __ ____ ​ ​​, At time ​t​, in addition to an effort and sav- 4 = 5 + 4 + 3 + 2 + κ4 ings decision, the manager can also take a __​r1​ ​ __​r2​ ​ __​r3​ ​ ​ ____​1​ ​r4​ ​ myopic action ​​m​ t, i​​​ that increases the current ln ​c5​ ​  ​ ​5​. return to ​​r​​ ​r​ ​ ​M​ ​(​m​ ​)​ where ​​M​ ​ is a = 5 + 4 + 3 + 2 + κ t ​ = t + i t, i i concave function.′ The CEO also chooses a “release lag” i​​, which is the number of Even though the CEO retires at the end of​ periods before the negative consequences t 3​, his income depends on ​​r​ ​, otherwise = 4 of myopia become evident. The maximum he would have an incentive to boost r​​ 3​ ​ at the possible release lag is ​H T L​. Myopia expense of ​​r​ ​​​. Thus, the CEO should retain ≤ − 4 at ​t​ with release lag i​​ reduces the return at​ equity in the firm even after retirement. t i​ to ​​rt​ i​ ​rt​ i​ ​m​ t, i​, and leaves other For a general maximum release lag of ​H​, + ′ + ​ = + − returns unchanged (r​​ ​t s​ ​rt​ s​ for ​s 0, i​). the CEO should be sensitive to firm returns ′ + ​ = + ≠ Edmans and Gabaix: Executive Compensation: A Modern Primer 1265 until period ​L H​, i.e., retain equity in the into an “incentive account,” a proportion​​ + firm for ​H​ years after retirement. This result ​​​​ of which is invested in stock and the θt formalizes the verbal argument of Bebchuk ­remainder in cash, so his %–% incentives and Fried (2004), who advocate escrowing equal ​​ ​​​​ given by (62). If the stock price θt the CEO’s equity to deter him from inflating declines, so that the fraction of stock falls the stock price before retirement and then below ​​ ​​​​ , cash in the account is used to buy θt cashing out. For example, Angelo Mozilo, stock to replenish his incentives. Every year, the former CEO of Countrywide Financial, a fraction of the account vests and is paid made $129 million from stock sales in to the manager, but the remainder remains the twelve months prior to the start of the escrowed to deter myopia. Zhu (2016) shows subprime crisis. Indeed, in the aftermath that “bonus banks,” introduced in practice by of the crisis, banks such as Goldman Sachs the consulting firm Stern Stewart, are a sim- and UBS have been increasing vesting ilar way to deter myopia. Bonuses for short- horizons. An alternative remedy proposed term performance are deposited into the is to use , i.e., pay executives “bonus bank,” rather than immediately paid bonuses for good short-term performance, to the manager, and only a fraction is paid but rescind them if the performance ends up each period. Poor performance in one period, being reversed in the long term. However, which may be caused by a myopic action in a clawbacks have been very rare in practice, previous period, wipes out previously accrued perhaps because they require an active bonuses. decision by the board. Lengthening the The advantage of the above framework is vesting period of equity so that rewards are that it yields closed-form solutions that make not paid out prematurely in the first place the economic intuition transparent. However, may be a superior solution. it comes at the cost of a number of assump- The sensitivity to r​​4​ ​​​ depends on the effi- tions. First, as in the EG model, it assumes a ciency of earnings inflation ​​ ​​; in the fixed target action and that the noise precedes 1 extreme, if ​​ ​​ 0,​ myopia is impossible the action, which may be reasonable in some 1 = and so there is no need to expose the CEO settings but not others. Second, it assumes a to returns after retirement. The contract is fixed retirement date ​T​ and does not allow unchanged for t​ 3,​ that is, for the peri- for quits or firings beforehand, which is an ≤ ods in which the CEO works. Even under important limitation in a CEO setting. the original contract, there is no incen- tive to inflate earnings att ​ 1 ​or t​ 2 ​ 4.2 Dynamic Incentives in Continuous Time = = because there is no discounting, and so the negative effect of myopia on future returns This section presents the continuous-time reduces the CEO’s lifetime utility by more model of Sannikov (2008) which allows for than the positive effect on current returns quits and firings, as well as for the imple- increases it. With discounting, incentives mented effort level to be endogenized. At increase even faster over time than in the every instant ​t​, the agent takes action ​​a​ t​ absence of a myopia problem. The higher sen- (from a compact set with smallest element sitivity to future returns ensures that myopia 0) and consumes c​​ ​ t​; the framework assumes causes the CEO to lose enough in the future away private saving so we do not distinguish to counterbalance the effect of discounting. between income and consumption. His The contracts in (66) and (68) can be expected lifetime utility at date t​​ is implemented in a simple manner. Each ∞ (s t) year, the manager’s annual pay is escrowed (69) U​ ​ ​E​ ​ ​ ​ ​ e​​ −δ − ​ u(​c​ ​ , a​ ​) ds ​.​ t = t[∫t s s ] 1266 Journal of Economic Literature, Vol. LIV (December 2016)

It is the “promised utility” that the agent will some reward function. Then the value func- obtain if he exerts the path of efforts recom- tion ​Q( )​ satisfies · mended by the principal. This path of efforts is given by an adapted process (​​​a​ t∗​ ​)​. Recall (73) 0 sup ​ ​ f (​x, C)​ Q(x) = − δ that an adapted process is a process whose C value depends on the information available at time ​t​. Q (x) ​(x, C)​ _1 ​ Q (x)​ ​​ 2(​ x, C)​. The principal uses the same discount rate ​ + ′ μ + 2 ″ σ ​, and her expected utility is δ Using the martingale representation theo- ∞ (s t) rem (72), the agent’s utility process (69) can (70) Qt​ ​ ​Et​ ​ ​ ​ ​ e​​ −δ − ​ (​d ​Vs​ ​ ​c​ s​ ds)​ ​,​ = [∫t − ] be written where d​ ​V​ ​​​ (instantaneous profit before pay- ​d ​U​ ​ ​ u (c​ ​ ​ , a​ ​) ​U​ ​ ​ dt ​ ​​ d ​Z​ ​ t t = (− t t + δ t) + βt σ t ing the CEO) is assumed to be for some process ​​ ​​​​ that represents the sen- βt (71) ​d ​V​ ​ ​a​ ​ d t d ​Z​ ​ ,​ sitivity of the contract (equation (72) yields a t = t + σ t process ​​ ​ ​, and we set ​​ ​​ ​​ ​ / ​). ξ t βt = ξ t σ where ​​Zt​ ​​​ is a standard Brownian motion. This allows us to write the contract. Call ​​ To derive the optimal contract, we first a​ t∗​ ​ the time-t​​ action recommended by the recall two basic lemmas from stochastic cal- principal (it also depends on the past, and culus, proven in Appendix A. will soon be optimized upon). On the equi- librium path, d​ ​V​ ​ ​a​ ∗​ dt d ​Z​ ​, and so t = t ​ + σ t Martingale Representation Theorem.—If ​​ utility follows: ∞ (s t) Ut​ ​ ​Et​ [​ ​ t​ ​ e​​ −δ − ​ ​K​ s​ ds]​ for some adapted = ∫ (74) ​d ​Ut​ ​ ​( u ​(​c​ t​ , a​ t∗​ ​)​ ​Ut​ )​ ​ dt process ​​K​ t​, then = − + δ

​ ​t(​ d ​Vt​ ​ ​a​ t∗​ ​ dt)​.​ (72) ​d​U​ ​ ​ ​K​ ​ ​U​ ​ ​ dt ​ ​ ​ d ​Z ​ ​ , + β − t = (− t + δ t) + ξ t t This equation defines an essential part of where ​​ ​ ​​​ is some other adapted process. the contract: given the past (summarized by ξ t the promised utility U​​ t​ ​), the contract recom- Hamilton-Jacobi Bellman (HJB) Equa- mends action a​​​ t∗​ ​, and given the “surprise” tion.—Consider a stochastic process x​​​ ​ fol- ​d ​V​ ​ ​a​ ∗​ dt​, the contract specifies how t t − t ​ lowing ​d ​x​ ​ x​ ​ , C​ ​ ​ dt x​ ​ , C​ ​ ​ d ​Z​ ​ , promised utility changes: by ​d ​U​ ​ given in t = μ(​​ t t) + σ(​​ t t) t t where ​​Ct​ ​​​ is a control variable (which is (74). Here (74) features the recommended potentially multidimensional),­ and the fol- action ​​a​ t∗​ ​ (which the principal specifies), lowing optimal control problem: rather than the actual action a​​​ t​ (which the principal does not observe). ​Q(​x)​ The intuition is as follows: ​​Ut​ ​ represents = an “account” that contains the lifetime util- ∞ (s t) ity promised to the agent. If the principal sup ​ ​ ​ ​Et​ ​ ​ ​ ​ ​e​​ −δ − ​ f ​(​x​ s​ , Cs​ )​ ​ ds x​ t​ x ​, [ t | ​ = ] provides instantaneous utility to the agent, (​Cs​​)s​ t​ ∫ ≥ the account changes by ​ u ​ ​c​ ​ , a​ ∗​ ​ as less − ( t t ​)​ where control ​​Ct​ ​​​ is adapted, i.e., uses only utility is owed in the future. If firm value is information available at ​t​, and f (​​x​ ​, ​​C​ ​) is higher than expected ​​d ​V​ ​ ​a​ ∗​ dt 0 ​, the s s ( t − t ​ > ) Edmans and Gabaix: Executive Compensation: A Modern Primer 1267 account rises in value. Otherwise, it grows at and a rate ​ ​ (the ​ ​U​ ​ term). δ δ t We now move to the agent’s IC. If the (78) 1 Q (x) ​u​​(c, a) − ′ a agent chooses action a​​ ​ t​, his utility is __1 ​ Q (x) ​ ​​ ​u​​(c, a)​​ 2​ ​ ​​ 2​ 0​. + 2 ″ ∂a( a ) σ = max ​ ​ u (​c​ t​ , a​ t​) dt d ​Ut​ ​(​a​ t​)​, ​at​​ + We now need to specify the boundary con- i.e., he receives ​u (​c​ t​ , a​ t​)​ today, and will ditions. The agent’s per period reservation receive continuation utility of ​d ​Ut​ ​ (​a​ t​)​ later. utility is ​w​, and so he accepts the contract 1 Given (71) and (74), we have only if ​​U​ ​ w/ ​. Let ​​Q​ ​(x) u ​​( , 0)​ − ​ t ≤ δ 0 = − · ( w)/ V A​ denote the cost of provid- × δ + ​d ​U​ ​ ​​​ ​a​ ​ dt terms independent of ​a​ ​ ,​ ing this to the agent when the agent exerts t = βt t + t zero effort, where ​A​ is the present value of which yields the IC: the principal’s outside option, e.g., his sur- plus from hiring a new agent. The agent is

a​ ∗​ arg max ​ ​ u (​c​ ​ , a​ ​) ​ ​​ ​a​ ​ .​ employed if and only if his promised utility t ​ ∈ t t + βt t a is x​ ​​x​ ​ , x​ ​ ​​, with the following matching ∈ [ L H] The first-order condition yields and smooth pasting conditions:

(75) ​​ ​u​​ (​c​ ​ , a​ ∗​ ​),​ (79) ​Q(x) ​Q​ ​(x), βt = − a t t = 0 i.e., incentives ​​ ​​ are determined by ​​c​ ​ and ​​ Q (x) ​Q​​ (x)​ βt t ′ = ′0 ​ a​ t∗​ ​. Turning to the principal’s problem, the for state variable for the HJB stochastic pro- ​x ​x​ ​ , x​ ​ . = L H cess is ​​x​ ​ ​U​ ​, the agent’s promised util- t = t ity. The principal’s value function is ​Q(​x​ t​)​. Hence, the problem is characterized by the Using (75), the HJB equation (73) is, for ODE (76), ​​x​ L​, and ​​x​ H​. At ​​x​ L​​​, the agent is ter- all ​x​, minated because of poor performance. At x​​ ​ H​, he is terminated because he has become too

(76) 0 max ​ ​ a c Q(x) Q (x) expensive to incentivize. Intuitively, as the = c, a − − δ + ′ agent’s promised utility increases, his mar- ginal utility of money falls, and so it is harder u(c, a) x × [− + δ ] to incentivize him. We have four unknowns, x​​​ L​ , x​ H​, and the __ 1 ​ Q (x)​u​​ (c, a​)​​ 2​ ​ ​​ 2​.​ two degrees of freedom associated with a + 2 ″ a σ second-order ODE (given a value ​​x​ ​, the This gives an ordinary differential equation ODE is described by two parameters,∗ Q​ (​x​ ​)​ for Q​ (x)​. From this, the optimal c​​ and a​ ​ are and ​Q (​x​ ​​)​). We also have four equations (79).∗ ′ ∗ implicitly determined by Hence, the problem yields a solution. To sum up, the principal solves the prob- (77) ​1 Q (x) ​u​​(c, a) lem as follows. Given her value function Q​ ​, − − ′ c she finds the optimal action and consump- tion from the first-order conditions (77) and __ 1 Q″ ( ​ x) ​ ​​ ​u​​ ​(c, a)​​ 2​ ​ ​​ 2​ 0​ + 2 ∂c( a ) σ = (78). These in turn determine the optimal 1268 Journal of Economic Literature, Vol. LIV (December 2016)

1 x* 0.8 0.2 0.6 0.4 Q (x) x 0.2 x* H 0 0

Consumption c ( x ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x Q0 (x) * −0.2 0.8 x 0.6 0.4 −0.4 0.2 Effort a ( x ) 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x −0.6 Continuation value Q ( x ) 0.08 x* 0.06 −0.8 0.04

Drift ( x ) 0.02 −1 0 x 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 L 0.2 0.4 0.6 0.8 1 x x

Figure 1. Optimal Contract in Continuous Time

Source: Author replication of Sannikov (2008) with parameters u(c) _c ​​, g(a) 0.5​​a2​​ ​ 0.4a, 0.1, = √ = + δ = 1. σ = incentives from (75), and the agent’s utility effort falls; the marginal utility of additional is given by (74). ­consumption is low and so stronger mone- This problem is quite complex, and typi- tary incentives are required to induce effort. cally the solutions are numerical.22 In fig- Since the agent is now expensive to incen- ure 1, we replicate a result from Sannikov tivize, the optimal effort level falls. Other (2008). variables have non-monotonic relationships While the numerical results depend some- with promised utility. One important open what on the utility function, for the speci- question would be to take these theoretical fication in figure 1, when promised utility models to the data and determine which rises, consumption increases. As a result, functional form best describes the world. The absence of analytical solutions renders com- parative statics relatively difficult to obtain. 22 One closed-form solution is the one in the Edmans et al. (2012) setup, with a constant a​​∗​​ ​ (e.g., ​​ In many situations, greater risk aversion _ ​ a∗​​ a ​) and no outside opportunity (w​ ​). With ​ = = −∞ reduces incentives, for the same intuition as log utility (u​ (c, a) ln c g(a​)) and ​r ​, the reader can = − = ρ Holmstrom and Milgrom (1987)—incentives verify that ​c(x) D ​er ​​ x​, ​Q(x) A​er ​​ x​ B​ for some con- = = + 1 g(a) −γ are more costly, as they require the agent to stants A​ , B, D​. With CRRA ​​ u ​​ c​e−​​ ​ ​ /​1 ​ ​, the ( = ( ) ​( − γ)) 1/​1 ​ 1/​1 ​ be paid a greater risk premium. solution has the form ​c D ​x​​ ( −γ)​, Q(x) A ​x​​ ( −γ)​ B​. = = + The resulting contracts are described in section 4.1. Sannikov’s (2008) methodology has since Otherwise, the only known solutions are numerical. been used in executive compensation Edmans and Gabaix: Executive Compensation: A Modern Primer 1269

­applications. He (2012) extends the model to ­models that predict a backward-loaded wage allow for private saving. In standard models pattern to remove incentives for ­private sav- without private saving (e.g., Rogerson 1985), ing. However, to our knowledge, predictions the optimal wage profile isfront ­ loaded, that the growth rate in pay depends on the but such a profile will induce the agent to level of incentives ​ ​ and firm risk ​ ​ are as yet θ σ engage in a joint deviation of shirking and untested. Turning to the effects of short-ter- saving. He shows that the wage profile is mism, Edmans et al. (2012) predict that back loaded to deter such private saving. He firms in which the CEO has greater scope to also finds that pay does not fall upon poor engage in myopia should have longer vest- performance, but exhibits a permanent rise ing periods and also more rapidly increasing after a sufficiently good performance history. incentives over time. Consistent with the This downward rigidity is also predicted by first prediction, Gopalan et al. (2014) find Harris and Holmstrom (1982), but through a that incentives have longer horizons in firms quite different channel. Their model features with more growth opportunities and greater ­two-sided learning about the agent’s ability, R&D intensity. rather than moral hazard. Downward rigidity Turning to the predictions regarding ter- in wages insures the agent against negative mination, many models predict termination news about his ability, while wage raises after after poor performance, to deter shirking positive news ensure that he does not quit.23 ex ante, e.g., DeMarzo and Sannikov (2006), DeMarzo and Fishman (2007), Biais et al. 4.3 Empirical Analyses (2007), and Sannikov (2008).24 In particular, We now turn to tests of the empirical pre- the first three models feature limited liability, dictions of dynamic models. Lambert (1983), which reduces the principal’s ability to pun- Rogerson (1985), and Edmans et al. (2012) ish poor performance financially, thus lead- predict the “deferred reward principle”: firm ing to a role for termination. In some cases, performance should affect future, as well as such as Sannikov (2008), termination also current, pay due to consumption-smoothing arises after very good performance, as the considerations. Indeed, Boschen and Smith agent becomes too expensive to incentivize. (1995) show that firm performance has a However, historically, dismissals have been much greater effect on the NPV of future relatively rare. Murphy and Zabojnik (2007) pay than current pay. Gibbons and Murphy report turnover rates of 10 percent per year (1992) find support for the “increasing incen- in the 1970s and 1980s, which increased only tives principle,” that incentives rise over to 11 percent in the 1990s. Taylor’s (2010) time, although studying pay–performance learning model shows that the low rate of sensitivity rather than wealth–performance dismissals can only be justified if the costs sensitivity. This result is consistent with to ­shareholders of ­turnover exceed $200 both consumption-smoothing possibilities million. Turning to turnover–performance and career concerns falling with tenure. In sensitivity, Coughlan and Schmidt (1985), addition to incentives, Murphy (1986) finds Jensen and Murphy (1990), Warner, Watts, that pay increases over time, consistent with

24 Note that termination after poor performance is typi- 23 DeMarzo and Sannikov (2006) use the Sannikov cally not subgame-perfect, so moral hazard models assume (2008) framework to study optimal capital structure and that the firm can commit to terminate the CEO. Learning show that it can be implemented with standard securi- models predict subgame-perfect termination after poor ties—a credit line, long-term debt, and outside equity. performance, as such performance signals low managerial Since the agent always holds equity, the model focuses on quality (e.g., Jovanovic 1979, Taylor 2010, and Garrett and financing, rather than executive compensation. Pavan 2012). 1270 Journal of Economic Literature, Vol. LIV (December 2016) and Wruck (1988), and Weisbach (1988) find ­employment.25 Furthermore, some theo- that turnover probability is decreasing in ries predict that severance pay can be opti- performance, but the economic magnitude mal. Almazan and Suarez (2003) show that is small. Jensen and Murphy (1990) estimate it can induce the CEO to leave voluntarily that a CEO who performs in line with the when a more able replacement is available; market over two years has a 11.1 percent Inderst and Mueller (2010) demonstrate that dismissal probability; underperforming by it can deter a CEO from entrenching him- 50 percent in each year increases this proba- self by concealing negative information that bility only to 17.5 percent. Thus, even under would lead to his dismissal. One example is the aggressive assumption that the CEO to induce the CEO to accept a takeover bid, receives no severance package and is unable which typically yields a substantial premium to find alternative employment until retire- to shareholders. Manso (2011) shows that ment, Jensen and Murphy (1990) estimate severance pay is valuable to induce the CEO that incentives from dismissal are equivalent to explore new technologies, rather than to an equity stake of 0.03 percent. merely exploit existing ones. In the afore- Moreover, incentives from dismissal are mentioned model of He (2012), severance even lower if the CEO is granted severance pay leads to a ­backward-loaded wage pattern pay. In contrast to most theories, which advo- that is robust to private savings. cate that pay should be weakly monotonic in Recent studies of CEO turnover have firm performance, CEOs are often given sev- uncovered higher rates. Kaplan and Minton erance packages upon departure. Yermack (2012) aggregate both internal (board-driven)­ (2006) finds a mean contracted severance and external (through takeover and bank- pay of $0.9 million, with a mean discretion- ruptcy) turnover and find that total annual ary amount of $4.5 million; the respective turnover was 17.4 percent over 1998–2005. maximums are $36.1 and $121.1 million, A one standard-deviation fall in the industry- suggesting that these packages can be sub- adjusted stock return is associated with a stantial. Their usage is especially prevalent 3.4 percent increase in the likelihood of among dismissed CEOs, compared to those turnover. This figure is 2.1 percent for indus- who voluntarily retire, and thus appears to try performance and 1.8 percent for the per- reward CEOs for failure. However, a closer formance of the overall market. Jenter and look at the data suggests that the vast major- Lewellen (2014) find a total turnover rate of ity of “severance pay” does not stem from 11.8 percent per year, of which they estimate compensation for loss of employment, but 4.1– 4.5 percentage points (35–38 percent instead items such as unvested restricted of the total) are performance induced, i.e., shares, unexercised stock options, and would not have occurred had performance accrued pension benefits, which were prom- been good. ised and contractually obligated to the CEO Thus, more recent evidence suggests that under any state of nature. For example, the threat of job loss from poor performance out of Henry McKinnell’s ­much-criticized is significant. While these results support the $180 million severance package from Pfizer, prediction that firing occurs upon poor per- $78 million was deferred compensation formance, they do not support the prediction ($67 million contributed plus $11 ­million that it is also prompted by good ­performance. interest), $82 million was the present value In addition, moral hazard models typically of his pension plan, and $8 million was from stock options. Thus, only an incre- mental $11 million was due to the loss of 25 We thank David Yermack for this example. Edmans and Gabaix: Executive Compensation: A Modern Primer 1271 do not yield quantitative predictions for what discretionary award of $121.1 million. In the rate of firing or the sensitivity of firing addition, while Dittmann, Maug, and Spalt to performance should be, making it difficult (2013) find that indexation of options would to assess whether the observed findings are not create value for the average firm, it would optimal.26 for 25 percent of firms, and so the absence of indexation across all firms is difficult to rec- oncile with efficiency. 5. Open Questions Second, some aspects of compensation are hidden from shareholders, which is 5.1 Apparent Inefficiencies difficult to reconcile with them being set in Executive Compensation in shareholders’ interest. For example, Lie Thus far, we have argued that many fea- (2005) presents evidence that the positive tures of executive compensation that are stock returns after the disclosed grant dates frequently criticized may yet be consistent of executive stock options, first documented with efficient contracting. Examples include by Yermack (1997), arises from backdating. the level of pay, low $–$ incentives, the nega- Since options are typically granted at the tive relationship between incentives and firm money, the CEO—unbeknown to share- size, the use of stock rather than options, the holders—chooses the grant date in retro- lack of relative performance evaluation, and spect, to coincide with days on which the the use of severance pay and inside debt. stock price is low, thus justifying a low strike However, this empirical consistency does price. Similarly, recent corporate scandals not prove that compensation practices are such as Tyco uncovered executives extract- efficient. As discussed earlier, the positive ing perks that were initially unknown to correlation between pay and firm size is also shareholders. consistent with rent extraction, and may arise Third, current schemes often fail to keep because a third omitted variable drives both; pace with a firm’s changing conditions. similar concerns surround other empirical While section 3.5 argues that the CEO’s findings consistent with value-maximiza- incentives are sufficient in normal times to tion models. In addition, even accepting the induce effort, if a company encounters dif- empirical correlations discussed in this paper ficulties and its stock price falls, the delta of as supportive of value maximization in gen- his options declines, and so the options lose eral, there remain several features of com- much of their incentive effect. One remedy pensation that could be improved upon. used in practice is the repricing of out-of- First, empirical studies uncover results for the-money options (Brenner, Sundaram, and the average firm. However, even if compen- Yermack 2000; Acharya, John, and Sundaram sation is efficient on average, there may still 2000), but this is controversial, as it appears be several individual cases of rent extraction. to reward the CEO for failure. Even if the For example, while the theories discussed CEO is paid purely with stock (which always in section 4.3 may be able to justify­ the has a delta of one), the problem continues mean level of severance pay, their forces are to exist as long as the benefit of effort ​b(S)​ unlikely to be strong enough to rationalize is increasing in firm size. Intuitively, when extreme realizations, such as the maximum firm size falls, the benefits from effort are lower and so additional equity is needed to induce a given level of effort. For example, if 26 Taylor (2010) derives quantitative predictions for the rate of firing as a function of the cost of turnover to share- both the production and utility functions are holders, but in a learning model. multiplicative, the relevant measure is %–% 1272 Journal of Economic Literature, Vol. LIV (December 2016) incentives, the percentage of the CEO’s management. Edmans, Fang, and Lewellen wealth that is comprised of stock, and this (2016) study the quantity of equity sched- measure falls when the stock price declines. uled to vest in a given year, since this amount If the utility function is additive, the relevant depends on equity grants made several years ­measure is $–% incentives, the dollar value of prior and is thus likely exogenous to current the CEO’s equity, which also falls when the investment opportunities. They find that stock price declines. A simple way to replen- vesting equity is significantly negatively cor- ish incentives is to increase (reduce) the por- related with the growth rate of R&D and tion of the CEO’s salary that is given in equity capital expenditure, and positively correlated (cash) after the stock price falls; indeed, with the likelihood of meeting or narrowly Core and Guay (1999) show that firms use beating earnings targets. Johnson, Ryan, and new equity grants to move executives toward Tian (2009) show that unrestricted stock is their optimal incentive levels. Alternatively, positively correlated with corporate fraud. the incentive account discussed in section One potential solution to the potential 4.1 involves rebalancing the amount of the negative consequences of short horizons is CEO’s escrowed equity and deferred cash to extend the vesting period of equity. While to ensure he always has sufficient equity. the current debates surround the level Critically, in both cases, unlike the repricing of pay and the sensitivity of pay to perfor- of options, the CEO’s additional equity is not mance, extending the horizon of incentives given for free: it is paid for by a reduction in may be particularly valuable in overcoming cash. Thus, the CEO is reincentivized with- moral hazard. However, lengthening vesting out him being rewarded for failure. periods is not costless. First, doing so will Fourth, standard measures of CEO incen- potentially expose the CEO to risk outside tives, such as those considered in section 3, his control. Second, Laux (2012) shows the- only measure how the CEO’s wealth is oretically that, if the CEO forfeits unvested affected by changes to the current stock price. equity upon dismissal, he may engage in myo- They do not consider the extent to which the pic actions to avoid the risk of dismissal until CEO is also aligned with the ­long-term stock his equity has vested. Third, Brisley (2006) price, i.e., the horizon of incentives. The clas- demonstrates that unvested equity ties up sic managerial myopia models of Stein (1988, a significant portion of the CEO’s wealth 1989) show that short-term incentives can within the firm, and thus may cause him to lead to myopic actions. In a corporate con- turn down risky, value-creating projects. text, these actions can involve cutting R&D, We note that most of the potential reme- reducing employee training, writing loans dies—indexation (where valuable), a crack- that may become delinquent in the future, or down on perks, updating contracts, and expending corporate resources on earnings lengthening vesting periods—can be imple- management. Empirical studies of the hori- mented by shareholders (or shareholder- zon of incentives have been hindered by lack aligned boards) themselves, without the of data availability on the vesting period of an need for regulatory intervention. The issue executive’s equity, but recent studies suggest with regulation is that it is one-size-fits-all that horizons may affect behavior. Gopalan and cannot be adapted to a firm’sparticular ­ et al. (2014) use the recent change in dis- ­circumstances. For example, a minimum closure requirements to pioneer a measure vesting horizon of (say) five years for equity of the “duration” of incentives, analogous to may be too short to induce investment in the duration of a debt security. Shorter dura- growth industries, and too long (thus sub- tion incentives are correlated with earnings jecting the CEO to excessive risk) in mature Edmans and Gabaix: Executive Compensation: A Modern Primer 1273 industries. Indeed, Gopalan et al. (2014) firm overpays itsworkers ­ (e.g., due to poor find that equity “duration” is longer in firms ­corporate ­governance), this will lead to other with higher growth opportunities, long-term firms optimally doing so to remain compet- assets, and R&D intensity, suggesting that itive, even if they are well-governed. How the optimal vesting period varies according quantitatively important these possible to firm characteristics. The recent increases externalities remains an open question. in disclosure requirements, and say-on-pay legislation, are steps in the direction of allow- 5.2 Underexplored Areas ing shareholders to ensure the optimality of contracts, as both give the information and 5.2.1 Empirical Questions ability to decide whether a given pay package is appropriate in a particular context. We start with potential avenues for future Moreover, any policy to reform executive empirical analysis, before turning to ideas pay should not focus narrowly on compen- for theoretical research. There have been sation alone, but recognize the systemic several high-impact empirical studies of nature of the issue. Bolton, Scheinkman, executive compensation. Since the debate and Xiong (2006) show that shareholders about the efficiency of pay concerns mag- who wish to maximize the short-term stock nitudes, this is a field in which descriptive price may deliberately induce myopia by statistics alone are illuminating, for example voting for short-term CEO contracts. Thus, Jensen and Murphy’s (1990) and Hall and passing say-on-pay legislation alone may Liebman’s (1998) seminal works on quan- not improve value creation if shareholders tifying CEO incentives. Other studies have do not have the incentives to set pay opti- correlated CEO pay with outcomes such as mally. Reorienting shareholders to focus on firm value (e.g., Morck, Shleifer, and Vishny long-term value is critical for ensuring that 1988; McConnell and Servaes 1990; and greater shareholder power does indeed lead Himmelberg, Hubbard, and Palia 1999). to improvements in executive contracts. One However, assigning causality is very difficult, potential channel is to encourage sharehold- as there are very few instruments for CEO ers to take large stakes (e.g. through superior incentives. Even the very basic question of liquidity, or fewer disclosure requirements), whether CEO incentives positively affect since large shareholders have sufficient firm value has not yet been satisfactorily incentives to gather information on the answered. Thus, a first open question is to firm’s long-run value, rather than relying on find good instruments for, orquasi-exogenous ­ public information such as short-term profit shocks to, CEO pay, to allow identification of (Edmans 2009). the effects of incentives. There have been Regulatory intervention is, however, a limited number of attempts in this direc- valuable if externalities exist. In Bénabou tion. Palia (2001) argues that CEO experi- and Tirole (2016), competition causes ence, education, and age, and firm volatility firms to offer high incentives to screen for are instruments for executive compensation. ­high-ability managers. However, strong Core and Larcker (2002) study increases incentives also lead to managers focusing in stock ownership mandated by CEOs excessively on measurable tasks and shirking approaching the minimum levels set by pre- on unmeasurable tasks, echoing Holmstrom announced guidelines. Shue and Townsend and Milgrom (1991). Acharya and Volpin (forthcoming b) exploit the fact that options (2010) and Dicks (2012) show that com- are granted according to ­multiyear cycles as petition can lead to spillover effects: if one an instrument for option grants. Edmans, 1274 Journal of Economic Literature, Vol. LIV (December 2016)

Fang, and Lewellen (2016) and Edmans the effect on firm value of changes in these et al. (2016) analyze the ­scheduled vesting parameters, or how the possibility of myo- of equity resulting from grants made several pia or short-termism changes the contract. years prior. In addition, formal joint tests of a theory’s Second, most empirical studies have been quantitative predictions can highlight where focused on public firms in the United States, the theory fails, thus opening doors to future given the availability of the ExecuComp research. dataset. Research on compensation prac- As examples of structural approaches, tices in private firms would be particularly Gayle and Miller (2009) study the extent useful. Since private firms are likely closer to which moral hazard can explain the rise to the efficient benchmark, due to the in CEO pay. Margiotta and Miller (2000), presence of a concentrated shareholder, who find that firms would suffer large losses such data would allow a comparison with from not contracting optimally, and also similar public firms to assess whether pay estimate the gains that would arise in a first- in public firms represents rent extraction. best world where effort were observable. For example, Cronqvist and Fahlenbrach In addition, managers only require a small (2013) study firms transitioning from pub- risk premium for the risk imposed by incen- lic to private ownership. Additional studies tives—the benefits of incentives substan- investigating private firms in general (in tially outweigh the costs. Gayle and Miller addition to those that were formerly public) (2015) show that moral-hazard models in would be helpful. Another fruitful direction which managers can manipulate account- would be to study international data and ing reports better explain observed con- analyze the determinants of cross-country tracts than ones in which they cannot, and differences in CEO pay. Conyon, Core, and Gayle, Golan, and Miller (2015) decompose Guay (2011) and Fernandes et al. (2013) the sources of pay differences between are useful steps in this direction. Moreover, large and small firms. Pan (forthcoming) while data on CEO wealth (an important extends the Gabaix and Landier (2008) determinant of both risk aversion and the assignment model, which matches CEO private benefits from shirking) is typically talent with firm size, to incorporate addi- unavailable in the United States, it is some- tional dimensions of heterogeneity—for times available in other countries (see, e.g., example, more diversified firms hire CEOs Becker 2006). with more cross-industry experience and Third, structurally estimating a dynamic ­research-intensive firms hire CEOs who are moral hazard model may allow us to study more prone to innovation—and estimates questions that are difficult to answer with the importance of match specificity for pro- reduced-form approaches. For example, it ductivity. Using a learning model, Taylor may allow us to quantify several important (2010) estimates the cost of firing that determinants of the optimal contract that are would rationalize observed turnover rates; a otherwise difficult to measure empirically, similar approach may uncover whether fir- such as the CEO’s risk aversion, cost of effort, ing rates are optimal from a moral-hazard ability to engage in manipulation, and desire perspective. Similarly, while existing tests for consumption smoothing.27 Relatedly, it of the rent extraction versus shareholder can permit counterfactual analyses such as value hypotheses typically study the cross

27 Dupuy and Galichon (2014) advance the modeling Techniques such as theirs could be useful in executive and econometrics of multidimensional matching models. compensation. Edmans and Gabaix: Executive Compensation: A Modern Primer 1275 section, an analysis of time-series dynamics possibility of turnover adds ­complications to would allow us to study whether the evolu- a standard dynamic model of moral hazard. tion of pay over time is consistent with value The classic single-firm models of Lambert maximization. (1983) and Rogerson (1985) predict that the Fourth, and relatedly, modern dynamic reduction in CEO pay caused by poor perfor- contracting models have generated new mance should be spread out over all future empirical predictions that can be tested using periods, to optimize risk sharing. However, a reduced-form approach. Examples include the CEO may quit if future expected pay how the level of pay evolves over time and is low, reducing consumption smoothing whether this wage growth is increasing in possibilities.29 incentives and firm risk, how incentives Second, while the “rent extraction” view change over time, and the determinants of has been influential, its arguments have the optimal horizon of incentives. been mainly stated verbally, e.g., Bebchuk Fifth, while empirical studies have iden- and Fried (2004). It would be particularly tified a number of determinants of both useful to model the rent extraction view and the level of pay and incentives, there are compare its predictions to the data. One significant managerial fixed effects in both example is Kuhnen and Zwiebel (2009), (Graham, Li, and Qiu 2012; Coles and Li where the manager has freedom to extract 2013), suggesting that a large component perks, but doing so reduces profits and thus remains unexplained. These fixed effects shareholders’ assessment of the manager’s may result from talent, ability to extract rent, ability, which may lead to him being fired. preferences, or other characteristics. In addi- The model predicts that perk consump- tion, these studies assume separability of the tion is increasing in production uncertainty unobserved fixed effect and the other deter- (since it is easier to disguise low profits as minants. However, there may be interactions resulting from a negative shock) and the between them; for example, part of the fixed manager’s outside option (since firing is less effect may result from talent, and the impact of a concern). It is decreasing in the uncer- of talent on pay may depend on firm size. tainty about the manager’s ability, as then Future research can lead to a better under- profits have a greater effect on sharehold- standing of what these unobservable fixed ers’ assessments of his ability and thus their effects may represent, and how their effect firing decision. They find qualitative sup- may vary with other characteristics already port for these predictions, measuring hid- known to affect pay. den pay with stock options, restricted stock, and annual pay not declared as salary and 5.2.2 Theoretical Questions bonus. A further potential avenue in this We now move to open theoretical ques- line of research would be a rent extraction tions. First, most current market equilibrium model that generates quantitative predic- models are static. It would be useful to add a tions, and allows for a horse race between dynamic moral hazard problem where incen- the two viewpoints. tives can be provided not only through con- Third, existing models of CEO pay are tracts, but also by the threat of firing.28 This ­single-agent models, but CEOs work in will also allow us to understand­ what causes CEOs to move between firms. Moreover, the 29 The dynamic moral hazard models of DeMarzo and Sannikov (2006), DeMarzo and Fishman (2007), and Biais 28 See Eisfeldt and Kuhnen (2013) for a market equi- et al. (2007) assume ­risk neutrality, and so consumption librium model with CEO turnover without moral hazard. smoothing is a nonissue. 1276 Journal of Economic Literature, Vol. LIV (December 2016) teams where complementarities between and under what circumstances an implicit agents exist. As a result, their contracts affect contract can be sustained. firm value not only directly through affect- Sixth, with few exceptions, existing exec- ing the CEO’s effort, but also indirectly­ utive compensation models are rational. because the CEO’s effort level affects the Incorporating behavioral considerations has optimal effort level chosen by workers. This been successful in other fields ofcorporate ­ consideration in turn affects the optimal finance (see the survey of Baker and contract for the CEO. Separately, a team Wurgler 2013) and could be similarly fruitful­ setting allows the study of the relative wages here. Baker and Wurgler (2013) divide the of the CEO and other employees, a ques- ­behavioral corporate finance literature into tion that has been of interest to regulators. two fields—managers who are irrational Edmans, Goldstein, and Zhu (2013) analyze or have nonstandard utility functions, and these issues within a CEO setting; Che and rational managers exploiting inefficient mar- Yoo (2001), Kremer (1993), Winter (2004, kets. As an example of the former, Dittmann, 2006, 2010), and Gervais and Goldstein Maug, and Spalt (2010) show that incorpo- (2007) are analyses of contracting under rating loss aversion can explain the observed ­production complementarities in general mix of stock and options, which standard principal-agent settings. utility functions cannot. As an example of Fourth, there has been substantial theo- the latter, Bolton, Scheinkman, and Xiong retical progress on continuous-time agency (2006) show that contracts that emphasize models which allow for the contracting short-term performance may be a ratio- problem to be solved with few assumptions. nal response to speculative markets. Other However, the empirical predictions of such potential behavioral phenomena that could models are typically less clear, given the be incorporated into compensation models absence of analytical solutions, and because include bounded rationality, overconfidence numerical solutions depend on the parame- (overweighting private signals and under- ters chosen. Future research may be able to weighting public signals), and optimism identify clearer implications of these mod- (overestimating one’s own managerial ability els, in particular comparative statics on how or firm quality). incentives and turnover–performance sensi- Finally, turning to questions for both tivity should differ across firms. theoretical and empirical research, we now Fifth, contracting models assume that have quantitative theories for the level of the principal and agent decide on the rele- pay and “demand” side, given the supply of vant performance measures and a contract talent. However, we know relatively little at the start of the employment relation- on the “supply” side. Given the substantial ship. However, there is evidence that the pay premium that top executives command performance measures may be renegoti- over other skilled professions (e.g., medi- ated ex post (e.g., Morse, Nanda, and Seru cine or law), it would be interesting to study 2011), and that more than half the CEOs empirically the extent to which this pre- of S&P 500 firms do not have an explicit mium results from limited supply and, if so, employment contract, instead employ- explore theoretically why supply appears ing the CEO at will (Gillan, Hartzell, and to remain so limited—why more people do Parrino 2009). It would be interesting to not enter the business profession. A related study the optimal contract when the CEO topic is to understand better the nature of and firm wait until performance has been the scarcity of CEO talent, e.g., whether it realized before negotiating a ­sharing rule, stems from innate skills, experience, lack of Edmans and Gabaix: Executive Compensation: A Modern Primer 1277 succession planning, and so on.30 Separately, incentives, but increase the level of pay. The while learning models (outside the scope of rarity of relative performance evaluation this survey and listed at the end of the intro- appears to contradict the Holmstrom (1979) duction) have generally been developed ­informativeness principle, but we discussed and tested independently of moral hazard potential rationalizations of this practice. If ones, theories that combine both learning the principal aims to induce risk taking, as well and moral hazard, or empirical studies that as effort from the CEO, the contract should analyze the relative importance of learning be convex and generally contain options, in versus moral hazard for observed contracts, contrast to the linear contracts advocated by would be valuable. Holmstrom and Milgrom (1987). We finally discussed dynamic ­moral- hazard models. To achieve optimal risk shar- 6. Conclusion ing, the reward for good performance should This article has presented a number of be smoothed over future periods. As the value-maximization models of executive CEO approaches retirement, since there ­compensation under a unifying framework. are fewer periods over which to engage in We commenced with assignment models smoothing, the sensitivity of current pay to of the CEO labor market. More talented performance should rise. If the CEO can managers are matched with larger firms, engage in private saving, his wage profile is since their talent is scalable. Their talent typically backward loaded, to remove such also allows them to command higher wages, saving incentives. leading to quantitative predictions for the While each model has different features cross-sectional relationship between pay and and tackles different questions, we high- firm size. Since the dollar benefits of talent lighted two common themes. First, the are greater in larger firms, the model also modeling assumptions (e.g., whether pref- implies that pay should rise over time as erences or production functions are addi- average firm size grows. tive or multiplicative, whether actions are We then moved to static moral hazard continuous or discrete, and whether pri- models and showed that the correct empirical vate saving is possible) can have significant measure of incentives depends on whether impact on the model’s predictions. Second, we believe effort has additive or multiplica- we emphasized the empirical predictions tive effects on firm value and CEO utility. of the models and compared them to the Moreover, if the effect of effort scales with data. In particular, observed practices that firm size, $–$ incentives should optimally be are often interpreted as evidence of rent weaker in smaller firms. If effort is continu- extraction are also qualitatively, and some- ous and the optimal effort level is interior and times quantitatively, consistent with value endogenous, as in Holmstrom and Milgrom maximization. However, consistency with (1987), incentives should be increasing in the observed correlations does not prove that benefits of effort and decreasing in risk, risk real-life practices are optimal; instead, it aversion, and the cost of effort. In contrast, merely emphasizes caution before attempt- if effort is binary or continuous and the prin- ing to intervene by regulation. Whether cipal wishes to implement full productive observed contracts result from efficiency or efficiency, risk and risk aversion do not affect rent extraction is still an open question, and we highlighted other potential avenues for 30 For the supply of skills of general workers, see Goldin future research. We look forward to seeing and Katz (2009). this literature continue to evolve. 1278 Journal of Economic Literature, Vol. LIV (December 2016)

for some adapted process ​​ ​ ​. Combining this Appendix A: Longer Derivations ξ t with (80), we obtain (72). ∎ A.1 Proof of Two Core Results on Dynamic Incentives Rigorous proof. The following proof (after Sannikov 2008) Heuristic Proof of the Martingale is more rigorous, but less explicit about the Representation Theorem (72).—We show origins of the result. Define two proofs—one rigorous, one heuristic that shows the economic origin of the result. ∞ s V​ ​ ​ ​ ​ e​​ −δ ​ ​K​ s​ ds​ ∞ ≔ ∫0 Heuristic proof: We reason by keeping the ​dt​ terms, drop- and ​​Vt​ ​ ​Et​ [​ ​V​ ]​ ​. Then, ​​Vt​ ​ is a martin- 2 = ∞ ping the ​O​ d ​t​​ ​ ​ terms: gale. That implies that we can write d​ ​Vt​ ​ ( ) t ​​ t​ ​e​​ −δ ​ d ​Zt​ ​​​ for some adapted process ​​​ t​. = ξ t s ξ ∞ (s t) Hence, ​​Vt​ ​ V​ 0​ ​ ​ ​ ​ ​ s​ ​e​​ −δ ​ d ​Z​ s​. On the U​ ​ ​E​ ​ ​ ​ ​ e​​ −δ − ​ ​K​ ​ ds ​ = + ∫0 ξ t = t[∫t s ] other hand,

t dt + (s t) Vt​ ​ ​Et​ ​ ​V​ ​ ​ ​Et​ ​ ​ ​ ​ e​​ −δ − ​ ​K​ s​ ds ​ = [ ∞] = [∫t ] t s ∞ s ∞ (s t) ​ ​ ​ e​​ −δ ​ ​K​ ​ ds ​E​ ​ ​ ​ e​​ −δ ​ ​K​ ​ ds ​E​ ​ ​ ​ e​​ −δ − ​ ​K​ ​ ds ​ 0 s t t s t[ t dt s ] = + + ∫ + ∫ ∫ t dt s t ​K​ t​ dt o(dt) ​e​​ −δ ​ ​ ​ ​ e​​ −δ ​ ​K​ ​ ds ​e​​ −δ ​ ​U​ ​ , = + + = ∫0 s + t ∞ (​s (​t dt))​​ ​E​ ​ ​E​ ​ ​ e​​ −δ − + ​ ​K​ ​ ds ​ (s t) t[ t dt t dt s ] as ​​U​ ​ ​E​ ​ ∞​ ​ e​​ −δ − ​ ​K​ ​ ds​. Differentiating × + ∫ + t = t ∫t s with respect to ​t​, we obtain dt ​K​ t​ dt o(dt) ​e​​ −δ ​ ​Et​ ​ ​Ut​ dt​ = + + + t ​d ​Vt​ ​ ​​ t​ ​e​​ −δ ​ d ​Z​ t​ ​K​ ​ dt o(dt) = ξ = t + t t t (1 dt)​​U​ ​ ​E​ ​ d ​U​ ​ ​ o(dt) ​e​​ −δ ​ ​K​ ​ dt ​e​​ −δ ​ d ​U​ ​ ​e​​ −δ ​ ​U​ ​ dt​, + − δ ( t + t t) + = t + t − δ t ​U​ ​ ​​K​ ​ ​U​ ​ ​ dt hence, = t + ( t − δ t)

​Et​ [​ d ​Ut​ ]​ ​ o(dt)​. ​d ​Ut​ ​ ​( ​Ut​ ​ ​K​ t)​ ​ dt ​ ​ t​ d ​Z​ t​ . ​ + + = δ − + ξ ∎ Hence, Heuristic Proof of the Hamilton-Jacobi (80) 0 ​​K​ ​ ​U​ ​ ​ dt ​E​ ​ d ​U​ ​ ​. Bellman (HJB) Equation (73).—The result = ( t − δ t) + t[ t] is standard, but here we provide a heuristic As ​d ​U​ ​ ​E​ ​ d ​U​ ​ ​​ has mean zero, and the proof. We first ignore the maximization over t − t[ t] only source of randomness is d​ ​Zt​ ​, it makes ​C​. We have, as in (80), sense that it can be written

(81) ​d ​U​ ​ ​E​ ​ d ​U​ ​ ​ ​​ ​ d ​Z​ ​ 0 ​( f (​x​ t​ , Ct​ ​) ​Qt​ )​ ​ dt ​Et​ ​ d ​Qt​ ​ ​. t − t[ t] = ξ t t = − δ + [ ] Edmans and Gabaix: Executive Compensation: A Modern Primer 1279

Now, by Ito’s lemma using ​​Q​ ​ Q(​x​ ​)​, Comparative Statics for Fixed Pay .—We t = t ϕ ​​E​ ​​​ d​Q​ ​​ ​/dt Q (​x​ ​​) (x​ ​​ , C) 1_ ​ Q (​x​ ​​) have t [ t] t t 2 t 2 = ′ μ ​ + ″ ​ ​​ (​ ​x​ t​ , C)​. Hence, ​ ​​ ∗​ 4 4 2 2 2 × σ ____∂ϕ ​K​ ​ ​ 3 ​b​​ ​ ​S​​ ​ ​b​​ ​ gw​S​​ ​ ​ ​​ ​ 2S ​ ​ = η ( − ησ − ∂η ( ) 2 2 0 f (​x​ t​ , Ct​ ​) Q(​x​ t​) Q (​x​ t​) (​x​ t​ , Ct​ ​) 2 ​g​​ ​ ​ ​​ ​ S ​,​ = − δ + ′ μ + ησ ) where __ 1 ​ Q (​x​ ​) ​​ 2(​ ​x​ ​ , C​ )​ ​. + 2 ″ t σ t t ​b​​ 2​ ​ ​​ 2​ ​S​​ 2​ K​ ​ ______σ ​ .​ Next, the principal will want to maximize the η = 2 2 2 3 right-hand side over ​​C​ ​ , hence (73). 2 ​​(​b​​ ​ ​S​​ ​ g ​ ​​ )​ ​ ​ t ∎ + ησ ​ ​​ ∗ A.2 Proofs of Other Results Hence, ___​​ ∂ϕ ​ 0​ if and only if

∂η ​ > Proof of (42).—Setting the participation ​3 ​b​​ 4​ ​S​​ 4​ 2 ​b​​ 2​ g ​S​​ 3​ 2 ​g​​ 2​ ​ ​​ 2​ S constraint (38) to bind, evaluating the expec- + + ησ tation on the left-hand side, and equating the ​b​​ 2​ g ​S​​ 2​ ​ ​​ 2​ .​ exponents yields > ησ

2 2 2 Further, we have ​ E V __ 1 ​ g ​a​​ ∗ ​ __η ​​ ​ ​ ​​ ​ w.​ ϕ + θ [ ] − 2 − 2 θ σ = ​ ​​ ∗​ 4 4 2 2 2 ____ ∂ϕ ​K​ ​ ​ 3 ​b​​ ​ ​S​​ ​ ​b​​ ​ g ​S​​ (​ ​ ​​ ​ 2S)​ Substituting in (41) yields ​ = σ ( − ησ − ∂σ 2 2 (82) ​E c E V 2 ​g​​ ​ ​ ​​ ​ S ​, [ ] = ϕ + θ [ ] + ησ ) 2 ​​ b(​S)​ ​ ​ where w ______(θ ) __η ​​ 2​ ​ ​​ 2​.​ = + 2g + 2 θ σ ​b​​ 2​ ​S​​ 2​ K​ ​ ______ησ ​ .​ From (37), the principal’s objective function σ = 2 2 2 3 (​ ​b​​ ​ ​S​​ ​ g ​ ​​ )​ ​ ​ is ​E S b(S)​a∗​​ c ​. Substituting in (41) and + ησ [ + ​− ] (82) yields an objective function of ​ ​​ ∗ Hence, ___​​ ∂ϕ ​ 0​ if and only if ​ > 2 ∂σ __ 2 ______( b(S)​)​​ ​ __ 2 2 ​S ​gθ ​ ​b(S) ​​ ​ w − ​ θ ​ ​η ​​ ​​ ​ ​ ​​ .​​ ​3 ​b4​​ ​ ​S​​ 4​ 2 ​b2 ​​ ​ g ​S​​ 3​ + [ ] − 2g − 2 θ σ + The first-order condition with respect to​ ​ 2 ​g​​ 2​ ​ ​​ 2​ S ​b2 ​​ ​ g ​S​​ 2​ ​ ​​ 2​ .​ θ + ησ > ησ yields We have b(S)​​​ 2​ b(S​)​​ 2​ ______​​ [ ​] ​ ______θ ​ ​ ​​ 2 ​ 0 ____ ​ ​​ ∗​ 6 6 4 2 4 g g ∂ϕ ​ ​K​ g​ ​ ​b​​ ​ ​S​​ ​ 3 ​b​​ ​ g ​ ​​ ​ ​S​​ ​ − − ηθσ = g = ( + ησ ∂ 2 2 2 2 2 ______​ 1 ​ .​ 2 ​b ​​ ​ ​g​​ ​ ​ ​​ ​ ​S​​ (​ ​ ​​ ​ S)​ θ = 2 − ησ ησ − 1 g ____σ ​ ​ ​ + η (​ b(S )) 2 ​g ​​ 3​ ​ ​​ 2​ ​ ​​ 4​ S ​, + η σ ) 1280 Journal of Economic Literature, Vol. LIV (December 2016) where Next, consider total utility U​ ​: 2 2 K​ ​ ______​b​​ ​ ​S​​ ​ ​ .​ ​U ln ​c​ ​ ln ​c​ ​ g(​a​ ​) g(​a​ ​) g = 3 = 1 + 2 − 1 − 2 2 ​g​​ 2​ ​​ ​b​​ 2​ ​S​​ 2​ g ​ ​​ 2​ ​ ​ ( + ησ ) ​ ​​ ∗​ 2 ​ ​1​ ​r1​ ​ ​ 2​​ ​r2​ ​ g(​a1​ ​) g(​a2​ ​) Hence, ​​ ___∂ϕ ​ 0​ if and only if = θ + θ − − g ∂ > 2 ​ ​​ ​k​ ​ .​ b6 ​​ ​ ​S​​ 6​ 3 ​b4​​ ​ g ​ ​​ 2​ ​S​​ 4​ 2 ​b2 ​​ ​ ​g​​ 2​ ​ ​​ 2​ ​S​​ 3​ + κ1 + 2 + ησ + ησ ___ U 3 2 4 2 2 2 2 2 From (58), the two EF conditions are ​​E1​ ​ ​ ∂ ​ ​ 2 ​g ​​ ​ ​ ​​ ​ ​ ​​ ​ S 2 ​b ​​ ​ ​g​​ ​ ​ ​​ ​ ​S​​ ​ ​ ​​ ​ .​ U [ ​r1​ ​] + η σ > ησ ησ ∎ g (​a∗​​ )​ and ​​E​ ​ ​ ___∂ ​ ​ g (​a∗​​ )​, that is, ∂ 2 ​r​ ​ ≥ ′ ​ [∂ 2] ≥ ′ ​ Heuristic Proof of (61)–(64).—See Edmans 2 ​ ​​ g (​a∗​​ ) and ​​​ g (​a∗​​ ).​ et al. (2012) for a rigorous proof. We present θ1 ≥ ′ ​ θ2 ≥ ′ ​ a heuristic proof in a simple case that conveys the key intuition. We consider ​L T 2 ​ Intuitively (and as can be proven), the EF = = and impose the PS constraint. We wish to constraints should bind, else the CEO is show that the optimal contract is given by exposed to unnecessary risk. Combining the binding version of these constraints with (84) ​r1​ ​ (83) ln ​c​ ​ g ​(a∗​​ ) __ ​ ​​ , yields (83). 1 = ′ ​ 2 + κ1

​r1​ ​ Step 2 (Optimality of log-linear con- ln ​c​ ​ g (​a∗​​ )​ __​ ​r​ ​ ​ ​ ​​ ​k​ ​ 2 = ′ ​( 2 + 2) + κ1 + 2 tracts): We next verify that optimal contracts should be log-linear. Equation (58) yields for some constants ​​ ​​ and ​​k​ ​ that make the 1 2 ​d(ln ​c​ ​)/d ​r​ ​ g (​a∗​​ )​. The cheapest contract κ 2 2 ≥ ′ ​ participation constraint bind. involves this local EF condition binding, that is, Step 1 (Optimal log-linear contract): We first solve the problem in a restricted class (86) ​d(ln ​c​ ​)/d ​r​ ​ g (​a∗​​ ) ​​​ .​ where contracts are log-linear, that is, 2 2 = ′ ​ ≡ θ2 Integrating yields the contract (84) ​ln ​c​ ​ ​​​ ​r​ ​ ​ ​​ , 1 = θ1 1 + κ1 (87) ln ​c​ ​ ​​​ ​r​ ​ B(​r​ ​), ln ​c​ ​ ​​ ​ ​r​ ​ ​ ​​ ​r​ ​ ​ ​​ ​k​ ​ 2 = θ2 2 + 1 2 = θ21 1 + θ2 2 + κ1 + 2 where B​ (​r​ ​​)​ is a function of r​​ ​ ​, which we will for some constants ​​ ​​, ​​ ​ ​, ​​, ​​ ​​ , k​ ​. This first 1 1 θ1 θ21 θ2 κ1 2 determine shortly. It is the integration “con- step is not necessary but clarifies the eco- stant” of equation (86) viewed from time ​2​. nomics, and in more complex cases is helpful We next apply the PS constraint (64) for ​ to guess the form of the optimal contract. t 1​: First, intuitively, the optimal contract = entails consumption smoothing, that is, ​c​ ​ ​c​ ​ (88) 1 ​E​ ​ ​ __1 ​ ​ ​E​ ​ ​ ______1 ​ ​ 1[​c​ ​] 1 ​ ​​ ​r​ ​ B(​r​ ​) shocks to consumption are permanent. This = 2 = [​e​​ θ2 2+ 1 ​] observation implies ​​ ​21​ ​1​​. To prove this, θ = θ ​​r​ ​ B(​r​ ​) the PS constraint (64) yields ​E​ ​[​e​​ −θ​ 2 2​]​ c​ ​ ​e​​ − 1 ​. = 1 1 __​c1​ ​ (​​ 1​​ 21​ )​ ​ r1​ ​ 2​​ ​r2​ ​ k2​ ​ (85) 1 ​E1​ ​ ​ ​ ​ ​e​​ θ −θ​ ​ ​E1​ ​[​e​​ −θ​ −​ ​]​.​ = [​c2​ ​] = Hence, we obtain This must hold for all ​​r​ ​. Therefore, ​​ ​ ​ ​​​. (89) ln ​c​ ​ B(​r​ ​) K , ​ 1 θ21 = θ1 1 = 1 + ′ Edmans and Gabaix: Executive Compensation: A Modern Primer 1281 where the constant ​K ​ is independent of ​​r​ ​. The agent’s reservation utility ​w​ affects the ′ 1 (In this proof, ​K​, ​K ​, and ​K ​ are constants fixed salary ​ ​, which in turn has two effects ′ ″ ‴ ϕ independent of ​​r1​ ​ and ​​r2​ ​.) Total utility is on his effort choice. First, it affects his ben- efit from effort. A higher ​ ​ reduces the mar- ϕ (90) ​U ln ​c​ ​ ln ​c​ ​ K ginal utility of money v​ ​ j(V) ​ because = 1 + 2 + ″ ′ ϕ + the agent is risk averse. However,( it) does not ​​​ ​r​ ​ 2B(​r​ ​) 2K K .​ affect the marginal cost of effort, because = θ2 2 + 1 + ′ + ″ effort entails disutility rather than a financial We next apply (58) to (90) to yield 2​ B (​r​ ​) expenditure. Thus, with a linear contract, ′ 1 g (​a∗​​ ).​ Again, the cheapest contract the optimal action will depend on ​ ​. Second, ≥ ′ ​ ϕ involves this condition binding, that is, it affects the agent’s attitude towards risk ​ ​. ε ​2B (​r​ ​) g (​a∗​​ ).​ Integrating yields The noise realization affects the agent’s ′ 1 = ′ ​ benefit from effort since he is risk averse. __​r1​ ​ For example, if noise turns out to be high, (91) ​B(​r1​ ​) g (​a∗​​ ) K .​ = ′ ​ 2 + ‴ then the agent will be highly paid even with low effort; thus, the benefits from working Combining (91) with (89) yields: ​ln ​c1​ ​ ​r1​ ​ are lower: v​ ( )​ falls. Hence, the agent will g (​a∗​​ ) __ ​ ​ ​​, for another constant ​​ ​​. ′ · = ′ ​ 2 + κ1 κ1 integrate over the possible noise realizations Combining (91) with (87) yields: when making his effort choice. Since ​ ​ also ϕ lies in the marginal felicity function ​v ( )​, it ​r1​ ​ ′ · ln ​c​ ​ g (​a∗​​ ) ​ ​ __ ​r​ ​ ​ ​ ​​ ​k​ ​ ,​ affects the agent’s attitude towards risk and 2 = ′ ​ ( 2 + 2) + κ1 + 2 thus his effort choice. To remove the first effect, HM assume a for some constant ​​k​ ​. 2 pecuniary cost of effort, which corresponds to v​ (c) c​ and a general ​u( )​ in the objec- = · Appendix B: Further Detail tive function (10). Thus, the first-order con- on Holmstrom–Milgrom (1987) dition becomes This section provides further details on ​E​[u ( j(S b(S) a )) the role played by exponential utility, a pecu- ′ ϕ + + + ε niary cost of effort, and normal noise in the ( j (V) b(S) g (a))]​ 0.​ Holmstrom and Milgrom (1987) model. × ′ − ′ = In the general objective function (10), we Now, the marginal benefit of effort​ now assume u​(x) x​ so that the cost of j (V) b(S)​ and the marginal cost of effort g​ (a)​ = ′ ′ effort is non-pecuniary, and have a general are on the same footing: both lie inside the (rather than exponential) function v​( )​. · final term in parentheses. A high fixed wage​ We assume the contract can be rewritten ​ ​ reduces the benefit of effort, but also the c(V) j(V)​ where ​ ​ is the fixed com- ϕ = ϕ + ϕ cost of effort because this cost is in financial ponent of the contract and j​(V)​ is a possibly terms. However, ​ ​ still affects the attitude to ϕ non-linear function; this is without loss of risk since it is inside the ​u ( )​ term. Thus, generality since ​ ​ can be zero. The agent’s ′ · ϕ to remove this second effect, we also need x first-order condition becomes exponential utility, u​ (x) ​e−η​​ ​, so that the = − objective function (10) becomes E​ v ​ j​ S b(S) a j (V) b(S) ​ [ ′(ϕ + ( + + ε)​)​′ ] ( j(V) g(a)) ( j(V) g(a)) ​E​[ e​​ −η ϕ+ − ]​ ​ ​e​​ −ηϕ E[​ e​​ −η − ]​ ​ g (a).​ −​ = ​ −​ = ′ 1282 Journal of Economic Literature, Vol. LIV (December 2016) with first-order condition Baker, George P. 1992. “Incentive Contracts and Per- formance Measurement.” Journal of Political Econ- omy 100 (3): 598–614. (92) Baker, George P., and Brian J. Hall. 2004. “CEO Incen- tives and Firm Size.” Journal of Labor Economics 22 ( j(S b(S)a ) g(a)) (4): 767–98. ​E​ ​e​​ −η + +ε − ​ [η Baker, George P., Michael C. Jensen, and Kevin J. Mur- phy. 1988. “Compensation and Incentives: Practice ( j (S b(S) a )b(S) g (a))]​ 0.​ vs. Theory.” Journal of Finance 43 (3): 593–616. × ′ + + ε − ′ = Baker, Malcolm, and Jeffrey Wurgler. 2013. “Behav- ioral Corporate Finance: An Updated Survey.” In The fixed salary ​ ​, and thus the reservation ϕ Handbook of the Economics of Finance, Volume 2A, utility w​ ​, is now irrelevant. However, the edited by George M. Constantinides, Milton Harris, contract is still very difficult to solve as we and Rene M. Stulz, 357–424. Oxford and Amster- dam: Elsevier, North-Holland. cannot factor out the noise ​​. Again, since ε Bakija, Jon, Adam Cole, and Bradley T. Heim. 2012. the incentive constraint (92) must hold only “Jobs and Income Growth of Top Earners and the on average, there are many possible con- Causes of Changing Income Inequality: Evidence from U.S. Tax Return Data.” http://web.williams. tracts that will implement a given action a​​ ∗​​ ​. ​ edu/Economics/wp/BakijaColeHeimJobsIncomeGro The contract will depend critically on the wthTopEarners.pdf. distribution of noise ​ ​, which poses import- Baranchuk, Nina, Glenn MacDonald, and Jun Yang. ε 2011. “The Economics of Super Managers.” Review ant practical challenges as the noise distri- of Financial Studies 24 (10): 3321–68. bution is often unknown. Only with normal Barro, Jason R., and Robert J. Barro. 1990. “Pay, Per- noise are we able to calculate the expecta- formance, and Turnover of Bank CEOs.” Journal of 1 2 2 Labor Economics 8 (4): 448–81. tion, since then ​E​ ​e−ηε​​ _ ​ ​ ​​ ​ ​ ​​ ​. [ ​]​ = 2 η σ Bebchuk, Lucian A., K. J. Martijn Cremers, and Urs C. Peyer. 2011. “The CEO Pay Slice.” Journal of References ­Financial Economics 102 (1): 199–221. Bebchuk, Lucian, and Jesse Fried. 2004. Pay without Acemoglu, Daron, and David Autor. 2012. “What Does Performance: The Unfulfilled Promise of Executive Human Capital Do? A Review of Goldin and Katz’s Compensation. Cambridge, MA: Harvard University The Race between Education and Technology.” Jour- Press. nal of Economic Literature 50 (2): 426–63. Becker, Bo. 2006. “Wealth and Executive Compensa- Acharya, Viral V., Kose John, and Rangarajan K. Sund- tion.” Journal of Finance 61 (1): 379–97. aram. 2000. “On the Optimality of Resetting Execu- Bénabou, Roland, and Jean Tirole. 2016. “Bonus Cul- tive Stock Options.” Journal of Financial Economics ture: Competitive Pay, Screening, and Multitasking.” 57 (1): 65–101. Journal of Political Economy 124 (2): 305–70. Acharya, Viral V., and Paolo F. Volpin. 2010. “Corporate Benmelech, Efraim, Eugene Kandel, and Pietro Vero- Governance Externalities.” Review of Finance 14 (1): nesi. 2010. “Stock-Based Compensation and CEO 1–33. (Dis)Incentives.” Quarterly Journal of Economics Aggarwal, Rajesh K., and Andrew A. Samwick. 1999. 125 (4): 1769–820. “The Other Side of the Trade-Off: The Impact of Bennedsen, Morten, Francisco Pérez-González, and Risk on Executive Compensation.” Journal of Polit- Daniel Wolfenzon. 2010. “Do CEOs Matter?” https:// ical Economy 107 (1): 65–105. www0.gsb.columbia.edu/mygsb/faculty/research/ Albuquerque, Ana. 2009. “Peer Firms in Relative Per- pubfiles/3177/valueceos.pdf. formance Evaluation.” Journal of Accounting and Bereskin, Frederick L., and David C. Cicero. 2013. Economics 48 (1): 69–89. “CEO Compensation Contagion: Evidence from an Almazan, Andres, and Javier Suarez. 2003. “Entrench- Exogenous Shock.” Journal of Financial Economics ment and Severance Pay in Optimal Governance 107 (2): 477–93. Structures.” Journal of Finance 58 (2): 519–47. Bertrand, Marianne, and Sendhil Mullainathan. 2001. Anantharaman, Divya, Vivian W. Fang, and Guojin “Are CEOs Rewarded for Luck? The Ones without Gong. 2014. “Inside Debt and the Design of Corpo- Principals Are.” Quarterly Journal of Economics 116 rate Debt Contracts.” Management Science 60 (5): (3): 901–32. 1260–80. Bettis, Carr, John Bizjak, Jeffrey Coles, and Swamina- Arnott, Richard, and Joseph E. Stiglitz. 1988. “Ran- than Kalpathy. 2010. “Stock and Option Grants with domization with Asymmetric Information.” RAND Peformance-Based Vesting Provisions.” Review of Journal of Economics 19 (3): 344–62. Financial Studies 23 (10): 3849–88. Axtell, Robert L. 2001. “Zipf Distribution of U.S. Firm Biais, Bruno, Thomas Mariotti, Guillaume Plantin, Sizes.” Science 293 (5536): 1818–20. and Jean-Charles Rochet. 2007. “Dynamic Security Edmans and Gabaix: Executive Compensation: A Modern Primer 1283

Design: Convergence to Continuous Time and Asset Risk at Financial Firms.” Journal of Finance 70 (2): Pricing Implications.” Review of Economic Studies 839–79. 74 (2): 345–90. Coles, Jeffrey L., Naveen D. Daniel, and Lalitha Biais, Bruno, Thomas Mariotti, Jean-Charles Rochet, Naveen. 2006. “Managerial Incentives and Risk- and Stéphane Villeneuve. 2010. “Large Risks, Lim- Taking.” Journal of Financial Economics 79 (2): ited Liability, and Dynamic Moral Hazard.” Econo- 431–68. metrica 78 (1): 73–118. Coles, Jeffrey L., and Zhichuan Frank Li. 2013. “Man- Bolton, Patrick, and Mathias Dewatripont. 2005. Con- agerial Attributes, Incentives, and Performance.” tract Theory. Cambridge, MA: MIT Press. Unpublished. Bolton, Patrick, Jose Scheinkman, and Wei Xiong. Conyon, Martin J., John E. Core, and Wayne R. Guay. 2006. “Executive Compensation and Short-Termist 2011. “Are U.S. CEOs Paid More than U.K. CEOs? Behaviour in Speculative Markets.” Review of Eco- Inferences from Risk-Adjusted Pay.” Review of nomic Studies 73 (3): 577–610. Financial Studies 24 (2): 402–38. Boschen, John F., and Kimberly J. Smith. 1995. “You Conyon, Martin J., and Kevin J. Murphy. 2000. “The Can Pay Me Now and You Can Pay Me Later: The Prince and the Pauper? CEO Pay in the United Dynamic Response of Executive Compensation States and United Kingdom.” Economic Journal 110 to Firm Performance.” Journal of Business 68 (4): (467): 640–71. 577–608. Cooley, Thomas F., and Edward C. Prescott. 1995. Brander, James A., and Michel Poitevin. 1992. “Mana- “Economic Growth and Business Cycles.” In Fron- gerial Compensation and the Agency Costs of Debt tiers of Business Cycle Research, edited by Thomas Finance.” Managerial and Decision Economics 13 F. Cooley, 1–38. Princeton and Oxford: Princeton (1): 55–64. University Press. Brenner, Menachem, Rangarajan K. Sundaram, and Core, John E., and Wayne R. Guay. 1999. “The Use of David Yermack. 2000. “Altering the Terms of Execu- Equity Grants to Manage Optimal Equity Incentive tive Stock Options.” Journal of Financial Economics Levels.” Journal of Accounting and Economics 28 (2): 57 (1): 103–28. 151–84. Brisley, Neil. 2006. “Executive Stock Options: Early Core, John E., Wayne R. Guay, and David F. Larcker. Exercise Provisions and Risk-Taking Incentives.” 2003. “Executive Equity Compensation and Journal of Finance 61 (5): 2487–509. Incentives: A Survey.” Federal Reserve Bank of New Bushman, Robert M., Raffi J. Indjejikian, and Abbie York Economic Policy Review 9 (1): 27–50. Smith. 1996. “CEO Compensation: The Role of Indi- Core, John E., Wayne R. Guay, and Randall S. Thomas. vidual Performance Evaluation.” Journal of Account- 2005. “Is U.S. CEO Compensation Inefficient Pay ing and Economics 21 (2): 161–93. without Performance?” Michigan Law Review 103 Carpenter, Jennifer N. 2000. “Does Option Compensa- (6): 1142–85. tion Increase Managerial Risk Appetite?” Journal of Core, John E., and David F. Larcker. 2002. “Perfor- Finance 55 (5): 2311–31. mance Consequences of Mandatory Increases in Carroll, Gabriel. 2012. “When Are Local Incen- Executive Stock Ownership.” Journal of Financial tive Constraints Sufficient?”Econometrica 80 (2): Economics 64 (3): 317–40. 661–86. Cornelli, Francesca, Zbigniew Kominek, and Alexan- Cassell, Cory A., Shawn X. Huang, Juan Manuel San- der Ljungqvist. 2013. “Monitoring Managers: Does chez, and Michael D. Stuart. 2012. “Seeking Safety: It Matter?” Journal of Finance 68 (2): 431–81. The Relation between CEO Inside Debt Holdings Cosh, Andrew. 1975. “The Remuneration of Chief and the Riskiness of Firm Investment and Financial Executives in the United Kingdom.” Economic Jour- Policies.” Journal of Financial Economics 103 (3): nal 85 (337): 75–94. 588–610. Coughlan, Anne T., and Ronald M. Schmidt. 1985. Chaigneau, Pierre, Alex Edmans, and Daniel Gottlieb. “Executive Compensation, Management Turnover, 2016a. “The Informativeness Principle under Lim- and Firm Performance: An Empirical Investigation.” ited Liability.” Unpublished. Journal of Accounting and Economics 7 (1–3): 43–66. Chaigneau, Pierre, Alex Edmans, and Daniel Gottlieb. Cremers, K. J. Martijn, and Yaniv Grinstein. 2014. 2016b. “The Value of Information for Contracting.” “Does the Market for CEO Talent Explain Contro- Unpublished. versial CEO Pay Practices?” Review of Finance 18 Chang, Yuk Ying, Sudipto Dasgupta, and Gilles Hilary. (3): 921–60. 2010. “CEO Ability, Pay, and Firm Performance.” Cronqvist, Henrik, and Rudiger Fahlenbrach. 2013. Management Science 56 (10): 1633–52. “CEO Contract Design: How Do Strong Principals Do Chassang, Sylvain. 2013. “Calibrated Incentive Con- It?” Journal of Financial Economics 108 (3): 659–74. tracts.” Econometrica 81 (5): 1935–71. De Angelis, David, and Yaniv Grinstein. 2016. “Rela- Che, Yeon-Koo, and Seung-Weon Yoo. 2001. “Optimal tive Performance Evaluation in CEO Compensation: Incentives for Teams.” American Economic Review A Non-agency Explanation.”Unpublished. 91 (3): 525–41. DeMarzo, Peter M., and Michael J. Fishman. 2007. Cheng, Ing-Haw, Harrison Hong, and Jose A. Scheink- “Agency and Optimal Investment Dynamics.” man. 2015. “Yesterday’s Heroes: Compensation and Review of Financial Studies 20 (1): 151–88. 1284 Journal of Economic Literature, Vol. LIV (December 2016)

DeMarzo, Peter M., and Yuliy Sannikov. 2006. “Opti- Eisfeldt, Andrea L., and Camelia M. Kuhnen. 2013. mal Security Design and Dynamic Capital Structure “CEO Turnover in a Competitive Assignment in a Continuous-Time Agency Model.” Journal of Framework.” Journal of Financial Economics 109 Finance 61 (6): 2681–724. (2): 351–72. Demsetz, Harold, and Kenneth Lehn. 1985. “The Falato, Antonio, Dan Li, and Todd Milbourn. 2015. Structure of Corporate Ownership: Causes and “Which Skills Matter in the Market for CEOs? Evi- Consequences.” Journal of Political Economy 93 (6): dence from Pay for CEO Credentials.” Management 1155–77. Science 61 (12): 2845–69. Dicks, David L. 2012. “Executive Compensation and Farhi, Emmanuel, and Ivan Werning. 2012. “Capital the Role for Corporate Governance Regulation.” Taxation: Quantitative Explorations of the Inverse Review of Financial Studies 25 (6): 1971–2004. Euler Equation.” Journal of Political Economy 120 Dittmann, Ingolf, and Ernst Maug. 2007. “Lower Sal- (3): 398–445. aries and No Options? On the Optimal Structure of Fernandes, Nuno, Miguel A. Ferreira, Pedro Matos, Executive Pay.” Journal of Finance 62 (1): 303–43. and Kevin J. Murphy. 2013. “Are U.S. CEOs Paid Dittmann, Ingolf, Ernst Maug, and Oliver Spalt. More? New International Evidence.” Review of 2010. “Sticks or Carrots? Optimal CEO Compen- Financial Studies 26 (2): 323–67. sation When Managers Are Loss Averse.” Journal of Frydman, Carola. 2014. “Rising through the Ranks: Finance 65 (6): 2015–50. The Evolution of the Market for Corporate Execu- Dittmann, Ingolf, Ernst Maug, and Oliver Spalt. 2013. tives, 1936–2003.” Unpublished. “Indexing Executive Compensation Contracts.” Frydman, Carola, and Raven E. Saks. 2010. “Executive Review of Financial Studies 26 (12): 3182–224. Compensation: A New View from a Long-Term Per- Dittmann, Ingolf, Ko-Chia Yu, and Dan Zhang. 2015. spective, 1936–2005.” Review of Financial Studies 23 “How Important Are Risk-Taking Incentives in Exec- (5): 2099–138. utive Compensation?” Unpublished. Frydman, Carola, and Dirk Jenter. 2010. “CEO Com- Dupuy, Arnaud, and Alfred Galichon. 2014. “Person- pensation.” Annual Review of Financial Economics ality Traits and the Marriage Market.” Journal of 2: 75–102. Political Economy 122 (6): 1271–319. Gabaix, Xavier. 1999. “Zipf’s Law for Cities: An Expla- Edmans, Alex. 2009. “Blockholder Trading, Market nation.” Quarterly Journal of Economics 114 (3): Efficiency, and Managerial Myopia.” Journal of 739–67. Finance 64 (6): 2481–513. Gabaix, Xavier. 2014. “A Sparsity-Based Model of Edmans, Alex. 2011. “Does the Stock Market Fully Bounded Rationality.” Quarterly Journal of Econom- Value Intangibles? Employee Satisfaction and Equity ics 129 (4): 1661–710. Prices.” Journal of Financial Economics 101 (3): Gabaix, Xavier, and Augustin Landier. 2008. “Why Has 621– 40. CEO Pay Increased So Much?” Quarterly Journal of Edmans, Alex, Vivian W. Fang, and Katharina A. Economics 123 (1): 49–100. Lewellen. 2016. “Equity Vesting and Investment.” Gabaix, Xavier, Augustin Landier, and Julien Sauvag- Unpublished. nat. 2014. “CEO Pay and Firm Size: An Update after Edmans, Alex, and Xavier Gabaix. 2009. “Is CEO Pay the Crisis.” Economic Journal 124 (574): F40–59. Really Inefficient? A Survey of New Optimal Con- Garen, John E. 1994. “Executive Compensation and tracting Theories.” European Principal–Agent Theory.” Journal of Political Econ- 15 (3): 486–96. omy 102 (6): 1175–99. Edmans, Alex, and Xavier Gabaix. 2011a. “The Effect Garicano, Luis, and Esteban Rossi-Hansberg. 2006. of Risk on the CEO Market.” Review of Financial “Organization and Inequality in a Knowledge Econ- Studies 24 (8): 2822–63. omy.” Quarterly Journal of Economics 121 (4): Edmans, Alex, and Xavier Gabaix. 2011b. “Tractability 1383–435. in Incentive Contracting.” Review of Financial Stud- Garrett, Daniel F., and Alessandro Pavan. 2012. “Man- ies 24 (9): 2865–94. agerial Turnover in a Changing World.” Journal of Edmans, Alex, Xavier Gabaix, and Augustin Landier. Political Economy 120 (5): 879–925. 2009. “A Multiplicative Model of Optimal CEO Garrett, Daniel F., and Alessandro Pavan. 2015. Incentives in Market Equilibrium.” Review of Finan- “Dynamic Managerial Compensation: A Variational cial Studies 22 (12): 4881–917. Approach.” Journal of Economic Theory 159 (B): Edmans, Alex, Xavier Gabaix, Tomasz Sadzik, and Yuliy 775–818. Sannikov. 2012. “Dynamic CEO Compensation.” Gayle, George-Levi, Limor Golan, and Robert A. Journal of Finance 67 (5): 1603–47. Miller. 2015. “Promotion, Turnover, and Compensa- Edmans, Alex, Itay Goldstein, and John Zhu. 2013. tion in the Executive Labor Market.” Econometrica “Contracting with Synergies.” Unpublished. 83 (6): 2293–369. Edmans, Alex, Luis Goncalves-Pinto, Moqi Groen-Xu, Gayle, George-Levi, and Robert A. Miller. 2009. “Has and Yanbo Wang. 2016. “Strategic News Releases in Moral Hazard Become a More Important Factor in Equity Vesting Months.” Unpublished. Managerial Compensation?” American Economic Edmans, Alex, and Qi Liu. 2011. “Inside Debt.” Review Review 99 (5): 1740–69. of Finance 15 (1): 75–102. Gayle, George-Levi, and Robert A. Miller. 2015. Edmans and Gabaix: Executive Compensation: A Modern Primer 1285

“­Identifying and Testing Models of Managerial Haubrich, Joseph G. 1994. “Risk Aversion, Perfor- Compensation.” Review of Economic Studies 82 (3): mance Pay, and the Principal–Agent Problem.” Jour- 1074–1118. nal of Political Economy 102 (2): 258–76. Gervais, Simon, and Itay Goldstein. 2007. “The Positive He, Zhiguo. 2012. “Dynamic Compensation Contracts Effects of Biased Self-Perceptions in Firms.” Review with Private Savings.” Review of Financial Studies 25 of Finance 11 (3): 453–96. (5): 1494–549. Gibbons, Robert, and Kevin J. Murphy. 1992. “Opti- Hellwig, Martin F., and Klaus M. Schmidt. 2002. mal Incentive Contracts in the Presence of Career “Discrete-Time Approximations of the Holmstrom– Concerns: Theory and Evidence.” Journal of Political Milgrom Brownian-Motion Model of Intertemporal Economy 100 (3): 468–505. Incentive Provision.” Econometrica 70 (6): 2225–64. Gillan, Stuart L., Jay C. Hartzell, and Robert Parrino. Hermalin, Benjamin E. 2005. “Trends in Corporate 2009. “Explicit versus Implicit Contracts: Evidence Governance.” Journal of Finance 60 (5): 2351–84. from CEO Employment Agreements.” Journal of Hermalin, Benjamin E., and Michael S. Weisbach. Finance 64 (4): 1629–55. 1998. “Endogenously Chosen Boards of Directors Gjesdal, Frøystein. 1982. “Information and Incentives: and Their Monitoring of the CEO.” American Eco- The Agency Information Problem.” Review of Eco- nomic Review 88 (1): 96–118. nomic Studies 49 (3): 373–90. Hermalin, Benjamin E., and Michael S. Weisbach. Goldin, Claudia, and Lawrence F. Katz. 2009. The Race 2012. “Information Disclosure and Corporate Gov- between Education and Technology. Cambridge, ernance.” Journal of Finance 67 (1): 195–233. MA: Harvard University Press. Himmelberg, Charles P., R. Glenn Hubbard, and Dar- Goldman, Eitan, and Steve L. Slezak. 2006. “An Equi- ius Palia. 1999. “Understanding the Determinants of librium Model of Incentive Contracts in the Presence Managerial Ownership and the Link between Own- of Information Manipulation.” Journal of Financial ership and Performance.” Journal of Financial Eco- Economics 80 (3): 603–26. nomics 53 (3): 353–84. Gong, Guojin, Laura Yue Li, and Jae Yong Shin. 2011. Holmstrom, Bengt. 1979. “Moral Hazard and Observ- “Relative Performance Evaluation and Related Peer ability.” Bell Journal of Economics 10 (1): 74–91. Groups in Executive Compensation Contracts.” Holmstrom, Bengt. 1999. “Managerial Incentive Prob- Accounting Review 86 (3): 1007–43. lems: A Dynamic Perspective.” Review of Economic Gopalan, Radhakrishnan, Todd Milbourn, and Feng- Studies 66 (1): 169–82. hua Song. 2010. “Strategic Flexibility and the Opti- Holmstrom, Bengt, and Paul Milgrom. 1987. “Aggrega- mality of Pay for Sector Performance.” Review of tion and Linearity in the Provision of Intertemporal Financial Studies 23 (5): 2060–98. Incentives.” Econometrica 55 (2): 303–28. Gopalan, Radhakrishnan, Todd Milbourn, Fenghua Holmstrom, Bengt, and Paul Milgrom. 1991. “Multi- Song, and Anjan V. Thakor. 2014. “Duration of task Principal–Agent Analyses: Incentive Contracts, ­Executive Compensation.” Journal of Finance 69 (6): Asset Ownership, and Job Design.” Journal of Law, 2777–817. Economics, and Organization 7 (Special Issue): Graham, John R., Si Li, and Jiaping Qiu. 2012. “Man- 24–52. agerial Attributes and Executive Compensation.” Holmstrom, Bengt, and Jean Tirole. 1997. “Financial Review of Financial Studies 25 (1): 144–86. Intermediation, Loanable Funds, and the Real Sec- Grossman, Sanford J., and Oliver D. Hart. 1983. “An tor.” Quarterly Journal of Economics 112 (3): 663–91. Analysis of the Principal–Agent Problem.” Econo- Inderst, Roman, and Holger M. Mueller. 2010. “CEO metrica 51 (1): 7–45. Replacement under Private Information.” Review of Hall, Brian J., and Thomas A. Knox. 2004. “Underwa- Financial Studies 23 (8): 2935–69. ter Options and the Dynamics of Executive Pay-to- Innes, Robert D. 1990. “Limited Liability and Incen- Performance Sensitivities.” Journal of Accounting tive Contracting with Ex-Ante Action Choices.” Jour- Research 42 (2): 365–412. nal of Economic Theory 52 (1): 45–67. Hall, Brian J., and Jeffrey B. Liebman. 1998. “Are Ittner, Christopher D., David F. Larcker, and Mad- CEOs Really Paid Like Bureaucrats?” Quarterly hav V. Rajan. 1997. “The Choice of Performance Journal of Economics 113 (3): 653–91. Measures in Annual Bonus Contracts.” Accounting Hall, Brian J., and Kevin J. Murphy. 2000. “Optimal Review 72 (2): 231–55. Exercise Prices for Executive Stock Options.” Amer- Jensen, Michael C., and William H. Meckling. 1976. ican Economic Review 90 (2): 209–14. “Theory of the Firm: Managerial Behavior, Agency Hall, Brian J., and Kevin J. Murphy. 2002. “Stock Costs and Ownership Structure.” Journal of Finan- Options for Undiversified Executives.”Journal of cial Economics 3 (4): 305–60. Accounting and Economics 33 (1): 3–42. Jensen, Michael C., and Kevin J. Murphy. 1990. Harris, Milton, and Bengt Holmstrom. 1982. “A Theory “Performance Pay and Top-Management Incen- of Wage Dynamics.” Review of Economic Studies 49 tives.” Journal of Political Economy 98 (2): 225–64. (3): 315–33. Jenter, Dirk, and Fadi Kanaan. 2015. “CEO Turnover Harris, Milton, and Artur Raviv. 1979. “Optimal Incen- and Relative Performance Evaluation.” Journal of tive Contracts with Imperfect Information.” Journal Finance 70 (5): 2155–84. of Economic Theory 20 (2): 231–59. Jenter, Dirk, and Katharina Lewellen. 2014. 1286 Journal of Economic Literature, Vol. LIV (December 2016)

“­Performance-Induced CEO Turnover.” Stanford Lewellen, Stefan. 2015. “Executive Compensation and University Graduate School of Business Working Industry Peer Groups.” http://faculty.london.edu/ Paper 3054. slewellen/exec_comp.pdf. Jewitt, Ian. 1988. “Justifying the First-Order Approach Lie, Erik. 2005. “On the Timing of CEO Stock Option to Principal–Agent Problems.” Econometrica 56 (5): Awards.” Management Science 51 (5): 802–12. 1177–90. Lucas, Robert E., Jr. 1978. “On the Size Distribution Jin, Li. 2002. “CEO Compensation, Diversification, of Business Firms.” Bell Journal of Economics 9 (2): and Incentives.” Journal of Financial Economics 66 508–23. (1): 29–63. Manso, Gustavo. 2011. “Motivating Innovation.” Jour- Johnson, Shane A., Harley E. Ryan, Jr., and Yisong S. nal of Finance 66 (5): 1823–60. Tian. 2009. “Managerial Incentives and Corporate Margiotta, Mary M., and Robert A. Miller. 2000. Fraud: The Sources of Incentives Matter.” Review of “Managerial Compensation and the Cost of Moral Finance 13 (1): 115–45. Hazard.” International Economic Review 41 (3): Joskow, Paul, Nancy Rose, and Andrea Shepard. 1993. 669–719. “Regulatory Constraints on CEO Compensation.” McConnell, John J., and Henri Servaes. 1990. “Addi- Brookings Papers on Economic Activity: Microeco- tional Evidence on Equity Ownership and Corpo- nomics 1: 1–58. rate Value.” Journal of Financial Economics 27 (2): Jovanovic, Boyan. 1979. “Job Matching and the Theory 595–612. of Turnover.” Journal of Political Economy 87 (5 Part Morck, Randall, Andrei Shleifer, and Robert W. Vishny. 1): 972–90. 1988. “Management Ownership and Market Valu- Kaplan, Steven N. 2012. “Executive Compensation and ation: An Empirical Analysis.” Journal of Financial Corporate Governance in the United States: Percep- Economics 20: 293–315. tions, Facts, and Challenges.” Cato Papers on Public Morse, Adair, Vikram Nanda, and Amit Seru. 2011. Policy 2: 99–157. “Are Incentive Contracts Rigged By Powerful Kaplan, Steven N., and Bernadette A. Minton. 2012. CEOs?” Journal of Finance 66 (5): 1779–821. “How Has CEO Turnover Changed?” International Murphy, Kevin J. 1985. “Corporate Performance and Review of Finance 12 (1): 57–87. Managerial Remuneration: An Empirical Analysis.” Kaplan, Steven N., and Joshua Rauh. 2010. “Wall Street Journal of Accounting and Economics 7 (1–3): 11–42. and Main Street: What Contributes to the Rise in the Murphy, Kevin J. 1986. “Incentives, Learning, and Highest Incomes?” Review of Financial Studies 23 Compensation: A Theoretical and Empirical Investi- (3): 1004–50. gation of Managerial Labor Contracts.” RAND Jour- Kostiuk, Peter F. 1990. “Firm Size and Executive nal of Economics 17 (1): 59–76. Compensation.” Journal of Human Resources 25 (1): Murphy, Kevin J. 1999. “Executive Compensation.” In 90–105. Handbook of Labor Economics: Volume 3B, edited Kremer, Michael. 1993. “The O-Ring Theory of Eco- by Orley C. Ashenfelter and David Card, 2485–563. nomic Development.” Quarterly Journal of Econom- Amsterdam and New York: Elsevier, North-Holland. ics 108 (3): 551–75. Murphy, Kevin J. 2013. “Executive Compensation: Kuhnen, Camelia M., and Jeffrey Zwiebel. 2009. Where We Are, and How We Got There.” In Hand- “Executive Pay, Hidden Compensation and Manage- book of the Economics of Finance, Volume 2A: Cor- rial Entrenchment.” Unpublished. porate Finance, edited by George M. Constantinides, Lacker, Jeffrey M., and John A. Weinberg. 1989. “Opti- Milton Harris, and Rene M. Stulz, 211–356. Oxford mal Contracts under Costly State Falsification.” Jour- and Amsterdam: Elsevier, North-Holland. nal of Political Economy 97 (6): 1345–63. Murphy, Kevin J., and Jan Zabojnik. 2007. “Managerial Lambert, Richard A. 1983. “Long-Term Contracts and Capital and the Market for CEOs.” Unpublished. Moral Hazard.” Bell Journal of Economics 14 (2): Nagel, Gregory L. 2010. “The Effect of Labor Market 441–52. Demand on U.S. CEO Pay since 1980.” Financial Lambert, Richard A., and David F. Larcker. 1987. Review 45 (4): 931–50. “An Analysis of the Use of Accounting and Market Oyer, Paul. 2004. “Why Do Firms Use Incentives That Measures of Performance in Executive Compensa- Have No Incentive Effects?” Journal of Finance 59 tion Contracts.” Journal of Accounting Research 25: (4): 1619–50. 85–125. Oyer, Paul, and Scott Schaefer. 2005. “Why Do Some Lambert, Richard A., David F. Larcker, and Robert E. Firms Give Stock Options to All Employees?: An Verrecchia. 1991. “Portfolio Considerations in Valu- Empirical Examination of Alternative Theories.” ing Executive Compensation.” Journal of Accounting Journal of Financial Economics 76 (1): 99–133. Research 29 (1): 129–49. Palia, Darius. 2001. “The Endogeneity of Manage- Laux, Volker. 2012. “Stock Option Vesting Conditions, rial Compensation in Firm Valuation: A Solution.” CEO Turnover, and Myopic Investment.” Journal of Review of Financial Studies 14 (3): 735–64. Financial Economics 106 (3): 513–26. Pan, Yihui. Forthcoming. “The Determinants and Lazear, Edward P. 1979. “Why Is There Mandatory Impact of Executive–Firm Matches.” Management Retirement?” Journal of Political Economy 87 (6): Science. 1261–84. Peng, Lin, and Ailsa Roell. 2008. “Manipulation and Edmans and Gabaix: Executive Compensation: A Modern Primer 1287

Equity-Based Compensation.” American Economic Financial and Quantitative Analysis 20 (4): 391–405. Review 98 (2): 285–90. Stein, Jeremy C. 1988. “Takeover Threats and Mana- Peng, Lin, and Ailsa Roell. 2014. “Managerial Incen- gerial Myopia.” Journal of Political Economy 96 (1): tives and Stock Price Manipulation.” Journal of 61–80. Finance 69 (2): 487–526. Stein, Jeremy C. 1989. “Efficient Capital Markets, Inef- Peters, Florian S., and Alexander F. Wagner. 2014. ficient Firms: A Model of Myopic Corporate Behav- “The Executive Turnover Risk Premium.” Journal ior.” Quarterly Journal of Economics 104 (4): 655–69 of Finance 69 (4): 1529–63. Sundaram, Rangarajan K., and David L. Yermack. Piketty, Thomas. 2014. Capital in the Twenty-First 2007. “Pay Me Later: Inside Debt and Its Role in Century. Cambridge, MA: Harvard University Press. Managerial Compensation.” Journal of Finance 62 Piketty, Thomas, and Emmanuel Saez. 2003. “Income (4): 1551–88. Inequality in the United States, 1913–1998.” Taylor, Lucian A. 2010. “Why Are CEOs Rarely Fired? Quarterly Journal of Economics 118 (1): 1–41. Evidence from Structural Estimation.” Journal of Prendergast, Canice. 1999. “The Provision of Incen- Finance 65 (6): 2051–87. tives in Firms.” Journal of Economic Literature 37 Taylor, Lucian A. 2013. “CEO Wage Dynamics: Esti- (1): 7–63. mates from a Learning Model.” Journal of Financial Prendergast, Canice. 2002. “The Tenuous Trade-Off Economics 108 (1): 79–98. between Risk and Incentives.” Journal of Political Terviö, Marko. 2008. “The Difference That CEOs Economy 110 (5): 1071–1102. Make: An Assignment Model Approach.” American Roberts, David R. 1956. “A General Theory of Exec- Economic Review 98 (3): 642–68. utive Compensation Based on Statistically Tested Tirole, Jean. 2006. The Theory of Corporate Finance. Propositions.” Quarterly Journal of Economics 70 Princeton and Oxford: Princeton University Press. (2): 270–94. Tirole, Jean. 2009. “Cognition and Incomplete Con- Rogerson, William P. 1985. “The First-Order Approach tracts.” American Economic Review 99 (1): 265–94. to Principal–Agent Problems.” Econometrica 53 (6): Warner, Jerold B., Ross L. Watts, and Karen H. Wruck. 1357–67. 1988. “Stock Prices and Top Management Changes.” Rose, Nancy L., and Andrea Shepard. 1997. “Firm Journal of Financial Economics 20 (1–2): 461–92. Diversification and CEO Compensation: Managerial Wei, Chenyang, and David Yermack. 2011. “Investor Ability or Executive Entrenchment?” RAND Journal Reactions to CEOs’ Inside Debt Incentives.” Review of Economics 28 (3): 489–514. of Financial Studies 24 (11): 3813–40. Rosen, Sherwin. 1992. “Contracts and the Market for Weisbach, Michael S. 1988. “Outside Directors and Executives.” In Contract Economics, edited by Lars CEO Turnover.” Journal of Financial Economics 20 Werin and Hans Wijkander, 181–211. Cambridge (1–2): 431–60. and Oxford: Blackwell. Winter, Eyal. 2004. “Incentives and Discrimination.” Ross, Stephen A. 2004. “Compensation, Incentives, and American Economic Review 94 (3): 764–73. the Duality of Risk Aversion and Riskiness.” Journal Winter, Eyal. 2006. “Optimal Incentives for Sequential of Finance 59 (1): 207–25. Production Processes.” RAND Journal of Economics Sannikov, Yuliy. 2008. “A Continuous-Time Version of 37 (2): 376–90. the Principal–Agent Problem.” Review of Economic Winter, Eyal. 2010. “Transparency and Incentives Studies 75 (3): 957–84. among Peers.” RAND Journal of Economics 41 (3): Sappington, David. 1983. “Limited Liability Con­ 504–23. tracts between Principal and Agent.” Journal of Yang, Jun. 2010. “Timing of Effort and Reward: Three- Economic Theory 29 (1): 1–21. Sided Moral Hazard in a Continuous-Time Model.” Sattinger, Michael. 1993. “Assignment Models of the Management Science 56 (9): 1568–83. Distribution of Earnings.” Journal of Economic Lit- Yermack, David. 1995. “Do Award CEO erature 31 (2): 831–80. Stock Options Effectively?” Journal of Financial Shue, Kelly, and Richard R. Townsend. Forthcoming a. Economics 39 (2–3): 237–69. “Growth through Rigidity: An Explanation for the Yermack, David. 1997. “Good Timing: CEO Stock Rise in CEO Pay.” Journal of Financial Economics. Option Awards and Company News Announce- Shue, Kelly, and Richard R. Townsend. Forthcoming b. ments.” Journal of Finance 52 (2): 449–76. “How Do Quasi-random Option Grants Affect CEO Yermack, David. 2006. “Golden Handshakes: Separa- Risk-Taking?” Journal of Finance. tion Pay for Retired and Dismissed CEOs.” Journal Smith, Clifford W., and Rene M. Stulz. 1985. “The of Accounting and Economics 41 (3): 237–56. Determinants of Firms’ Hedging Policies.” Journal of Zhu, John Y. 2016. “Myopic Agency.” Unpublished. This article has been cited by:

1. Holger M. Mueller1, Paige P. Ouimet2 and Elena Simintzi3 1Stern School of Business, New York University, 44 West Fourth Street, New York, NY 10012, NBER, CEPR, and ECGI (e-mail: [email protected]) 2Kenan-Flagler Business School, University of North Carolina at Chapel Hill, Campus Box 3490, McColl Building, Chapel Hill, NC 27599 (e-mail: [email protected]) 3Sauder School of Business, University of British Columbia, 2053 Main Mall, Vancouver, BC, V6T 1Z2, Canada (e-mail: [email protected]) . 2017. Wage Inequality and Firm Growth. American Economic Review 107:5, 379-383. [Abstract] [View PDF article] [PDF with links]