EFFORT AND SELECTION EFFECTS OF PERFORMANCE
PAY IN KNOWLEDGE CREATION∗
† Erina Ytsma
It is well-documented that performance pay has positive effort and selection effects in routine, easy to measure tasks, but its effect in knowledge creation is much less understood. This paper studies the effects of explicit and implicit, market-based incentives commonly found in knowledge work industries in a multi-tasking model and estimates the causal effort and selection effects of performance incentives in knowledge creation by exploiting the introduction of performance pay in German academia as a natural experiment, and using a newly constructed dataset of the universe of German academics. I find that performance incentives attract more productive academics, and research quantity increases by 14 to 18%, but there is no increase in the highest quality output.
JEL J33, M52, O31
1 INTRODUCTION
Knowledge work is an important pillar of present-day economies. It has become rapidly more prevalent over the last four decades and exhibited consistent growth in occupational employment share1(Autor, 2019). Furthermore, knowledge creation has long been considered an important driver of economic growth (Romer, 1986; Lucas, 1988). Yet much is still unclear about how to motivate knowledge workers, including how they respond to performance pay. This paper sheds
∗I would like to thank Laurence Ales, Pierre Azoulay, Oriana Bandiera, Tim Besley, Jordi Blanes-i-Vidal, Kenneth Corts, Pablo Casas-Arce, Baran Duzce, Florian Englmaier, Jeff Furman, Maitreesh Ghatak, Bob Gibbons, De Gruyter, Rosario Macera, Bentley Macleod, Michal Matejka, Bob Miller, Steve Pischke, Andrea Prat, Carol Propper, John van Reenen, Mark Schankerman, Axel Schniederjuergen, Ananya Sen, Chris Stanton, Scott Stern, Neil Thompson, Fabian Waldinger, the ministries of education of the German states and seminar and conference participants at DRUID 2016, the 31st EEA Congres, ESNASM 2017, ESEM 2017, ESEWM 17, IOEA 18, SIOE 18, BePE 2018, AEA 2019, GEIRC 2019, EAA 2019, GSE-OE 2019, AAA 2019, NBER Personnel Economics Summer Institute 2019, MAS Midyear 2020, CMU MAC 2019, MIT TIES, MIT IDE, MIT OE Lunch, Universidad Carlos III, Copenhagen Business School, Carnegie Mellon Tepper, UANDES, Universite Laval and MPI Munich for helpful comments, information or data. A previous version of this paper was titled “Career Concerns in Knowledge Creation”. †Carnegie Mellon University, Tepper School of Business. Tepper Building Room 4123, 5000 Forbes Avenue, Pittsburgh, PA 15213. e-mail: [email protected]. Phone: +1-412-268-1117. 1Knowledge work is defined here as non routine cognitive jobs that comprise a host of intellectual tasks.
1 light on the effect of performance pay on knowledge creation by causally identifying the effort and selection effects of performance incentives in academia. It is by now well-understood that performance pay increases productivity in routine tasks and settings in which output is readily measurable (e.g. car window replacement, fruit picking, students’ test scores), through increases in effort or by attracting the most productive individuals (Lazear, 2000; Shearer, 2004; Bandiera, Barankay and Rasul, 2005; Leuven et al., 2011; Dohmen and Falk, 2011). However, it is not clear that performance pay would have the same effects in the context of knowledge work. For one, knowledge work generally comprises multiple, complex tasks, the output of which is often not measurable or only a noisy signal of effort. Multi-tasking problems are therefore likely to arise (Holmstrom and Milgrom, 1991; Hellmann and Thiele, 2011). Secondly, because quality dimensions such as impact and novelty are valuable outcome characteristics for many types of knowledge work, incentive systems may need to be structured differently, with a longer time-horizon and allowing for (early) exploration and experimentation (Azoulay, Graff Zivin and Manso, 2011; Manso, 2011; Ederer and Manso, 2013). Finally, knowledge workers may be particularly highly intrinsically motivated. Higher-powered extrinsic incentives may crowd-out this intrinsic motivation, thus potentially reducing knowledge output (Benabou and Tirole, 2003; Bénabou and Tirole, 2006; Besley and Ghatak, 2005, 2018). In this paper, I study the effect of performance incentives on the quantity and quality of knowl- edge output and the productivity of knowledge workers attracted by high-powered incentives both empirically and theoretically. I present a simple multi-tasking model with explicit and implicit, market-based incentives commonly found in knowledge work industries and use this to derive testable implications for both average incentive effects and heterogeneous responses across ability types. I test the model’s predictions by exploiting the introduction of performance pay in German academia as a natural experiment, and using a newly constructed dataset of the universe of German academics2. The specifics of the roll-out of the performance-related pay scheme give rise to a differential incidence of performance incentives across tenure and age cohorts. This allows me to causally and separately identify the effort and selection effects in a difference-in-differences framework. The theoretical model presented in this paper builds on Gibbons and Murphy (1992) and features both explicit performance incentives (bonuses for performance on the job) and implicit, market-based incentives (wage supplements negotiated in contract talks). This combination of explicit and implicit incentives is a common feature of pay structures in knowledge work industries3. The model also incorporates multi-tasking issues as output has two dimensions, quantity and quality, the latter of which is less precisely measured (c.f. Holmstrom and Milgrom (1991)). Both output dimensions increase with effort as well as agent ability and agent ability is imperfectly known by the market and the agent (i.e. there is symmetric uncertainty about agent
2This dataset is also used in Ytsma (2021). 3Market-based wages determined in contract negotiations (career concerns) and on-the-job performance bonuses are common performance incentives in academia and knowledge creation jobs more generally (Bonatti and Hörner, 2017), as well as managerial jobs (Gibbons and Murphy, 1992) and professional jobs such as in law (Ferrer, 2016), finance (Hong, Kubik and Solomon, 2000; Chevalier and Ellison, 1999) and software development (Lerner and Wulf, 2007).
2 ability). The market then uses output measures as signals of both effort and agent ability, to inform agent pay. This, in turn creates incentives for the agent to exert effort. Because quality is less precisely measured, the incentives to exert effort toward quality are relatively weaker. Yet because the market takes both quantity and quality output into account to update believes about agent ability, incentives to exert quality are not absent either and they are stronger for higher ability academics, for whom exerting effort towards both quantity and quality is assumed to be relatively less costly. In equilibrium, and relative to a flat wage, output quantity goes up unambiguously in response to performance pay, but output quality increases only if the quality output measure is sufficiently precise. These responses are not uniform across ability types. Performance pay increases quantity effort the most for the lowest ability workers, and least for workers of intermediate ability. Quality effort on the other hand increases the most or decreases the least for the most able workers and decreases the most for those of intermediate ability. That is, there is no simple substitution of quantity for quality, rather the degree to which there is substitution varies across ability types and depends on the noise with which output dimensions are measured. Furthermore, the higher-powered incentives attract workers of higher ability (in expectation), since they can expect to earn more under performance pay. In order to empirically analyze the effect of performance pay in academia, I constructed a data set comprising the affiliations, research productivity measures and related information of the universe of academics in Germany by consolidating information from various, unstructured data sources. To estimate the effort effect, I use the fact that any contract signed or renegotiated after the implementation of the reform necessarily falls under the new performance pay scheme, while any existing contract continues to fall under the old age-related pay scheme. Academics who start their first tenured affiliation before the reform therefore fall under the age-related pay scheme, while those who start their first tenured affiliation after the reform are paid according to the performance pay scheme. If the timing of the start of the first tenured affiliation is exogenous, any difference in the change of productivity from before to after the reform between academics who start their first tenured position just before the reform and those who start a first tenured position directly after the reform can be interpreted as the causal effect of performance pay on effort. I find that performance pay increases research quantity and quality-adjusted quantity by at least 14 to 18% on average4. At the same time, the average quality of publications5 decreases by 9 to 10% in response to performance pay. The response in output quantity is equivalent to treated academics publishing almost one extra paper every three years, while the decline in average quality is equivalent to a decrease of almost 0.22 in the impact factor of the journal in which the average publication appears (this is roughly equal to e.g. the difference in 2-year impact factor between the Journal of Political Economy and the American Economic Journal: Applied Economics (Clarivate Analytics, 2017)) or 3.6 fewer citations to publications. To get a better idea
4Pure research quantity is measured by the number of publications, while the impact factor-rated number of publications, and the number of citations to publications at least six years after publication are used as measures of quality-adjusted quantity. 5Measured as average impact factor rating and average number of citations per paper.
3 of the quality of the work produced in response to performance pay, I analyze the distribution of citations and impact factors. I find that treated academics produce more of low to medium-high quality work, but not more of the highest quality research. These effort effects arise in response to the implicit performance incentives of the wage premiums determined in contract negotiations, with no additional significant effort response to the explicit on-the-job performance incentives. Furthermore, the effort effects are highly persistent; they do not diminish before the end of the study period eight years after implementation of the reform. I find no evidence of pre-existing trends, which lends support to the identifying parallel trends assumption of the difference-in-differences estimation, and no effect of the performance pay reform on academic effort in a placebo difference-in-differences estimation with two cohorts that tenure before the reform. The results are also robust to estimation with synthetic cohorts, where assignment to the treatment and control cohort is determined by the average age at which academics start their first tenured affiliation instead of the actual timing of the first tenured affiliation. These tests lend support to the causal interpretation of the effort effect estimates. The average effort effect estimates hide a considerable amount of heterogeneity, which help explain the absence of an increase in the highest quality work. The greatest increase in quantity comes from the relatively less productive academics, who tend not to produce the highest quality work. High ability academics also increase the number of papers they produce, but they do not produce more of the highest quality work. Medium ability academics, finally, even decrease the number of high-quality papers they produce. To provide further evidence on the quality and impact of the work produced in response to performance incentives, I use textual analysis techniques to construct metrics for the similarity of papers to past and future publications to gauge their novelty and impact, respectively. These metrics are based on cosine similarities of vector representations of the abstracts of papers, similar to Kelly et al. (forthcoming). The analyses confirm the quality response results; the additional work produced in response to performance pay is, on average, not the most novel or impactful. Top productivity academics produce more medium to highly novel work that generates mid- to high-level follow-on, low productivity academics produce additional papers that are just above the median in terms of both similarity to past and future work, while sub-top productivity academics produce more papers that are very novel, but also garner only very little follow-on work. Finally, I estimate the selection effect of performance incentives by testing for differential changes in switching hazard rates by age and cohort for academics across average productivity levels in another difference-in-differences framework. For the first dimension of variation - tenure cohort - I use the fact that only academics who already hold a tenured affiliation before the reform select into the performance pay scheme when they change their affiliation, position or contract. Academics who start their first tenured affiliation after the reform automatically fall under the performance pay scheme from the start. Accordingly, the treated-control designation for the selection analysis is the opposite of what it was for the effort effect analysis, with academics
4 who hold a tenured affiliation before the reform now comprising the treated cohort. The second dimension of variation I exploit here is the age of an academic. The basic wage schemes of the age-related and performance pay schemes compare differently at different ages. The schemes intersect only once, and wages increase with age in the age-related scheme. The performance pay scheme is therefore relatively less attractive for older academics. and selection incentives decrease by age. In line with this, I find that a higher productivity increases the switching rates of treated academics less when they are older. Put differently, older academics need to be of relatively higher productivity in order for them to change affiliation or renegotiate their contract if such a change means switching to performance pay. Performance pay thus attracts more productive academics. Taken together, this paper shows that performance pay schemes commonly found in knowledge creation jobs significantly increase knowledge output quantity. The additional knowledge output produced is however not of the highest quality, for two reasons. Firstly, the quantity effort response is strongest for relatively less productive academics, whose work is not of the highest quality on average. Secondly, academics just below the productivity top actually decrease output quality, while the output quality of academics at the top of the productivity distribution does not change. At the same time, performance pay is effective in attracting more productive academics. Though these academics increase output quantity, they do not produce more of the highest quality work, so the two effects do not reinforce each other. Importantly, this means that the nature of the task matters for the effect performance pay has on output. If tasks are complex, in that effort and output are multi-dimensional and output measures are noisy, the effort response might be more positive or positive only in the dimension(s) with less noisy output measures. In knowledge work and professional work, measures for output quality are likely more noisy. This paper shows that in such a setting performance pay may fail to increase high quality output and may decrease average quality. By studying the effect of performance pay on a non-routine, multi-dimensional and hard-to- measure task like knowledge creation, this paper contributes to the vast literature on performance incentives (cf. Lazear and Oyer (2012); Oyer and Schaefer (2011) for reviews), much of which has focused on more routine tasks with more precise output measures (e.g. Bandiera, Barankay and Rasul (2005); Shearer (2004); Lazear (2000)). Moreover, the (empirical) literature on incentives has mostly studied the effects of explicit performance incentives, ranging from piece-rate pay (Dohmen and Falk, 2011; Bandiera, Barankay and Rasul, 2005; Shearer, 2004; Lazear, 2000) or bonus pay (Hossain and List, 2012; Muralidharan and Sundararaman, 2011; Lavy, 2009) to tournament schemes (Leuven et al., 2011; Carpenter, Matthews and Schirm, 2010; Freeman and Gelber, 2010) and monitoring regimes (Boly, 2011; Dickinson and Villeval, 2008). This paper, on the other hand, studies the effort and selection effects of explicit and implicit performance incentives, in the form of career concerns, as these are common incentives in knowledge work and professional jobs. As such, the paper contributes to the literature on career concerns as well (Bonatti and Hörner, 2017; Miklós-Thal and Ullrich, 2016; Ferrer, 2016;
5 Holmström, 1999; Gibbons and Murphy, 1992; Holmström, 1982). This paper also contributes to the literature on incentives for innovation and knowledge creation as well as the literature on university governance (McCormack, Propper and Smith, 2014; Haeck and Verboven, 2012; Aghion et al., 2010) and the organization of knowledge creation (Jones, 2009; Wuchty, Jones and Uzzi, 2007; Audretsch and Feldman, 1996; Jaffe, Trajtenberg and Henderson, 1993). Many papers in the former literature look at incentives for commercializable knowledge and the commercialization of knowledge (patenting) (Hvide and Jones, 2018; Hall and Harhoff, 2012; Azoulay, Ding and Stuart, 2009; Lach and Schankerman, 2008, 2004). In so doing, the academic fields that can be studied is generally restricted to the (applied) sciences, and it is frequently difficult to differentiate between changes in the production of knowledge that can be commercialized and the rates at which such knowledge is commercialized (patented). Of the papers that study incentives for academic (basic) research more generally, many focus on the effects of funding or awards (Borjas and Doran, 2015; Chan et al., 2014; Azoulay, Graff Zivin and Manso, 2011). In these settings, it is difficult to distinguish between the effect of funding itself and the incentive effect of funding or awards on output. Furthermore, it is difficult to distinguish between the effort effect of performance incentives and selection into the funding or award schemes. This paper contributes to this literature by providing causal and separate evidence of the effort and selection effect of common performance incentives in knowledge creation, across all academic fields. Finally, by studying the effect of performance incentives on the quality, impact and novelty of knowledge output, this paper also contributes to the literature on incentives for novelty and creativity. A number of recent papers study the effects of competition in online platforms on the creativity or novelty of outputs such as logo designs (Gross, 2016), novels (Wu and Zhu, 2018) and software algorithms (Boudreau, Lacetera and Lakhani, 2011; Boudreau, Lakhani and Menietti, 2016), with mixed results. Gibbs, Neckermann and Siemroth (2017) find that rewarding employees for ideas for product and process improvements raises the quality of ideas submitted in a field experiment setting, while Erat and Gneezy (2016) find that piece-rate pay does not alter creativity, though competitive incentives reduce creativity in a lab experiment. Ederer and Manso (2013) finally show in another lab setting that the structure of rewards, allowing for early failure and rewarding long-term success, is important for innovation. The paper is structured as follows: the next section provides information on the institutional background and section 3 presents the theoretical model. The empirical analysis makes up section 4, with the first part focusing on the effort effect and the second part on the selection effect of performance pay. Section 5 concludes.
2 INSTITUTIONAL BACKGROUND
The German academic pay reform that I exploit as a natural experiment in this paper introduced a new pay scheme (“W-pay”) under which professors can earn performance-related bonuses
6 on top of a basic wage (BMBF, 2002). These performance-related bonuses can be substantial; potentially more than doubling a professor’s monthly wage. Before the reform, professors were paid according to an age-related pay scheme (“C-pay”) in which pay increased at a pre-determined rate with age (Hochschullehrerbund, 2009; Oeffentlicher Dienst, 2004; Expertenkommission, 2000).
2.1 Performance Pay (W-Pay)
There are three basic pay levels in the performance pay scheme: W1 pays a basic monthly wage of 3405.34 euro, W2 3890.03 euro and W3 4723.606 (Detmer and Preissler, 2006; Oeffentlicher Dienst, 2004). Tenured professors receive either W2 or W3 pay. Professors can earn performance bonuses on top of these basic wages in the W-pay scheme in three ways: as attraction or retention bonus, for on-the-job performance, and for taking on management roles or tasks (BMBF, 2002). Although federal and state laws lay down the ground rules for the performance pay, universities have discretion in whom to award performance bonuses, and how much7. Many universities set out procedures for the award of bonuses in university statute supplements (Detmer and Preissler, 2006). Only tenured professors can earn substantial bonuses in the performance pay scheme. I therefore restrict attention to tenured professors when analyzing effort and selection effects of the pay reform in this paper. The first kind of performance bonus, the attraction or retention bonus, is a wage premium that is determined as part of contract (re)negotiations and generally awarded on the basis of a professor’s qualifications and past achievements and performance, taking into account applicant pool quality and labor market tightness (Detmer and Preissler, 2005). In order to be able to negotiate such a bonus, professors are often required to show proof of another/outside offer (Detmer and Preissler, 2005). The attraction and retention bonuses are thus implicit, market- based incentives. These incentives do not derive from an explicit performance contract between an academic and any particular university, but rather from the academic’s expectation to be able to influence future attraction or retention bonuses, either at the current university or another, by exerting more effort now. Since an academic’s past performance is used to update beliefs about the academic’s ability and the (bonus) pay offered is driven to reflect beliefs about the academic’s expected productivity under the influence of competition in the labor market, these incentives take the form of career concerns. Career concerns incentives are very common in academia and knowledge creation jobs more generally(Bonatti and Hörner, 2017; Lerner and Wulf, 2007), as well as in managerial and professional jobs (Chevalier and Ellison, 1999; Gibbons and Murphy, 1992).
6These were the basic wage levels under the performance pay scheme in former West-German states determined as of 1 August 2004. The corresponding basic wage levels in former East-German states were 92.5% of the West-German rates (Detmer and Preissler, 2006). 7See Handel (2005) for a comprehensive overview of how much discretion higher education institutes have regarding hiring and pay decisions after the reform in the different German states. Only very few states (e.g. Bremen), require the state’s minister of education to have a say in bonus negotiations (Detmer and Preissler, 2006).
7 In the German performance pay system, the career concerns incentives should take effect from the moment the reform is announced for those individuals that anticipate they may fall under the performance pay scheme at some point in the future, since improved performance before the implementation of the reform will increase the chance they can command lucrative attraction or retention bonuses after implementation of the reform. I use this timing to separately identify the causal effort effect of the implicit, career concerns incentives from that of explicit performance incentives, to which I turn next. The second type of bonus introduced with the pay reform are bonuses for on-the-job per- formance. These can be awarded for performance in research, art, teaching, mentoring and supervision (BMBF, 2002). To assess research performance for instance, universities take into account the number and quality/rank of publications, external research grants, patents and re- search prizes, while exceptional teaching evaluations, the development of didactic methods and teaching grants and prizes can serve as evidence of special teaching achievements (Detmer and Preissler, 2005; Universitaet Regensburg, 2016; Gien, 2017). Most state laws stipulate that the performance for which on-the-job bonuses are awarded must have been effected over multiple years (often at least 3 years)(Handel, 2005). Universities generally have the option to award attraction or retention and on-the-job bonuses on a permanent basis, for a fixed term (initially) or even as a one-off payment (Detmer and Preissler, 2004, 2005). If bonuses are awarded for a fixed term with the option of renewal, universities frequently enter into a target agreement with the respective professor, especially if it concerns an attraction bonus for a first-time tenured professor (Detmer and Preissler, 2006). The target agreement specifies the achievements, such as the number and type of publications and grants, that are expected of the professor in a 3- or 5-year period. If these targets are met, the attraction or retention bonus continues to be paid, either for another 3- to 5-year period, or permanently. Target agreements may also allow for partial fulfillment, such that, when a professor secures external funds below a certain threshold for instance, the bonus that they (continue to) receive is lower (Detmer and Preissler, 2006). The on-the-job bonuses and target agreements constitute explicit performance incentives of the sort commonly seen in knowledge creation, managerial and professional jobs (Lerner and Wulf, 2007; Hong, Kubik and Solomon, 2000; Gibbons and Murphy, 1992). They take effect only after the reform is implemented, when professors enter into the performance pay scheme. The third kind of bonus in the performance pay scheme comprises pay supplements for taking on management tasks or roles (BMBF, 2002). These bonuses are paid as lump-sum payments for the duration of the task or role and are therefore not performance incentives. Finally, the reform also introduced the option for professors to extract pay supplements from third-party awarded funds for research or teaching projects for the duration of such projects (BMBF, 2002). To the extent that grant application committees use an academic’s past perfor- mance to update their beliefs about the academic’s ability and chances of success, academics who anticipate to fall under the performance pay scheme have an incentive to improve their
8 performance from the moment the reform is announced in an attempt to improve their chances of winning a grant and earning a pay supplement. The grant pay supplements thus introduce implicit performance incentives of a career concerns nature as well. The performance bonuses are not a rare occurrence. In 2006, only 23% of the professors in the performance pay scheme did not receive a performance bonus (BMI, 2007). Of the different types of bonuses, the attraction and retention bonus is the most important, both in terms of frequency of award and total amount awarded. Attraction and retention bonuses were awarded almost six times as often as on-the-job performance bonuses in 2005 and more than three times as often in 2006, and they were awarded almost six times as often as function-specific bonuses in both years (BMI, 2007). Furthermore, about 75% of the total amount of bonus pay in the performance pay scheme up until 2008 was awarded as attraction or retention bonus (Biester, 2010).
2.2 Comparison With Age-Related Pay (C-Pay)
There are four pay levels in the age-related pay scheme (C1-C4); university professors are generally awarded C3 or C4 pay. The monthly salary in these levels increases every two years by roughly 170 Euros8, from the age of 21 to the age of 49 (Hochschullehrerbund, 2009; Oeffentlicher Dienst, 2004; Expertenkommission, 2000). In contrast, the basic wage in the W-pay scheme does not vary with age, and the level is such that professors earn a higher before-bonus wage in the performance pay scheme at first, but once they get older, they would earn a higher basic wage in the C-pay system. depicts the monthly wage by age for the several pay levels in the performance pay and age-related pay schemes. Thus the basic wage schedules have a single the crossing point the location of which depends on the specific pay level of an academic; the age-related wage starts to exceed the basic wage in the performance pay scheme either at age 33 or at age 43 (Cf. Appendix Figure A.1) (Oeffentlicher Dienst, 2004; Handel, 2005). Before the pay reform, professors in the highest pay level of the age-related pay scheme (C4) could earn bonuses when they received offers after their first appointment as C4-professor. These bonuses were standardized to be around 650 euro per month for the second C4-offer, and about 730 euro for the third C4-offer from another university, and roughly 75% of this if a counter-offer of the home university was accepted (Detmer and Preissler, 2006; Preissler, 2006; Dilger, 2013). By comparison, the average attraction and retention bonus in the W-pay system had already grown to 1187 euro per month in 2006, and the average on-the-job performance bonus to 1649 euro (BMI, 2007). Furthermore, only a small fraction of professors qualified for and received bonuses under the age-related pay system. Handel (2005) for instance calculates, using data from the Ministry of Science and Culture in Niedersachsen, that only 16.5% of professors received attraction or retention bonuses in the age-related pay system. In contrast, any tenured professor in the performance pay system can receive bonuses, and in 2006 already 77% of professors in the
8Using pay tables valid as of August 2004 (Hochschullehrerbund, 2009)
9 performance pay scheme did receive bonuses (BMI, 2007). Consequently, only 3.55% of the total professorial pay volume was spent on attraction and retention bonuses in the age-related system, before the reform, while an estimated 26% of the professorial pay volume was available for performance bonuses under the performance pay scheme immediately after the reform (Handel (2005), using data from Expertenkommission (2000)). Combined with the fact that, at most ages, the basic wage is lower in the performance pay system than in the age-related system, this means that a larger portion of professorial pay depends on performance and there is a greater spread in professorial pay in the W-pay system. The W-pay system therefore offers higher-powered performance incentives than the old, age-related pay system.
2.3 Implementation
The federal law introducing the new professorial pay scheme was passed by Germany’s par- liament in February 2002 and applies to all public institutions of higher education.. The law required all states to implement the reform within their respective jurisdiction latest by 1 January 2005 and only three states (Bremen, Niedersachsen and Rheinland-Pfalz) did so before the end of 2004 (Detmer and Preissler, 2005). Hence any new or renegotiated professorial contract entered into as of 1 January 2005 falls under the performance pay scheme, while existing contracts continue to fall under the age-related pay scheme. Importantly, once a professor switches to performance pay 2004, they can never go back to age-related pay (Detmer and Preissler, 2004). Professors do not have to wait for outside offers in order to be able to renegotiate their contract and switch to performance pay; they can opt into the performance pay scheme at any time after the reform’s implementation. Preissler (2006) however reports that few professors have exercised this option. Appendix A2 provides additional institutional details.
3 THEORETICAL FRAMEWORK
The model outlined here and presented in detail in Appendix section A1.2 illustrates how a combination of career concerns and explicit performance pay bonuses affect effort when effort and output have two dimensions: quantity and quality. The model builds on the multi- tasking model of Holmstrom and Milgrom (1991) and the career concerns model with explicit performance incentives in Gibbons and Murphy (1992). The combination of market-based wages reflecting perceived ability and explicit performance bonuses aligns with the structure of the performance pay scheme introduced in German academia, with its attraction and retention bonuses negotiated in contract talks and additional on-the-job bonuses. It is also representative of pay structures in academia and knowledge creation jobs more generally (Bonatti and Hörner, 2017), as well as managerial jobs (Gibbons and Murphy, 1992) and professional jobs such as in law (Ferrer, 2016), finance (Chevalier and Ellison, 1999; Hong, Kubik and Solomon, 2000) and software development (Lerner and Wulf, 2007). Dewatripont, Jewitt and Tirole (1999) also study a career concerns model with multitasking, but their analysis centers around total effort across
10 equally noisy output signals, while this model focuses on effort allocation across tasks (quantity and quality) that differ in the noise with which their output is measured. Consider a labor market in which risk neutral principals interact with infinitely lived, risk −→ −→ averse agents. Each period t = 0,1,2,.., agents choose effort et = ep,t,eq,t ≥ 0 and produce −→ output yt = yp,t,yq,t . Effort cannot be observed by principals, but output is observable to all market participants. Output has two dimensions, quantity yp,t and quality yq,t, each of which is a noisy signal of agent ability θ and effort put towards the respective output dimension: yp,t = θ + ep,t + εt and yq,t = θ + eq,t + νt. Here εt,νt are iid sequences of normally distributed 2 2 2 2 noise terms: εt ∼ N 0,σε ,νt ∼ N 0,σν ,σε > 0,σν > 0. I assume that the quality measure is 2 2 more noisy, σε < σν . Ability is not known either by agents or principals, but there is common knowledge about the prior ability distribution. In particular, an agent’s ability θi is an iid draw from a normal 2 distribution with mean mi,0 ∈ [m,m],m ≥ 0 and variance σ0 > 0. I allow for abilities to be drawn from distributions with different means, to reflect the possibility that agents can distinguish themselves before entering the labor market, for instance during their studies. Agents have CARA preferences with risk aversion parameter r. Their cost of effort is multivariate quadratic, specifically " # −→ h−→ −→iT 1 h−→ −→i h−→ −→iT 1 c d h−→ −→i c( et ) = et − e C et − e = et − e et − e (1) 2 2 d c
−→ Inclusion of e follows Holmstrom and Milgrom (1991) and ensures positive effort levels even in the absence of performance pay. These minimum effort levels capture other mechanisms and institutions that drive effort, such as intrinsic motivation or minimum output requirements (e.g. tenure requirements). For simplicity, and without loss of generality, I set c = 1 and to assure concavity of the relevant optimization problems, I assume that 0 < d < 1 so that C is positive definite. This means that effort towards quantity ep,t and effort towards quality eq,t are substitutes.
I allow for the degree of substitution to vary by ability class, d = d (m0). In particular, I follow Rubin, Samek and Sheremeta (2018) in assuming that quantity and quality effort are weaker substitutes for agents in higher ability classes, ∂d < 0. ∂m0 Under flat wage contracts, when all agents are paid the same wage regardless of output, ∗ ∗ all agents optimally provide minimal effort; ep,t = e¯p and eq,t = e¯q. I contrast this against a performance pay system that comprises both explicit performance contracts and career concerns. In particular, consider a perfectly competitive labor market in which only short-term contracts are feasible. As in Gibbons and Murphy (1992), I restrict attention to linear contracts of the form −→ −→T −→ wt ( y t) = ct + bt y t. This not only ensures tractability, but Holmstrom and Milgrom (1991) also show that optimal contracts are linear in a setting with comparable assumptions about agent preferences and output noise. In each period, the timing of actions is as follows: principals offer a contract (wt) to agents; agents pick the contract that yields the highest expected utility; agents
11 choose and exert effort; output materializes, principals receive the output produced by agents they employ and agents are paid according to their contract terms. Proposition 1 in Appendix 1 shows that equilibrium effort under this performance pay system is given by:
(1 − d) + r σ 2 − dσ 2 e∗ =e¯ + ν ε 1 + rCC σ 2σ 2 (2) p,t p,t (1 − d2)D t ε ν " # (1 − d) + r σ 2 − dσ 2 e∗ = max 0,e¯ + ε ν 1 + rCC σ 2σ 2 (3) q,t q,t (1 − d2)D t ε ν
2 ∞ τ σ0 2 2 2 where CCt = 2∑τ=0 δ 2 2 2 2 2 is the career concerns effect, D := 1+r 2σt + σε + σν + σε σν +(t+τ)σ0 (σε +σν ) 2 2 2 2 2 2 2 2 2 2 σ0 σε σν r σt σε + σν + σε σν and σt := var (θt ) = 2 2 2 2 2 . Proposition 2 in the Appendix σε σν +tσ0 (σε +σν ) provides testable implications for the effort and selection effects of this performance pay system, as compared to the flat wage system. I summarize these here. First, effort towards quantity is unambiguously larger under performance pay, while quality effort is higher only for high ability agents and only if quantity and quality are sufficiently weak substitutes or the quality measure is not too noisy. If the quality measure is very noisy, no agent increases quality effort. Thus, while quantity is expected to increase unambiguously upon the introduction of performance pay, what happens to quality is ultimately an empirical question. The second implication relates to heterogeneous effort responses. High ability agents increase quality the most or, if the output measure is very noisy, decrease it the least. Low ability agents may end up at a corner solution, where they put no effort towards quality. If the minimum effort level these agents (are required to) put towards quality absent performance pay is very low, the difference in quality effort of low ability agents between pay systems is nil or very small. As for quantity effort, if the quality measure is not too noisy, both low and high ability agents increase their effort levels the most, with intermediate ability classes the least responsive. However, if the quality measure is very noisy, such that no agent increases quality, the quantity response decreases monotonically with agent ability. Effort response heterogeneity by ability type is therefore an empirical question as well. Thirdly, if some, but not all agents select into performance pay, only high ability agents select into performance pay. The ability cut-off for such selection is higher when flat wages are higher. The same selection results apply to selection into academia and labor markets with similar incentive structures as compared to an outside option that provides a given baseline utility. Appendix 1 also shows that equivalent implications hold in the presence of career concerns incentives only. The effort and selection effects derived here are thus quite general and apply to incentive structures commonly found in knowledge work and professional jobs.
12 4 EMPIRICAL ANALYSIS
4.1 Data Description
In order to test the theoretical implications for the effort and selection effects of performance pay, I constructed an individual level panel data set that encompasses the affiliations of the universe of academics in German academia for the years 1999-2013, as well as their publication records from 1993 onwards. The data set also provides personal information such as the year in which an academic completed their PhD, obtained their postdoctoral qualification, and started working in academia, as well as an academic’s gender, birth year and, if applicable, year of passing. All in all, the panel encompasses 50174 academics who held a tenured position at a German public university at some point between 1999 and 2013. I restrict attention to public universities only, of which there are 89 in Germany in the years spanned by the panel (HRK, 2014). I discard higher education institutions other than universities, because I focus on research productivity as outcome variable, and the research output of universities is incomparable to that of other higher education institutions BMBF (2002). I further restrict attention to academics who hold a tenured affiliation at some point in the sample period, because performance bonuses can be earned in tenured positions only. To construct the individual panel data set, I draw from three main, mostly unstructured, input data sets; Kuerschners Deutscher Gelehrten Kalender and Forschung & Lehre Magazine for affiliations, and ISI Web of Science for publications data. Kuerschners Deutscher Gelehrten Kalender (hereafter: DGK) is a comprehensive encyclopedia of academics affiliated with German universities (De Gruyter, 2013, 2006, 2008). I use DGK as a register of the universe of academics affiliated with German universities and extract academics’ personal information (full name, birth date, year of passing, gender) as well as professional information (academic affiliation at different points in time, start year of academic career in Germany, end year of academic career in Germany, self-reported information on career history) from it. I supplement the information in DGK regarding the timing of affiliation changes and the obtainment of postdoctoral qualifications with information from Forschung & Lehre Magazine (hereafter: FuL) (DHV, 1999-2013). FuL is Germany’s largest academic professional magazine. Every month, it publishes an overview of scholars that obtained their post-doctoral qualification (habilitation), as well as professorial offers that were extended, accepted or rejected. Finally, I extract publication records for the academics in my data set for the years 1993-2012 from the ISI Web of Science database to construct measures of research productivity Clarivate Analytics (1993-2012a) and I use journal impact factors taken from the ISI journal citation report (JCR) of the year of publication9 (Clarivate Analytics, 2000-2012b) to construct impact factor-rated publication measures. I match academics across the three input data sets on the basis of last name, initials and field, and discard any resulting duplicate matches. I further improve the matching by, for instance
9Due to data availability limitations, I have ISI JCR data for the years 2000-2013 only. I therefore use the average of the impact factors from JCR 2000 through JCR 2004 to weigh publications before 2000.
13 exploiting additional information such as the start or end date of academic careers to rule out implausible matches. Doing so yields an 83% match rate of academics whom FuL reports as having a tenured affiliation at a German university to academics listed in DGK. Differences in the spelling of names, typos and erroneous information regarding affiliation changes in FuL mostly explain the 17% that I cannot match. Where possible, I resolve such inconsistencies manually. I have direct information on the timing of half the affiliation changes10 in my panel data set from FuL. For the other half of affiliation changes, DGK provides the year of change in 23% of the cases11 and I infer the timing of the remaining affiliation changes from academic affiliations listed in DGK at different points in time, the year they passed their habilitation as well as the start and end year of their academic career in Germany recorded in DGK. A detailed description of the construction of the data set used for the analyses below is provided in Section A3 of the Appendix.
4.2 Effort Effect
In order to identify the pure effort effect of the introduction of performance pay in German academia on knowledge creation, I use the fact that any contract for a professorial position at a public university in Germany signed or renegotiated as of 1 January 2005 necessarily falls under the performance pay scheme, whereas any contract signed before this date falls under the old, age-related pay scheme12. Accordingly, academics who start their first tenured affiliation before 2005 continue to fall under the age-related pay scheme, whereas academics who start their first tenured affiliation after 2004 switch to the performance pay scheme upon starting their first tenured position. If the timing of the start of the first tenured position is exogenous, the performance incentives that first-time tenured affiliates face are exogenous as well. I can then identify the causal effort effect of performance pay on knowledge creation by comparing the change in research productivity from before to after the pay reform of academics who start their first tenured affiliation before 2005 (the control group) with the change in research productivity of academics who start their first tenured affiliation as of 1 January 2005 (the treatment group). Unless indicated otherwise, I use a three-year window before and after the reform to define the treatment and control group for the analyses below in order to abstract from seniority effects. Thus the treatment group consists of academics who start their first tenured position at a public university in 2005, 2006 or 2007, while the control group consists of academics who start their first tenured affiliation at a public university in 2002, 2003 or 2004. Results are however robust to extending or reducing the cohort window (Appendix Table A.3 panels E and F). I exclude academics who hold a foreign affiliation before their first tenured affiliation in Germany to avoid
10Where at least one of the affiliations concerns a tenured position at a German university. 11This is self-reported career information and hence may introduce bias in my data set. I therefore use the information regarding affiliation changes provided in FuL whenever available. Reassuringly though, a consistency check revealed that the information in DGK regarding the timing of affiliation changes differs from that in FuL for only 5% of the individuals who change a (tenured) affiliation at least once. 12With the exception of Bremen, Niedersachsen and Rheinland-Pfalz, who introduced performance pay before this deadline (in 2003 and 2004, respectively) (Detmer and Preissler, 2005). Note that using 2005 as uniform before-after cut-off yields a conservative measure of the effort effect, since some of the control group is, in fact, already treated before this time.
14 confounding the effort effect with selection effects of performance pay. The treated cohort comprises 2,844 academics, the control cohort 3,197.
4.2.1 Descriptive Statistics
For the effort effect analysis I focus on measures of research productivity that are based on the publications of academic i in field f in year t + x f , where x f denotes the average publication lag in field f , rounded up to the nearest year. The average publication lags are taken from Björk and Solomon (2013) and range from 8 months for Chemistry to 18 months for Economics and Business. Correspondingly, I backdate publications by one to two years. After correcting for average publication lags I have productivity measures for (at least) 18 years, from 1993 through 2010; from 9 years before the announcement of the reform until 9 years after, and from 12 years before the implementation of the reform until 6 years after. The productivity measures take all available publication types into account, from journal articles to books and from book chapters to conference proceedings, with the exception of citations, which are available only for articles in the journals indexed by Web of Science and which do not include citations to books, chapters, patents, etc. Table 1 reports summary statistics for the treatment and control cohort. On average, academics in the treatment and control cohort publish almost 3 papers per year. Weighting publications by the two-year impact factor of the outlet in which it appears brings this sum to almost 10. To put this in perspective, the latest two-year impact factor ratings, available for 2017, put the top five general interest journals in economics at an impact factor rating between 3.750 (Econometrica) and 7.863 (Quarterly Journal of Economics), while a top field journal like the Journal of Labor Economics had an impact factor rating of 3.607 (Clarivate Analytics, 2017). Impact-factor ratings do get considerably greater than this. Science had an impact factor rating of 41.058 and Nature of 41.577, for instance (Clarivate Analytics, 2017). The average total number of citations to publications from a given year is 102. Citations data was extracted from ISI Web of Science in January 2019, so there are at least 6 years between the publication date and the time at which citations were counted for each publication. The distributions of these variables are highly skewed: the median academic does not have any publication in any given year, while the most productive academics produce orders of magnitude more work than the average academic by any of these measures.
4.2.2 Baseline Difference-in-Differences
I estimate the effort effect in a parametric difference-in-differences model:
7 0 E Yi, f ,t−x f |Xi, f ,t = exp[αi + β1Post 02 ∗ Treatmenti + β2Tenurei, j ∗ Treatmenti + ∑ ttti, j + γt ] (4) j=−7
15 The dependent variable, Yi, f ,t−x f is a productivity measure of academic i in field f in year t − x f , where x f denotes the average publication lag in field f as defined above. The Treatment variable is 1 for academics who start their first tenured affiliation at a public university in 2005, 2006 or 2007, and 0 for those who start their first tenured affiliation at a public university in 2002, 2003 or 2004 (the control cohort). The variable Post002 is 1 as of 2002 and 0 beforehand, and the variable Tenure is 1 as of the year in which an academic starts their first tenured affiliation at a public university and 0 beforehand. The ttti, j variables are time-to-tenure dummies. They control flexibly for the seven years before and after academics start their first tenured position, as 13 well as the tenure year itself . I also include individual fixed effects, αi, and calendar year fixed effects, γt. Taken together, these fixed effects control flexibly for calendar year fixed effects, cohort-specific relative time fixed effects, and individual academic fixed effects14. I estimate the model as a conditional quasi-maximum likelihood fixed-effect Poisson model15, because the dependent variables are highly skewed with a large mass at zero and long right tail. The corresponding estimation results are shown in Table 2. Robust standard errors, clustered at the individual level, are reported throughout. This difference-in-differences specification distinguishes two before and after periods. The Post002 variable is included to pick up on the effect of career concerns incentives that take effect as of the announcement of the reform, while the Tenure variable captures the effect of the explicit on-the-job performance bonuses that kick in upon entry into the performance pay scheme. The moment the reform is announced, the lure of future attraction and retention bonuses and, consequently, the career concerns incentives of the performance pay scheme take effect. Because tenure-track positions generally do not exist in Germany, academics need to move to a new university and negotiate a new contract to obtain tenure. Academics who anticipate starting their first tenured affilliation in the performance pay system therefore face strong career concerns, as their pre-tenure performance can influence their tenure contract negotiations and tenure pay. I can thus identify the effort effect of career concerns off of the differential change in productivity of about-to-be-tenured academics here. The incentive effect of the explicit on-the-job performance bonuses takes effect only after academics enter into the performance pay scheme. This entry into the performance pay scheme coincides with the start of the first tenured affiliation for academics 16 0 in the treated cohort . The Post 02 ∗ Treatmenti and Tenure ∗ Treatmenti interaction terms taken together therefore provide a difference-in-differences estimate of the total effort effect of career concerns and explicit performance incentives in knowledge creation. Because of the
13Including 7 year-to-tenure dummies aligns with the institutional setting here, as the median number of years between the end of the PhD and the completion of the habilitation is 7 and academics are, traditionally, required to have completed their habilitation before they become eligible for a tenured affiliation. Results are however robust to including other sets of year-to-tenure dummies (Table A.3 Panel G and H). 14Note that individual fixed effects subsume academic field fixed effects here, because an academic’s field is kept constant throughout. 15This is the same model as used in, for instance, Azoulay et al. (2015). Even though the dependent variables here are not all integers, Silva and Tenreyro (2006) show, using a result from Gourieroux, Monfort and Trognon (1984), that the estimator based on the Poisson likelihood function is consistent even for non-integer dependent variables, as long as the conditional mean is correctly specified. 16Universities generally announce either the number of on-the-job bonuses or the total amount of on-the-job bonus pay to be paid out in a given year at the beginning of that year. (Lünstroth, 2011) These incentives thus do not just vary across universities, but by university and year. On top of that is the variation, at the individual academic level, in target agreements. Identifying the effort effect of the explicit performance incentives by exploiting cross-university variation is therefore not feasible, even aside from the potential bias due to sorting. I therefore estimate the average effort effect of explicit performance incentives here.
16 way the Post002 and the Tenure variable are defined, the interactions of these variables with the Treatment variable capture persistent differential changes in the research productivity of the treated cohort relative to the control cohort. The announcement of the reform occurs at the same calendar time for all tenure cohorts, but at a different time relative to tenure. The start of the first tenured affiliation, on the other hand, occurs at the same relative time, but at a different calendar time for all cohorts. Because the specification includes individual fixed effects (which subsume cohort and group fixed effects), including all calendar time and relative time fixed effects would yield a specification that is underidentified, and result in the underweighting of long-run effects (see e.g. Borusyak and Jaravel (2017); Abraham and Sun (2018); Goodman-Bacon (2018)). Including at most 15 year- to-tenure (relative time) fixed effects for each cohort17, and thus dropping at least three relative time fixed effects for each cohort and treatment group, prevents the underidentification problem, while estimating treatment effects relative to a control group avoids underweighting long-term effects18 Borusyak and Jaravel (2017). The control group pins down calendar time and relative time, so that the (restricted) treatment group-specific calendar time and relative time fixed effects 0 (the Post 02∗Treatmenti and Tenure∗Treatmenti interaction terms) in the specification estimate the effort effect of career concerns and explicit performance incentives, respectively, by allowing for differences in behavior of the treatment group around the points in calendar time and relative time when these incentives take effect.
4.2.3 Baseline Results
Table 2 shows that research quantity and quality-adjusted quantity increases in response to performance incentives, but the average quality decreases. The positive and significant (at 1%) Post002 ∗ Treatment interaction implies that there is a persistent increase in the number of publications of the treated cohort by 18.3% after the announcement of the reform, when career concerns come into effect, relative to the control cohort19. There is no additional increase in the number of publications from tenure onwards, when treated academics enter into the performance pay scheme and explicit performance incentives take effect. To allow for a more easily interpretable result, I also estimate a linear fixed effects version of the baseline regression, the results of which are reported in Table A.2. The results in column 1 suggest that academics in the treated cohort produce almost one extra publication every three years after the announcement of the reform compared to the control cohort20. Moving from a pure quantity measure of productivity to measures of quality-adjusted quantity, I find comparable results (cf. columns 2
17I drop year-to-tenure fixed effects that are far from the time of treatment (here: tenure) as this normalization allows for a more stable estimation (see e.g. Borusyak and Jaravel (2017)). Furthermore, dropping only fixed effects far from tenure, and not tenure itself, makes for easier interpretation. 18The results are robust to including a non-linear function of relative time - the absolute value of time-to-tenure - instead of the restricted set of time-to-tenure fixed effects to prevent underidentification (Table A.3 Panel C), or including different sets of year-to-tenure dummies (Panels G and H in Table A.3). 19The exponentiated coefficients of the Poisson QML, minus one, can be interpreted as elasticities. 20The estimation results of the linear FE model should be interpreted with caution because the model is likely misspecified given the censored and highly skewed distribution of the dependent variables.
17 and 3 in Table 2). The number of publications of the treated cohort, weighted by impact-factor rating, increases by 14.2% after the announcement of the reform (significant at 1%), while the sum of citations to publications published in a given year increases by 13.8% relative to the control cohort (at 5% significance), with no further increase upon entry into the performance pay scheme. The average quality of the publications, however, decreases after the announcement of the reform. The average impact factor rating of publications produced by the treated cohort decreases by 9% (significant at 1%), and the average number of citations decreases by a marginally significant 10% relative to the control cohort (columns 4 and 5 in Table 2)21. The coefficient estimates of the equivalent linear fixed effects estimation in columns 4 and 5 of panel A in Table A.2 allow for slightly easier interpretation. The average impact factor rating of publications of treated academics after the announcement of the reform decreases by 0.216. This means, for instance, that a treated academic whose average publication before the announcement of the reform appeared in the Journal of Political Economy, which had a 2-year impact factor rating of 5.247 in 2017, publishes in journals like the American Economic Journal: Applied Economics, which had an impact-factor rating of 5.028 in 2017 (Journal Citation Report 2017), on average after the announcement of the reform. The 10% decrease in the average number of citations in 2019 to publications by treated academics published after the reform announcement is equivalent to these publications having received on average 3.6 fewer citations than publications of the control cohort published in the same year. Additional analyses show that there is no significant decrease in either the maximum or minimum number of citations to the publications of treated academics (Panel A, Table A.1). To get a better idea of the quality of the work produced in response to performance pay, I analyze the distribution of citations and impact factors next. I calculate the percentiles of citations and impact factor ratings separately by field and publication year and use the percentile cut-offs to generate quantile frequency variables for each author and publication year. To illustrate, if an author has three publications in a given year, one of which garners a number of citations that puts it in the top quartile of citations of publications in the same field and publication year, while the other two papers fall in the bottom quartile of citations, then the top quartile frequency variable is equal to 1, the bottom quartile frequency variable 2, and all other quantile frequency variables 0. I estimate the baseline model separately for all quantile frequency variables. The histograms 0 in Figure 1 depict the resulting Post 02 ∗ Treatmenti (grey bars) and Tenure ∗ Treatmenti (white bars) coefficient estimates and 95% confidence intervals. These figures clearly show that treated academics produce more of low to medium-high quality work, but not more of the highest quality research. In short, I find evidence of a positive and significant average effect of performance incentives
21The difference in the number of observations across columns in this table occurs for two reasons. While observations are missing for the average quality measures in years in which an academic does not have any publications, the quantity and quality-adjusted quantity variables have zero entries for those years. Further differences between columns arise because I have to drop authors for whom all observations are 0, or for whom I have only one observation in order to estimate the Poisson model.
18 on the total raw and quality-adjusted quantity of knowledge output. This response arises from the moment high-powered career concerns incentives take effect, with no additional increase in response to explicit performance bonuses. The effect size ranges from 14 to 18%, which is of the same order of magnitude as previous estimates of the effort response to performance incentives, albeit mostly explicit performance incentives for routine tasks, in the literature (e.g. Lazear (2000); Shearer (2004)). The extra output produced, however, is not of the highest quality, as only the number of publications produced in low to medium-high citation and impact-factor quartiles increases. Indeed, there is a significant decrease in the average quality of knowledge output of around 9 to 10% in response to the introduction of performance pay.
4.2.4 Robustness
There is no evidence that the changes in productivity metrics are the result of strategic co- authorship behavior. The average number of co-authors on papers does not increase (Table A.1, Panel B) and the results of the baseline regression with dependent variables weighted by number of authors (so a paper with three authors counts for one-third) are very similar to the baseline results (Table A.2, Panel B). Papers also do not become significantly shorter (Table A.1, Panel B). Academics who are paid according to the age-related pay scheme can switch to the perfor- mance pay scheme after its implementation by changing affiliation or position, or by opting into the performance pay scheme while retaining the same position. Academics in the control cohort may therefore end up being treated as well. Any effort response of the control cohort would lead me to underestimate the effort effect of the treated cohort, so, if anything, the baseline results provide a conservative estimate of the effort effect. To test this, I re-estimate the baseline specification with a control group that excludes any switchers, where I label any academic who changes affiliation, position or contract22 after implementation of the pay reform (as of 2005) as a switcher. The effort effect estimates for output quantity and quality-adjusted quantity are indeed larger, ranging from a 17% increase in citations to a 23% increase in the number of publications, while the estimates of the effects on average quality are qualitatively the same (Table A.2 Panel C). The effort effect results are also robust to restricting attention to articles and proceedings papers only23 (Table A.3, Panel A); widening or narrowing the treatment and control cohort windows 0 (Panels E and F); including a Post 05∗Treatmenti interaction instead of the Tenure∗Treatmenti interaction to control for implementation instead of entry into the performance pay scheme (Panel D ); or transforming the dependent variables using the inverse hyperbolic sine transform and estimating as a panel fixed effects model (Panel B ).
22I assume that, whenever academics receive an outside offer, they either accept and change affiliation, or reject and renegotiate their current contract, and consequently switch to performance pay. If there are academics who do not at least renegotiate their contract when they receive an outside offer, this overestimates switching and leads to a conservative estimate here. 23Specifically, I restrict attention to publications in the following ISI web of science categories only: “Article”, “Article: Book”, “Article: Book Chapter”, “Article: Proceedings Paper”, “Proceedings Paper”.
19 Finally, in supplementary materials (Appendix section A4.2) I show the results of an alternative identification strategy; using an instrumental variables approach to estimate the effort effect of the performance pay reform in academics that switch to the performance pay scheme. I instrument for endogenous switches into performance pay (by academics who start out in the age-related pay scheme) with age and age cut-offs that align with the single crossing points of the basic wage schedules of the performance pay and age-related pay schemes (Cf. Figure A.1). The results are qualitatively similar, though concerns about the validity of the instruments suggest these results are indicative at best.
4.2.5 Validity of Identifying Assumption
The 14% to 23% increase in quantity and quality-adjusted quantity, and the 9% to 10% decrease in average quality can be interpreted as the causal effort effect of performance pay on knowledge creation if, absent the reform, the productivity of the treatment and control cohort would have evolved along parallel paths. There are a number of potential threats to identification; it could be that other events that occurred around the same time are driving the result or that the timing of tenure is endogenous. I discuss these concerns and how I address them below. Any events that occur around the time of the pay reform but that do not affect the pre- and post-reform first-time tenured cohorts differentially are not a threat to identification. For this reason, the start of the “Excellenz Initiative”, a large funding initiative for universities and research centers as of late 2006/early 2007 (DFG, 2016), or the abolition of the professor’s privilege in 200224 (Von Proff, Buenstorf and Hummel, 2012) do not invalidate the identifying assumption. The introduction of the “Junior Professorship” in 2002 as an alternative path to professorships from the habilitation, cannot be driving the earlier results either, because the first Junior Professors become eligible for a tenured position by 2008/9 at the earliest (Lutter and Schröder, 2016). The three-year window of the treatment and control cohorts therefore does not include first-time tenured professors who completed a Junior Professorship. Nonetheless, I test for pre-existing trends to provide further assurance that other events are not driving the baseline results. To do so, I estimate the following model as a conditional QML Poisson fixed effects model:
h i 15 7 E Yi, f ,t−x f |Xi, f ,t = exp[αi + ∑ βkttti,k−8 ∗ Treatmenti + ∑ ttti, j + γt] (5) k=1 j=−7
Here, ttti,k−8 ∗Treatmenti denote interactions of 15 time-to-tenure dummies (from 7 years before to 7 years after the start of the first tenured affiliation) with the treatment variable. All other variables are as before. This specification effectively aligns the relative time (time-to-tenure) for different tenure cohorts, and allows me to estimate the differences in outputs of the treated and control cohorts as they move towards and beyond the start of their first tenured affiliation. The
24Furthermore, under the professor’s privilege regime, professors own the IPR of their inventions (Hvide and Jones, 2018). The abolition of this privilege should reduce incentives to produce commercializable (patentable) knowledge.
20 coefficient estimates and 95% confidence intervals of the interaction terms are depicted in Figure 2 for the baseline dependent variables. The green vertical dashed line at t − 5 indicates where in the tenure trajectory of the youngest academics of the treated cohort, who start their first tenured affiliation in 2007, the announcement of the pay reform occurs. The orange vertical dashed line at tenure marks the time at which treated academics enter into the performance pay scheme. All five figures display a similar pattern: the interaction terms, which capture the difference in the respective output measures between the treated and control cohort, show that these cohorts start to diverge after the announcement of the reform, when the treated cohort faces higher-powered incentives. Recall further that the publications have been backdated by the average publication lags in the respective academic fields (rounded up to the nearest year), so the differential increase in number of publications directly after the announcement of the reform is consistent with an immediate effort response to the higher-powered incentives it heralds. The response in quality-adjusted output and average quality occurs a bit later and more gradually, as expected if producing high quality research takes more time and is riskier. The absence of pre-existing trends and clear alignment of the productivity response with the timing of the announcement of the pay reform, and hence the moment when the treated cohort starts to face higher-powered incentives, lends support to the interpretation of the differential productivity response as the causal effort effect of the performance pay reform and not the effect of another event. The analysis also underlines that the effort response to the higher-powered incentives is not a temporary response, but one that persists. The absence of pre-existing trends and the persistence of the effort effect also rule out that the effect of tenure itself is driving the results. Recall also that the time-to-tenure dummies in the baseline specification control for any common productivity changes in the run-up to and following the start of first-time tenured positions. The causal interpretation of the baseline results also requires that the timing of tenure is exogenous. In particular, academics could try to get a tenured affiliation sooner after they learn of the impending reform in order to avoid the performance pay system25. The absence of pre-existing trends for the treated and control cohorts lends support to the exogeneity of tenure timing and a placebo pre-trends regression as in equation 5 lends further support. For this placebo test, I restrict the sample to academics who start their first tenured position in 2001 to 2004 and use the cohort that starts their first tenured position at a public university in 2001 or 2002 as placebo control group and those who start their first tenured position in 2003 or 2004 as placebo treatment group. Figure 3 shows that the interaction terms are generally close to 0 and not significant and there is no evidence of any consistent differential trends. If academics in the placebo treatment group had been able to avoid entry into the performance pay scheme by temporarily stepping up their efforts to obtain a tenured affiliation sooner, the productivity differential between the placebo treatment and control cohorts would have been positive between
25Note that attempts to delay the start of a tenured position would not be rational and are therefore not much of a concern, since an academic would delay earning the higher pay associated with a tenured position, while they can always opt into the performance pay scheme after 2005 if so preferred.
21 the announcement of the reform and the start of the first tenured affiliation, and possibly negative thereafter. The absence of such a pattern thus lends support to the identifying assumption of the exogeneity of the timing of the start of the first tenured affiliation. . As a futher test, I show that the main results go through with synthetic treatment and control cohorts, which are defined by the average age at which academics start their first tenured affiliation rather than the actual timing of the first tenured affiliation (Table A.2 Panel D and Appendix Section A4.1) As a final validity test, I estimate the tenure probability using hazard rate analysis in Appendix Section A4.3. I find no evidence that the requirements for obtaining a tenured position increase with the reform. There is thus no evidence that the cohort of academics who start their first tenured position after the reform are more productive than academics in the control cohort, so this is not driving the results either.
4.2.6 Heterogeneous responses by academic field
The baseline results show how performance pay affects research productivity on average. This section delves into the anatomy of the effort response and tests if and how the effort response differs by academic field, while the next section estimates responses by productivity quantile. Effort responses may differ by field for a number of reasons. For one, research teams in the natural and applied sciences tend to be much larger than those in the social sciences and humanities26. The response to higher-powered incentives may be larger in smaller teams, if the likelihood that all or most team members are highly incentivized is larger in smaller teams. On the other hand, in large teams the benefit of good management may have stronger effects, so that highly incentivized team leaders may be able to bring about larger changes in output. Furthermore, fields may differ in the noise in output quality measures. To the extent that there are more objective quality measures in the natural and applied sciences than in the social sciences and humanities, quality measures may be more noisy in the latter fields. The theoretical model presented above shows that greater quality measure noise is associated with a quality effort response that is smaller positive or larger negative. To study heterogeneous effort responses by academic field, I estimate the baseline regression separately by broad academic field: natural and applied sciences, and social sciences and humanities. I classify mathematics, physics and informatics, biology, chemistry, earth sciences, pharmacology, engineering, medicine, dentistry, veterinary, agricultural science and nutrition science as natural and applied sciences; and theology, philosophy and history, philology and anthropology, law, economics and other social sciences as social sciences and humanities. Figure 4 depicts the estimation results of the baseline regression for these broad fields separately. Quantity effort increases in both broad fields in response to performance pay, by 17% in the natural and applied sciences and 31% in the social sciences and humanities. The difference be- tween these effort responses is not statistically significant in a pooled regression with interaction
26As gauged by the average number of authors on a paper in a field (calculated over the pre-reform years 1996-2000).
22 terms with broad field indicators (Table A.4). Yet, where quality-adjusted measures of output increase significantly in the natural and applied sciences, there is no significant increase in these same measures in the social sciences and humanities and the average impact factor-rating of publications decreases less in the former, though this difference is only marginally significant. That is, while I find no evidence of a significant difference in the level of the quanity effort response, there is some evidence suggesting the quality effort response may be more negative in the social sciences and humanities, which would align with quality measures being more noisy in this field.
4.2.7 Heterogeneous responses by productivity quantile
The theoretical model predicts that the quantity effort response decreases with ability type, with a potential uptick for the highest ability levels if the quality signal is not too noisy. Quality effort on the other hand is expected to increase the most in high ability agents or, if the quality signal is very noisy, decrease the least. Low ability agents are expected to decrease quality effort the most or, if they already exert the minimum possible level of quality effort, not change quality. I test these hypotheses in turn below. I determine productivity quantiles separately by academic field and treatment group on the basis of the averages of the impact factor-rated number of publications published in 1999, 2000 and 2001, using pre-announcement averages to avoid simultaneity bias. Because the productivity distributions are highly right-skewed, with the median academic not publishing a paper or receiving any citations in an average year (cf. Table 1), I look at above-median academics separately by decile and below-median academics as one group. The histograms in Panel A 0 through F of Figure 5 depict the Post 02∗Treatmenti (grey bars) and Tenure∗Treatmenti (white bars) coefficient estimates and 95% confidence intervals of baseline regressions run separately for academics in the top five deciles and those below the median. Both low and high productivity academics increase pure quantity and quality-adjusted quantity in response to performance pay. There is a 24% increase in the number of publications and a 31% increase in impact factor-rated number of publications in response to career concerns incentives in the below-median treatment group relative to the same quantile in the control group. The top decile and 7th decile also increase the number of publications, by 22% (Top 10%) and 24% respectively, as well as the sum of citations to publications, by 32% and 39%. There is no significant response in the 9th, 8th and 6th decile, though the lack of statistical significance in the 6th decile is likely due to the small decile size because of the aforementioned skewness in the distribution of the productivity variables. This quantity effort response constitutes a positive and significant intensive margin response for top decile academics only (Fig A.2). Conditional on having at least one publication in a given year, their number of publications increases by 20% and citations to publications by 30%. In contrast, the effort response of lower deciles is solely an extensive margin response. The probability that an academic has at least one publication in a given year increases for
23 below-median academics as well as in all but the highest two above-median deciles (Fig A.3) The separate quantile regressions compare responses in the same quantile of the treatment and control group but cannot test whether the effort response differs across quantiles. To test the latter, I estimate the baseline regression model augmented with interaction terms with indicator variables for the top five deciles. Table 3 shows that the differences in effort responses are, in fact, significant. The coefficient of the post002 interaction with the Treatment variable in column 1 implies that the below-median productivity academics of the treatment group produce 32% more publications in response to career concern incentives, relative to the same quantile in the control group. The triple interactions of post002, Treatment and indicator variables for the 10th and 9th decile imply that the effort response of these deciles of the treated group are, respectively, 14% and 28% less than the effort response of below-median academics (significant at the 10% and 1% level, respectively). Moreover, simple Wald tests for equality of the post002 ∗ Treatment ∗ decile interactions show that the 9th decile interaction is significantly different from all other interactions, so the quantity effort response of the sub-top productivity decile is less positive than that of both higher and lower productivity quantiles. A similar pattern emerges for the impact factor-rated number of publications, and for the sum of citations the effort response in both the 8th and 9th decile is significantly less positive than in all other quantiles. Results are robust to using deciles based on pre-announcement averages of the sum of citations, as well as to excluding academics who switch to performance pay after 2005 (Table A.5), so there is no evidence that differential selection into performance pay is driving these heterogeneous treatment effects. Taken together, these results show that the pure quantity and quality-adjusted quantity effort response is largest for relatively low productivity (below-median) academics and smallest for academics just below the top of the productivity distribution. Turning to the quality effort response, I find that academics in the sub-top productivity deciles decrease quality effort, while quality remains unchanged in lower and higher productivity quantiles. Figure 5 shows that the average number of citations decreases significantly in the 8th decile, and the intensive margin response in the sum of citations is significantly negative for the 9th decile (Fig A.3), but there is no significant change in either metric in other deciles. The response histograms in Figure Panel B show that both these sub-top productivity deciles produce fewer papers in the top citation decile, a reduction of 25% and 32% respectively, thus the quality of their work decreases. There is no evidence that other productivity deciles change quality effort. There are increases in the lowest citation quartile bin for below-median productive academics and in higher citation bins for the most productive academics, in line with their quantity effort response and commensurate with their ability class. But since there is no sign of substitution of higher citation decile bin papers for lower decile bins or vice versa, there is no evidence of changes in output quality for these productivity classes. These findings align with the theoretical predictions for a setting in which the quality signal is sufficiently noisy so that even the highest productivity agents do not increase quality effort, and the lowest productivity classes cannot reduce quality as they
24 already exert the minimum required level, leaving a decrease in quality effort for intermediate productivity levels only.
4.2.8 Novelty and Impact
To provide further evidence on the quality and impact of the work produced in response to performance incentives, I perform textual analysis of paper abstracts. Specifically, I calculate cosine similarity measures of the Term Frequency Inverse Document Frequency vectors of publications and comparison papers, as in Kelly et al. (forthcoming). The Term Frequency Inverse Document Frequency (TFIDF) index is defined in the following way: cw,d cd,s
! TFIDFd,t TFIDFd˜,t˜ ρd,t;d˜,t˜ = . |TFIDFd,t| |TFIDFd˜,t˜| For each paper pair, I calculate the inverse document frequency based on the set of all papers in the same field published in all years prior to the earlier of the two publication dates (t,t˜)28. If the focal publication and comparison publication have abstracts that have no bigrams in common, the cosine similarity is 0. The more common bigrams in the focal publication and comparison
27As an illustration, the following sentence, ’Paul walks home’, has two bigrams: ’Paul walks’ and ’walks home’. 28Note that, due to the way in which publication records were extracted from ISI Web of Science, namely, filtering by publications that have at least one author with a German (work) address, the set of comparison papers are not just from the same field, but also from authors in the same country.
25 publication, the closer the cosine similarity is to 1. As in Kelly et al. (forthcoming), I calculate two different metrics derived from these cosine similarities: backward similarity and forward similarity. The backward similarity of focal publication d published in t is calculated as the sum of the cosine similarities between the focal document and comparison publications published in a three year window before the focal document was published, while the forward similarity is calculated as the sum of the cosine similarities between the focal publication and comparison publications published in a three year window after the focal document was published. The more bigrams in the abstract of the focal publication that have not been used in abstracts of publications published earlier, the smaller the backward similarity metric. This might point to such publications being more novel. The forward similarity metric on the other hand captures the relatedness of follow-on research and, as such, constitutes an alternative measure of impact to citations. Kelly et al. (forthcoming) find that patents with a higher ratio of forward to backward similarity are more likely highly cited and have higher market value. This lends support to the notion that the TFIDF cosine similarity measures capture novelty and impact. Iaria, Schwarz and Waldinger (2018) employ Latent Semantic Analysis (LSA) of abstracts to generate measures of similarity of papers. LSA is a machine learning technique that takes into account whether words are commonly used in similar contexts. As such, it classifies titles with words that often occur in the same context as similar, even if the words are different. Because the TFIDF metric does not ‘learn’ about similar topics, it is more likely to understate similarity and the cosine similarity measures based on it are more likely to overstate novelty and understate the relatedness of follow- on research. I use metrics based on the TFIDF here because we may be particularly concerned about potential decreases in the novelty of knowledge work in response to performance pay and the TFIDF metrics provide conservative estimates of any decreases in novelty. Because the analysis in this paper is at the individual academic level rather than at the publication or patent level, as is the case in (Kelly et al., forthcoming), and because publications are matched to academics on the basis of i.a. name and field, I deviate from the metrics in the latter paper in two ways. First, I restrict the set of comparison publications to the same field as the focal publication. Second, for each similarity measure, I calculate quantile frequency bins in the same way as for citations above, so as to analyze the effect of performance pay on the distribution of the similarity of papers to past and future work. The novelty metric analyses align with the earlier quality response results; the additional work produced in to response performance pay is, on average, not the most novel or impactful. Panel A in Figure 7 shows that there is a significant increase in the frequency of top quartile backward cosine similarity papers as well as a marginally significant increase in the bottom decile and quartile bins of the same variable. That is, in response to performance pay, more papers that are very similar to previously published papers are produced, though there is also a slight increase in papers that are very dissimilar to prior work. At the same time, there is an increase in papers in the third quartile bin of forward similarity metrics, and hence in papers that give rise to
26 considerable, but not the most, follow-on research or - which would be indistinguishable - are part of a burgeoning literature (Figure 7 Panel B). Breaking down the novelty metric analysis by productivity quantiles provides further insight into the earlier heterogeneous quality response results. Top productivity academics produce more medium to highly novel work that garners mid- to high-level follow-on, low productivity academics produce additional papers that are just above the median in terms of both similarity to past and future work, while sub-top productivity academics produce more papers that are very novel, but also garner only very little follow-on work (Figures A.4 and A.5).
4.3 Selection Effect
The empirical analysis has so far focused on the effort effect of performance pay. This is only one channel through which performance pay can increase output. I now turn attention to another important channel: selection. To this end, I study which academics sort into performance pay, by switching from the old age-related pay scheme to the new performance pay scheme. As shown in the theoretical section, selection into academia should follow the same pattern as selection into performance pay as the latter drives the former.
4.3.1 Non-Parametric Analysis
As a first test, I analyze the hazard and survival rates of switches to performance pay of academics with a tenured affiliation at a public university before the reform. These academics have the choice (i.e. are “at risk”) of switching pay scheme because they are initially paid according to the age-related pay scheme. They can select into the performance pay scheme by changing affiliation or position, or renegotiating their contract. Accordingly, I label any first affiliation, position or contract change29 after implementation of the pay reform (as of 2005) as a switch to performance pay (the “failure” event) and use the time until such a switch as duration variable. I count the time it takes for academics to switch from their most recent affiliation, position or contract change before the reform implementation (the at risk duration) if they are tenured, at a public university and not retired. I restrict attention to academics who start their first tenured affiliation after 1998, so that I observe their full tenured affiliation history and hence the moment they become “at risk” of switching30’31. There are 11237 such academics and I observe 1231 switches in a total of 85716 periods (years) during which these academics can switch from the
29I assume that, whenever an academic receives an offer, they either accept and change position, or reject and renegotiate their current contract. In either case, the academic switches to the new performance pay scheme if the change or renegotiation happens after the reform. If there are academics who do not at least renegotiate their contract when they receive an offer, these academics are more likely to be of a lower productivity type, and including them in the pool of switchers would reduce the estimate of the selection effect I find. Preissler (2006) reports that only a small number of professors chose to opt into the W-pay scheme without another/outside option. 30As in the effort effect analysis, I also restrict the sample to academics that do not have a foreign affiliation before their first tenured affiliation, since I do not have full affiliation and publication records for academics from abroad. 31Including academics who enter into the data set in a tenured position would introduce left truncation into the survival analysis sample. Some academics who enter the dataset in a tenured position change contracts before the reform is implemented. For those academics I know when they become at risk of switching to the performance pay. For others, who do not change contract before the reform implementation, I would not know their risk origin date and hence their risk duration. The former academics may well be systematically different from the latter, which would bias the analysis.
27 age-related to the performance pay system. The average incidence rate of switches is 0.014. Hence, on average, academics in this group have a 1.4% chance of switching to performance pay in any year after 2004. Figure 8 shows the Epanechnikov kernel density estimates of the hazard function for switches from age-related to performance pay for academics whose average productivity falls in the top quartile or bottom three quartiles of the average productivity distribution. I base productivity quartiles on the average of the impact factor-rated number of publications in the three pre- implementation years (2002, 2003 and 2004), calculating quartiles separately by academic field and broad tenure group (tenured before or after the reform). The hazard rate for switching to the performance pay scheme is clearly greater for top quartile academics throughout, so higher productivity academics are more likely to sort into the performance pay scheme. A log-rank test of the equality of the survival functions of top quartile academics and bottom three quartile academics rejects the equality of the survival functions at the 1% significance level32. Results are robust to using quartiles based on the sum of citations or impact factor-rated number of publications weighted by number of authors (Figure A.6).
4.3.2 Difference-in-Differences Estimation
In order to distinguish any general switching patterns from the selection effect of performance pay, I estimate the selection effect of the introduction of performance pay parametrically in a difference-in-differences framework. Here I exploit variation along two dimension; an academic’s tenure cohort, and age. Academics who start their first tenured position before 2005 fall under the age-related pay scheme when the reform is implemented. They switch to performance pay when they change their affiliation, position or contract after the reform takes effect in 2005. In contrast, academics who start their first tenured position after the reform implementation automatically fall under the performance pay scheme from the moment they make tenure and thus cannot switch to performance pay. Hence the cohort of academics who start their first tenured position before 2005 is the treated cohort here, while those that make tenure after 2004 make up the control cohort. Due to the single-crossing property of the basic wage schemes for age-related and performance pay (Cf. Figure A.1), selection incentives are weaker for older academics. In order to make switching worth their while, academics need to be able to more than make up for the difference in basic wage. Because this difference is larger for older academics, they will need to be of higher ability in order to want to switch. That is, compared to academics in the control group, the risk of switching increases relatively less with productivity for older academics in the treated cohort.
32The log-rank test returns a Chi-squared statistic of 8.14 (p-value 0.0043)
28 I estimate the following Weibull proportional hazard model to test this:
λi,t = ρ ∗exp[β0 +β1Treati +β1Agei,t +β2AvgProdi +β3Agei,t ∗Treati +β4AvgProdi ∗Treati ρ−1 + β1AvgProdi ∗ Agei,t + β2AvgProdi ∗ Agei,t ∗ Treati + Xi + ui,t] ∗t (6)
Treat is a dummy variable that is 1 for academics who started a first tenured affiliation before 2005, and 0 for those whose first tenured affiliation started as of 2005. AvgProd is an academic’s average productivity calculated as three-year pre-implementation averages (2002-2004) of the impact factor-rated number of publications. The age variable is equal to an author’s self-reported age if known, and equal to a synthetic age otherwise. I calculate synthetic ages using the average age at habilitation, promotion or tenure. All models control for academic field fixed effects33 and are estimated for years t > 2004 and for academics who start their first tenured position after 1998 and do not hold a foreign position immediately prior. My preferred specification also controls for synthetic age at the start of the first tenured affiliation . Standard errors are robust and clustered by individual academic. The a and b columns in Panel A of Table 4 report estimation results of specifications without and with age-at-tenure as additional control, respectively. The positive and significant coefficient estimates of the interaction AvgProd ∗Treat in column 1 imply that a one standard deviation increase in the average productivity of treated academics increases the rate at which they switch to performance pay, by 10.8% to 20%34 more than academics in the control group. This holds while controlling for the interaction term Age ∗ Treat, which, as expected, is negative. Column 2 shows that, while a one standard deviation increase in average productivity reduces the negative effect of age on switching rates by 2.6% on average, this moderating effect of productivity is 2.6% less for treated academics (compare the coefficients of AvgProdi ∗ Agei,t and AvgProdi ∗ Agei,t ∗ Treati). That is to say, treated academics need to be of even higher productivity in order to switch. The selection effect of performance pay, net of general differential sorting patterns by age and productivity types, is thus positive and significant. The finding that performance pay attracts more productive academics is robust to estimating the model as a Cox proportional hazard model (Table A.6 Panel B) or estimating the Weibull model with academic field strata (Table A.6 Panel C)35. Replacing the AvgProd variable in the baseline Weibull model (6) by a dummy indicating whether an academic’s pre-reform average productivity is above median shows that having above median productivity reduces the negative effect on switching rates of an extra year of age by 5.8% on average (Column 1c Table A.6 Panel A). However, this moderating effect of above median productivity is 3.8% less in treated academics, so that treated academics need to be even more productive to switch.
33These fields are: theology; philosophy and history; social sciences; philology and cultural studies; law; economics; mathematics, physics and computer science; biology, chemistry, earth sciences and pharmaceutics; engineering; agricultural sciences, nutrition and veterinary medicine; medicine (human); dentistry. 34Calculated as EXP(0.004*25.53797)-1, where 25.53797 is the standard deviation of the average productivity variable here. 35Results are robust to basing the average productivity variable on other productivity measures as well. Results are available from the author on request.
29 4.3.3 Validity Checks
To assess the validity of the selection effect estimation, I run placebo estimations of the afore- mentioned models. Academics who start their first tenured position before 2002 are defined as placebo treatment group here, while academics who start their first tenured affiliation between 2002 and 2005 act as control group. Both groups switch into the performance pay scheme when they change affiliation, position or contract after 2004, so they face the same selection incentives. The estimation results of the baseline placebo estimation are reported in Panel B of Table 4 with robustness checks in columns 2a, 2b and 2c Table A.6 panel A. Reassuringly, neither the
AvgProd ∗Treat interaction nor the AvgProdi ∗Agei,t ∗Treati triple interaction is ever significant, so there is no evidence that productive academics in the placebo treatment group are more likely to select into performance pay than academics in the placebo control group. As a final check, I also estimate any changes in affiliation, position or contract switching rates from before to after the implementation of the reform for academics in the treated cohort (those whose first tenured affiliation started before 2005). Table A.7 shows, that, while a one standard deviation increase in average productivity increases the likelihood of a switch by around 5.5% on average, this increase in the likelihood of a switch grows to 8 or 9% after the implementation of the reform. This consolidates the finding of a positive selection effect of performance pay.
5 CONCLUSION
This paper shows that performance incentives in knowledge creation give rise to greater output quantity, but not more of the highest quality output. The theoretical model presented in this paper shows that this is what we would expect to see if output quality measures are noisy. This is found despite the performance incentives studied including implicit, career concerns incentives which potentially allow for performance assessment over a longer time horizon, and therefore a less noisy measure of quality. Even then, in the absence of readily available, precise measures of quality such as for novelty or impact, principals may still have to resort to noisier signals of knowledge production quality. It would be valuable therefore, if more informative measures of novelty, impact or other quality dimensions became available. This paper employs one potential measure, and with the advent of ever more powerful machine learning algorithms, the availability and precision of relevant performance metrics for knowledge creation could likely be improved. This seems a worthwhile avenue for research and development, not just for academic research, but knowledge output more generally. Another multi-tasking issue pertains to the different dimensions of academic jobs in particular. If performance in the realm of research is easier to measure than educational performance for instance, or if more weight is given to research output metrics, academics may have shifted effort away from teaching to focus more on research, even if incentivized on all dimensions.. It would be worthwhile to assess if and how performance in teaching and promotion of young scholars changes in response to performance pay. This is left for future research.
30 The paper also shows that more productive academics are more likely to sort into higher- powered incentives, so I do not find evidence of crowding out in this regard (Benabou and Tirole, 2003). The theoretical model shows that selection into academia should follow the same pattern to the extent this is driven by performance incentives. There may be other factors that drive selection into academia or other knowledge jobs, such as risk preferences or differential opportunity costs of the generally long training trajectories required for knowledge work. More research into selection into knowledge creation would therefore be worthwhile. Academic research is an important instance of knowledge work, and understanding the effect of performance pay on both effort and selection in this particular context is valuable in its own right. Academia is also an interesting and useful setting in which to study the organization of knowledge work more generally and as such, the findings in this paper have implications for knowledge work in other contexts. Academia does however have a number of characteristics that may not be present to the same degree in other contexts, and this has implications for the extent to which the findings in this paper carry over to other settings. The knowledge created in academia is highly visible and available to a broad audience, and academics are (expected to be) highly mobile. Both of these characteristics are conducive to career concerns. The introduction of performance pay, specifically the implicit, career concerns incentives introduced with the reform studied in this paper, may therefore not have as strong an effect in sectors in which these conditions are not met. In those areas, principals may have to resort to explicit, on-the-job performance incentives more. These were however not found to give rise to an additional significant effort response in academia. In industries in which knowledge is confidential, it may therefore be worthwhile to contemplate publicizing (some of) the knowledge generated as a means to motivating workers and more research into the publicness of output and reputation concerns would be valuable.
References
Abraham, Sarah, and Liyang Sun. 2018. “Estimating Dynamic Treatment Effects in Event Studies with Hetero- geneous Treatment Effects.” Arxiv Preprint Arxiv:1804.05785. Academics.de. 2016. “Tenure Track: Professor Auf Lebenszeit.” Aghion, Philippe, Mathias Dewatripont, Caroline Hoxby, Andreu Mas-Colell, and André Sapir. 2010. “The Governance and Performance of Universities: Evidence from Europe and the Us.” Economic Policy, 25(61): 7–59. Aghion, Philippe, Peter Howitt, and David Mayer-Foulkes. 2005. “The Effect of Financial Development on Convergence: Theory and Evidence.” The Quarterly Journal of Economics, 120(1): 173–222. Audretsch, David B, and Maryann P Feldman. 1996. “R&d Spillovers and the Geography of Innovation and Production.” The American Economic Review, 630–640. Autor, David H. 2019. “Work of the Past, Work of the Future.” Vol. 109, 1–32. Azoulay, Pierre, Jeffrey L Furman, Joshua L Krieger, and Fiona Murray. 2015. “Retractions.” Review of Economics and Statistics, 97(5): 1118–1136. Azoulay, Pierre, Joshua S Graff Zivin, and Gustavo Manso. 2011. “Incentives and Creativity: Evidence from the Academic Life Sciences.” The Rand Journal of Economics, 42(3): 527–554.
31 Azoulay, Pierre, Waverly Ding, and Toby Stuart. 2009. “The Impact of Academic Patenting on the Rate, Quality and Direction of (public) Research Output.” The Journal of Industrial Economics, 57(4): 637–676. Bandiera, Oriana, Iwan Barankay, and Imran Rasul. 2005. “Social Preferences and the Response to Incentives: Evidence from Personnel Data.” The Quarterly Journal of Economics, 917–962. Benabou, Roland, and Jean Tirole. 2003. “Intrinsic and Extrinsic Motivation.” The Review of Economic Studies, 70(3): 489–520. Bénabou, Roland, and Jean Tirole. 2006. “Incentives and Prosocial Behavior.” American Economic Review, 96(5): 1652–1678. Bergsdorf, Wolfgang. 2005. “Richtlinie Der Universitaet Erfurt Ueber Das Verfahren Und Die Vergabe Von Leistungsbezuegen.” Besley, Timothy, and Maitreesh Ghatak. 2005. “Competition and Incentives with Motivated Agents.” American Economic Review, 95(3): 616–636. Besley, Timothy, and Maitreesh Ghatak. 2018. “Prosocial motivation and incentives.” Annual Review of Eco- nomics, 10: 411–438. Biester, Christoph. 2010. “Der Universitaere Metabolismus, Die Buerokratisierung Der Leistungsorientierten Verguetung in Der W-besoldung.” Online powerpoint, Accessed on 6 May 2015. Björk, Bo-Christer, and David Solomon. 2013. “The Publishing Delay in Scholarly Peer-reviewed Journals.” Journal of Informetrics, 7(4): 914–923. BMBF. 2002. “Gesetz Zur Reform Der Professorenbesoldung.” ProfBesReformG. BMI. 2007. “Bericht Zum Besoldungsrechtlichen Vergaberahmen Bei Der Professorenbesoldung Nach § 35 Abs. 5 Bundesbesoldungsgesetz.” Bundesministerium des Innern. Stand 29 February 2008. Boly, Amadou. 2011. “On the Incentive Effects of Monitoring: Evidence from the Lab and the Field.” Experimental Economics, 14(2): 241–253. Bonatti, Alessandro, and Johannes Hörner. 2017. “Career concerns with exponential learning.” Theoretical Economics, 12(1): 425–475. Borjas, George J, and Kirk B Doran. 2015. “Prizes and Productivity How Winning the Fields Medal Affects Scientific Output.” Journal of Human Resources, 50(3): 728–758. Borusyak, Kirill, and Xavier Jaravel. 2017. “Revisiting Event Study Designs.” Available at Ssrn 2826228. Boudreau, Kevin J, Karim R Lakhani, and Michael Menietti. 2016. “Performance Responses to Competition across Skill Levels in Rank-order Tournaments: Field Evidence and Implications for Tournament Design.” The Rand Journal of Economics, 47(1): 140–165. Boudreau, Kevin J, Nicola Lacetera, and Karim R Lakhani. 2011. “Incentives and Problem Uncertainty in Innovation Contests: An Empirical Analysis.” Management Science, 57(5): 843–863. Bundesgesetzblatt. 1985. “Bundesbeamtengesetz.” Carpenter, Jeffrey, Peter Hans Matthews, and John Schirm. 2010. “Tournaments and Office Politics: Evidence from a Real Effort Experiment.” The American Economic Review, 100(1): 504–517. Chan, Ho Fai, Bruno S Frey, Jana Gallus, and Benno Torgler. 2014. “Academic Honors and Performance.” Labour Economics, 31: 188–204. Chevalier, Judith, and Glenn Ellison. 1999. “Career concerns of mutual fund managers.” The Quarterly Journal of Economics, 114(2): 389–432. Clarivate Analytics. 1993-2012a. “ISI Web of Science.” Clarivate Analytics. 2000-2012b. “Journal Citation Report.” Clarivate Analytics. 2017. “Journal Citation Report.” De Groot, Morris H. 1970. Optimal Statistical Decisions. McGraw-Hill. De Gruyter. 2006. “Kuerschners Deutscher Gelehrten Kalender.” cd-rom. De Gruyter. 2008. “Kuerschners Deutscher Gelehrten Kalender.” cd-rom. De Gruyter. 2013. “Kuerschners Deutscher Gelehrten Kalender Online.”
32 Detmer, Hubert, and Ulrike Preissler. 2004. “Abenteuer W, Strategien, Risiken Und Chancen.” Forschung Und Lehre, , (6): 308–311. Detmer, Hubert, and Ulrike Preissler. 2005. “Die Neue Professorenbesoldung, Ein Ueberblick.” Forschung Und Lehre, , (5): 256–258. Detmer, Hubert, and Ulrike Preissler. 2006. “Die W-besoldung Und Ihre Anwendung in Den Bundeslaendern.” Beitraege Zur Hochschulforschung, 28(2): 50–65. Dewatripont, Mathias, Ian Jewitt, and Jean Tirole. 1999. “The economics of career concerns, part II: Application to missions and accountability of government agencies.” The Review of Economic Studies, 66(1): 199–217. DFG. 2016. “Excellence Initiative 2005-2017.” DHV. 1999-2013. “Forschung Und Lehre.” DHV. 2002. “Habilitationen Und Berufungen.” Forschung Und Lehre, 41. DHV. 2014. “Forschung und Lehre - Wir Ueber Uns.” Dickinson, David, and Marie-Claire Villeval. 2008. “Does Monitoring Decrease Work Effort?: The Complemen- tarity between Agency and Crowding-out Theories.” Games and Economic Behavior, 63(1): 56–76. Dilger, Alexander. 2013. “Vor- Und Nachteile Der W-besoldung.” Discussion Paper of the Institute for Organisa- tional Economics. Westfaelische Wilhelms-Universitaet Muenster. Dohmen, Thomas, and Armin Falk. 2011. “Performance Pay and Multidimensional Sorting: Productivity, Prefer- ences, and Gender.” The American Economic Review, 556–590. Ederer, Florian, and Gustavo Manso. 2013. “Is Pay for Performance Detrimental to Innovation?” Management Science, 59(7): 1496–1513. Erat, Sanjiv, and Uri Gneezy. 2016. “Incentives for Creativity.” Experimental Economics, 19(2): 269–280. Expertenkommission. 2000. “Bericht Der Expertenkommission - Reform Des Hochschuldienstrechts.” Ferrer, Rosa. 2016. “The effect of lawyers’ career concerns on litigation.” Fitzenberger, Bernd, and Ute Schulze. 2014. “Up or Out: Research Incentives and Career Prospects of Postdocs in Germany.” German Economic Review, 15(2): 287–328. Freeman, Richard B, and Alexander M Gelber. 2010. “Prize Structure and Information in Tournaments: Experi- mental Evidence.” American Economic Journal: Applied Economics, 2(1): 149–164. Gibbons, Robert, and Kevin J Murphy. 1992. “Optimal Incentive Contracts in the Presence of Career Concerns: Theory and Evidence.” Journal of Political Economy, 100(3): 468–505. Gibbs, Michael, Susanne Neckermann, and Christoph Siemroth. 2017. “A Field Experiment in Motivating Employee Ideas.” Review of Economics and Statistics, 99(4): 577–590. Gien, Gabriele. 2017. “Satzung Der Katholischen Universitaet Eichstaett-ingolstadt Zur Regelung Des Verfahrens Der Bewertung Der Besonderen Leistungen Zur Vergabe Der Besonderen Leistungsbezuege.” Goodman-Bacon, Andrew. 2018. “Difference-in-differences with Variation in Treatment Timing.” National Bureau of Economic Research. Gourieroux, Christian, Alain Monfort, and Alain Trognon. 1984. “Pseudo Maximum Likelihood Methods: Applications to Poisson Models.” Econometrica: Journal of the Econometric Society, 701–720. Gross, Daniel P. 2016. “Creativity under Fire: The Effects of Competition on Creative Production.” Review of Economics and Statistics, 1–49. Haeck, Catherine, and Frank Verboven. 2012. “The Internal Economics of a University: Evidence from Personnel Data.” Journal of Labor Economics, 30(3): 591–626. Hall, Bronwyn H, and Dietmar Harhoff. 2012. “Recent Research on the Economics of Patents.” Annu. Rev. Econ., 4(1): 541–565. Handel, Kai Christian. 2005. Die Umsetzung Der Professorenbesoldungsreform in Den Bundesländern. CHE. Harbring, Christine, Bernd Irlenbusch, and Matthias Kräkel. 2004. “Ökonomische Analyse Der Profes- sorenbesoldungsreform in Deutschland.” 197–219. Hellmann, Thomas, and Veikko Thiele. 2011. “Incentives and Innovation: A Multitasking Approach.” American
33 Economic Journal: Microeconomics, 3(1): 78–128. Hochschullehrerbund. 2009. “Die Professorengehaelter in Der W-besoldung, Art Und Umfang Von Berufungsver- handlungern.” Online. Holmström, Bengt. 1982. “Managerial Incentives Schemes-a Dynamic Perspective.” Essays in Economics and Management in Honor of Lars Wahlbeck. Holmström, Bengt. 1999. “Managerial Incentive Problems: A Dynamic Perspective.” The Review of Economic Studies, 66(1): 169–182. Holmstrom, Bengt, and Paul Milgrom. 1991. “Multitask principal-agent analyses: Incentive contracts, asset ownership, and job design.” JL Econ. & Org., 7: 24. Hong, Harrison, Jeffrey D Kubik, and Amit Solomon. 2000. “Security analysts’ career concerns and herding of earnings forecasts.” The Rand journal of economics, 121–144. Hossain, Tanjim, and John A List. 2012. “The Behavioralist Visits the Factory: Increasing Productivity Using Simple Framing Manipulations.” Management Science, 58(12): 2151–2167. HRK. 2014. “Hochschulkompass.” Huber, Bernd. 2005. “Richtlinien Der Ludwig-maximilians-universitaet Muenchen Zur Regelung Der Grundsaetze Fuer Die Vergabe Von Leistungsbezuegen.” Hvide, Hans K, and Benjamin F Jones. 2018. “University Innovation and the Professor’s Privilege.” American Economic Review, 108(7): 1860–98. Iaria, Alessandro, Carlo Schwarz, and Fabian Waldinger. 2018. “Frontier Knowledge and Scientific Production: Evidence from the Collapse of International Science.” The Quarterly Journal of Economics, 133(2): 927–991. Jaffe, Adam B, Manuel Trajtenberg, and Rebecca Henderson. 1993. “Geographic Localization of Knowledge Spillovers As Evidenced by Patent Citations.” The Quarterly Journal of Economics, 108(3): 577–598. Jones, Benjamin F. 2009. “The Burden of Knowledge and the Death of the Renaissance Man: Is Innovation Getting Harder?” The Review of Economic Studies, 76(1): 283–317. Kelly, Bryan, Dimitris Papanikolaou, Amit Seru, and Matt Taddy. forthcoming. “Measuring Technological Innovation Over the Long Run.” Kräkel, Matthias. 2006. “Zur Reform Der Professorenbesoldung in Deutschland.” Perspektiven Der Wirtschaft- spolitik, 7(1): 105–126. Lach, Saul, and Mark Schankerman. 2004. “Royalty Sharing and Technology Licensing in Universities.” Journal of the European Economic Association, 2(2-3): 252–264. Lach, Saul, and Mark Schankerman. 2008. “Incentives and Invention in Universities.” The Rand Journal of Economics, 39(2): 403–433. Lavy, Victor. 2009. “Performance Pay and Teachers’ Effort, Productivity, and Grading Ethics.” The American Economic Review, 1979–2011. Lazear, Edward P. 2000. “Performance Pay and Productivity.” The American Economic Review, 90(5): 1346–1361. Lazear, Edward P, and Paul Oyer. 2012. “Personnel Economics.” The Handbook of Organizational Economics, 479. Leitungsgremium, Universitaet Augsburg. 2005. “Grundsaetze Der Universitaet Augsburg Fuer Die Vergabe Von Leistungsbezuegen.” Lerner, Josh, and Julie Wulf. 2007. “Innovation and Incentives: Evidence from Corporate R&d.” The Review of Economics and Statistics, 89(4): 634–644. Leuven, Edwin, Hessel Oosterbeek, Joep Sonnemans, and Bas Van Der Klaauw. 2011. “Incentives Versus Sorting in Tournaments: Evidence from a Field Experiment.” Journal of Labor Economics, 29(3): 637–658. Lucas, R. E. 1988. “On the Mechanisms of Economic Development.” Journal of Monetary Economics, 22(1): 3–32. Lünstroth, Pia. 2011. “Leistungslohn Und Kooptation-eine ökonomische Analyse Der Reform Der Profes- sorenbesoldung.” PhD diss. Universitaet Trier. Lutter, Mark, and Martin Schröder. 2016. “Who Becomes a Tenured Professor, and Why? Panel Data Evidence
34 from German Sociology, 1980–2013.” Research Policy, 45(5): 999–1013. Macleod, Bentley. n.d.. Beyond Price Theory. MIT Press. Manso, Gustavo. 2011. “Motivating Innovation.” The Journal of Finance, 66(5): 1823–1860. McCormack, John, Carol Propper, and Sarah Smith. 2014. “Herding Cats? Management and University Performance.” The Economic Journal, 124(578): F534–F564. Miklós-Thal, Jeanine, and Hannes Ullrich. 2016. “Career Prospects and Effort Incentives: Evidence from Professional Soccer.” Management Science, 62(6): 1645–1667. Mohr, Joachim. 2007. “Professoren, Die Vertreibung Der Weisen.” Spiegel Online. Muralidharan, Karthik, and Venkatesh Sundararaman. 2011. “Teacher Performance Pay: Experimental Evi- dence from India.” The Journal of Political Economy, 119(1): 39–77. Oeffentlicher Dienst. 2004. “Gesetz Ueber Die Erhoehung Von Dienst- Und Versorgungsbezuegen in Bund Und Laendern 2003/2004.” Last accessed 11-08.2015. Oyer, Paul, and Scott Schaefer. 2011. “Personnel Economics: Hiring and Incentives.” Handbook of Labor Economics, 4: 1769–1823. Preissler, U. 2006. “Zwischenbilanz Professorenbesoldung - Sichtweise Und Beratung Des Deutschen Hochschul- verbandes.” Arbeitsgruppe Fortbildung im Sprecherkreis der Deutschen Universitätskanzler. Pritchard, Rosalind. 2006. “Trends in the Restructuring of German Universities.” Comparative Education Review, 50(1): 90–112. Romer, Paul M. 1986. “Increasing Returns and Long-run Growth.” The Journal of Political Economy, 94(5): 1002– 1037. Rubin, Jared, Anya Samek, and Roman M Sheremeta. 2018. “Loss aversion and the quantity–quality tradeoff.” Experimental Economics, 21(2): 292–315. Schniederjuergen, Axel. 2013a. E-mail. Schniederjuergen, Axel. 2013b. E-mail. Shearer, Bruce. 2004. “Piece Rates, Fixed Wages and Incentives: Evidence from a Field Experiment.” The Review of Economic Studies, 71(2): 513–534. Silva, JMC Santos, and Silvana Tenreyro. 2006. “The Log of Gravity.” The Review of Economics and Statistics, 88(4): 641–658. Universitaet Regensburg, Universitaetsleitung. 2016. “Grundsaetze Der Universitaet Regensburg Zur Vergabe Von Leistungsbezuege.” Von Proff, Sidonia, Guido Buenstorf, and Martin Hummel. 2012. “University Patenting in Germany before and After 2002: What Role Did the Professors’ Privilege Play?” Industry and Innovation, 19(1): 23–44. Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. MIT press. Wuchty, Stefan, Benjamin F Jones, and Brian Uzzi. 2007. “The Increasing Dominance of Teams in Production of Knowledge.” Science, 316(5827): 1036–1039. Wu, Yanhui, and Feng Zhu. 2018. “Competition, Contracts, and Creativity: Evidence from Novel Writing in a Platform Market.”
35 (a) Citations (b) Impact Factor Rated Publications
Figure 1: Number of Publications in Citation and IFR Publication Bins Notes: Histograms depict the coefficient estimates and corresponding 95% confidence intervals of the post002 ∗ Treatment (grey bars) and Tenure ∗ Treatment (white bars) interactions in separate regressions with citation quantile frequency variables as dependent variable in sub- figure a, and impact factor rating quantile frequency variables as dependent variables in sub-figure b. To generate these dependent variables, percentiles of citations and impact factor ratings are calculated separately by field and publication year and the percentile cut-offs are used to generate quantile frequency variables for each author and publication year. All other specifications as in the baseline estimation of the effort effect.
36 (a) Number of Publications (b) Impact Factor-Rated Number of Publications
(c) Total Number of Citations (d) Average Impact Factor Rating
(e) Average Number of Citations
Figure 2: Pre-trends and Effect Dynamics - Treatment vs. Control Cohort Notes: Plots depict the coefficient estimates and corresponding 95% confidence intervals of the interactions of a treatment dummy and time- to-tenure fixed effects in regressions with the following dependent variables: number of publications (panel a), impact factor-rated number of publications (panel b), total number of citations (panel c), average impact factor rating (panel d), and average number of citations. All other specifications as in the baseline estimation of the effort effect.
37 (a) Number of Publications (b) Impact Factor-Rated Number of Publications
(c) Total Number of Citations (d) Average Impact Factor Rating
(e) Average Number of Citations
Figure 3: Pre-trends and Effect Dynamics - Placebo Experiment Notes: Figures depict the coefficient estimates and corresponding 95% confidence intervals of the interactions of a treatment dummy and time- to-tenure fixed effects in regressions with the following dependent variables: number of publications (panel a), impact factor-rated number of publications (panel b), total number of citations (panel c), average impact factor rating (panel d), and average number of citations.The sample is restricted to academics who started their first tenured affiliation at a German public university in 2001, 2002, 2003 or 2004 and did not hold a foreign affiliation immediately prior. Treatment is 1 if an academic makes tenure at a public university in 2003 or 2004 and 0 if they becomes tenured in 2001 or 2002 (the control group). All other specifications as in the baseline estimation of the effort effect.
38 (a) Natural and Applied Sciences (b) Social Sciences and Humanities
Figure 4: Heterogeneous Effort Effect by Broad Academic Field Notes: Histograms depict the coefficient estimates and corresponding 95% confidence intervals of the post002 ∗ Treatment (grey bars) and Tenure ∗ Treatment (white bars) interactions in separate regressions with the following dependent variables: the number of publications, the impact factor-rated number of publications, the total number of citations to all publications published in a given year, the average impact factor rating of publications published in a given year, and the average number of citations to publications published in a given year. In sub-figure a, the sample is restricted to academics in the natural and applied sciences.. In sub-figure b, the sample is restricted to academics in the social sciences and humanities. Mathematics, physics and informatics, biology, chemistry, earth sciences, pharmacology, engineering, medicine, dentistry, veterinary, agricultural science and nutrition science are classified as natural and applied sciences, while social sciences and humanities, comprises theology, philosophy and history, philology and anthropology, law, economics and other social sciences. All other specifications as in the baseline estimation of the effort effect.
39 (a) 10th (top) Decile (b) 9th Decile
(c) 8thDecile (d) 7th Decile
(e) 6thDecile (f) Below Median
Figure 5: Heterogeneous Effort Effect by Productivity Quantile Notes: Panels are restricted to, respectively, the top five deciles (sub-figures a-e) and below median academics (sub-figure f). Productivity deciles are determined on the basis of the averages of the impact factor-rated number of publications over the three pre-announcement years 1999, 2000 and 2001, separately by academic field and treatment group. All other specifications as in the baseline estimation of the effort effect.
40 (a) 10th (top) Decile (b) 9th Decile
(c) 8thDecile (d) 7th Decile
(e) 6thDecile (f) Below Median
Figure 6: Heterogeneous Effort Effect - Citation Bins Notes: Results from separate regressions with citation quantile frequency variables as dependent variable.Panels are restricted to, respectively, the top five productivity deciles (sub-figures a-e) and below median productivity academics (sub-figure f). Productivity deciles are determined on the basis of the averages of the impact factor-rated number of publications over the three pre-announcement years 1999, 2000 and 2001, separately by academic field and treatment group. All other specifications as before.
41 (a) Backward Cosine Similarity (b) Forward Cosine Similarity
Figure 7: Cosine Similarity Distributions Notes: Results from separate regressions with backward cosine similarity quantile frequency variables as dependent variable in sub-figure a, and forward cosine similarity quantile frequency variables as dependent variables in sub-figure b. To generate these dependent variables, percentiles of backward and forward cosine similarity metrics are calculated separately by field and publication year and the percentile cut-offs are used to generate quantile frequency variables for each author and publication year. Further details in text. All other specifications as in the baseline estimation of the effort effect.
42 Figure 8: Quartiles Based on Impact Factor-Rated Number of Publications Notes: Epanechnikov kernel-density estimates of the hazard function for switching to the performance pay scheme for academics in the top quartile (red line) and bottom three quartiles (blue line) of the average productivity distribution. Switches to performance pay are defined as any contract, position or affiliation change after 2005. Quartiles are determined based on three pre-implementation year averages (2002, 2003 and 2004) of the number of impact factor-rated publications. Quartiles are derived separately by academic field and tenure year. The sample is restricted to academics who held a tenured affiliation at a public university before 2005.
43 Table 1: Summary Statistics
Research productivity variables N Mean SD Median Min Max
Number of publications 108363 2.907 7.134 0 0 195
IF-rated publications 108363 9.874 31.38 0 0 1082.363
Citations 108363 102.271 333.992 0 0 11086
Average IF-rating 48552 2.709 2.848 2.075 0 52.589
Average citations 48552 32.691 61.322 19.118 0 2759
Maximum citations 48552 99.164 216.016 42 0 8796
Minimum citations 48552 10.612 41.576 1 0 2759
Notes: The unit of observation is academic i. The sample is restricted to academics who started their first tenured affiliation at a German public university in 2002 to 2007 (excluding those with a foreign affiliation directly prior to this and dropping academics once they pass away or retire) and includes data from 1993 until and including 2012.
44 Table 2: Baseline Effort Effect Estimation
# Publications IF-rated publications Citations Average IF-rating Average citations
Post’02 * Treatment 0.168*** 0.133*** 0.129** -0.094*** -0.103*
(0.038) (0.051) (0.063) (0.035) (0.062)
Tenure * Treatment -0.011 0.026 0.017 0.025 -0.018
(0.040) (0.052) (0.064) (0.036) (0.067)
Number of Observations 83937 74326 78308 47052 47789
Number of Individuals 4671 4136 4357 3917 4110
Log Likelihood -136647.270 -338248.006 -4508200.290 -70677.097 -736179.301
Chi-squared 2863.231 2856.117 1101.473 611.818 476.042
Notes: The unit of observation is academic i. The sample is restricted to academics who started their first tenured affiliation at a German public university in 2002 to 2007 (excluding those with a foreign affiliation directly prior to this and dropping academics once they pass away or retire) and is estimated using data from the the period 1993-2012. The dependent variables are, respectively, the number of publications, the impact factor-rated number of publications, the total number of citations to all publications published in a given year, the average impact factor rating of publications published in a given year, and the average number of citations to publications published in a given year. All dependent variables are defined for academic i in field f and year t, lagged by average publication lag in field f as reported in Björk and Solomon (2013). Post002 is 0 before 2002 and 1 thereafter, Tenured is 0 before an academic obtains their first tenured affiliation and 1 thereafter, and Treatment is 1 if an academic makes tenure at a public university in 2005, 2006 or 2007 and 0 if they make tenure at a public university in 2002, 2003 or 2004 (the control group). All specifications control for year and individual fixed effects and fifteen time-to-tenure fixed effects (from seven years before the tenure year to seven years after). Estimation as conditional quasi-maximum likelihood estimation of Poisson fixed effects models with robust standard errors clustered at the individual level.
45 Table 3: Heterogeneous Responses - Interactions
# Publications IF-rated publications Citations Average IF-rating Average citations
Post’02 * Treatment 0.280*** 0.296** 0.249* -0.116 -0.107
(0.071) (0.116) (0.129) (0.077) (0.110)
Post’02 * Treatment * Top Decile -0.152* -0.199 -0.063 0.038 0.161
(0.087) (0.132) (0.152) (0.091) (0.140)
Post’02 * Treatment * 9th Decile -0.331*** -0.443*** -0.391** -0.016 0.014
(0.095) (0.140) (0.163) (0.096) (0.147)
Post’02 * Treatment * 8th Decile -0.152 -0.290* -0.497*** -0.008 -0.270
(0.101) (0.152) (0.182) (0.094) (0.181)
Post’02 * Treatment * 7th Decile -0.022 -0.019 -0.053 0.004 -0.222
(0.107) (0.155) (0.176) (0.125) (0.171)
Post’02 * Treatment * 6th Decile -0.094 0.134 0.252 0.263** 0.470***
(0.124) (0.217) (0.223) (0.110) (0.177)
Tenure * Treatment -0.032 0.093 0.046 0.090 -0.042
(0.079) (0.115) (0.134) (0.071) (0.109)
Tenure * Treatment * Top Decile -0.035 -0.150 -0.199 -0.098 -0.143
(0.080) (0.114) (0.139) (0.083) (0.131)
Tenure * Treatment * 9th Decile 0.118 0.030 0.152 -0.124 0.064
(0.096) (0.134) (0.172) (0.083) (0.136)
Tenure * Treatment * 8th Decile 0.071 0.020 0.170 -0.059 0.102
(0.092) (0.130) (0.165) (0.082) (0.152)
Tenure * Treatment * 7th Decile 0.021 -0.107 -0.086 0.014 0.226
(0.101) (0.139) (0.191) (0.114) (0.166)
Tenure * Treatment * 6th Decile 0.110 -0.049 -0.004 -0.217** -0.194
(0.111) (0.172) (0.199) (0.096) (0.165)
Number of Observations 80953.000 71540.000 75504.000 45588.000 46324.000
Number of Individuals 4505.000 3981.000 4201.000 3771.000 3966.000
Log Likelihood -131129.798 -321279.569 -4287805.762 -68430.475 -708085.051
Chi-squared 3329.827 3418.293 1569.162 687.096 509.727
Notes: Results are from the estimation of the baseline model augmented with a dummy variables for productivity deciles and their double and triple interactions with Post002, Tenure and Treatment. Productivity deciles are determined on the basis of the averages of the impact factor-rated number of publications over the three pre-announcement years 1999, 2000 and 2001, separately by academic field and treatment group. All other specifications as before.
46 Table 4: Selection Analysis
Panel A: Treatment versus Control Panel B: Placebo
1a 1b 2a 2b 3a 3b
Treatment 0.658 0.046 0.576 -0.207 -0.790 -1.653**
(0.401) (0.426) (0.412) (0.436) (0.624) (0.667)
Age -0.126*** -0.299*** -0.131*** -0.313*** -0.165*** -0.449***
(0.007) (0.017) (0.007) (0.017) (0.010) (0.040)
Avg Productivity -0.001 -0.004* -0.038*** -0.056*** -0.003 -0.006
(0.002) (0.002) (0.015) (0.013) (0.014) (0.015)
Age * Treatment -0.028*** -0.007 -0.026*** -0.001 0.013 0.034**
(0.009) (0.009) (0.009) (0.010) (0.013) (0.014)
Avg Productivity * Treatment 0.004** 0.007*** 0.032** 0.051*** -0.008 -0.005
(0.002) (0.002) (0.015) (0.014) (0.015) (0.016)
Avg Productivity * Age 0.001*** 0.001*** 0.000 0.000
(0.000) (0.000) (0.000) (0.000)
Avg Productivity * Age * Treatment -0.001* -0.001*** 0.000 0.000
(0.000) (0.000) (0.000) (0.000)
Age at Tenure 0.192*** 0.200*** 0.293***
(0.015) (0.016) (0.038)
Constant 2.381*** 1.599*** 2.576*** 1.924*** 3.667*** 2.721***
(0.329) (0.347) (0.334) (0.350) (0.449) (0.479)
Number of Observations 80131 80131 80131 80131 51431 51431
Number of Subjects 14972 14972 14972 14972 6960 6960
Number of Switches 2435 2435 2435 2435 1099 1099
Log Likelihood -7545.484 -7404.981 -7541.479 -7394.569 -3232.134 -3178.716
Chi-squared 1365.450 1310.708 1378.937 1326.956 595.064 557.869
Rho 1.345 1.653 1.345 1.671 1.439 2.516
Notes: The unit of observation is academic i. The table reports estimation results of Weibull proportional hazard models of selection into performance pay. Any first affiliation, position or contract change after implementation of the pay reform (as of 2005) is considered a switch to performance pay (the “failure” event) and the time until such a switch is used as duration variable. Academics are “at risk” of switching after their most recent affiliation/position/contract change if they are tenured, at a public university and not retired. In Panel A, the treatment variable is 1 for academics who have made tenure before 2005 and 0 for those who make tenure afterwards. Panel B reports the results for a placebo experiment where the placebo-treatment group comprises academics who start their first tenured position before 2002, while academics who start their first tenured affiliation between 2002 and 2005 act as placebo-control group. “Avg Productivity” is calculated as three year pre- implementation averages (2002-2004) of the impact factor-rated number of publications. The age variable is equal to an author’s self-reported age if known, and equal to a synthetic age otherwise. The synthetic age is calculated using the average age at habilitation, promotion or tenure. All specifications include academic field fixed effects and are estimated for years t>2004 and for academics who start their first tenured affiliation as of 1999 (excluding those with a foreign affiliation directly prior to this). The models in the “b” columns models also control for synthetic age at tenure. Standard errors are robust and clustered by individual academic.
47 Appendix
A1 Model of Performance Pay and Multi-Tasking
This Appendix provides a detailed exposition of the model outlined in the main text. The set-up of the model is provided in the main text and not repeated here. I first discuss the full insurance case as a benchmark before presenting the performance pay case. I then derive implications for the effort and selection effects of performance pay, by comparing the equilibrium behavior under the latter to that under a flat wage system. Finally, I show that equivalent effort and selection results hold for an incentive system that features career concerns only, and that the results for selection into performance pay carry over to selection into academia and other labor markets featuring such performance pay.
A1.1 Flat Wage
Suppose that principals can offer flat wage contracts only and, moreover, that they cannot tailor wages to their beliefs about agent ability or effort. This is the case in the German age-related pay system, in which principals (universities) do not have discretion over wage contracts, and would apply to other markets in which wage contracts are similarly constrained. With a per period discount factor δ, the expected life-time utility of an agent is given by:
( " ∞ #) t −→ U = E −exp −r ∑ δ (wt − c( e t)) t=0
In this benchmark flat wage case, pay wt ≥ 0 does not depend on output or effort in any period, so an agent’s payoff in period t in certainty equivalents is then:
−→ −→ −→ ut (wt, e t) = E {wt − c( e t)} = wt − c( e t) (7)
It follows that optimal effort in any period t equals minimum effort levels:
−→ −→ −→ argmaxe [ut (wt, e t)] = argmaxe {wt − c( e t)} = e (8)
Any differences in effort therefore reflect differences in intrinsic motivation, minimum output requirements such as tenure requirements and other such mechanisms. Suppose agent’s outside option yields per period utility u. Principals then need to set the wage −→∗ ∗ wt such that E {wt − c( e t )} = wt ≥ u in order to attract agents. In equilibrium therefore wt = u
A1.2 Career Concerns and Incentive Contracts
Consider now the case of a perfectly competitive labor market in which only short-term contracts are feasible. As in Gibbons and Murphy (1992), I restrict attention to linear contracts of the form
48 −→ −→T −→ wt ( y t) = ct + bt y t. This not only ensures tractability, but Holmstrom and Milgrom (1991) also show that optimal contracts are linear in a setting with comparable assumptions about agent preferences and output noise. In each period, the timing of actions is as follows: principals offer a contract (wt) to agents; agents pick the contract that yields the highest expected utility; agents choose and exert effort; output materializes, principals receive the output produced by agents they employ and agents are paid according to their contract terms. Because output is observable to all market participants, the market can use this information −→ to update its beliefs about agent ability. Given the assumptions on ability and output noise, yt is bivariate normal. The prior distribution of θ is normal as well, and hence so is the posterior distribution of θ. Using well-known formulas in De Groot (1970), the mean and variance of −→ −→ this posterior distribution of θ, given past output ( y ,.., y t− ) and conjectured effort levels −→ −→ 0 1 ( eˆ 0,.., eˆ t−1), are given by, respectively,
2 2 2 t−1 2 2 h −→ −→ i σ σ m0 + σ ∑ σ (yp,s − eˆp,s) + σ (yq,s − eˆq,s) m := E |(−→y ,..,−→y );( eˆ ,.., eˆ ) = ε ν 0 s=0 ν ε . t θ 1 t−1 0 t−1 2 2 2 2 2 σε σν +tσ0 (σε + σν ) (9) and 2 2 2 2 σ0 σε σν σt := var (θt) = 2 2 2 2 2 (10) σε σν +tσ0 σε + σν Perfect competition in the labor market implies that agents are offered contracts that will earn them their expected productivity, that is:
T −→ −→ −→T −→ E 1 y t = (mt + eˆp,t) + mt + eˆq,t = E [wt ( y t)] = ct + bt y t (11)
Here 1T denotes a 1x2 matrix of ones.36 An agent’s expected lifetime utility is then given by
( " ∞ #) ( " ∞ #) t −→T −→ −→ t −→ U = E −exp −r ∑ δ ct + bt y t − c( e t ) = E −exp −r ∑ δ ((mt + ep,t ) + (mt + eq,t ) − c( e t )) t=0 t=0 (12) −→ In any period t, effort affects the payoff that period, through the explicit bonus bt , as well as future wage payments through updated beliefs about ability. If only the latter, career concerns incentives are present, we know, following Holmström (1999) and using 9 and 12, that optimal effort is given by the following first order conditions −→ −→ ∂c( e t) 2 ∂c( e t) 2 = σνCCt; = σε CCt (13) ∂ep,t ∂eq,t
2 ∞ τ σ0 where CCt = 2∑τ=0 δ 2 2 2 2 2 . As noted in Macleod (n.d.), adding explicit perfor- σε σν +(t+τ)σ0 (σε +σν ) mance incentives to career concerns in this model does not affect the career concerns incentives. Ability enters output additively and does not affect the marginal cost of effort, so the optimal −→ bonus bt does not depend on mt, as will be shown below. Future income risk is therefore
36This vector of ones implies that principals’ return to the output quantity and quality (signal) is equal to this output (signal). It is straightfor- ward to allow for other rates of return and all the model’s results would continue to hold.
49 unaffected by effort, and career concerns incentives are unaffected by the introduction of explicit performance incentives. The first order conditions for optimal effort are thus −→ ∂c( e t) 2 = (ep,t − e¯p,t) + d eq,t − e¯q,t = bp,t + σνCCt (14) ∂ep,t −→ ∂c( e t) 2 = eq,t − e¯q,t + d (ep,t − e¯p,t) = bq,t + σε CCt (15) ∂eq,t
Substituting 14 in 15 and rearranging yields the following expressions for optimal effort
1 e∗ =e¯ + b + σ 2CC − d b + σ 2CC (16) p,t p 1 − d2 p,t ν t q,t ε t 1 e∗ =e¯ + b + σ 2CC − d b + σ 2CC (17) q,t p 1 − d2 q,t ε t p,t ν t
−→∗ −→ To derive the optimal bonus bt , I again use that bt affects only the effort and income risk in period t and is therefore chosen to maximize agent utility that period. This amounts to optimizing −→ the certainty equivalent with respect to bt −→ n −→ −→ −→ r −→ o b∗ = argmax m + e b + m + e b − c −→e b − b 2Σ 2 t bt t p,t t t q,t t t t 2 t t " # σ 2 + σ 2 σ 2 where 2 = t ε t . The resulting first order conditions are Σt 2 2 2 σt σt + σν
−→ −→ −→ ∂ep,t bt −→ ∂eq,t bt ∂c( e t) ∂c( e t) 2 2 2 1 − + 1 − = r bp,t σt + σε + bq,tσt (18) ∂ep,t ∂bp,t ∂eq,t ∂bp,t −→ −→ −→ ∂eq,t bt −→ ∂ep,t bt ∂c( e t) ∂c( e t) 2 2 2 1 − + 1 − = r bq,t σt + σν + bptσt (19) ∂eq,t ∂bq,t ∂ep,t ∂bq,t
Using 14 and 15 to substitute for the terms on the left-hand side both directly and using the implicit function theorem, we get
2 ∗ 2 ∗ 1 − σνCCt − rbq,tσt bp,t = 2 2 1 + r σt + σε 2 ∗ 2 ∗ 1 − σε CCt − rbp,tσt bq,t = 2 2 1 + r σt + σν
50 Substituting one into the other yields