Which Workers’ Track Productivity? The Role of Position Specificity and On-the- Search

Justin Bloesch & Bledi Taska

September 30, 2020

Abstract Which workers’ wages track productivity regardless of the state of the labor market? Evidence suggests that wages for workers in high- occupations (i) have been relatively stable across the business cycles over the last two and recoveries, and (ii) covary systematically with firm characteristics such as productivity or firm size. Conversely, wages in low-wage occupations have comoved more with labor market slack over the last two business cycles, and a broad literature suggests that firm characteristics, such as size or productivity, no longer have a significant relationship with wages in low- wage . In this paper, we introduce a concept of position specificity to explain these differential wage patterns across occupations. Using online job posting data from Burning Glass Technologies, we construct a measure of position specificity based on jobs’ posted skill requirements, using a clustering algorithm to identify occupations in which jobs are highly differentiated in their tasks from other jobs within the firm. Next, we develop a model with worker-position skill specificity and on-the-job search that can rationalize the above observations. This is achieved without imposing exogenous heterogeneity in worker bargaining power, while remaining consistent with a range of recent findings about wage setting and recruiting behavior that have challenged other popular models. We document the differential sensitivity of wages in low-wage occupations to labor market slack over the last two business cycles, and we offer preliminary evidence that position specificity may better account for these wage patterns than an occupation’s relative wage.

1 Introduction

The evolution of the wage distribution in the US is one of the most thoroughly researched subjects in economics. In this paper, we hope to highlight two thus-far unrelated patterns in the labor market and propose a new concept of position specificity to jointly explain

1 these observations. In doing so, we hope to contribute to understanding the dynamics of the wage distribution over the past two business cycles, while offering a novel theoretical foundation for heterogeneity in wage determination across workers. The first of these patterns is the greater sensitivity of wages in low-wage occupations to aggregate labor market slack. This pattern is well illustrated by the wage dynamics between high- and low-wage occupation during the current recovery. Using wage data from the Occupational Statistics (OES), Figure 1 plots a 6-digit occupation’s log median wage in 2010 against that occupation’s nominal wage growth in two periods, 2010- 14 and 2014-18. These periods were chosen to avoid the large and sudden compositional changes of employment during the . These periods differ most notably in that throughout the earlier period, the labor market was very slack, whereas in the second period, the labor market is substantially tighter. Figure 1 shows that wage growth in high- wage occupations was nearly constant over these two periods, but wage growth in low-wage occupations was substantially lower in the early period but shows nearly perfect catch-up growth in the later period. As shown in Figure 2, it is difficult to rationalize these wage patterns of low- and high-wage occupations between 2010-2018 with changes in demand alone. Figure 2 plots employment growth by occupation over same period, showing that employment growth over these two periods are essentially identical for both high- and low-wage occupations. In addition to differences within business cycles, there is evidence that low-wage occupations’ relative sensitivity to slack holds across business cycles as well. Autor (2014) shows that despite acceleration of employment in non-routine manual jobs from the decades of the 1990s to early 2000s, wage growth suddenly decelerated as the labor market became slack in the new millenium. Over the same period, the growth high-wage jobs slowed, but wage growth remained high in these high-wage jobs. Taken together, it appears that the wages in high-wage occupations are driven upwards regardless of labor market conditions, while wages in low-wage occupations are severely affected by labor market slack. The second pattern we seek to rationalize is that wages for workers in high-wage oc- cupation appear to covary more with firm characteristics, such as firm size or output per worker, and we hypothesize that these patterns may reflect true firm premia. On the other hand, wages in low-wage occupations appear to covary much less with firm characteristics, suggesting less variation in firm premia. Evidence on differential firm premia falls gener- ally into two camps. First are recent studies that the measure the passthrough of firm productivity shocks to wages. Garin & Silv´erio(2018) show using Portuguese data that shocks to firm value added per worker are passed on to workers in low- industries, indicating that jobs with large barriers between internal and external labor markets fa-

2 cilitate passthrough of producitivty shocks to wages. Kline et. al (2019) show that firms with high-value patents raise wages, primarily for top quartile employees, specifically men. Lastly, Friederich et. al (2019) using Swedish data find that while temporary shocks to firm productivity pass on more to the wages of low- workers, they find that permanent shocks to productivity are passed on much more to high-education workers. This finding is important because unlike Garin & Silv´erio(2018) and Kline et. al (2019), it says that even in the long term, skilled workers at more productive firms receive a larger firm premium than do less skilled workers. In addition to differential pay premia linked to productivity, other studies show a that large and high paying firms offer differentially higher wage premia for “higher” type workers or workers in high-wage occupations in the cross section, such as Bloom et al. (2018) and Mueller et al. (2017). Bloom et al. show that prior to 1990, the firm size premium for workers with a high school diploma or less was large, nearly 20%, but has now all but disappeared. However, for workers with a college degree or more, the wage premium for being at a large firm has held steady at around 10%. Mueller et al. find that the highest paying occupations at large firms pay substantially more than at small firms. Using data from the UK between 2004 and 2013, they detail that within-firm upper tail inequality (e.g. 90/50 ratios) grows substantially with firm size, while lower tail inequality (50/10 ratios ) does not. In the sorting/worker and firm fixed effects literature, Mogstad et al. (2019) find in the US that moving a 20th percentile from the lowest to the highest type of firm raises wages by 22%, while an equivalent move for an 80th percentile worker yields a 78% wage increase. While these cross-sectional studies do not establish a causal link between firm productivity and heterogenous wage premia, they would be consistent with such a finding if workers in high-wage occupations differentially received firm productivity premia. Together, these two patterns can be loosely described with the reduced-form relation- ship:

wjt = γyyjt + γθθt, where yjt is output per worker at firm j at time t, θt is a measure of labor market tightness at time t, and γy and γθ are the weights determining the wage. In summary, our brief high low high low survey of the literature suggests that γy > γy and γθ < γθ . We seek to provide an explanation for both of these inequalities with one mechanism: heterogeneity in position specificity. This paper offers three contributions to the literature. First, we derive a model in which the wage patterns described above are determined by the specificity of skills required by a position, rather than the level of skills. The intuition for this model is that highly skilled workers are, on average, more likely to be differentiated in their tasks within the firm. Not

3 only might this imply that workers are less substitutable within the firm, but also that their horizontal differentiation increases the difficulty of finding a replacement in the case of a vacancy. Drawing on Lazear’s (2009) “Skill-Weights Approach”, firms with positions that require unique combinations of skills search in thinner labor markets. Hence, firms with positions defined by high specificity must have both (i) costly reallocation of workers across positions in the firm and (ii) time-consuming search (or ) processes to fill vacant positions. The second contribution is to provide a measure of position specificity. To do this, we will use data from Burning Glass Technologies, who collect and clean nearly the entire universe of online vacancy postings since 2010. These vacancy postings provide information on required skills, as well as the company name, location, and occasionally the wage of the postings. We will use this data to construct average position specificity measures by occupation. Third, we will document that some of these specificity measures successfully account for the different relationship between slack and wages for low-wage and high-wage occupations in aggregate panel regressions. In future work, we plan to developed identified tests of differential causal effects of slack on wages. A key question that arises from thinking of workers as horizontally differentiated: how should their labor inputs be aggregated into a production function? To answer this, we introduce a novel production function with two inputs: the number of positions and the fraction of positions filled. The crucial feature is that the lost product of having a position go unfilled is greater than the marginal product of adding a new position. Depending on the functional form, this means that fewer positions costs the firm marginal product, while unfilled positions cost the firm average product. This feature can be seen as an intermediate case of O-Ring production as in Kremer (1993). Under O-Ring production, if a single worker fails to do their task, the entire output of the firm is lost. In that case, a worker failing to do their job destroys total product, even though an additional worker adds only marginal product if they successfully perform their task. The production function we offer here is a more general form that includes fully homogenous labor and O-ring production as corner cases. With unfilled vacancies and the need for replacement hires playing a distinct role, the process through which firms retain workers and find new hires becomes especially important. We embed these new firms in a frictional labor market where, because positions themselves are differentiated and require their own unique combination of skills, firms recruit for each position separately, position-by-position. Workers have preferences over wages and idiosyncratic, time-varying, non-wage employer preferences, which are private

4 information to the worker, and workers are allowed to search on the job. Firms, knowing this, set wages to optimize the tradeoff between wage costs and turnover costs. This is where market thinness matters: firms who search in thinner markets face more steeply diminishing returns to search effort, constraining them to settle for longer vacancy durations.1 For firms with the production function described above, having higher average product directly increases the cost of unfilled vacancies, increasing the marginal value of retention. Thus, that unfilled vacancies cost the firm average product, combined with high-specificity firms being constrained in how fast they can fill a vacant position, means that firms with higher average product will offer incumbent workers higher wages, even if the marginal product of a new position is the same as a firm with lower average productivity. This problem of long, costly vacancies is less prominent for firms with low-specificity positions. For firms with less convex recruiting costs, firms with higher productivity can more easily choose higher intensity and fill vacancies faster, mitigating the higher turnover costs due to higher average product. This exposition reveals the core mechanism at this model: when firms have upward sloping retention functions with respect to the wage, firms set the marginal cost of re- tention, the wage, equal to the marginal benefit of retention, the avoidance of turnover costs. Therefore, slack will have a smaller effect on the wages of high-specificity positions if turnover costs respond less elastically to slack. This is achieved in this model by making the replacement costs for high-specificity firms to have a quasi-fixed component: if even in perfectly slack labor markets, firms with high-specificity positions are constrained from quickly filling vacancies with replacements, then the cyclical component of replacement costs is smaller as a share of total replacement costs. In this sense, the markets are truly thin: every period, firms with high-specificity positions face a chance that there is no worker who is qualified and currently wants the position, given expectations about that position’s wage. This thinness creates an unavoidable turnover cost, regadless of labor market condi- tions. This generates a similar dynamic as in Oi (1962), who argues that firms will want to retain workers with larger hiring costs during temporary declines of productivity to avoid those hiring costs. This model achieves a similar motive through the search process, and which will be explored in more detail in the model section of the paper. This paper is closely related to Manning (2006), who emphasizes that it is the convexity

1What matters here for the firm’s wage setting decision boils down to how convex the costs are for filling a position quickly. Instead of diminshing returns to search expenditure, we could as easily have modeled diminishing returns to training expenditure, where trying to train an employee to become fully productivity becomes more costly as the firm tries to do it quickly. However, modeling turnover costss as coming from search frictions allows us to derive the firm’s recruiting and retnetion problems all at once as one, coherent problem.

5 of recruiting costs that distinguishes a monopsonistic labor market from a competitive one. This study differs in that while recruitment costs are convex for each position individually, i.e., on the intensive margin, recruitment costs are linear on the extensive margin: large firms do not systematically faces higher recruitment costs per worker. This is consistent evidence from with Davis, Faberman, and Haltiwanger (2010), who show that the returns to recruiting expenditure on the extensive margin are linear to modestly increasing. Further, this stucture breaks the link between firm premia and firm size, which is shown to only weakly hold in the service sector (Berlingieri et al., 2018). Breaking the link between firm size and wage premia also distinguishes this model from other models that produce only temporary firm premia in response to a shock, rather than persistent firm premia. This includes models with one period setups such as Garin and Silv´erio(2018), and Kline et al. (2019), or multi-worker firm dynamic bargaining models such as Acemoglu and Hawkins (2014), where firm premia are generated only when firms are below their target level of employment and are temporarily constrained from hiring quickly by convex extensive margin hiring adjustment costs. Lastly, we believe that this model is consistent with recent evidence on what factors determine wages in imperfectly competitive labor markets. Models that rely on the value of in determining wages have shown to have to have little empirical sup- port (J¨ageret al., 2018). Meanwhile, Moscarini and Postel-Vinay show that it is the EE (employment to employment) transition rate that determines the wage growth of not just job movers, but also job stayers. This suggests strongly firms’ avoidance of turnover is much more important than the worker’s value of unemployment in wage setting. In our model, the firm’s retention decision is the key determinant of wages, whereas the value of unemployment plays no role. The rest of the paper is as follows. Section 2 will describe the related literature. In Section 3, we will document the relationship between labor market slack and measures of position specificity. In Section 4, we will describe the full model. Section 5 will include a calibration and comparative statics. Section 6 discuss next steps, and Section 7 will conclude.

2 Related Literature

This papers is related to a large number of literatures. The first and most obvious is the wage dispersion and on-the-job search. Mortensen (2003) shows that under a random search environment with ex-ante homogenous firms and workers, there must be wage dis- persion in equilibrium. This means that a firm setting wages is choosing between the flow

6 rate of profits and turnover costs. In the model we present here, introduce idiosyncratic workers preferences over workplaces, which eliminates the necessity of wage dispersion in equilibrium for ex-ante homogenous workers and firms. However, from a given firm’s per- spective, the wage setting decision is the same: firms set wages to optimze the tradeoff between wage costs and turnover costs. Accordingly, this model also shares features withs Delacroix and Shi (2006), who is a directed search environment find a wage ladder resulting from search frictions and wage posting. This research is also connected to attempts to quantify the degree of wage dispersion that can arise from search frictions, such Hornstein, Krusell, and Violante (2011), who document that search frictions alone cannot account for observed differences in pay. How- ever, in this paper, we take a different approach by exploring how heterogeneity in search frictions enable differences in pay between workers. Since this is a model of take-it-or-leave-it wage offers from firms, it shares a lot in common with the monopsony models of Manning (2003) and Manning (2006). In his 2006 article, Manning articulates that what differentiates a monopsonistic market from a competitive one is if recruiting has diminishing returns to scale - another way of saying convex recruiting costs. However, our model differes in that recruiting costs are convex for an individual position but linear in the number of positions. Put another way, this means that recruiting costs are convex on the intensive margin but linear on the extnesive margin. This means that wage premia and firm size are no longer necessarily connected. This feature is favored by evidence from Davis, Faberman, and Haltiwanger (2010), who show that returns to recruiting expenditure on the extensive margin are linear to modestly increasing. In modeling choices, this paper closely resembles a combination of Faberman and Nagyp´al(2008), who model positions as needing replacement hires in the event to of a quit and need to be recruited for indepedently of each other, plus Nagyp´al(2004), who uses non-wage idiosyncratic match utility values to justify on-the-job search that doesn’t rely on wage dispersion alone. Together, these papers have two of the three core elements of the model we present here, but there are three main differences. First, in their 2008 paper, these authors model positions as having a sunk cost to create, while in our model, the cost of vacant positions is implicit in the production function. Second, we use a novel production function that aggregates labor inputs of the number of positions and fraction filled. Third, we model heterogeneity in employer recruiting costs. The research most similar in the question being asked are papers that explore wage determination and rent sharing in a setting with on-the-job search. Cahuc, Postel-Vinay, and Robin (2006), who estimate what fraction of ‘observed’ worker bargaining power is in

7 fact due to intra-firm competition and what fraction is due to underlyiing worker bargaining power, finding that professional class workers tend to have higher inherent bargaining power. Postel-Vinay and Robin (2002) also estimate a model with on-the-job search and heterogeneous workers and firms. This paper both compliments and contrasts with models of large firms which bargain with their workers, such as Stole & Zwiebel (1996) and Acemoglu and Hawkins (2014). All of these models attempt to tackle how firms set wages under diminshing returns to labor and imperfect labor market competition. They all also share the feature that firms lack the ability to commit to long term wage contracts. However, this model differs in important ways. Here, firms have complete wage setting power, yet workers earn more than their immediate outside option. The wage is pinned down by the firm trading off between profits per worker and retention, rather than an exogenous bargaining parameter. It seems more plausible that a worker’s credible outside option comes from draws from other employment opportunities, rather than a threat to quit into unemployment. Therefore, this papers lies more in the literatures on monopsony and on-the-job search. This research also draws on the literature on the cyclical behavior of vacancies and unemployment, such as the traditional Shimer (2005). The recent Gavazza, Mongey, and Violante (2018) shares a very similar function for the cost of recruiting effort, as in both papers the firm’s recruiting effort decision is crucial to each paper’s repsective results. The literature on job-to-job quits and wage growth heavily influenced this paper. No- tably, Moscarini and Postal-Vinay find that the EE (employment to employment) job transition rate best predicts wage growth of job stayers. In our model, it the threat of quitting that pin’s down firm’s wage offers. This paper also draws heavily from Faberman and Justiniano (2015), who similarly a tight link between job to job transitions and wage growth. This paper is also related to Beaudry and Dinardo (1991), where the type of wage setting (we.e., implicit contracts) by the firm depends on the type of committment in the employment relationship and well as the costs of mobility for the worker. The literature on replacement hires, such as Faberman and Nagyp´al(2008) and Mercan and Schoefer (2019), is connected as well. In our model, filled positions are the core input in production, meaning that whenever a quit occurs, a replacement hire must be found. As mentioned in the inroduction, this paper relates deeply to the literature on firms and wages, such as Card et al. (2018), Song et al. (2018), Mogstad et al. (2019), Garin and Silvero (2018), Friederich et al. (2018), among others. Lastly, this research intersects loosely with the assignment model literature such as Sattinger (1975) and Tervio (2009), in the sense that workers are assigned to a particular set of tasks, or position, within the firm. While assignment models are typically used to

8 describe top labor markets such as the labor market for CEO’s, it is arguable that highly skilled workers operate in a labor market with assingment like qualities.

3 Position Specificity and Labor Market Slack

3.1 Data and Measurement of Position Specificity

In this section, we will analyze the relationship between labor market slack and wages by occupational position specificity from 1999-present. While the motivating plots in Figures 1 and 2 plotted growth rates of wages, the model will predict a level-levels relationship between labor market tightness and wages, akin to a hetergeneous wage curve (Blanchflower and Oswald, 1994). The goal of this section is (i) to identify measures of or proxies for position specificity that account for the differential wage patterns, and (ii) to quantify the magnitudes with which the wages in low-specificity occupations move more with slack than high-specificity occuaptions. We will use the term “tightness” as the inverse or opposite of “slack.” To construct measures of position specificity, we use the vacancy data from Burning Glass Technologies, which has nearly the entire universe of online job postings from 2010- 2018 with cleaning to remove duplicate postings. The job postings include information such as location, employer, and 6-digit SOC occupation codes. Further, the job postings include detailed information about the skill requirements posted with every vacancy. Burning Glass then cleans and categorizes the skill requirements into broad skill categories called “skill clusters”. For example, if a job posting has skill requirements called “Authentication” and “Intrusion Detection”, both of these skills are both categorized under the “Cybersecurity” skill cluster. Table 1 provides an example snapshot of different skills, their skill clusters, and skill cluster families as categorized by Burnining Glass. Over 13,000 skills are categorized into over 700 skill clusters, which are further categorized into approximately 30 skill clusters families. We create two measures of position specificity using the Burnining Glass data. The first measure we construct is the average number of unique skill clusters listed in a job posting by occupation. While simple, we believe that the skill count measure will have informational content about the difficulty of replacing a worker, beyond what the wage level of the worker would imply. The first row of Table 2 provides summary statistics of the mean number of unique skills by occupation: a job posting will on average have 4.2 unique skill clusters posted, with a standard deviation of 1.49. Many low-wage services jobs fall near the bottom of this range: job postings for “Dishwashers” list a mean of 1.86

9 unqiue skill clusters, while “Database Admininstrator” job postings list a mean of 8.18 unique skill clusters. Next, we develop our main emprical measure of position specificity. Motivated by the- ory, we want to capture a notion of uniqueness, that for a position to have high specificity, it should be highly differentiated in its skill requirements from other jobs posted within a firm. Therefore, we use an algorithm called the “ROCK” clustering algorithm (short for “Robust Clustering Algorithm”) that clusters categorical data (Guha, Rastogi, & Shim, 2000). This algorithm is traditionally used to analyze transaction data, and it is well suited to this problem because the algorithm takes as inputs long, sparse vectors of 0’s and 1’s. 2 We use this algorithm to intentify clusters of job postings within an establishment over the years available (2010-2017). We do this because we want a direct measure of how different a particular job may be from any other job within that establishment. We focus exclusively on firms that post more than 1000 job postings over this time period. The ROCK clustering algorithm works as follows: the similarity between two data points is computed using the Jaccard coefficient, which is defined as sim(A, B) = |A ∩ B|/|A ∪ B|. Suppose job A requires skills {a, b} and job B requires skills {b, c, d}. Then A ∩ B = {b} and A ∪ B = {a, b, c, d}. Therefore in this example, sim(A, B) = .25. The main variable set by the user of this algorithm is the threshold θ ∈ (0, 1) for defining two data points as “neighbors”. We use θ = 0.8. 3 Once a threshold is defined, the ROCK alogorithm computes the similarity of all points, and the points become neighbors if their similarity exceeds the threshold θ. Next, the ROCK algorithm creates links, defined as the number of common neighbors between points. Then, using this measure of links across points, the algorithm categorizes observations into clusters. Table 3 gives an sample of output from the ROCK clustering algorithm for job postings categorized as in the 6-digit soc occupation code “Medical and Health Services Managers.” Each column represents a single job posting, each with their unique BGT job ID. The rows are different skills that were listed with these jobs, and a cell is set to 1 and highlighted yellow if that job posting listed that particular skill. The first four job postings listed here are grouped together in a cluster (the cluster number 12 is arbitrary), as many of these jobs appear to have the same skills tending to deal with direct patient care: “advanced

2Other commonly known algorthims such as K-means are poorly suited for this environment, as K-means typically takes a more limited number categories where each observation has a value for every category. 3Using the value of θ = 0.8 for the similarity threshold, we find that 70% of jobs are successfully categorized into a cluster. We find that for thresholds that are too low, most jobs all get put into one large cluster. If we set the threshold too high, few postings are put in a cluster at all. Setting θ = 0.8 appears to achieve both a desirable number of clusters and a high but sensible fraction of jobs falling into cluster.

10 patient care”, “medical support”, and “surgery”, for example. This cluster also contains many job postings that are categorized as registered nurses (not shown here). The next few columns are jobs grouped into cluster 141, which tend to have primarily admininstrative duties. There is only one job posting grouping into cluster 1180: this occured because the other jobs grouped into cluster 1180 are listed under different occupations, so this job posting had no other postings with the same occupation categorization in its cluster. Any posting listed with cluster 9999 was not grouped into a cluster at all, and the jobs list a heterogenous set of skills. There are a few points worth noting. First, we believe that this sample of clusters high- lights the benefit of having the ROCK algorithm not take into account the Burning Glass occupation categorization. If a job is listed as a “Medical and Health Service Manager,” but their tasks in the firm can reasonably be replaced by a registered nurse, then such a posting would not suggest that that position is particularly costly to replace in the event of a quit. However, it does appear that the clustering algorithm does group job postings into quite intuitive groups. It is also reassuring to see which jobs are not put in a cluster at all: the combination of skills required in jobs not put into a cluster (9999) appear quite specialized. Going forward, there are possibly many ways to check the robustness of these results. We could run the algorithm on more aggregated skill categories (recall the 30 “skill cluster families”), divide postings across time periods, cluster jobs within occupation groups but not across, among other possibilities. One major shortcoming of this method is we run this algorithm on all of the jobs ever posted within an establishment from 2010-2017. In the ideal data, we would observe the real-time composition of workers and their tasks within the firm. Relative to that ideal data, we miss on two fronts. First, job postings in the Burning Glass data overrepresent high-wage occupations, and so clusters with many jobs of low-wage occupations will likely be too small relative to the data. However, we are also unable to see job tenure, and since low-wage jobs tend to have shorter duration, jobs of low-wage occupations could be overrpresented in these clusters.

Computing a “Cluster Score” We compute our “cluster score” measure of position specificity in three steps. First, we take all the jobs posted at the establishment (here defined at the “employer name”-county) level, turning each job positing into a vector of 0’s and 1’s based on if that job requires a particular skill. Next, we input all of the jobs ever posted at that establishment for the entire time period of 2010-2017 into the clustering algorithm. Note that we do not take into account

11 occupation code at this step: the algorithm is blind to the occupation of a job posting, since we only care about replaceability to the extent that jobs are similar in their required skills. Third, we compute two measures that we believe may capture our notion of position specificity: first, the fraction of jobs by occupation that fall in a cluster, and second, conditional on being in a cluster, the average number of other jobs also in that cluster. This second metric can be described as “mean cluster size”. Occupations with on average “high- specificity” positions should be less likely to be categorized into a cluster, and conditional on being in a cluster, the cluster should be smaller. Summary statistics for these two measures are provided in row 2 and 3 in Table 2. Across occupations, the average fraction of jobs in a cluster across occupations is .72, with a standard deviation of .13. The mean cluster size conditional on being in a cluster is 194, with a standard deviation of 223. The correlation of these two measures is .42. Lastly, we take the first principle component of these two measures to create our “cluster score” which is our preferred measure of position specificity. Since this is a principle component, the cluster score with have mean 0 and standard deviation 1. Looking at our cluster score across occcupations, a few patterns emerge. First, low- wage, in-person service jobs tend to have the lowest cluster scores: “Cooks, Fast Food” (-2.77), “Waiters and Waitresses” (-2.73), “Bartenders” (-2.66), “Retail Sales Persons” (- 2.55). Occupations with many workers in health and education tend to also have low cluster scores, including: “Registered Nurses” (-2.01) and “Elementary School Teachers, Except Special Education” (-2.10). Construction occupations tend to have high cluster cluster scores relative to education, coming in the middle of the range: “Construction ” (0.08), “Carpenters” (0.36). Managerial, financial, and computer occupations typically have the highest cluster scores: “Marketing Managers” (2.58), “Database Admininstrators” (2.37). Interestingly, advanced medical occupations typically have low scores: “Surgeons” (-1.35), “Pharmacists” (-0.36). In general, it appears that healthcare and education jobs tend to have low cluster scores relative to the occupation’s wages. Also, while we have not looked at the share of women and men in each occupation, it also appears that position specificity is much higher male dominated occupations and much lower in female dominated occupations. In Figures 3-6, we plot the occupation’s cluster score against various occupational char- acteristics. Figure 3 plots an occupation’s 2017 mean posted wage in the Burning Glass data on the x-axis and the cluster score on the y-axis. We distinguish healthcare occupa- tions, in red, from all other occupations, in blue. Excluding health care occupations, the cluster score is strongly and positively related to mean posted wages. As for health care

12 occupations, we believe that the cluster score is less correlated with wages because these occupations tend to be highly credentialized and the expected skills associate with that job come primarily through the education for the occupation. For example, in many surgeon job postings, “Surgery” was listed the only required skill. While such a skill is clearly highly complex, the combination of skills required to do that job may not be reflected the job posting. Figure 4 plots the cluster score against the log of the average number of unique skill clusters by occupation. Perhaps unsurprisingly, these two measures are highly related, as having more required skills on average increases the scope for heterogeneity of skills requirements. Figure 5 plots the cluster score against the mean posted number of years of experience in these online job postings. Similarly to the number of skills, these two variables are strongly positively correlated. Lastly, Figure 6 plots the cluster score against the posted required years of education. The points tend to cluster around 12, 16, and 22 years of education, corresponding to a high school diploma, a 4-year college, and a PhD or professional school experience. While occupations that typically require a 4-year degree seem to have higher cluster scores on average than occupations requiring a high school diploma, there appears to be substantial heterogeneity within these groups. Additionally, occupations that require PhDs tend to have lower cluster scores than jobs that require a college degree. Like medical occupations, the requirements for these jobs are likely defined by the educational credentials, making needing to specificy the exact list of skills needed less important.

3.2 Vacancy Duration

One of the predictions of position specificity is that high-specificity positions should be more costly or time consuming to fill vacancies. Therefore, we may be interested to see if our measure of position specificity is predictive of vacancy duration. In a separate data set, Burning Glass produces measures of average vacancy duration by occupation. While we do not explore how these data are constructed in this paper, these data generally reflect the number of days between a job being posted and being taken down from online job boards. Figures 7 and 8 show the relationship between various occupational characteristics and vacancy duration. For every occupational characterstic - the cluster score, the number of skills, average postd wage, and average posted experience - there is U-shaped relationship with vacancy duration. This is likely because job postings for low-wage jobs are held open as rolling job postings, intending to produce more than one hire. We exclude education and health occupations; education occupations because their vacancy durations are unusually long and may be affected by hiring procedures outlined in public sector collective bargaining

13 agreements, and health care occupations because the importance of licensing requirements diminishing the informational content in job postings. Comparing the plots in Figure 8 against each other, no occupational metric appears to have too different of a relationship with vacancy duration, except that the two metrics generated from Burning Glass data (the cluster score and log mean number of skills) appear to create more clean U-shapes. Accounting for the likely mismeasurement in low-wage oc- cupations, we conclude that vacancies for high-specificity occupations likely do take longer to fill for the hiring firm.

3.3 National Panel Regression

To see if either of our measures of position specificity has explanatory power in accounting for the differential relationship between labor market tightness and wages in low-wage occupations, we run a national panel regression on occupational median wages from 1999- 2018. We use a levels specification of the following form:

  wi,t ∗ ln = β0 + β1tightt + β2tightt speci + β3spec + γ0year + γ1ζit + xi + uit, (1) Yt/Nt where occupations are index by i, time is indexed by t, ζit are controls, and xi are occupation fixed effects. On the left hand side of the regression is the level of the median hourly wage in occu- pation i divided by aggregate output per worker. We use the Occupational Employment Statistics (OES) for yearly occupational wage data. One concern with regressing occupa- tional log wages on slack is possible compositional changes, particularly during recessions. To control for possible composition issues, we divide by aggregate output per worker. This is useful for two reasons: first, output per worker and median wages should be impacted in the same direction by compositional issues, as firms shedding their least productivity work- ers likely also means shedding some of its lowest wages workers. Second, how wages move with respect to total output per worker is the variable of interest, as the question is how slack affects aggregate passthrough of productivity to wages. The coefficient on the inter- action term, which is the coefficient of interest, will be unaffected by dividing the left hand side variable by an aggregate term. This specification assumes that the degree of cyclical composition changes within an occupation is uncorrelated with measures of specificity. On the right hand side of the equation is the national measure of labor market tightness, plus an interaction term of tightness and position specificity by occupation. We measure labor market tightness at the national level in two ways, both the prime age (25-54) employ- ment to population ratio (EPOP). We choose the 25-54 employment to population ratio

14 because, due to lagged procylicality of labor force participation, the unemployment rate will underestimate the degree of labor market slack in the years immediately following the downturn. Additionally, the secular trend of women’s labor force participation appeared to stop around 2000, so this measure will be unaffected by that large secular change. Column (1) of Table 4 reports the relationship between a percentage point change in the prime age EPOP, including as the interaction term the occupation’s cluster score multiplied by the change in EPOP. The results suggest that a one percentage point increase in the prime-age employment to population ratio is associated with an increase in wages over total productivity by 0.93 percent. The -0.38 coefficient on the interaction term indicates that the log wage over total productivity, for an occupation with a cluster score one standard deviation below the mean, will change by 1.31 percent in response to change in the prime- age EPOP. The equivalent association for an occupation with a cluster score one standard deviation above the mean is then 0.55 percent. In the simplest regression specification, wages in occupations with cluster scores one standard deviation below the mean are more than twice as sensitive to labor market slack as occupations one standard deviation above the mean. Standard erros are clustered at the 2-digit occupation level. Column (2) reports the same regression, but including a shift-share industry demand variable constructed using OES data. These changes in demand are computed at the 4-digit soc occupation level. Unfortunately, employment by occupation and industry is available in the OES beginning only in 2004, so the regression from column (2) onwards reflect a 2004-2018 sample. After controlling for demand, both the level and interaction terms fall nearly by half, suggesting that a one percentage point increase in the prime- age employment to population ratio is associated with an increase in wages over total productivity by .51 percent. For occupations one standard deviation below and above the mean of cluster scores, the increase in the occupational mean wage is .71 percent and .31 percent respectively. Columns (3), (4), and (5) run the same regression in column (2) but adding interaction terms with other occcupational characteristics, respectively, the 1999 occupational mean wage, the log number of unique skill clusters by occupation, and the mean posted years of education by occupation. In each case, the term on the cluster score remains significant. It is only when pitted against mean posted experience by occupation does the coefficient on the interaction term with the cluster score fall substantially and become statistically insignificant. Since these estimates are not causal, we cannot draw any firm conclusions. However, it appears that an occupation’s mean wage is not best predictor of how that occupation’s wages covary with labor market slack, and the measures of position specificity that we

15 constructed may have promise in explaining the differential relationship between slack and wages by occupation.

4 Model

In this section, we outline the model that produces both the heterogenous effect of slack low high on wages γθ > γθ , as well as the differential effect of firm output per work on wages, low high γy < γy .

4.1 Preferences and Production Technology

Time is discrete. There is a fixed large, measure of firms and measure 1 of workers. Workers and firms are risk neutral. Workers have additively separable preferences over wages and time-varying, idiosyncratic non-wage utility match values and discount at factor δ. The workers lifetime utility takes the form

∞ X t Ui,s = δ (wijt + ιijt), s=t where wijt is the wage that worker i earns at firm j in period t, and ιijt is the utility that worker i derives from working at firm j in period t. Firms are large and produce using only labor, but production takes a novel form. Firms take as two inputs the number of positions N¯ and fraction of positions filled X, with the total number of workers N = NX¯ . The production takes the form: F = F (N,X¯ ) ¯ ¯ ¯ with FN¯ (N,X) > 0, FN¯N¯ (N,X) < 0, and FX (N,X) ≥ 0. Workers are symmetric in the sense that the decline in output from a position going vacant is the same as any other position, and each position is equally “marginal”.

4.2 Search and Matching Environment

Workers and positions are matched through a centralized, frictional matching process. A vacancy is an unfilled position. Let the measure of unfilled positions, or vacancies, be V . Every worker can search, so the measure of searchers S is equal to the measure of workers in the economy. Firms must recruit position-by-position, and a firm can increase the probability of filling vacancy k by spending on recruitment effort hk. The matching process allocates a number of viewings to workers each period, allocating the views of vacancies to workers who are most likely to be able to fill the position. Workers can search

16 on the job, and both firms and workers lose one period of production when workers change jobs. Let the total number of vacancy viewings in the market vt be a constant returns to scale function of the measure of searchers and total recruiting activity V ∗:

∗ ∗ φ 1−φ viewingst = vt(Vt ,St) = m(Vt ) St ,

∗ R V where m is a constant, S is the measure of searchers, and V = 0 hkdk, which adds up the recruiting effort over all the vacancies V . Plugging in, we get

Z V φ φ ∗ φ 1−φ   1−φ ¯  1−φ viewings = vt = m(V ) S = m hkdk S = m hV S . 0 Then, if recruiting effort is the same across vacancies, then the average number of viewings per vacancy is

φ ¯  1−φ viewings v m hVt S = t = . vacancy Vt Vt The number of viewings for vacancy k will be proportionate to the ratio of the hiring ¯ expenditure hk to the average expenditure h:

φ φ ¯  1−φ ¯  1−φ h m hV S m hV S  S 1−φ viewings = v = k = h = mh = mq(θ∗)h , kt kt h¯ V k hV¯ k hV¯ k

∗ hV¯ ∗ ∗ φ−1 0 ∗ with θ = S and q(θ ) = (θ ) . With q (θ ) < 0, the number of viewings per effort are lower in tighter labor markets.

While the number of viewings of vacancy k is linear in recruiting expenditure hk, the number of viewings with feasible matches is diminishing in recruiting expenditure. For each vacancy-worker pair, there is a probability match score p. At low values of recruitment effort hk, the matching technology matches the vacancy to the measure of workers who have the highest match score p. At low values of recruitment effort hk, the matching technology matches the vacancy to the measure of workers who have the highest match score. Once matched, match quality is realized as productive or not productive, according to the probability p. Realizations of productive matches are sparse, which translates a measure of probable matches to a discrete number of successful matches.

∗ Z mq(θ )hk+1 feasible f viewingsk (hk) = vk (hk) = f(p)dp 1

17 where f(p) is the distribution of match probabilites between workers. By construction, the distribution of match probabilities f(p) will be chosen such that compared to firms with low-specificity positions, firms with high-specificity positions have:

f,high f,low vk (hk) ≤ vk (hk) f,high f,low vk,h (hk) ≤ vk,h (hk) f,high f,low vk,hh (hk) ≤ vk,hh (hk)

∀hk. This is saying that for any level of recruiting expenditure, the measure of feasible matches for a higher-specificity position will be lower, have a lower slope, and have more concavity with respect to recruitment effort h, compared to a position with lower specificity. This means that the marginal returns to recruitment expenditure will be lower and more quickly diminishing for high-specificity positions. However, assume for now that positions of the same degree of specificity are all posted in the same market, and positions of different degrees are found in separate submarkets.

4.3 Worker Quit Decision and the Firm’s Retention Function

The total number of feasible viewings vf for a given submarket is equal to the sum of feasible R f views over all vacancies k vk (hk)dk. Since the quality of matches is evenly distributed vf across vacancies and searchers within a submarket, each worker will have S opportunities to apply to position that the worker likes more more than their current position. For simplicity, assume that the worker’s non-wage utility value ιjkt is a random variable for the first period of the match and is equal to zero for all remaining periods. Assume also that a worker pays cost c to switch jobs. Workers also miss out on one period of production and subsequently one period of wages when changing jobs. This means that a worker will apply if the non-wage utility during the first period of the new job exceeds the switching cost plus one period’s worth of wages at their current job: 4

ιj,t+1 > wi,t + c.

Given the distribution of ι, there is a probability P (wi) that the worker applies to feasible 0 f vacancy, with P (wi) < 0. The worker has the opportunity to apply to v /S vacan- cies, so the probability that the worker applies to at least one vacancy is P r[apply] =

4Since we abstract from idiosyncratic shocks to the firm, should choose the same wages in the future. Workers know this, and so future expected wages at the poaching firm exactly equal expected future wages at their current firm and so drop out of the application decision.

18 vf /S 1 − (1 − P (wi)) . Then, conditional on applying to one vacancy, there is a chance that that vacancy received one or more applications, in which the match is randomly as- signed. In total, this process approximates to urn-ball matching, where the total measure of applicantion feasible matches Af is equal to the measure searchers S multiplied by the probability that a searcher submits an application: Af = S∗P r[apply]. Given a total of V vacancies and Av applicants, the probability that a vacancy finds a match is approximated by 1 − e−Af /V . This implies there there are V (1 − e−Af /V ) total successful matches, leav- ing the probability of a match conditional on applying is V/Af (1 − e−Af /V ). In total, the probability that a worker quits can be described by the following equation:

f    −A  f Vt t ∗ vt /St Vt P r[quit|wkt] = P r[apply|wkt] P r[quit|apply] = 1−(1−P (wkt)) ∗ f (1−e ) . At This equation subtracted from 1 is the firm’s retention function with respect to the wage 0 wkt. It is straightforward to see that the quit function has the same sign as P (w), which is negative: with higher wages, workers are less likely to apply to any vacancy, making the workers less likely to quit in general.

4.4 The Firm’s Recruiting Function

To derive the firm’s recruiting function with respect to recruiting effort hk, we will return to the urn-ball matching problem. If all firms recruit with the same intensity for every position, 1 then the probability that a vacancy receives any one application is V . If a firm exerts effort different from the effort of other firms, the probability that the vacancy receives any one f applicant changes in proportion to ratio of feasible viewings: 1 v (hk) . So, the probability V vf (h¯) f 1 vk (hk) that a firm doesn’t receive an application from a particular applicant is 1 − V f ¯ . The vk (h) f f f 1 vk (hk) A firm has A chances, so the probability than a firm gets no applications is (1 − V f ¯ ) , vk (h) f f v (h ) A k k − f V v (h¯) which is approximated by e k . So, the probability that the firm recieves at least one applicant is

f f v (h ) −A k k f V v (h¯) Rk(hk) = 1 − e k .

Recall that high-specificity positions have functions for the number of feasible viewings vf (h) that are both less steep and more concave with respect to h than for firms with low-specificity positions. Since 1 − e−x is a monotonic function, the relative slope and

19 concavity properties of the feasible viewings function are also true about the firms recruiting properties. Analgous to above, we have:

high low R (hk) ≤ R (hk) high low Rh (hk) ≤ Rh (hk) high low Rhh (hk) ≤ Rhh (hk).

4.5 Firm’s Problem

With the recruiting and retention functions for each position in hand, we can now solve the firm’s problem. However, with firms being large and having a continuum of positions, the problem could be potentially very complex if firms chose different wages and recruiting intensities for each position. To address this, we impose an additional assumption that every position is equally “marignal”. What this means in practice is that if a firm wants to decrease the number of positions N¯, then the firm cannot know in advance which positions it will need. To operationalize this, we define a variable Xd, or the “downsizing” fraction of positions filled, as the following:

d ¯ ¯ Xt = Xt−1 if Nt < Nt−1 ¯ d Nt−1 ¯ ¯ Xt = Xt−1 if Nt > Nt−1. N¯t

When a firm wants to decrease the total number of positions, i.e., N¯t < N¯t−1, the fraction of filled positions remains the same. This means that the firm is unable to increase the filled position rate via layoffs. Crucially, this also means that firms are not able to know ahead of time which workers they will need in the case of downsizing. In the other direction, when firms want to expand the number of positions, they do not want to engage in any layoffs and will need all of their current workers. However, since the new number of positions is greater than before but the number of incumbent workers is the same, the fraction of filled positions falls. Now that each position is equally marginal, the firm faces an identical problem for each incumbent worker and each position, meaning that the firm will endogenously choose the same wage wt = wkt for each worker and the same recruiting effort ht = hkt ∀k for all vacant positions. This drastically reduces the complexity of the firm’s problem to just three choice variables, N¯t wt, and ht each period.

Once the firm has chosen its desired number of positions N¯t and determined it’s “post- d downsizing” fraction of positions Xt , the number of vacancies is equal to the number of

20 vacant positions in the economy. Next, the search process commences, and quits and poaches occur at the same time. Newly hired workers cannot produce in the same period that they were hired, and they are onboarded at the end of the period and are considered incumbent workers at the beginning of the next period. Since firms only produce with the p workers they retained, we define Xt to be the fraction of positions filled when the firm p d produces, with Xt = Xt (1 − q(wt)). Together, the timing of the firms problem can be written as follows:

Inherit Choose Choose Quits & Onboard ¯ N−1,X−1 N¯ w, h Poaches Produce Hires Begin Determines Determines Determines Next period d p X Period Xt Xt t If the market is in steady state, we can then write the firm’s problem in recursive form:

p p d V (N¯−1,X−1) = max F (N,X¯ ) − wNX¯ − hN¯(1 − X ) + δF (N,X¯ ), N,w,h¯ subject to

d X = X−1 if N¯ < N¯−1 N¯ Xd = X −1 if N¯ > N¯ −1 N¯ −1 Xp = (1 − q(w))Xd X = (1 − q(w))Xd + R(h)(1 − Xd).

4.6 Discussion

One of the key motivating facts for this model setup is the documented relationship between quits and wage growth in US economy. Faberman & Justiniano (2015) show that the quit rate and wage growth are very tightly linked. However, from Moscarini and Postel-Vinay (2017), it is clear that the cyclicality of wages is not solely coming from a cyclical job ladder where people make more transitions to higher paying jobs during booms. Rather, it is also because the quit rate raises the growth rate of wages of job stayers. In light of J¨ager et al. (2018) who show that the value of unemployment benefits has almost no effect on wages in Austria, the data show convincingly that it is workers’ alternative job options and employers’ fear of turnover, not the worker’s position when unemployed, that pins down wages.

21 With it clear that a worker’s willingness to change jobs is central in wage determination, then workers having idiosyncratic preferences over workplaces can play a significant role in wage setting. Sorkin (2015) and Hall & Mueller (2016) document that the variance of the idiosyncratic values of employment is large. Hall & Mueller (2016) state, ... the dispersion in the idiosyncratic part of non-wage values is larger than the dispersion in offered wages alone, and thus non-wage values tend to dominate wages in the acceptance decision. In other words, employed workers in our model transition frequently from one job to the next, but mostly because new jobs offer higher non-wage values rather than higher wages...”. Idiosyncratic match values also play an important role in Nagyp´al(2004). While adding idiosyncratic non-wage values can seem to increase the complexity of the model, it in fact makes the environment much simpler. If workers have substantial variance of idiosyncratic non-wage values over workplaces, that means that firms face an upward sloping labor supply even if there is no wage dispersion in equilibrium. This is has two important consequences. First, it makes it much simpler to derive retention functions in general equilibrium, as we intend to do in future work, without having to keep track of and solve for the distribution of wages. Second, it means that it is possible to get wage dispersion from firm premia from reasonable heterogeneity in firm marginal product. This stands in contrast to Burdett and Mortensen (2003), where workers can search on the job and quit only to firms that offer higher wages. However, since theirs is a random search environment, firms with high productivity need only offer wages slightly higher than other firms in order to successfully poach workers. Subsequently, in order to generate mean- ingful wage dispersion, firms need unrealistically large heterogeneity in worker marginal product, despite workers being ex-ante identical. In the end, including significant variance of non-wage values in the model enables the existence of steady-state firm premia derived from reasonable heterogeneity in worker marginal products, while also capturing a realistic mechansim through which workers switch jobs and firms face upward sloping labor supplies. One last feature of this model is, because hired workers cannot produce in the period that they are poached and are treated like incumbent workers at the beginning of next period, there is no difference in wage setting for new hires and job stayers. This is in part intentional, as it reduces the complexity of the firm problem and the possible dimensionality of wages. It also means that the emprical application of this model would be the cyclicality of the wages of job stayers, which Moscarini and Postel-Vinay (2017) show do in fact vary strongly with the quit rate.

22 4.7 Parameterization

For the production function, we will introduce technology parameter Z, which affects the firm’s productivity in a way that biases the ratio of the average to marginal product. Let the production function be

  F (N,X,A¯ ) = (1 + A)N¯ α + A Xψ, with 0 < α < 1 and with ψ ≥ α. When A= 0, this production function collapses to

F (N,X,¯ 0) = N¯ αXψ.

By construction, X ∈ [0, 1], and ψ = α is the limiting case of homogenous labor: N¯ αXα = N α, while ψ → ∞ converges the production function to a large-firm analogue of O-ring production. For simplicity, we will always use ψ = 1. We use this functional form because changing A alters the ratio of average to marginal product. Without the additional added term within the parentheses, the production func- tion is Cobb-Douglas, where the ratio of average to marginal product with respect to N¯ is always the same. In that case, higher productivity would not lead to higher vacancy costs. With the form we propose, firms with higher A face a higher ratio of average to marginal product, meaning that letting the fraction of filled vacancies X go below 1 is additionally costly. For the distribution of feasible matches, we assume that the probability match scores p are distributed pareto, so

∗ Z mq(θ )hk+1 feasible f viewingsk (hk) = vk (hk) = f(p)dp 1 ∗ Z mhkq(θ )+1 = p−ζ dp 1 (1 − (1 + mh q(θ∗))1−ζ ) = k , ζ − 1 where ζ ∈ [0, ∞] determines the degree of position specificity. As ζ → 0, this function approaches linear, and the function becomes more concave as ζ increases. Interestingly, the value of ζ = 2 has a particular meaning: when ζ < 2, this function is bounded, while the function is unbounded when ζ > 2. While the properties of the firm’s decision rule around this point will be continuous, this model is saying that in a given period, for firms with ζ < 2, there is a finite number of workers who could feasibly fill the firm’s position, and with sufficient search effort, this pool of workers will be exhausted.

23 For the worker’s non-wage utility, we assume that ιikt is also distributed pareto (1, σ), implying that the probability that ιj is larger than some value x is equal to:

1 P r[ι > x] = σ. j,t+1 x

Since the worker will apply if ιj,t+1 > wi,t + c, then probability that a worker applies to any one vacancy, conditional on viewing the vacancy and the match being feasible, is

−σ P r[ιj,t+1 > wi,t + c] = P (wi) = (wi,t + c) .

Surpressing the time subscripts, the firm’s quit function for position j is

   f  −σ vf /S V −A q(w ) = 1 − (1 − (w + c) ) ∗ (1 − e V ) , k k Af where Af = Af (S, vf (m, V, h¯)), and the firm’s recruiting function is

f f v (h ) −A k k f V v (h¯) Rk(hk) = 1 − e k ,

f ∗ 1−ζ with vk (hk) = (1−(1+mhkq(θ )) )/(ζ −1). To get a sense of the shape of this function, Figure 9 plots the recruiting functions for firms with high specificity (ζ = 5) and low specificity (ζ = .2), using parameter values from the following calibration. One can see that the function for both the number of views and the probability of successfully recruiting is more concave for the higher specificity.

5 Calibration and Numerical Results

In this section, we will compute two comparative statics: how wages change in response to changes in labor market tightness θ = hV/S¯ , and how wages changes in response to changes in firm productivity parameter A, holding labor market variables fixed. For the purposes of this calibration, we are interested in changes in the tightness of the market that the firm recruits from, while holding constant the tighness in the market that workers quit to. Since we are breaking the link between the quit market and the recruiting market, and firms take aggregate labor market variables A, V , S, and vf as given, we do not need the aggregate variables to be consistent across the quit and recruiting markets. This allows us to see how the firm’s decision changes by simply directly changing the aggregate variables in the firm’s recruiting market. Our calibration strategy is to pick vf /S, σ, and c to match the steady state quit rate, the elasticity of quits to tightness, and the two firms’ vacancy durations for q(θ). The way this

24 works is that for any mq(θ∗), there is a choice of vf /S, σ, and c that can recreate any other value for m: this is because we have four parameters to match the level, slope, and concavity of the retention function, allowing us to set m = 1. The main logic is that m determines how many vacancies viewings there are, thereby influencing how recruiting effort scales into number of views. If there are more views, to keep the number of applicants the same, we can adjust σ, which determines the variance in non-wage utility values and determines the likelihood a worker will apply after viewing a vacancy. Though our calibration targets are in monthly data, we set δ = .95 for computational speed. Calibration of all the parameters can be found in Table 5. According the to the Job Openings and Labor Turnover Survey (JOLTS), the monthly quit rate has ranged between between 1.2 and 2.5 percent, averaging around 2 percent. The elasticity of aggregate quits with respect to aggregate job openings is roughly 0.5. In our calibration, firms have a quit rate around 3 to 4 percent for standard calibrations, and the elasticity of quits to vf /S is around 0.9. However, recall that we constructed the number of viewings as v = (hV¯ )φS1−φ. In order to vary tightness q(θ) = q(hV/S¯ ), we did not have to specify the elasticity of viewings with respect to vacancies φ. We think of this parameter in the range of 0.5, in line with Gavazza, Mongey, and Violante (2018) who have a similar elasticity of matches to aggregate recruiting intensity. Together, this would plausibly give an elasticity of feasible viewings respect to total vacancies in the range of 0.5. Our final calibration target is vacancy durations. There currently is not a measure of average vacancy duration by occupation for the US economy. However, to get an estimate of the possible range, we look at the range of vacancy durations provided by DHI Hiring Indicators by industry. On the lower end of vacancy durations is the leisure and hospitality industry, varying from 10.4 working days in 2009 to 20.7 working days in 2016, or 2-4 weeks. The sector with the longest vacancy duration is health services, varying from 29.8 workings days in 2009 to 47.7 working days in 2016, or 6-10 weeks. This roughly corresponds monthly vacancy filling probabilities of 60% and 25%. In our baseline calibration, the vacancy filling rates are 47% and 14% for respectively low- and high-specificity positions, as shown in the R(h) row in Table 6. This would slightly exaggerate the cost turnover due to vacancy durations, but the relative durations are roughly in line. To alter labor market tightness, we need only directly vary the value of q(θ). To assess how wages change with respect to labor market tightness, we will solve for the firm’s decision in steady state for tightness q(θ) ∈ [.7 1 1.4] holding productivity A constant at 0. This was chosen to correspond roughly a doubling of the ratio of vacancies to searchers over the course of a business cycle. To measure the affect of changes in productivity A on wages, we choose values of A ∈ [0 .2 .4], holding tightness q(θ) constant at 1. The result

25 are showed in Tables 6 and 7 and Figures 10 and 11. In our baseline comparative static, moving the level of tightness q(θ) from .7 to 1.4 raises wages 12.5% in low-specificity positions but only 8.5% in high-specificity positions, reflected in the first row of Table 5. The secont to last row, which lists the recruiting probability R(h), shows that tighter labor markets significantly decrease the probability that firms low-specificity positions find workers: falling from 56.3%to 35.7%. Notice also that as labor markets get tighter, firms choose to operate with fewer positions. Since it is more difficult to replace workers and so firms find it desirable to pay higher wages, the marginal cost of a position is now higher, leading firms to choose fewer positions. Figure 10 shows these results in more detail, plotting every combination of wages for productivity A ∈ [0 .2 .4] and tightness q(θ) ∈ [.7 .85 1 1.2 1.4]. The bold lines are for ζ=5 high- specificity firms, and the dotted lines are for ζ=.2 low-specificity firms. For each line, the wage is normalized to 1 for q(θ)=1.4. Regardless of the value of productivity A, changes in tightness q(θ) from 1.4 to .7 all lead to about a 10% delince in wages for low-specificity firms and 7% decline for high-specificity firms. Table 6 reports the wage and average product by specificity for productivity A ∈ [0 .2 .4]. The third and fourth rows of the table measure the passthrough relative to the case of productivity A = 0. The third measures passthrough by taking a ratio of the differences in wages and differences in productivity. The fourth row takes the percent change in wage over percent change in average product. In either measure, passthrough of productivity is roughly 30% higher for high-specificity, ζ=5, firms. Figure 11 plots the relationship between average output and wages across all combinations of productivity A and tightness q(θ). The first panel plots the levels, and the second panel normalizes the wage and average product equal to 1 for the case of productivity A = 0. One can see that as productivity A increases, output per worker also increases. But notably, the slope of the wage line is always higher for high-specificity, ζ=5, firms, keeping the roughly 30% passthrough premium regardless of the state of the labor market.

5.1 Discussion

In total, the calibration of our model delivers about 30% extra passthrough of average out- put to wages and 30% less effect of slack on wages for high-specificity workers. An interest- ing feature is that there appears to be little interaction between differential passthrough of productivity by specificity and differential sensitivity to slack by specificity. For example, in Figure 11, there is little systemic pattern of productivity passthrough accross different labor market slack regimes: the difference in passthrough seems to entirely come from the different specificity. Similarly, in Figure 10, the effect of slack on wages is largely unaffected

26 by different values of productivity A, and the relative wage slopes are mostly governed by position specificity. This supports a modified version of the linear reduced-form conceptual framework we introduced above:

ln(w) = γyln(y) + γθln(θ), with the wage and productivity variables, w and y, put in logs. One surprising feature of this model is how even low-specificity positions have mean- ingful productivity passthrough. One would expect that, in the limit, firms with the least replacement frictions would have no passthrough of productivity: if firms can instantly re- place a worker for a known price, average product would have no effect on the wage. There are two reasons in this model why low-specificity positions still get some passthrough. First, even though the number of feasible viewings is almost linear in recruiting effort for low-specificity firms, the recruiting probability function is still concave, due a combination of features such as limiting workers to one application per worker, the inability of firms to commit to wages, and coordination frictions. Second, even for the lowest specificity position, a quit means at least two periods of non-production in that position: one pe- riod immediately when the worker quits, and another after a new hire the next period, as new hires are ‘onboarded’ after production. This imposes a minimum cost of turnover that is not affected by the shape of the recruitment probability function. Nonetheless, the qualitative point still holds: firms with lower position specificity have lower passthrough of average product on to wages.

6 Conclusion

In this paper, we introduce a concept of position specificity, where skilled workers are hor- izontally differentiated in the firm. In a frictional labor market with upward sloping reten- tion functions, high-specificity workers are more costly to replace due to employers search- ing in thinner labor markets for a specific combination skills. This leads high-specificity firms to offer higher wages as average product at the firm rises, while also insulting work- ers in high-specificity positions from changes in labor market conditions. We believe that this model provides insight into heterogenous wage setting decisions where it appears that different kinds of workers have different levels of “bargaining power”, as well as has the potential to help explain unresolved questions in the change of the distribution of wages in the last two business cycles. We conclude that the concept of position specificity can be useful for a wide range of questions in labor economics and may have important policy considerations for understanding and addressing wage inequality.

27 Figure 1: Wage Growth by Occupation, 2010-2018

Wage Growth by Occupation, 2010−14 and 2014−18 Sorted by Occ. Median Wage .2 .1 0 100*dlog(wage) −.1 2 2.5 3 3.5 4 4.5 2010 log wage

2010−2014 2014−2018 2010−2014 2014−2018

Note: Data from Occupation Employment Statistics (OES). On the horizontal axis is the median log wage by 6-digit SOC occupation code. On the veritical axis is the percent nominal wage growth by occupation. The size of the circle represents the number of full time equivalent workers in that occupation in 2010.

28 Figure 2: Employment Growth by Occupation, 2010-2018

Employment Growth by Occupation, 2010−14 and 2014−18 Sorted by Occ. Median Wage .4 .2 0 100*dlog(emp) −.2 −.4

2 2.5 3 3.5 4 4.5 2010 log wage

2010−2014 2014−2018 2010−2014 2014−2018

Note: Data from Occupation Employment Statistics (OES). On the horizontal axis is the median log wage by 6-digit SOC occupation code. On the veritical axis is the percent nominal employment growth by occupation. The size of the circle represents the number of full time equivalent workers in that occupation in 2010.

29 Figure 3: Correlation of Cluster Score and Occupational Mean Posted Wage

Figure 4: Correlation of Cluster Score and Number of Skills by Occupation

30 Figure 5: Correlation of Cluster Score and Mean Posted Experience

Figure 6: Correlation of Cluster Score and Mean Posted Education

31 Figure 7: Vacancy Duration by Occupational Cluster Score

32 Figure 8: Vacancy Duration by Occupation Characteristics

33 Figure 9: Feasible Viewings and Recruiting Functions by Specificity

20 0.7

0.6 15 0.5

0.4 10 0.3

0.2 5 0.1

0 0 0 0.5 1 0 0.5 1

Note: In the left plot, we plot the number of feasible viewings based on recruitng effor hk for high-specificity (ζ = 5) positions and low-specificity (ζ = .2) positions, for the calibration with m = 1, σ = 16, vf /S = 20, and c = .8.

34 Figure 10: Wages and Slack, Normalized q(θ) ∈ [.7 .85 1 1.2 1.4]

1

0.98

0.96

0.94

0.92

0.9

0.88 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4

Note: q(θ) is a measure of labor market slack, with q(θ) =1.4 corresponds to the tightest market, and q(θ)=0.7 corresponds to the most slack labor market. Wages are divided by the corresponding value of the wage when q(θ) = 1.4. θ = hV/S¯ , where h¯ is average recruitment effort, V is the measure of vacancies, and S is the measure of searchers. The bold lines are for high specificity (ζ = 5) positions, and the dotted lines are for low specificity (ζ = .2) positions. The orange color indicates productivity parameter A = 0, blue indicates A = .2, and green indicates A = .4.

35 Figure 11: Wages and Average Product, A ∈ [0 .2 .4]

0.68

0.66

0.64

0.62

0.6

0.58

0.56 1.1 1.2 1.3 1.4 1.5 1.6

1.05

1.04

1.03

1.02

1.01

1 1 1.05 1.1 1.15 1.2

Note: The bold lines are for high specificity (ζ = 5) positions, and the dotted lines are for low-specificity (ζ = .2) positions. Green indicates tightest labor markets (q(θ) = 1.2), blue intermediate (q(θ) = 1), and orange slackest labor markets (q(θ) = .85). The bottom panel normalizes the data from the top panel by dividing the wage and outper worker by the wage and output per worker repsectively when A = 0.

36 37 Table 1: Example of Burning Glass Skill Categorization

This table provides a random sample skills that appear in the Burning Glass vacancy dataset. In the rightmost column, the variable labeled “skill” is the qualification that appears in the raw data. The other two columns are the outcomes of algorithms that Burning Glass uses to categorize data. There are over 700 “skill clusters” and approximately 30 “cluster cluster families.”

38 Table 2: Calibrated Values

Mean St. Dev Number of Unique Skill Clusters 4.20 1.49 Fraction in Cluster .72 .13 Mean Cluster Size 194.4 223.5

Table 3: Example Clusters: Medical and Health Services Managers

This table gives an example of the output of the ROCK clustering algorithm. Each column represents a single job posting, each with their unique Burning Glass job ID. The rows are different skill clusters that were listed with these jobs, and a cell is set to 1 and highlighted yellow if that job posted that particular skill. The first four job postings (columns) listed here are grouped together in a cluster (the cluster number 12 is arbitrary), as many of these jobs appear to have the same skills. Notice how the job grouped into cluster 1180 is by itself: this occured because the other jobs grouped into cluster 1180 are listed under different occupations, so this job posting had no other postings with the same occupation categorization in the same cluster. Any posting listed with cluster 9999 was not grouped into a cluster at all.

39 Table 4: State by Occupation Panel Regressions: State Unemployment Rate

(1) (2) (3) (4) (5) (6) ln(w/Y) ln(w/Y) ln(w/Y) ln(w/Y) ln(w/Y) ln(w/Y) EPOP 0.93∗∗∗ 0.51∗∗∗ 0.51∗∗∗ 0.52∗∗∗ 0.51∗∗∗ 0.50∗∗∗ (0.065) (0.061) (0.059) (0.054) (0.058) (0.053)

EPOP*Cluster Score -0.38∗∗ -0.20∗∗∗ -0.15∗∗ -0.17∗∗ -0.16∗∗ -0.046 (0.11) (0.046) (0.049) (0.053) (0.042) (0.068)

EPOP*1999 Occ Wage -0.069 (0.058)

EPOP*BG N Posted Skills -0.039 (0.065)

EPOP*BG Mean Educ -0.092 (0.065)

EPOP*BG Mean Exp -0.21∗∗ (0.055)

year -1.06∗∗∗ -1.06∗∗∗ -1.06∗∗∗ -1.05∗∗∗ -1.05∗∗∗ -1.06∗∗∗ (0.073) (0.066) (0.066) (0.066) (0.067) (0.066) Observations 6347 4888 4888 4888 4888 4888 OES Shift-Share Demand No Yes Yes Yes Yes Yes

Standard errors in parentheses ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Note: Standard errors are clustered at the 2 digit occupation level. Table corresponds to regression (2). “dU” is the change in the unemployment rate at the state level.

40 Table 5: Calibrated Values

Parameter Description Value Data Target α diminishing Returns to Labor .6 ψ exponent on X 1 δ discount factor .95 A productivity parameter {0,.4} concavity of ζ recruiting expenditure {.2,5} Vacancy Durations .5, 2 months σ shape parameter on ι 16 Quit rate .02 number of vacancy views Elasticity of Quits vf /S for searching workers 20 to Vacancies .54 c switching cost .8 Quit rate .02 number of vacancy m views in recruiting 1 V A vacancies per applicant 1.34 −A V V A (1 − e ) labor market congestion .7

Table 6: Slack and Wages

Specificity ζ = 0.2 ζ = 5.0 Variable q(θ) = .7 q(θ) = 1 q(θ) = 1.4 q(θ) = .7 q(θ) = 1 q(θ) = 1.4 w .549 .585 .618 .600 .625 .651 h .620 .660 .644 .249 .261 .249 N¯ 1.045 .831 .704 .834 .703 .599 X .919 .933 .940 .852 .869 .875 q(w) .070 .047 .032 .040 .030 .225 R(h) .563 .465 .357 .162 .140 .113 %∆ w / %∆ q(θ) .15 .14 .10 .10

Table 7: Productivity and Wages

Specificity ζ = 0.2 ζ = 5.0 Variable A = 0 A = .2 A = .4 A = 0 A = .2 A = .4 w = wage .585 .595 .600 .625 .638 .651 F (N,X¯ )/NX¯ = avg product 1.15 1.30 1.35 1.32 1.46 1.57 ∆ w / ∆ avg prod .07 .08 .10 .10 %∆ w / %∆ avg prod .13 .16 .21 .22

41 Works Cited

Abowd, John, Francis Kramarz, and David Margolis. “High Wage Workers and High Wage Firms.” Econometrica 67.2 (1999): 251-333. Abowd, John, and Francis Kramarz. “The Costs of Hiring and Separations.” Labour Economics 10.5 (2003): 499-530. Acemoglu, Daron, and David Autor. “Skills, Tasks and Technologies: Implications for Employment and Earnings.” Handbook of labor economics. Vol. 4. Elsevier, 2011. 1043-1171. Acemoglu, Daron, and William Hawkins.“Search with Multi-Worker Firms.” Theoretical Economics 9.3 (2014): 583-628. Autor, David. “Polanyi’s Paradox and the Shape of Employment growth.” Vol. 20485. Cambridge, MA: National Bureau of Economic Research, 2014. Autor, David. “Work of the Past, Work of the Future.” Richard T. Ely Lecture, American Economic Association: Papers and Proceeding, May 2019, 109(5), 1–32. Baker, Dean, and Jared Bernstein. ”Getting Back to .” (2013). Beaudry, Paul, and John DiNardo. “The Effect of Implicit Contracts on the Movement of Wages over the Business Cycle: Evidence from Micro Data.” Journal of Political Economy 99.4 (1991): 665-688. Blanchflower, David, and Andrew Oswald. The Wage Curve. MIT press, 1994. Burdett, Kenneth, and Dale Mortensen. ”Wage Differentials, Employer Size, and Unem- ployment.” International Economic Review (1998): 257-273. Card, David, et al. “Firms and Labor Market Inequality: Evidence and Some Theory.” Journal of Labor Economics 36.S1 (2018): S13-S70. Cahuc, Pierre, Fabien Postel-Vinay, and Jean-Marc Robin. ”Wage Bargaining with On- the-Job Search: Theory and Evidence.” Econometrica 74.2 (2006): 323-364. Cobb, J. Adam, and Ken-Hou Lin. ”Growing Apart: The Changing Firm-Size Wage Premium and its Inequality Consequences.” Organization Science 28.3 (2017): 429- 446. Davis, Steven, Jason Faberman, and John Haltiwanger. “The Flow Approach to Labor Markets: New Data Sources and Micro-Macro Links.” Journal of Economic Perspec- tives 20.3 (2006): 3-26. Davis, Steven J., R. Jason Faberman, and John C. Haltiwanger. ”The Establishment- Level Behavior of Vacancies and Hiring.” The Quarterly Journal of Economics 128.2 (2013): 581-622. Dube, Arindrajit, Eric Freeman, and Michael Reich. “Employee Replacement Costs.”

42 (2010). Edmond, Chris, and Simon Mongey. “Unbundling Labor.” Slides (2019). Faberman, Jason, and Guido Menzio. “Evidence on the Relationship between Recruiting and the Starting Wage.” Labour Economics (2017). Faberman, Jason, and Alejandro Justiniano. 2015. “Job Switching and Wage Growth.” Chicago Fed Letter, 337. Faberman, R. Jason, and Eva´ Nagyp´al.”Quits, Worker Recruitment, and Firm Growth: Theory and Evidence.” (2008). Friedrich, Benjamin, Lisa Laun, Costas Meghir, and Luigi Pistaferri. Earnings Dynamics and Firm-level Shocks. No. w25786. National Bureau of Economic Research, 2019. Garin, Andrew, and Filipe Silv´erio. How Responsive are Wages to Demand within the Firm? Evidence from Idiosyncratic Export Demand Shocks. No. w201902. 2019. Gavazza, Alessandro, Simon Mongey, and Giovanni L. Violante. “Aggregate Recruiting Intensity.” American Economic Review 108.8 (2018): 2088-2127. Guha, Sudipto, Rajeev Rastogi, and Kyuseok Shim. ”ROCK: A Robust Clustering Al- gorithm for Categorical Attributes.” Information Systems 25.5 (2000): 345-366. Haanwinckel, Daniel. ”Supply, Demand, Institutions, and Firms: A Theory of Labor Market Sorting and the Wage Distribution.” Working Paper (2018). Hall, Robert, and Andreas Mueller. “Wage Dispersion and Search Behavior: The Impor- tance of Non-Wage Job Values.” No. w21764. National Bureau of Economic Research, 2016. Hornstein, Andreas, Per Krusell, and Giovanni Violante. “Frictional Wage Dispersion in Search Models: A Quantitative Assessment.” American Economic Review 101.7 (2011): 2873-98. Katz, Lawrence, and Kevin Murphy. “Changes in Relative Wages, 1963 - 1987: Supply and Demand Factors.” The Quarterly Journal of Economics 107.1 (1992): 35-78. Kremer, Michael. “The O-Ring Theory of Economic Development.” The Quarterly Jour- nal of Economics 108.3 (1993): 551-575. Lazear, Edward. “Firm-Specific Human Capital: A Skill-Weights Approach.” Journal of Political Economy 117.5 (2009): 914-940. J¨ager,Simon. “How Substitutable are Workers? Evidence from Worker Deaths.” (2016). J¨ager,Simon, Benjamin Schoefer, Samuel Young, and Josef Zweim¨uller.“Wages and the Value of Nonemployment.” No. w25230. National Bureau of Economic Research, 2018. Kline, Patrick, Neviana Petkova, Heidi Williams, and Owen Zidar. “Who Profits from Patents? Rent-Sharing at Innovative Firms.” No. w25245. National Bureau of Eco-

43 nomic Research, 2018. Kremer, Michael. “The O-ring Theory of Economic Development.” The Quarterly Journal of Economics 108.3 (1993): 551-575. Manning, Alan. “Imperfect Competition in the Labor Market.” Handbook of Labor Economics 4 (2011): 973-1041. Manning, Alan.“A Generalised Model of Monopsony.” The Economic Journal 116.508 (2006): 84-100. Mercan, Yusuf, and Benjamin Schoefer. “Jobs and Matches: Quits, Replacement Hiring, and Vacancy Chains.” Working Paper. (2019). Modestino, Alicia Sasser, Daniel Shoag, and Joshua Ballance. “Downskilling: Changes in Employer Skill Requirements over the Business Cycle.” Labour Economics 41 (2016): 333-347. Montgomery, James D. “Equilibrium Wage Dispersion and Interindustry Wage Differen- tials.” The Quarterly Journal of Economics 106.1 (1991): 163-179. Mortensen, Dale, and Christopher Pissarides. “New Developments in Models of Search in the Labor Market.” Handbook of Labor Economics 3 (1999): 2567-2627. Mueller, Holger , Paige Ouimet, and Elena Simintzi. “Wage Inequality and Firm Growth.” American Economic Review 107.5 (2017): 379-83. Nagyp´al, Eva.´ ”Amplification of Productivity Shocks: Why Vacancies Don’t Like to Hire the Unemployed?” (2004). Oi, Walter. “Labor as a Quasi-Fixed Factor.” Journal of Political Economy 70.6 (1962): 538-555. Postel-Vinay, Fabien, and Jean-Marc Robin. “Equilibrium Wage Dispersion with Worker and Employer Heterogeneity.” Econometrica 70.6 (2002): 2295-2350. Shimer, Robert. “The Cyclical Behavior of Equilibrium Unemployment and Vacancies.” American Economic Review 95.1 (2005): 25-49. Sorkin, Isaac.“Ranking Firms using Revealed Preference.” No. w23938. National Bureau of Economic Research, 2017. Stole, Lars A. and Jeffrey Zwiebel (1996), “Intra-Firm Bargaining Under Non-Binding Contracts.” Review of Economic Studies, 63, 375–410. T¨ervio,Marko. “The Difference that CEOs Make: An Assignment Model Approach.” American Economic Review 98.3 (2008): 642-68. Yamaguchi, Shintaro. “Tasks and Heterogeneous Human Capital.” Journal of Labor Economics 30.1 (2012): 1-53.

44