<<

Distributional Effect of International and in Labor Markets∗

Rodrigo Adão MIT

September 20, 2015

Abstract

This paper investigates the distributional consequences of shocks in the con- text of the Brazilian labor . First, I document a new set of facts about how differential ex- posure to commodity shocks across educational groups and regions leads to differential out- comes in terms of sectoral employment and . Second, I show that such facts are qualitatively consistent with a two-sector Roy model where worker heterogeneity regarding comparative and ab- solute advantage in sector-specific tasks determines the structure of employment and wages. Third, I establish that the schedules of comparative and absolute advantage are nonparametrically iden- tified from cross-regional variation in sectoral responses of employment and wages induced by sector demand shocks. Lastly, I build on this result to structurally estimate the model in the sample of Brazilian local labor markets. My structural estimates indicate that a 10% decrease in commod- ity causes counterfactual increases of 1.2% in the skill premium and of 5% in wage dispersion.

∗Author contact information: [email protected]. I am extremely grateful to Daron Acemoglu, Arnaud Costinot and Dave Donaldson for invaluable guidance and support. I also thank Pol Antras, David Autor, Arthur Braganca, Ariel Burstein, Dejanir Silva as well as seminar participant at the MIT Labor Lunch and the MIT Macro-International Lunch. All errors are my own.

1 1 Introduction

In an integrated world economy, shocks in a particular country have the potential to exert different effects across different workers within another country located on the other side of the globe. For example, if China implements a large-scale plan of infra-structure expansion, what is the effect on workers employed in the mining industry relative to those in the auto industry within the U.S.? Alter- natively, if Saudi Arabia aggressively reduces oil production, what is the effect on workers residing in oil-rich regions relative to those in oil-poor regions within Brazil? More generally, if such shocks affect the relative world price of various products, which workers stand to gain or lose within a country? And how large are these distributional effects? Such distributional concerns would not be relevant if homogeneous workers were able to perfectly switch their sector of employment, their region of residence, and their level of skill. However, this is unlikely to be true. Consider how the negative shock to the world oil supply caused by Saudi Arabia’s policy affects different workers in Brazil. In order to expand production in response to higher oil prices, firms compete for the residents of oil-rich coastal regions with the necessary skills to perform the daily tasks in the offshore drilling industry, pushing their wages upwards. In contrast, workers without these skills and those unable to move to the oil-rich areas are not directly affected by the higher labor demand in the Brazilian oil industry. This simple example suggests that heterogeneous workers are differentially exposed to international trade shocks due to their limited ability to respond to changes in sectoral labor demand. In this article, I propose a unified empirical and theoretical framework to investigate how differential exposure to sectoral shocks across educational groups and regional markets leads to differential outcomes in terms of employment and wages in the context of the Brazilian labor market. The first contribution of the paper is to provide a new set of empirical facts that connect changes in the international price of agriculture and mining commodities to changes in the Brazilian structure of wages and employment between 1980 and 2010.1 First, I establish that aggregate movements in Brazilian wage dispersion are negatively related to the evolution of international commodity prices with the correlation being mainly driven by changes in wage differentials associated with workers’ level of education, sector of employment and region of residence. Second, I explore the variation in the importance of different commodity-oriented industries in employment across educational groups and regional markets to establish a causal relation between commodity prices and labor market out- comes. In each region, I measure exposure to international commodity price shocks for two worker groups: High-School Graduates (HSG) and High-School Dropouts (HSD). The exposure of a group- region pair is defined as the interaction between the change in commodities’ world prices and the initial participation of corresponding commodities in the group’s labor payroll in the region. Armed with this measure, I document that regional economies more exposed to positive price shocks expe- rienced stronger employment expansions in the commodity sector among both HSG and HSD. This employment expansion is associated with mixed responses in the commodity sector wage differen- tial: it increases for HSG, but remains stable for HSD. Lastly, higher exposure to positive shocks is

1Production of basic products constitutes an important part of the Brazilian economy. In 2010, agriculture and mining industries represented 58.5% of total exports and 19.9% of total employment.

2 related to stronger growth in the group’s average wage which, in turn, translates into a reduction in the HSG-HSD wage premium because of the relatively higher employment share in the commodity sector among Brazilian HSD. In order to explain this rich response pattern in labor market outcomes across groups and regions, I propose a Two-Sector Roy Economy as in Heckman and Honore(1990). Within each group and re- gion, individuals are heterogeneous with respect to a skill bundle that, ultimately, determines their efficiency in the sector-specific tasks demanded by the commodity and the non-commodity sectors. Conditional on the price of these sector-specific tasks, the heterogeneity in sector efficiency translates into heterogeneity in potential sector wages, generating endogenous selection of an individual to the sector yielding her the highest labor income. To be more precise, the sector employment decision is intrinsically related to the worker’s comparative advantage defined as the worker’s efficiency in the commodity sector task relative to her efficiency in the non-commodity sector task. Whenever the worker’s comparative advantage is higher than the relative task price offered in the non-commodity sector, the worker decides to be employed in the commodity sector. Alternatively, the worker’s wage, given her employment decision, depends also on the worker’s absolute advantage defined as her effi- ciency in the task specific to the non-commodity sector. In this environment, an individual’s exposure to sectoral demand shocks is centrally dependent on her level of comparative advantage. To see this, consider a partial equilibrium comparative statics ex- ercise in which a increases the task price in the commodity sector while not affecting the task price in the non-commodity sector. The shock increases the wage of all commodity sector employ- ees, but it only affects the wage of those non-commodity sector employees that decide to switch into the commodity sector. These sector-switchers are the non-commodity sector employees whose level of comparative advantage is similar to the pre-shock ratio of sector-specific task price. Accordingly, the magnitude of the change in sector employment is determined by the amount of such workers in the economy as controlled by the comparative advantage . To the extent that sector-switchers are distinct from sector-stayers in terms of sector-specific efficiency, the change in employment compo- sition affects the average efficiency in the sector with the potential to attenuate or reinforce the direct effect of the task price change. The magnitude of this compositional effect is directly related to the difference between the average absolute advantage of sector-switchers and sector-stayers. In a group and region, the average wage response combines the gains of workers employed in the two sectors, depending on both the sector employment composition and the comparative advantage distribution. In line with this discussion, I show that two functions are sufficient to evaluate the impact of sec- toral shocks on the average and variance of the log-wage distribution in a group and region. Specif- ically, (i) the distribution of comparative advantage; and (ii) the average absolute advantage condi- tional on the level of comparative advantage. The second contribution of this paper is to provide a novel result that establishes the nonparametric identification of such functions using cross-regional variation in sector labor demand. For each worker group, I assume that regions are segmented labor markets with parallel schedules of comparative and absolute advantage; yet, I allow regions to have different skill shifters to accommodate regional variation in the level of sector-specific task supply. Under this assumption, the schedules of comparative and absolute advantage are nonparametrically identified, respectively, from responses of sector employment and average sector wage to changes in

3 sector-specific task prices induced by exogenous shocks to sector labor demand in the cross-section of regions. The identification result is valid for an arbitrary number of observable worker groups and it imposes no parametric restrictions on the skill distribution. This nonparametric identification result constitutes a mapping between population moments and the unknown schedules of comparative and absolute advantage. Its main importance relies on in- forming the source of variation in the data that uncovers the economic relation of without additional restrictions than those implied by the theory.2 In practice, limited availability of exogenous variation in the data imposes the need of auxiliary functional form assumptions. With this in mind, the empirical application of the model builds on a parsimonious log-linear system with two structural parameters that capture separately dispersion in comparative and absolute advantage. The proposed system represents a strict generalization of the system implied by a skill distribution of the Frechet family that entails a single parameter to control both comparative and absolute advantage.3 The combination of the nonparametric identification result and the auxiliary functional form as- sumption leads to an empirical strategy to estimate the structural parameters of comparative and absolute advantage using Generalized Method of Moments (GMM). As in the reduced-form analy- sis, I consider HSG and HSD that are differentially exposed to international commodity price shocks across Brazilian regional labor markets. The implementation of such strategy requires measurement of sector-specific task prices for all groups and regions.4 An additional contribution of this paper is to provide a new methodology to estimate task prices based on the Roy model’s predicted relation between wage growth and initial sector employment across quantiles of the wage distribution for each triple of group-region-period. This methodology can be easily implemented as a first-step regression using repeated cross-section data on wage and employment at the individual level. The structural estimation delivers two main insights. First, between-sector worker mobility is low with the of relative sector employment to relative sector task price estimated to be around one for both HSG and HSD. Second, the pattern of selection is qualitatively different in the two groups. For HSD, an expansion in sector employment reduces average efficiency in both sectors, generating the stability in sector wage premium documented in the reduced-form analysis. However, a sector employment expansion among HSG is associated with a decrease in average efficiency in the com- modity sector and an increase in the non-commodity, creating the strong responses in the sector wage premium obtained in the reduced-form regressions. I conclude the paper by applying the framework to answer one counterfactual question: “If com- modity prices fall by 10%, how would Brazilian wage inequality be affected?" To answer this question, I combine the estimated schedules of comparative and absolute advantage with the sectoral allocation of workers across regions and groups in 2010. Since employment in the commodity sector is higher among HSD residing in poor regions, the model predicts an increase in wage inequality following the

2As noted by Matzkin(2007), such identification results represent an important first step in the empirical analysis of al- most any model as the credibility of empirical results would be significantly hindered if identification could only be achieved under restrictive parametric assumptions. 3The Frechet distribution has been the basis of numerous recent applications of the Roy Model — see e.g., Hsieh, Hurst, Jones, and Klenow(2013); Burstein, Morales, and Vogel(2015); and Galle, Rodriguez-Clare, and Yi(2015). 4Sector-specific task prices are not immediately available in survey datasets that only contain information on individual labor income. In the model, observable labor income corresponds to the product of the unit task price and the individual’s endowment of efficiency units.

4 shock. Precisely, the fall in commodity prices triggers a 1.2% increase in the HSG-HSD wage premium that is accompanied by a 5% increase in the between-component of log-wage variance in the country. The rest of the paper is organized as follows. Section2 reviews the related literature. Section 3 presents the empirical facts about the adjustment of the Brazilian labor market following shocks to the international price of agriculture and mining commodities. Section4 presents the Two-Sector Roy Economy and its implications for the equilibrium structure of employment and wages. Section 5 establishes the nonparametric identification of comparative and absolute advantage in a sample of segmented labor markets. Section6 implements the identification strategy for High-School Graduates and High-School Dropouts differentially exposed to commodity price shocks across Brazilian regions. Section7 presents the counterfactual exercise that quantifies the effect of changes in commodity prices on changes in Brazilian wage inequality. Section8 offers some concluding remarks.

2 Related Literature

Research on the distributive effects of international trade has traditionally built upon the stark predic- tions regarding the changes in relative wages across worker groups provided by neoclassical models of trade in Stolper and Samuelson(1941) and Jones(1975). Yet, empirical studies have failed to pro- vide support for the predictions of these canonical models. For instance, a number of authors have documented (i) movements in wage inequality correlated in both developed and developing countries (Goldberg and Pavcnik, 2007), (ii) movements in the skill wage premium uncorrelated with changes in the of skill intensive products (Lawrence and Slaughter, 1993) while correlated with changes in the skill intensity of production within industries (Bekman, Bound, and Machin, 1998), (iii) and diminished between-sector responses in employment and wages following trade shocks (Wacziarg and Wallack, 2004; and Goldberg and Pavcnik, 2007). Such evidence gave rise to a consensus that in- ternational trade shocks were, at best, secondary drivers of the changes in wage inequality between the early-1980s and the mid-1990s. However, the last decade has witnessed a renewed interest in the distributive effects of trade shocks motivated by, on the one hand, recent changes in the wage distribution within several countries and, on the other hand, the integration into the world economy of China and other developing countries.5 Due to the disconnection between empirical evidence and predictions of neoclassical models, this emerging literature has expanded the scope of analysis by examining broader dimensions of workers’ exposure to international trade shocks. The literature has evolved on several directions. More closely related to this paper is the recent reduced-form literature that establishes the impact on labor market outcomes of heterogeneous exposure to import in terms of sector of em- ployment (Autor, Dorn, Hanson, and Song, 2014), level of education (Dix-Carneiro and Kovak, 2015b), and region of residence (Topalova, 2010; Kovak, 2013, Autor, Dorn, and Hanson, 2013; and Costa, Garred, and Pessoa, 2014). In this paper, I build on their reduced-form strategy to provide new evi-

5Acemoglu and Autor(2011) document a rich transformation pattern in the wage distribution of the U.S. and other OECD countries. As noted by Hanson(2012), low- and middle-income countries experienced not only an increase in international trade volumes but also pronounced changes in their profile of traded . In fact, emerging countries increased their participation in world trade from 13.5% in 1990 to 29.6% in 2012.

5 dence regarding the effect of differential exposure to commodity price shocks on sectoral employment and wages across educational groups and regional labor markets. This reduced-form evidence is then combined with the structure of the Two-Sector Roy Economy to provide a methodology to quantita- tively investigate the impact of trade shocks on the wage distribution. Following the explosion of firm-level work in the field of international trade, another strand of the literature has investigated the effect of international trade shocks on workers employed in differ- ent firms within-industries. These papers focus on how market integrations affects the within-industry matching of workers to firms and the subsequent implications for wage inequality (see e.g. Verhoogen, 2008; Helpman, Itskhoki, and Redding, 2010; Frias, Kaplan, and Verhoogen, 2012; Helpman, Itskhoki, Muendler, and Redding, 2015; and Burstein and Vogel, 2015). Another approach has been to inves- tigate dynamic effects of international trade shocks, paying particular attention to the welfare costs implied by the transitional dynamics in the reallocation of workers across sectors (Kambourov, 2009; Artuç, Chaudhuri, and McLaren, 2010; Dix-Carneiro, 2014; and Dix-Carneiro and Kovak, 2015a) and, more recently, regions (Caliendo, Dvorkin, and Parro, 2015). This paper is closely related to the growing body of literature connecting changes in the economic environment to self-selection of heterogeneous workers. On the theory side, recent papers demonstrate the potential of the Roy model to generate a rich response pattern in wage inequality — see Ohnsorge and Trefler(2007), Costinot and Vogel(2010), Acemoglu and Autor(2011), and, for a comprehensive review, see Costinot and Vogel(2014). On the empirical side, the Roy model has been applied to inves- tigate the consequences of selection to aggregate productivity and output (Lagakos and Waugh, 2013; Hsieh, Hurst, Jones, and Klenow, 2013; and Young, 2014). Closer in spirit, two papers impose a Frechet distribution of worker skills to quantify the portion of changes in between-group wage inequality as- sociated with technological changes in the USA (Burstein, Morales, and Vogel, 2015) and import com- petition in Germany (Galle, Rodriguez-Clare, and Yi, 2015). I complement these recent quantitative applications by providing a characterization of wage inequality responses in a generic Two-Sector Roy Model where the skill distribution is completely unrestricted. The generality of the model sheds light on the different roles played by comparative and absolute advantage in determining the evolution of labor market outcomes both within- and between-groups. This increases the ability of the model to capture features of the data that are meaningful in the study of distributional effects of trade shocks.6 Lastly, this paper is related to the large literature on selection models in labor markets. Inspired by the seminal paper of Roy(1951), this literature investigates the consequences of self-selection based on unobservable characteristics to observable components of labor income. Evidence in favor of selection driven by multivariate skills is provided by Gibbons and Katz(1992), Neal(1995), Parent(2000), and Gibbons, Katz, Lemieux, and Parent(2005). In order to establish the Roy model’s identification in a single market, the most traditional approach was to impose parametric restrictions on the skill distri- bution as in Heckman and Sedlacek(1985). Yet, this approach relies heavily on the functional form imposed on the skill distribution since Heckman and Honore(1990) show that the Roy model cannot

6Although remaining tractable in the multiple sector setting, the Frechet distribution has features that are unappealing in labor market analysis. Not only it requires a high degree of between-sector mobility but also it is unable to generate responses in between-sector wage differentials. In contrast, these restrictive properties are not implied by the log-linear system used in the empirical application of this paper.

6 be nonparametrically identified in a single cross-section of individuals. Following this negative re- sult, they propose two main alternatives. The first assumes observable shifters in the location of the skill distribution for individuals in the same market; the so called identification at infinity reviewed in French and Taber(2011). The second relies on the variation of skill prices across markets with an iden- tical skill distribution. In its majority, the selection literature builds on the first approach to identify hedonic wage regressions (e.g., Mulligan and Rubinstein, 2008); alternatively, this paper builds upon the second approach to investigate distributional consequences of sectoral labor demand shocks. To this end, I provide a nonparametric identification result that complements the elegant result of Heckman and Honore(1990). By allowing for regional skill shifters, I introduce task supply shocks that affect sector employment and sector wages across markets while endogenously correlated with sector- specific task prices in equilibrium. For this reason, identification requires an instrumental variable that generates exogenous variation in sector-specific task prices. Accordingly, the assumption in this paper is weaker than the existence of a single skill distributions in all market assumed by Heckman and Honore(1990) to establish their nonparametric identification result. In this sense, I provide a generalization of Heckman and Honore’s (1990) result when regions are subject to parallel shocks in the skill distribution.

3 International Commodity Prices and the Brazilian Labor Market

3.1 Data and Background

Brazil, as primarily an exporter of basic commodities, is particularly exposed to shocks in the world price of agricultural and mining goods. Such exposure pattern is captured by the relative price of six commodity categories: Grains, Livestock, Soft Agriculture, Metals, Precious Metals, and Energy. Together, these commodity groups account for a large share of Brazilian exports, representing 53.8% of total exports in 1980 and 60.4% in 2012. As described in AppendixA, I construct price indices for each category using data on commodity transactions in the main exchange markets of the United States between 1980 and 2012. To replicate relative prices faced by producers in Brazil, international commodity prices are converted to Brazilian and deflated by the Brazilian consumer price index. The analysis of labor market outcomes rely on wage and employment data from the Brazilian Census performed by the Brazilian Institute of Geography and Statistics (IBGE) for 1980, 1991, 2000 and 2010. To focus on individuals with strong labor force attachment mainly affected by market- oriented forces, I consider a benchmark sample of full-time white male employed individuals aged 16-64.7 Individuals are divided into two observable groups: High-School Dropouts (HSD) and High- School Graduates (HSG). These two groups reflect the relevant margin of education in Brazil; among male white workers, the high-school graduation rate was 16.0% in 1980 and 51.4% in 2010.

7I restrict the benchmark sample to include only white and male individuals because of the strong declines in gender and race wage differential between 1995 and 2010; see e.g. Ferreira, Firpo, and Messina(2014). The model presented below is not intended to speak directly to the behavior of these components of the wage structure and, therefore, I exclude their behavior from the analysis in this section. AppendixB provides robustness exercises including female and non-white individuals, indicating that all results are essentially the same when these additional worker groups are considered.

7 Figure 1: International Commodity Prices and Wage Dispersion in Brazil, 1980-2010 International Relative Commodity Prices, 1980-2012 Wage Dispersion in Brazil, 1980-2010

3.0 1.2

2.5 1.0

2.0 0.8

1.5 0.6

1.0 0.4 Variance of wageslogof Variance

0.5 0.2 Log of Log deflated international commodity prices (BRL)

0.0 0.0 1980 1985 1990 1995 2000 2005 2010 1980 1991 2000 2010 Commodity Price Index Grains (corn, soybeans, wheat) Energy (crude oil) Residual Between Soft (coffee, cacoa, sugar, and others) Metals (copper, lead, steel, tin, zinc) (education, sector, region, experience) Livestock (cattle, hogs, and others) Precious Metals (gold and silver) Note. Left panel: six-month moving average of log international product price converted to Brazilian currency and deflated by the Brazilian consumer price index (Sept-1994 = 1). Right panel: variance of log wages among male white full-time workers in the Brazilian Census of 1980, 1991, 2000 and 2010. Wage decomposition implied by the regression of log real wage on a full set of dummies for high school graduation, employment in the commodity sector, and microregion of residence, and years of experience (0-39yrs). Details in AppendixA.

Throughout the analysis, I classify individuals into a sector and a regional labor market. According to their self-reported industry of employment, I map workers either to the commodity sector or to the non-commodity sector. Industries specialized in the production of agricultural and mining products are included in the commodity sector, covering 13.9% of the national labor income in 1980 and 12.9% in 2010. All manufacturing and industries are included in the non-commodity sector. In addition, I allocate individuals to a regional labor market based on their microregion of residence. Between 1991 and 2010, the Census contains a balanced panel of 558 microregions where each microregion corresponds to a set of economically integrated municipalities. In the 1980 Census, only a subset of these microregions can be constructed due to changes in administrative boundaries of municipalities and states between 1980 and 1991. AppendixA discusses details on the construction and measurement of labor market outcomes.8 Let us now turn to the aggregate trends in international commodity prices and Brazilian wage dispersion presented respectively on the left and right panels of Figure1. Between 1980 and 2012, the evolution of relative commodity prices is U-shaped: the strong price decline of the 1980s was followed by a recovery of equivalent magnitude from the late 1990s until 2012. Notice that, while the price drop was homogeneous across commodity categories, the price recovery exhibited great between- category heterogeneity. In opposite direction, wage dispersion, after increasing in the 1980s, began a compression process in the 1990s that accelerated in the 2000s. This movement in wage dispersion can be decomposed into observable and residual components by regressing log wages on a full set of dummies for years of experience, level of education, sector of employment and microregion of residence. The decomposition indicates that observable worker attributes account for a sizeable part of the wage dispersion movement — i.e., 44.5% of the change in 1980-1991 and 78.1% in 1991-2010.

8Table A1 presents the allocation of industries to sectors in each year of the Census. Also, AppendixA describes both the construction of microregions in the 1980 Census and the consequences of the changing municipality boundaries to the panel of microregions across years.

8 Figure 2: Components of Wage Dispersion in Brazil, 1980-2010

Decomposition of Between-Component of Wage Dispersion Return to Observable Worker Characteristics 0.5 1.2

0.4 1.0

0.3 0.8

0.2 0.6

Wage Wage Differentials Variance of log wages

0.1 0.4

0.0 0.2 1980 1991 2000 2010 1980 1991 2000 2010 HSG Wage Premium Region Wage Premium Sector Wage Premium Covariance HSG Wage Premium P90/P10 Region Wage Premium Sector Wage Premium

Note. Left panel: decomposition of the between-component of the log-wage variance associated with the set of dummies for high school graduation, employment in the commodity sector, and microregion of residence. Right panel: estimated return to observable worker at- tributes implied by the log-wage regression. The dispersion in the 558 regional wage differentials is represented by the difference between 90th and 10th percentiles of the estimated regional fixed effects.

To investigate further the negative aggregate correlation between commodity prices and the between- component of wage dispersion, Figure2 presents the full decomposition of the portion of log-wage variance associated with worker’s affiliation in terms of education-sector-region. The left panel indi- cates that the substantial contribution of this component to the 1991-2010 decline in wage dispersion is distributed across all its terms: 34.0% is related to the education dummy, 10.7% to the sector dummy, 23.9% to regional dummies, and 31.4% to the covariance of these terms. Importantly, the right panel shows that this fall in log-wage variance was driven by strong reductions in the differential return to observable worker attributes. This discussion can be summarized as follows.

Fact 1. Between 1980 and 2010, international commodity prices and Brazilian wage dispersion are negatively correlated. A large fraction of the change in wage dispersion in the period is related to observable worker at- tributes with substantial movements in wage differentials associated with the worker’s level of education, sector of employment and region of residence.

3.2 Exposure to Commodity Price Shocks and Labor Market Outcomes

The similarity between the aggregate trends in wage dispersion and commodity prices is suggestive of their interconnection. Yet, this correlation is potentially driven by confounding shocks afflicting the Brazilian economy in the period. This section addresses the causal effect of commodity prices on labor outcomes separately for High-School Dropouts and High-School Graduates. To this extent, I treat microregions as local subeconomies subject to differential price shocks according to their initial profiles of industry specialization. By comparing worker groups in regions differentially exposed to commodity price shocks, the reduced-form exercise sheds light on the adjustment pattern of wages and employment following sectoral labor demand shifts. In the exercise, I consider a balanced panel

9 of 518 microregions in the years of 1991, 2000 and 2010.9

3.2.1 Heterogeneity in Exposure to Commodity Price Shocks

To capture heterogeneous exposure to fluctuations in international commodity prices, I construct a Bartik-Instrument for each microregion and group. Specifically, the shock exposure corresponds to the interaction between the commodity participation in the group-microregion wage bill on the initial year and the change in the international commodity price on the period. Precisely, the exposure of group g in microregion r to the commodity price shock is given by

∆Zg,r,t = ∑ φg,r(j) · ∆ ln pt(j) (1) j∈J C where ∆ ln pt(j) is the log-change in the international price of product j between years t − 1 and t; and

φg,r(j) is the share of industry j in total labor payments of the commodity sector to individuals of group g in microregion r on the initial year of 1991. Intuitively, expression (1) entails a stronger response in the regional demand for worker groups specialized in the production of commodities experiencing stronger international price gains.10 Con- sequently, ∆Zg,r,t embeds two dimensions of exposure heterogeneity. First, cross-regional variation in initial industry composition creates exposure heterogeneity for individuals of the same group that reside in different microregions. Second, cross-industry variation in skill intensity generates exposure heterogeneity between groups in the same region. As shown in Table A4, these two variation sources are important in Brazil, giving rise to great exposure heterogeneity across both regions and groups. Figure3 exhibits the pattern of spatial exposure to the commodity price shock among microregions for HSG (left panel) and HSD (right panel). Notice that shock exposure differs significantly for the two groups with a cross-region correlation of only .493. Moreover, Figure3 indicates that HSG tend to have lower shock exposure than HSD — specifically, the median exposure is 4.7 log-points for HSG and 6.3 log-points for HSD.

9I restrict the sample to only include those microregions with positive employment in the commodity sector for all years and groups. As a result, the final sample contains 518 microregions that represented 98.4% of the country’s population on 1991. As discussed in AppendixA, there are two reasons to exclude the 1980-1991 period from the baseline sample. First, changing municipality borders only allows the construction of a subset of the microregions included in the 1991-2010 Cen- suses. Second, the 1980-1991 period was marked by severe economic turbulences in Brazil that have the potential to hinder the connection between international and domestic relative commodity prices — these events include hyperinflationary sprouts, suspension of foreign currency convertibility, and the adoption of restrictive internal controls on prices and wages. Nevertheless, AppendixB attests that similar results are obtained with a restricted sample of microregions spanning the entire 1980-2010 period. 10The initial industry composition is indicative, as in Costinot(2009), of the region’s comparative advantage in production, reflecting local availability of natural resources like soil fertility and oil reserves. Accordingly, the intuition behind the labor demand response in expression (1) follows directly from the comparative static exercises in Costinot and Vogel(2010, 2014). Alternatively, Kovak(2013) connects production specialization to comparative advantage implied by the local endowment of industry-specific factors, deriving an expression similar to (1) for the response of regional labor demand to small price shocks. Recently, reduced-form strategies based on related measures of local exposure to trade shocks have become popular in the literature; see e.g. Topalova(2010), Kovak(2013), and Autor, Dorn, and Hanson(2013).

10 Figure 3: Heterogeneity in Exposure to Commodity Price Shock, 1991-2010

3.2.2 Reduced-Form Evidence

In order to investigate the effect of exposure to the commodity price shock on labor market outcomes, I consider the following reduced-form specification:

∆Yg,r,t = βg · ∆Zg,r,t + Xg,r,tγg + vg,r,t (2) where ∆Yg,r,t is the change in a labor market outcome for individuals of group g in microregion r between years t − 1 and t; and Xg,r,t is a control vector of group-microregion characteristics poten- tially correlated with the exposure measure. In the baseline specification, the control vector includes macroregion-period dummies, the initial share of group labor income in the commodity sector, and the initial share of workers earnings at most the federal minimum wage. Also, microregions are weighted by their share in the national population of 1991 and standard errors are clustered by microregion to account for serially correlated errors. Conditional on the initial share of group labor income in the commodity sector, specification (2) relies exclusively on the variation in relative product prices within the commodity sector. Accordingly, the causal interpretation of model (2) requires that shocks in Brazilian regions are not large enough to affect the world price of basic commodities. A requirement especially plausible given the strong growth in Chinese imports of agriculture and mining products during the period; arguably, an exoge-

11 nous demand shock to relative prices of raw materials.11

Sectoral Wages and Employment. Columns (1)-(3) of Table1 estimate equation (2) with the dependent variable being the commodity sector employment share of HSG in Panel A and HSD in Panel B. The positive significant coefficients in column (1) indicate that, for both HSG and HSD, higher commodity prices induce workers to reallocate from the non-commodity to the commodity sector. However, es- timates suggest very limited between-sector mobility: a 10% increase in commodity prices cause the commodity sector employment share to increase by .31 p.p. for HSG and .77 p.p. for HSD. Compared to a region in the 10th percentile of shock exposure, these estimates imply that a region in the 90th percentile had a differential commodity sector expansion of 0.6 p.p. for HSG and 1.0 p.p. for HSD. To test robustness and potentially eliminate confounding effects, I augment the model with period dummies interacted with a set of initial labor market conditions in column (2) and a quadratic polyno- mial of the initial commodity sector size in column (3). These controls represent period-specific effects projected on initial region characteristics, capturing for example effects related to the introduction of cash transfer programs and secular differences in sector productivity growth. Although these addi- tional controls absorb a large part of the cross-section variation in sector employment change, they only strengthen estimated coefficients. Turning to the impact of commodity prices on sectoral wages, columns (4)-(6) estimate model (2) with the commodity sector average wage premium as dependent variable. In column (4), the two groups exhibit different qualitative responses. The price shock triggers a significant positive response

Table 1: Exposure to Commodity Price Shocks and Sector Employment and Wages Change in Commodity Sector Change in Commodity Sector Dependent Variable: Employment Share Average Log Wage Premium (1) (2) (3) (4) (5) (6) A. High-School Graduates Commodity price shock 0.031** 0.038** 0.038** 0.298** 0.437** 0.416** (0.009) (0.010) (0.010) (0.083) (0.095) (0.098) R2 0.283 0.343 0.405 0.143 0.198 0.220 B. High-School Dropouts Commodity price shock 0.077* 0.101** 0.099** -0.110 0.061 0.059 (0.031) (0.029) (0.029) (0.144) (0.161) (0.161)

R2 0.5081036 0.5571036 0.5591036 0.1601036 0.2181036 0.2181036 Baseline Controls Initial labor market conditions x period dummies No Yes Yes No Yes Yes Initial commodity sector size controls x period dummies No No Yes No No Yes

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. All regressions are weighted by the microregion share in national population on 1991. All models include ten macroregion-period dummies and the following baseline controls on 1991: share of group labor income in the commodity sector and share of workers earnings at most the federal minimum wage. Labor market conditions: cubic of per-capita income, quadratic of share of white individuals, share of employed individuals, share of individuals in urban areas, share of workers earnings at most the federal minimum wage, share of social security dependents (only HSD). Commodity sector size controls: quadratic polynomial of commodity sector share in group labor income. Standard Errors clustered by microregion. ** p<0.01, * p<0.05

11Between 1992 and 2010, the average annual growth rate of Chinese imports was 17.2% for all products, 16.2% for Agriculture, and 28.3% for Mining. Over the period, Hanson(2012) provides a careful discussion of the transformation in the profile of international trade of emerging economies and, in special, China. To the extent that this transformation was mainly driven by internal changes in the production structure of China, this large demand shock represented an exogenous impulse to world commodity prices in the period. In Table B4 of AppendixB, I investigate the sensitivity of results when the exposure to commodity price shocks is instrumented with the exposure to Chinese commodity imports growth.

12 of the commodity sector wage differential for HSG; in contrast, there is a nonsignificant response for HSD. These conclusions are robust to the inclusion of additional controls in columns (5) and (6). In light of the results in Table1, I state the following conclusion.

Fact 2. Exposure to commodity price shocks is positively related to the commodity sector employment of both HSG and HSD. In addition, shock exposure is ambiguously related to the commodity sector wage premium, ex- hibiting a positive relation for HSG and a nonsignificant relation for HSD.

Group Wages. Table2 investigates the consequences of the commodity price shock to the average log wage of HSG in Panel A and HSD in Panel B. Columns (1)-(3) indicate that exposure to higher commodity prices is associated with higher average wages for the two worker groups. In columns (4)-(6), I access the robustness of this relation to the use of an alternative exposure measure based on the interaction of ∆Zg,r,t with the initial commodity sector size. As in Autor, Dorn, and Hanson(2013), this alternative measure explicitly incorporates differences in the overall sectoral allocation of workers across microregions and groups.12 Again, estimates indicate a positive response of average wages to higher commodity prices. This result suggests that, despite the mixed wage response at the sector level, the commodity price shock has a positive impact on the microregion’s labor demand, putting upward pressure on wages of HSG and HSD. Results in Table2 imply a sizeable effect of the commodity price shock on the behavior of region

Table 2: Exposure to Commodity Price Shocks and Group Wages Dependent Variable: Change in Average Log Wage (1) (2) (3) (4) (5) (6) A. High-School Graduates Commodity price shock 0.296** 0.283** 0.288** (0.069) (0.066) (0.064) Commodity sector size x Commodity price shock 0.694* 0.510* 0.636* (0.342) (0.260) (0.292) R2 0.399 0.433 0.440 0.374 0.415 0.422 B. High-School Dropouts Commodity price shock 0.271** 0.280** 0.268** (0.103) (0.097) (0.099) Commodity sector size x Commodity price shock 0.628** 0.557** 0.558* (0.204) (0.208) (0.221)

R2 0.555 0.624 0.636 0.5521036 0.6211036 0.6341036 Baseline Controls Initial labor market conditions x period dummies No Yes Yes No Yes Yes Initial commodity sector size controls x period dummies No No Yes No No Yes

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. All regressions are weighted by the microregion share in national population on 1991 and include ten macroregion-period dummies. Baseline controls as in Table 1. Standard Errors clustered by microregion. ** p<0.01, * p<0.05

12In this case, exposure to the price shock of each commodity is proportional to its overall participation in the regional wage bill of the worker group. In the model presented below, the response of group average wage is proportional to the overall sector allocation of group workers, being more closely related to the alternative exposure measure. In contrast, the model predicts that sectoral wages and employment within each group respond to sectoral demand shocks, being more closely related to ∆Zg,r,t. In Table B6, I attest that qualitative conclusions of the sector-level analysis remain valid if the alternative exposure measure is considered.

13 and education wage gaps in Brazil. To see this, consider the differential wage growth of a microregion in the 90th percentile of shock exposure relative to another in the 10th percentile. Estimates in column (4) predict that this differential growth is 1.9 p.p. for HSG and 4.2 p.p. for HSD. Higher relative wage gains for HSD follow from their higher exposure to commodity price shocks and, in this case, translate into a 2.2 p.p. reduction in the high-school wage premium.

Fact 3. Exposure to commodity price shocks is positively related to the average wage of both HSG and HSD. Es- timated coefficients suggest a sizeable impact of commodity prices on educational and regional wage differentials.

Sensitivity Analysis. Having established Facts 2 and 3, AppendixB turns to an empirical investigation of the robustness of these results. In particular, I obtain similar conclusions if the baseline specifica- tion is extended to include additional sector composition controls, microregion-specific time trends, additional periods and additional worker groups (Tables B1-B3). To address concerns regarding the effect of supply conditions in Brazil on world commodity prices, I also attest that similar results are obtained when the exposure to commodity price shocks is instrumented with the exposure to Chinese commodity imports growth in 1991-2010 (Table B4). Lastly, I show that exposure to commodity price shocks is not related to the change in the labor supply of both native and migrant workers (Table B5).

Following an increase in the world price of commodities, Facts 2 and 3 indicate that the adjustment pattern of wages and employment depends on the shock exposure across regions and groups. To be more precise, regional economies more exposed to the price shock experience a stronger employment expansion in the commodity sector for both HSG and HSD. Although shock exposure is related an increase in the average wage at the group level, there are mixed within-group responses in sector av- erage wages. In the next section, I propose a Two-Sector Roy Economy that, following commodity price shocks, generates such adjustment pattern through the self-selection of heterogeneous individu- als to sectors. This structural framework delivers a methodology to evaluate the impact of commodity price shocks on the wage distribution, allowing a quantitative investigation of the aggregate relation between the Brazilian wage structure and the international commodity prices emphasized in Fact 1.

4 Two-Sector Roy Economy

4.1 Environment

Consider a segmented labor market populated by workers of multiple groups, g, that self-select into either the commodity sector (k = C) or the non-commodity sector (k = N). In sector k, production depends on the sum of sector-specific tasks performed by all sector employees of a particular worker group. Following a long tradition in Roy-like models, assume that workers can only be employed in a single sector. Employees in each sector are paid a salary in exchange for each completed unit of the k sector-specific task assigned to their group. Denote wg as the unit price in sector k of the task performed by workers of group g. This notation allows, but does not require, sector-specific task prices to differ

14 across groups.13

Within each group, there is a continuum of heterogeneous individuals indexed by i ∈ Ig. Individ- C N ual i is endowed with a bivariate skill vector, (Tg (i), Tg (i)), that determines her effective supply of k sector-specific tasks. Specifically, individual i if employed in sector k performs Tg (i) units of the task k k specific to sector k and earns a wage of wgTg (i). The analysis is simplified by working with a log-linear transformation of earnings so that the potential log-wage of individual i of group g in sector k is given k k k k by ωg + ln Tg (i) where ωg ≡ ln wg. Assume that individuals choose to be employed in the sector that gives them the highest wage.14

Accordingly, individual i self-selects into sector Kg(i) where

n C C N N o Kg(i) = arg max ωg + ln Tg (i); ωg + ln Tg (i) . (3)

To characterize the structure of employment and wages in the Two-Sector Roy Economy, let us C N define individual i’s comparative advantage as sg(i) ≡ ln(Tg (i)/Tg (i)), and absolute advantage as N ag(i) ≡ ln Tg (i). In a given group, suppose individuals independently draw their skill vector from a common bivariate distribution such that

sg(i) ∼ Fg(s) and {ag(i)|sg(i) = s} ∼ Hg (a|s) (4)

−1 where αg(q) ≡ (Fg) (q) is the quantile function of comparative advantage. The skill distribution in (4) implies that, in a particular quantile q ∈ [0, 1], there is a set of individ- uals in group g whose level of comparative advantage is αg(q). Among these individuals, there is a  conditional distribution of absolute advantage, Hg a|αg(q) , with average and variance respectively     denoted by Ag(q) ≡ E ag(i)|sg(i) = αg(q) and Vg(q) ≡ Var ag(i)|sg(i) = αg(q) .

4.2 Sectoral Wages and Employment: Comparative and Absolute Advantage

The most direct implication of the structure in (3) is the endogenous sorting of workers to sectors. Due to the heterogeneity in sector-specific task efficiency of workers, this endogenous sorting decision has strong implications for the pattern of employment and wages in the two sectors of the economy. To analyze such pattern, I consider a graphical representation of the economy in terms of quantiles of the comparative advantage distribution. Figure4 exhibits the average potential log wage in each sector for individuals distributed across dif- ferent comparative advantage quantiles. If employed in the non-commodity sector, workers in quantile q perform, on average, Ag(q) units of the sector-specific task, receiving in return an average potential

13This particular formulation follows closely the environment in the extensive literature inspired by the seminal work of Roy(1951); see e.g. Heckman and Sedlacek(1985), Heckman and Honore(1990), and, more recently, Ohnsorge and Trefler (2007), Lagakos and Waugh(2013) and Young(2014). In AppendixC, I provide a detailed description of the production and market structures that generate such sector-specific task prices in general equilibrium. 14Equivalently, one could assume that individuals derive solely from the optimal consumption bundle acquired with their labor income. By introducing within-group heterogeneity entirely on sector-specific labor efficiency, the distribu- tive impact of a trade shock is completely captured by the behavior of observable labor income. In this environment, identical homothetic preferences imply that changes in the welfare distribution are directly related to changes in the wage distribu- tion. Notice that this preference structure abstracts from private employment benefits and mobility costs. In Appendix D.3, I explore an extension that incorporates such features into the model.

15 ¯ N N log wage of Yg (q) = ωg + Ag(q). Alternatively, if employed in the commodity sector, they perform an average of αg(q) + Ag(q) units of the sector-specific task and earn an average potential log wage ¯ C C of Yg (q) = ωg + αg(q) + Ag(q). In a particular quantile, the unique source of dispersion in potential sector wages is the dispersion of absolute advantage, Vg(q) — illustrated by the hump-shaped curves 15 in quantile q1.

In Figure4, Ag(q) is strictly decreasing in q, implying that individuals in high quantiles of compar- ative advantage have, on average, lower levels of absolute advantage. However, the skill distribution in (4) does not impose any restriction on the relation between comparative and absolute advantage.

Consequently, the behavior of Ag(q) is free to change in an arbitrary way across comparative advan- tage quantiles. Figure5 illustrates the other extreme case in which Ag(q) is strictly increasing in q. As discussed below, these two cases have very distinct implications for the adjustment of sector average wages following sectoral demand shocks.

q q 1 1

Figure 4: Case I - Ag(q) decreasing. Figure 5: Case II - Ag(q) increasing.

Sector Employment and Comparative Advantage. The preference structure in (3) implies that indi- viduals self-select into the sector yielding the highest wage. For all individuals in a given quantile of comparative advantage, the sectoral employment choice can be determined from the comparison between the two sectoral curves of average potential wage in Figures4-5. 16 In high quantiles of com- parative advantage, the relatively higher efficiency in the commodity sector task yields a relatively higher wage in that sector, implying self-selection into the commodity sector. In contrast, individuals

15 The two potential wage curves exhibit the single-crossing property because, by construction, αg(q) is increasing in q. Hence, they cross at most once. To simplify the analysis, I assume that task prices are such that these curves cross at least once. This is the case if Fg(.) has full support on R so that limq→0 αg(q) = −∞ and limq→1 αg(q) = ∞. In general equilibrium, this is also implied by restrictions on the production technology guaranteeing that, in both sectors, the marginal product of a task unit goes to infinity as the total task quantity goes to zero. 16 To formalize this claim, consider individual i with comparative advantage sg(i) = αg(q). This worker has a potential N C log wage of ωg + ag(i) in the non-commodity sector and ωg + αg(q) + ag,m(i) in the commodity sector. For this individual, potential sector wages correspond to vertical shifts of those of a worker with the same level of comparative advantage but a different level of absolute advantage. Consequently, the sectoral choice of individual i with sg(i) = αg(q) is identical to that 0 0 0 of a hypothetical individual i with sg(i ) = αg(q) and ag(i ) = Ag(q).

16 in low quantiles of comparative advantage obtain a relatively lower wage in the commodity sector, finding it optimal to self-select into the non-commodity sector. In equilibrium, the sector employment composition is determined by marginal individuals with comparative advantage equal to the relative N C task price, ωg − ωg . These marginal workers have exactly the same potential wage in the two sectors, being indifferent between them. N N Formally, individuals in quantiles of comparative advantage Qg ≡ {q : q < lg } are employed in C N the non-commodity sector and those in quantiles Qg ≡ {q : q ≥ lg } are employed in the commodity N sector where the employment share in the non-commodity sector, lg , is determined by the unique intersection of the sectoral schedules of average potential wage,

N C  N ωg − ωg = αg lg . (5)

This discussion highlights the central feature of the Two-Sector Roy Economy: workers are not readily substitutable between sectors. Because of the efficiency differential in sector-specific tasks, workers do not switch sectors to arbitrage the differential in sector-specific task prices in equilibrium. The sector employment composition in (5) reflects this heterogeneity in sector efficiency, depending on the shape of the schedule of comparative advantage, αg(.).

Sector Average Wage and Average Sector-Specific Efficiency. The endogenous selection of individuals to sectors implies that the sector average wage embodies only the average labor income of individuals ¯ k ¯ k k employed in the sector: Yg = E[Yg (q)|q ∈ Qg]. As a result, the average log wage in the non-commodity sector is given by l ¯ N N ¯ N N ¯ N 1 Yg = ωg + Ag (lg ) s.t. Ag (l) ≡ Ag(q) dq ; (6) l ˆ0 and the average log wage in the commodity sector is

1 ¯ C C ¯ C N ¯ C 1 Yg = ωg + Ag (lg ) s.t. Ag (l) ≡ [αg(q) + Ag(q)] dq. (7) 1 − l ˆl

In expressions (6)-(7), there are two determinants of the average wage in each sector. First, the k sector-specific task price, ωg, that directly affects the wage of sector employees. Second, the sector N employment composition, lg , that affects the average efficiency in the sector-specific task through the   ¯ k N term Ag lg . Because of the heterogeneity in average sector-specific efficiency illustrated in Figures 4-5, the range of quantiles allocated to each sector has a direct effect over that sector’s average wage. In other words, the pattern of variation in average sector-specific task efficiency across comparative advantage quantiles —Ag(.) in the non-commodity sector and αg(.) + Ag(.) in the commodity sector— ¯ k determines the average efficiency of sector employees summarized in the shape of Ag (.).

4.3 Average and Variance of Group Log Wages

In the Two-Sector Roy Economy, average and variance of the log-wage distribution of workers in group k g are determined by their sectoral allocation decision, Qg, as well as their schedules of comparative and absolute advantage, αg(q) and Ag(q). By construction, the average wage in group g is Y¯g =

17 N ¯ N C ¯ C lg · Yg + lg · Yg and, therefore, expressions (6)-(7) imply

1 ¯ C C N N Yg = ωg · lg + ωg · lg + αg(q)dq + eg (8) ˆ N lg

1 where eg ≡ 0 Ag(q)dq. Since individuals´ are not readily substitutable between sectors, they are solely exposed to the task price in their sector of employment. As a result, the average wage in (8) depends on the sector employ- C C N N ment composition as captured by ωg · lg + ωg · lg . The other two terms reflect the average efficiency of workers in the two sector-specific tasks. While the term eg captures the average absolute advantage 1 that affects efficiency in both sectors, the term N αg(q)dq captures the comparative advantage that lg affects only the efficiency of commodity sector employees.´ Among individuals of group g, there are two sources of wage dispersion: the between-sector aver- age wage differential and within-sector wage dispersion. To be more precise, the law of total variance implies that  2 N C ¯ C ¯ N N N C C Vg = lg lg · Yg − Yg + lg · Vg + lg · Vg (9)

k where Vg corresponds to the log-wage variance of individuals employed in sector k. As indicated in Figures4-5, this within-sector wage variance combines the variation in average wage of individuals distributed across comparative advantage quantiles employed in the sector, Yg(q), and the variation in absolute advantage in any particular comparative advantage quantile, Vg(q). Consequently, the log- k k ¯ k k wage variance in sector k is Vg = E[Vg(q)|q ∈ Qg] + Var[Yg (q)|q ∈ Qg]. By applying this expression into (9), the log-wage variance in group g can be written as

 2 h i h i N C ¯ C ¯ N N N C N Vg = lg lg · Yg − Yg + lg · Var Ag(q) q < lg + lg · Var αg(q) + Ag(q) q ≥ lg + νg (10) where the variance is taken in the conditional uniform distribution of quantiles allocated to each sec- 1 tor. In equation (10), the term νg ≡ 0 Vg(q) dq captures the portion of wage dispersion related to the absolute advantage dispersion within´ comparative advantage quantiles, being independent of the endogenous worker sorting for its symmetric effects on potential wages in the two sectors.

4.4 Comparative Statics: Sector Demand Shocks

To better understand the mechanics of the Two-Sector Roy Economy, let us analyze the adjustment of labor market outcomes following a positive shock to task demand in the commodity sector. To simplify C N the analysis, I consider a demand-driven increase in ωg while holding constant both ωg and the skill distribution. With this exercise, I show how the model is capable of rationalizing the response of the Brazilian labor market to the commodity price shock (i.e., Facts 1-3 in Section3). As illustrated in Figures6-7, the increase in the commodity sector task price implies an upward shift in the commodity sector curve of potential average wage. Following the shock, individuals initially indifferent between the two sectors find it optimal to be employed in the commodity sector, reducing N the employment share in the non-commodity sector by ∆lg < 0. The magnitude of the between-

18 sector employment flow is directly related to the slope of the comparative advantage schedule: for the same variation in the relative task price, the sector employment response is smaller if the slope of the comparative advantage schedule, αg(.), is higher. This is immediately implied by equation (5):

N N lg +∆lg h N Ci ∂αg(u) ∆ ωg − ωg = du. (11) ˆ N q lg ∂

q q 1 1

Figure 6: Positive selection into non-commodity Figure 7: Negative selection into non-commodity sector. sector.

Although the non-commodity sector task price is constant, the implied outflow of non-commodity sector employees affects the sector’s employment composition and, consequently, the sector’s average wage. To the extent that the absolute advantage of sector-switchers differs from that of sector-stayers, the change in sector employment affects the sector average efficiency. If Ag(q) is decreasing as illus- trated in Figure6, then sector-stayers have a higher level of absolute advantage than sector-switchers (i.e., blue area). In this case, the outflow of workers leaves the non-commodity sector with employees whose average absolute advantage is relatively higher, raising the average wage in the sector. How- ever, this is not the only possibility. In Figure7, Ag(q) is increasing so that the outflow of workers lowers the average wage in the non-commodity sector. To formalize this discussion, notice that ex- pression (6) yields

N + N N N + N lg ∆lg ∂A¯ (u) lg ∆lg   ¯ N N g 1 ¯ N ∆Yg − ∆ωg = du = Ag(u) − Ag (u) du. (12) ˆ N q ˆ N u lg ∂ lg

¯ N By expression (12), the implied response in the non-commodity sector average efficiency, ∆Yg − N ¯ N ∆ωg , depends on the slope of Ag (.) that is directly related to the difference between the absolute N ¯ N N advantage of sector-switchers, Ag(lg ), and sector-stayers, Ag (lg ). Following the reduction in non- commodity sector employment, this compositional effect has an ambiguous sign; for example, it is positive if Ag(.) is decreasing (i.e., Figure6) or negative if Ag(.) is increasing (i.e., Figure7).

19 In the commodity sector, the shock has two effects on the average wage. There is an increase in the log-wage of commodity sector employees implied by the growth of the the commodity sector task price. The magnitude of this increase is represented by the vertical shift of the curve of potential av- erage wage in the commodity sector. In addition, there is a compositional effect driven by the inflow of new employees whose sector-specific efficiency differs from that of original employees in the com- modity sector. As discussed above for the non-commodity sector, the sign of this effect is ambiguous, ¯ C 17 being controlled by the slope of Ag (.). As a consequence of the shock, the upper-envelope of the two curves of potential sector wage shifts upwards and, therefore, the average group wage increases. Formally, expression (8) yields

" N + N #   lg ∆lg ¯ C C N N N N N ∆Yg = ∆ωg · lg + ∆ωg · lg + αg lg + ∆lg · ∆lg − αg(q) dq . (13) ˆ N lg

The first two terms correspond to the positive wage gain implied by the shock if workers were to remain in their initial sector of employment. Yet, the change in sector composition associated with the shock introduces an additional source of average wage variation: the compositional effect captured by the term in brackets. For the positive shock considered, the compositional effect is also positive because individuals only move into the commodity sector to take advantage of its higher wage growth.18 At last, notice that this compositional effect is second-order: for small price shocks, sector-switchers are the marginal individuals with the same potential wage in the two sectors so their reallocation does not affect the log-wage distribution of the group. Proposition1 summarizes the predictions of the model following an exogenous increase in the commodity sector task price.

C N Proposition 1. Suppose an increase in the commodity sector task demand such that ∆ωg > ∆ωg ≥ 0. Then,

i. There is an increase in the employment share of the commodity sector whose magnitude is controlled by

the slope of the schedule of comparative advantage, αg(.).

ii. There is an ambiguous change in the average wage of both sectors due to the existence of compositional ¯ k effects controlled by the slope of the schedule of sector average efficiency, Ag(.).

iii. There is an increase in the average group wage whose magnitude depends on the initial sector allocation of workers in the group.

17In the commodity sector, the average efficiency in the sector-specific task combines the schedules of comparative and absolute advantage. From expression (7),

N + N C N + N lg ∆lg ∂A¯ (u) lg ∆lg   ¯ C C g 1 ¯ C ∆Yg − ∆ωg = du = − αg(u) + Ag(u) − Ag (u) du. ˆ N q ˆ N − u lg ∂ lg 1

According to this expression, changes in the employment composition have ambiguous consequences to the commodity sector average efficiency. The magnitude of this compositional effect is related to the difference between the sector-specific N N ¯ C N efficiency of sector-switchers, αg(lg ) + Ag(lg ), and sector-stayers, Ag (lg ). In Figures6 and7, the average efficiency in the commodity sector is increasing in q. Thus, an employment expansion triggers the absorption of individuals with lower sector-specific efficiency, reducing the average wage in the commodity sector. 18 N This follows immediately from ∆lg < 0 and the fact that αg(.) is increasing.

20 Taking the commodity price increase as a demand-driven shock to sector-specific task prices such C N 19 that ∆ωg > ∆ωg ≥ 0, Proposition1 can be readily used to interpret Facts 2 and 3 in Section3. Con- sistent with Fact 2, the model predicts a positive response in the commodity sector employment share and an ambiguous response of the commodity sector wage premium. Through the lens of the Two- Sector Roy Model, this pattern of adjustment is related to the existence of compositional effects implied by the between-sector worker reallocation. This compositional effect may reinforce or diminish the di- rect effect of the task price change. If there is positive selection into the non-commodity sector (Figure 6), then the compositional effect increases the average wage in the non-commodity sector — as, for ex- ample, in the weak response of the commodity sector wage premium for HSD. However, to the extent that workers are negatively selected into the non-commodity sector (Figure7), the compositional ef- fect reinforces the direct price effect — as, for example, in the strong response of the commodity sector wage premium for HSG. Despite these mixed wage responses at the sector level, the model also predicts, as in Fact 3, a gain in the average wage of all worker groups. At the group level, the change in the average wage combines the changes in sector-specific task prices using the sectoral allocation of workers. This creates a link between the magnitude of the average wage response and the sector employment composition. In order to move from this qualitative analysis to a quantitative examination of the impact of the commodity price shock on the Brazilian wage structure, it is necessary to recover the schedules of comparative advantage, αg(.), and absolute advantage, Ag(.). To be more precise, the knowledge of these schedules allows the counterfactual decomposition of the log-wage variance based on equation (10), permitting the quantification of the commodity price shock contribution to the wage dispersion behavior in Brazil (Fact 1).

5 Identification of Comparative and Absolute Advantage

This section investigates the nonparametric identification of the schedules of comparative and absolute advantage in the Two-Sector Roy Economy. This result establishes a mapping between population data and the unknown functions αg(.) and Ag(.) without additional restrictions than those implied by the theory. In this sense, such result represents an important first step in the empirical analysis of the model. It indicates the source of variation in the data that uncovers the main economic relation of interest, focusing on robust predictions of the model. Alternatively, as noted by Matzkin(2007), the credibility of empirical results would be significantly hindered if identification could only be achieved under restrictive parametric assumptions.

5.1 Assumptions

In order to establish identification of comparative and absolute advantage in the Two-Sector Roy Econ- omy of Section4, it is necessary to make additional assumption regarding observable labor market

19For a particular worker group g in microregion r, assume that changes in sector-specific task prices are proportional to k k C N the exposure to the commodity price shock: ∆ωg,r,t = βg · ∆Zg,r,t such that βg > βg ≥ 0. If the commodity price shock affects sector task demand, then this assumption can be interpreted as the linear projection of the change in sector-specific k task prices, ∆ωg,r,t, on the shock exposure, ∆Zg,r,t.

21 outcomes as well as their relation to unobservable variables.

Panel of Segmented Labor Markets. Consider a country constituted of regions, r, in different years, t. Each region-year pair is a segmented labor market equivalent to the Two-Sector Roy Economy of C N 20 Section4. Assume that sector-specific task prices are observable variables denoted by (ωg,r,t, ωg,r,t). In equilibrium, these prices are jointly determined by for sector-specific tasks. Let

Dg,r,t denote a vector of task demand conditions in the market, including, among other variables, final product prices, and shifters of production technology and trade costs. In addition, assume that task supply is determined by a skill distribution satisfying the following regularity conditions.

Assumption 1. [Skill Distribution] Suppose individual i of group g in region r at year t, i ∈ Ig,r,t, indepen-  dently draws sg(i), ag(i) from a bivariate distribution such that the following decomposition holds.

i. Comparative Advantage:

sg(i) = s˜g(i) + u˜g,r,t and {s˜g(i)} ∼ Fg (s)

where u˜g,r,t, is a group-region-time shifter of comparative advantage.

ii. Absolute Advantage:

a e {ag(i)|s˜g(i) = s} ∼ Hg,r,t(a|s) = µHg (a|s) + (1 − µ)Hg,r,t(a)

e e where Hg,r,t(a) ≡ H (a|u˜g,r,t, θg,r,t) is a group-region-time mixing distribution of absolute advantage e such that v˜g,r,t ≡ (1 − µ) a dHg,r,t(a). ´ The regularity conditions in Assumption1 guarantee that the pattern of sector selection is identical in every market. Intuitively, they impose parallel curves of potential sector wage across both regions and years; leaving nevertheless the shape of these curves completely unrestricted. In terms of the notation introduced in Section4, Assumption1 implies that the schedules of comparative and absolute advantage are respectively given by

αg,r,t(q) = αg(q) + u˜g,r,t (14)

Ag,r,t(q) = Ag(q) + v˜g,r,t (15)

−1 a  where αg(q) ≡ (Fg) (q) and Ag(q) ≡ µ a dHg a|αg(q) . Equations (14)-(15) attest the importance´ of Assumption1 for its dimensionality reduction impli- cation: cross-market heterogeneity in labor market outcomes is generated by shocks to sector task  supply in u˜g,r,t, v˜g,r,t and sector task demand in Zg,r,t. In this context, u˜g,r,t is a shifter of comparative advantage, representing in the model a supply-driven shock to sector employment across markets. e In addition, v˜g,r,t is a shifter of the overall task efficiency implied by the mixing distribution Hg,r,t(a).

20In this Section, I treat sector-specific task prices as, for all purposes, observable variables determined in the equilibrium of the Two-Sector Roy Economy. Section 6.1 provides a methodology to estimate task prices based on the model’s predicted relation between wage growth and initial sector employment across quantiles of the wage distribution.

22 Notice that this market-specific component of the absolute advantage distribution is independent of comparative advantage, capturing supply shocks to labor efficiency not directly related to sector em- ployment.21 To move towards an empirical application of the model in Section4, I make an additional assump-  tion regarding the structure of the supply shocks u˜g,r,t, v˜g,r,t . In particular, I assume that they combine observable and unobservable components as follows.  Assumption 2. [Error Structure] Assume that the shifters of comparative and absolute advantage, u˜g,r,t, v˜g,r,t , can be written as u v u˜g,r,t = Xg,r,tγg + ug,r,t, and v˜g,r,t = Xg,r,tγg + vg,r,t  where Xg,r,t is an observable vector of group-region-year variables; and ug,r,t, vg,r,t is an unobservable vector of group-region-year supply shocks. Also, normalize the shifters so that E[u˜g,r,t] = E[v˜g,r,t] = 0

Instrument: Sector Demand Shifter. As discussed above, labor market outcomes in equilibrium are determined by both supply and demand conditions in each region-year pair. Accordingly, in order to recover the task supply structure represented by the schedules of comparative and absolute advantage in αg(.) and Ag(.), it is necessary an observable shifter of sector task demand that varies across regions and years. Hence, I assume the existence of an observable subvector Zg,r,t in the vector of sector demand shifters, Dg,r,t, that satisfies the following exogeneity restriction.     Assumption 3. [Exogeneity] E ug,r,t|Zg,r,t, Xg,r,t = E vg,r,t|Zg,r,t, Xg,r,t = 0.

Assumption3 is the usual exogeneity restriction of an instrument: conditional on Xg,r,t, Zg,r,t is mean independent from unobservable shocks to sector task supply, (ug,r,t, vg,r,t). In this case, the in- strument must be a regional shock to sector demand that is uncorrelated with regional shifters of comparative and absolute advantage. Following the intuition of the comparative statics exercise in

Section4, Zg,r,t generates exogenous variation in sector-specific task prices whose quantitative impli- cations for sector employment and wages are directly connected to the schedules of comparative and absolute advantage. As standard in the nonparametric identification literature, I impose an additional completeness condition on the instrument vector, Zg,r,t.

h N i Assumption 4. [Completeness] For any f (.) with finite expectation, E f (lg,r,t, Xg,r,t) Zg,r,t, Xg,r,t = 0 im- N plies that f (lg,r,t, Xg,r,t) = 0 almost surely.

Assumption4 is the equivalent of a rank requirement in the context of nonparametric models. Newey and Powell(2003) show that it is necessary and sufficient for identification of nonparametric

21 For group g, the heterogeneity in the skill distribution across markets is controlled by u˜g,r,t and θg,r,t. The term u˜g,r,t af- fects market-specific components of both comparative and absolute advantage, allowing for correlated supply-driven move- ments in sector employment and overall labor efficiency. The vector θg,r,t allows for a rich pattern of variation in labor efficiency across markets, capturing differences in the mass of workers at various ranges of absolute advantage through e the distribution Hg,r,t(a). Hence, at a particular part of the income distribution, variation in θg,r,t introduces supply-driven shocks to income growth across markets.

23 instrumental variable models of the class considered in this paper. Intuitively, this condition guaran- N tees that the instrument Zg,r,t induces enough variation in the endogenous sector composition lg,r,t to uniquely discriminate the function consistent with the underlying data generating process.

5.2 Identification of Comparative and Absolute Advantage

Under Assumptions1 and2, the conditions determining the employment share and the average wage in the non-commodity sector — i.e., equations (5)-(6) — immediately imply that

h N C i  N  u ωg,r,t − ωg,r,t = αg lg,r,t + Xg,r,tγg + ug,r,t (16)

h i   ¯ N N ¯ N N v Yg,r,t − ωg,r,t = Ag lg,r,t + Xg,r,tγg + vg,r,t (17)

h i ∂ ¯ N where, by construction, Ag(q) = ∂q q · Ag (q) . The variation in the instrument vector Zg,r,t creates variation in the sector employment composition that is orthogonal to variation in the shifters of the skill distribution across markets. As in the labor market adjustment to demand shocks described by Proposition1 of Section4, this exogenous demand variation gives rise to responses in sector employment and sector average wage that are quantitatively connected to the economy’s skill distribution. To be more precise, the schedule of comparative advan- tage, αg(.) in equation (16), controls the magnitude of the change in the relative task price associated with the change in sector employment implied by the exogenous demand shock. The subsequent change in the composition of workers employed in the non-commodity sector triggers a change in the ¯ N N sector’s average efficiency as measured by [Yg,r,t − ωg,r,t]. In equation (17), this compositional effect is ¯ N controlled by the function Ag (.) which is directly related to the schedule of absolute advantage, Ag(.). In line with this discussion, an instrument satisfying Assumptions3-4 identifies αg(.) and Ag(.) respectively from equations (16) and (17).

Proposition 2. Consider a panel of segmented labor markets such that Assumptions1-2 hold. If there is an instrument vector Zg,r,t satisfying Assumptions3-4, then the schedules of comparative advantage, αg(.), and absolute advantage, Ag(.), are nonparametrically identified respectively from equations (16) and (17).

¯ N Proof. As a direct application of Lemma1 in Appendix D.1, the functions αg(.) and Ag (.) are identified ¯ N from equations (16) and (17) with the instrument Zg,r,t. Using the definition of Ag (.), we immediately h i ∂ ¯ N identify Ag(.) such that Ag(q) = ∂q q · Ag (q) . 

In the Two-Sector Roy Economy, comparative and absolute advantage immediately determine ef- ficiency in the task specific to the commodity sector and, consequently, the average efficiency in that sector. As a result, there is one overidentification restriction in the model. To see this, notice that, under Assumptions1 and2, equation (7) yields h i   ¯ C C ¯ C N u v Yg,r,t − ωg,r,t = Ag lg,r,t + Xg,r,t(γg + γg) + (vg,r,t + ug,r,t) (18)

24 h i ∂ ¯ C where, by definition, αg(q) + Ag(q) = − ∂q (1 − q) · Ag (q) . ¯ C From equation (18), the instrument Zg,r,t satisfying3-4 identifies Ag (.) which is directly related to the sum of the schedules of comparative and absolute advantage, αg(.) + Ag(.).

Proposition 3. Consider a panel of segmented labor markets such that Assumptions1-2 hold. If there is an instrument vector Zg,r,t satisfying Assumptions3-4, then the sum of the schedules of comparative and absolute advantage, αg(.) + Ag(.), is nonparametrically identified from equation (18).

¯ C Proof. As a direct application of Lemma1 in Appendix D.1, the function Ag (.) is identified from ¯ C equations (18) with the instrument Zg,r,t. Using the definition of Ag (.), we immediately identify h i ∂ ¯ C Ag(.) + αg(.) such that αg(q) + Ag(q) = − ∂q (1 − q) · Ag (q) . 

6 Empirical Application

The above results establish identification of the schedules of comparative and absolute advantage us- ing cross-market variation in sectoral labor demand. Armed with these theoretical results, I now turn to the estimation of comparative and absolute advantage from the adjustment of labor market out- comes in Brazilian microregions differentially exposed to international commodity price shocks be- tween 1991 and 2010. The empirical application builds directly upon the reduced-form results of Sec- tion3: precisely, the impact of the commodity price shock on between-sector responses of employment and wages. A detailed description of the data used in this section is given in AppendixA.

6.1 Sector-Specific Task Prices

In order to apply the results of Section5, it is first necessary to measure sector-specific task prices in each region and year. In this section, I propose a methodology to estimate task prices using promptly available information on labor income and employment at the individual level. In the Two-Sector Roy Economy, individuals are not readily substitutable between sectors, imply- ing that their wages are only exposed to changes in the task price of their own sector of employment. Following task price shocks, this observation implies that, across different parts of the wage distribu- tion, variation in the pre-shock sector employment composition translates into variation in the growth of wages. Intuitively, if all individuals at the bottom of the distribution are employed in the commod- ity sector, then the wage gain at the bottom is entirely attributed to the change in the task price of the commodity sector. In such case, an increase in the task price of the non-commodity sector has no impact on the wage of individuals at the bottom of the wage distribution.

To formalize this intuition, let Yg,r,t(π) denote the π-quantile of the log-wage distribution of group g in region r at year t. For small shocks, I show in Appendix D.2 that the wage growth between periods t0 and t in quantile π of the log-wage distribution is given by h i ( ) = C + N − C · N ( ) + · ˜ ( ) + ( ) ∆Yg,r,t π ∆ωg,r,t ∆ωg,r,t ∆ωg,r,t lg,r,t0 π µg,r,t Xg,r,t π ∆vg,r,t π (19)

25 π lN (π) where, at quantile of the log-wage distribution, g,r,t0 is the initial employment share of the non- 22 commodity sector, X˜ g,r,t(π) is a set of observable controls, and ∆vg,r,t(π) is a productivity shock. For each triple of group-region-period, equation (19) implies that changes in task prices can be consistently estimated from the relation across wage distribution quantiles between the initial sector lN (π) Y (π) composition, g,r,t0 , and the wage growth, ∆ g,r,t . In this context, an estimator based on equa- tion (19) relies on the assumption that, conditional on the set of controls X˜ g,r,t(π), pre-shock variation in sector employment composition is uncorrelated with variation in labor efficiency shocks among individuals with different levels of labor income in a particular group-region-period. Additionally, such estimator hinges on a central feature of the Roy Model embedded in equation (19): the indif- ference of marginal individuals between the two sectors. For small price shocks, sector-switchers are the marginal individuals with the same potential wage in the two sectors so their reallocation has no first-order impact on the group’s log-wage distribution.23 Armed with the model’s prediction in equation (19), I proceed to estimate task price changes by regressing wage growth between two years on the initial year’s sector employment composition in a set of wage distribution quantiles for each group-region-period. Towards this end, I use Census data lN (π) Y (π) to construct g,r,t0 and ∆ g,r,t in various quantiles of the log-wage distribution. In the baseline specification, I divide workers into bins of 1 p.p. width computed from the log-wage distribution quantiles for a group-region-period.24 The implementation of expression (19) allows for a vector of observables variables that vary with the position in the wage distribution. Accordingly, the baseline specification includes the following dummy variables as nonparametric controls: (i) indicator that wage percentile is at the bottom, middle or top of the log-wage distribution; (ii) indicator that wage percentile is below the federal minimum wage (pre-year and post-year). These dummies capture, for example, differential efficiency gains for workers in distant parts of the wage distribution, and income gains generated by bunching around the minimum wage. In this specification, sector-specific task prices are identified from the variation in pre-shock sector employment in small neighbourhoods of the log-wage distribution of workers in the same group-region-period. As in Section3, I focus on the sample of male white full-time worker

22In Appendix D.2, I show that equation (19) is generated by a first-order expansion of the implicit equation defining Yg,r,t(π). In this context, ∆vg,r,t(π) is a shock to the absolute advantage of individuals spread across quantiles of the log- wage distribution. It is introduced by shocks to (u˜g,r,t, θg,r,t) that affect the market-specific mixing distribution of absolute e e advantage, Hg,r,t(a) ≡ H (a|u˜g,r,t, θg,r,t) in Assumption1. The change in the mixing distribution of absolute advantage has consequences for the labor efficiency of individuals at different income levels. As a result, the process generating innovations in (u˜g,r,t, θg,r,t) creates, through ∆vg,r,t(π), idiosyncratic shocks to wage growth across quantiles of the wage distribution. 23Expression (19) is modified whenever there exists a wedge in sector potential wage of sector-switchers. This is the case in the presence of non-monetary benefits of employment. To the extent that sector-switchers are spread over the wage distribution, the wedge affects wage gains across quantiles. Consistent with this intuition, there is a new term in equation (19) that is proportional to the fraction of sector-switchers among individuals at the π-quantile of the log-wage distribution. For a detailed analysis of this case, see Appendix D.3. 24In principle, equation (19) can be implemented with any division of individuals into quantiles; in practice, however, this choice entails a tradeoff. On the one hand, a coarse discretization yields a low number of quantiles with potentially little variation in initial sector employment to precisely estimate task prices. On the other hand, a refined discretization exacerbates measurement error of sector employment in each quantile because of the low number of sampled individuals in each sector. Also, extreme values of income could be generated by measurement error, affecting wage growth in the tails of the wage distribution. With these considerations in mind, the estimation is implemented with 88 percentile bins of 1 p.p. width between the 6th and the 94th percentiles. Below, I show that similar results are obtained if bins of 2 p.p. width are used. See AppendixA for more details on data construction.

26 divided in two educational groups, High-School Graduates (HSG) and High-School Dropouts (HSD).

Table 3: Summary Statistics: Estimated Sector-Specific Task Prices, 1991-2010 Log change in commodity Log change in non-commodity R2 sector task price sector relative task price Mean SD Mean SD Mean (1) (2) (3) (4) (5) A. High-School Graduates 1991 - 2000 0.320 0.370 -0.151 0.347 55.5% 2000 - 2010 0.150 0.645 -0.306 0.609 75.8% B. High-School Dropouts 1991 - 2000 0.524 0.579 -0.364 0.619 71.5% 2000 - 2010 0.440 0.579 -0.360 0.634 83.0%

Note. Sample of 518 microregions in 1991-2000 and 2000-2010. Statistics are weighted by the microregion share in national population on 1991. Worker sample of male white individuals distributed in 1 p.p. percentile bins of hourly wage (N = 88). Baseline specification includes the following dummies: (i) indicator that income percentile is below the federal minimum wage (pre and post years); (ii) indicator that income percentile belongs to range P6-P30 and P30-P75.

Table3 presents the summary statistics of estimated task prices implied by the baseline specifica- tion in 2,072 group-region-period triples. Columns (1)-(2) display statistics of the estimated task price C in the commodity sector, ∆ωg,r,t, and columns (3)-(4) of the estimated relative task price of the non- N C commodity sector, ∆ωg,r,t − ∆ωg,r,t. The commodity sector task price presented robust growth in both periods. Between 1991 and 2010, the average increase was 47.0 log-points for HSG and 96.4 log-points for HSD. Simultaneously, the relative task price in the non-commodity sector decreased sharply. Lastly, column (5) reports the average R2 of the estimation in the sample of microregions. A large fraction of the variation in wage growth across quantiles of the earnings distribution is captured by equation (19); in the two periods, the average R2 is above 55% for HSG and 71% for HSD. To address robustness to implementation choices, Table4 presents the correlation between task price estimates implied by different specifications of equation (19) and those implied by the baseline specification. Columns (1)-(3) and (5)-(7) indicate a high correlation between estimates obtained with different control sets. Notice that, when minimum wage controls are omitted, estimated task prices are extremely similar to those of the baseline specification. This suggests that quantile range controls absorb much of the variation captured in the minimum wage dummies. Columns (4) and (8) attest that the particular choice of bin width has little impact on estimates: the correlation is above .88 between baseline estimates and those obtained with a coarser discretization of 2 p.p. bins.

6.2 Parametric Restrictions: Log-Linear System

The nonparametric identification in Section5 is an asymptotic result. In practice, limitations in the availability of exogenous variation in the data may prevent the implementation of a fully flexible es- timator capable of nonparametrically recovering the functions of interest. In such cases, auxiliary

27 Table 4: Estimated Sector-Specific Task Prices, Correlation with Benchmark Specification Commodity sector task price Non-commodity sector relative task price (1) (2) (3) (4) (5) (6) (7) (8) A. High-School Graduates Correlation with baseline estimates 0.776 0.855 0.973 0.926 0.807 0.874 0.969 0.916 B. High-School Dropouts Correlation with baseline estimates 0.802 0.914 0.960 0.886 0.812 0.912 0.960 0.893

Baseline Controls Percentile below federal minimum wage No Yes No Yes No Yes No Yes Percentile in bottom, middle or top of wage distribution No No Yes Yes No No Yes Yes Discretization of wage distribution Bins of 1 p.p. (N = 88) Yes Yes Yes No Yes Yes Yes No Bins of 2 p.p. (N = 44) No No No Yes No No No Yes

Note. Sample of 518 microregions in 1991-2000 and 2000-2010. Statistics are weighted by the microregion share in national population on 1991. Baseline estimates based on the discretization of the wage distribution in 88 bins of 1 p.p. width, including indicator dummies of percentile bins below the federal minimum wage (pre and post years) and percentile bins in bottom, middle or top of the wage distribution (P6-P30 and P30-P75).

functional form assumptions on αg(.) and Ag(.) are particularly useful to increase estimation preci- sion. It is important however that these parametric assumptions do not impose artificial restrictions on the model. In the Two-Sector Roy Model, it is particularly relevant that functional forms allow for separate roles for comparative and absolute advantage since they are related to distinct predictions of the model. Accordingly, the benchmark specification in the empirical application below is based on the fol- lowing log-linear version of system (16)-(17).

Assumption 5. [Log-Linear System] Suppose schedules of comparative and absolute advantage in system (16)-(17) are given by

αg(q) = αg · [ln (q) − ln (1 − q)] and Ag(q) = A˜ g + Ag · ln (q) where αg ≥ 0 and Ag ∈ R.

Assumption5 commands constant-elasticity schedules of comparative and absolute advantage.

Following the discussion in Section4, the positive parameter αg controls the dispersion of comparative advantage; alternatively, the parameter Ag controls the pattern of variation in average absolute advan- tage of individuals distributed across quantiles of comparative advantage. In the empirical application, these parametric restrictions are useful for two main reasons. First, they simplify the instrument’s rank requirement since, as discussed below, they allow the use of standard instrumental variable estimators.

Second, parameters αg and Ag can be interpreted as the average elasticity of the comparative and ab- solute advantage schedules. In other words, these parameters correspond to the local treatment effect induced by the labor demand instrument in the cross-section of markets.25

25In the spirit of the series estimator proposed by Newey and Powell(2003), the log-linear system could be augmented to include higher-order polynomials. In the limit, such expansion would recover nonparametrically functions αg(.) and Ag(.). Yet, the implementation of this estimator is not feasible in most applications. As pointed out by Newey(2013), the estima- tion of nonlinearities tend to be accompanied by sharp increases in standard errors, requiring multiple strong instruments. The application in this paper is no exception and, for this reason, the constant-elasticity specification in Assumption5 is particularly attractive.

28 It is important to notice that the system in Assumption5 is a strict generalization of the system obtained under a Frechet distribution of skills that form the basis of numerous recent empirical appli- cations of the Roy Model — see e.g., Hsieh, Hurst, Jones, and Klenow(2013); Burstein, Morales, and Vogel(2015); and Galle, Rodriguez-Clare, and Yi(2015). As discussed in Appendix D.4, the Frechet distribution leads to a similar log-linear system, but it entails a single parameter to control both com- parative and absolute advantage. In terms of the system above, the Frechet distribution requires that

αg = −Ag where αg < 1. These restrictions have strong consequences to the model’s predictions regarding the responses of employment and wages following sector demand shocks. Namely, they impose constraints not only on the magnitude of the between-sector reallocation but also on the pat- tern of selection into both sectors. In fact, αg = −Ag rules out responses in sector wage differentials, being unable to replicate the positive impact of commodity prices on the the commodity sector wage premium documented in Section3. In contrast, the more general log-linear system in Assumption5 contains parameters that separately control comparative and absolute advantage. Specifically, αg is positive to reflect a positively sloped comparative advantage schedule and Ag is free to capture both positive and negative selection in the absolute advantage schedule.26

6.3 Estimation Procedure

Now we are ready to propose an estimator for the schedules of comparative and absolute advantage directly related to the identification result in Proposition2. To this extent, I take advantage of the para- metric restrictions in Assumption5 to construct a consistent GMM procedure with moment conditions that use the differential exposure of Brazilian microregions to the variation in international commodity prices in the two period windows of 1991-2000 and 2000-2010. To this end, let us combine equations (16)-(17) with the functional forms in Assumption5 to write the following first-difference system:

N C  N . C  u ∆ωg,r,t − ∆ωg,r,t = αg · ∆ ln lg,r,t lg,r,t +∆Xg,r,tγg + ∆ug,r,t (20)

  ¯ N N N v ∆Yg,r,t − ∆ωg,r,t = Ag · ∆ ln lg,r,t +∆Xg,r,tγg + ∆vg,r,t (21) where Xg,r,t is a control vector of group-microregion-period variables that include group-microregion k fixed effects; and ∆ωg,r,t is the changes in the task price of sector k estimated with the procedure de- scribed in Section 6.1. Also, the overidentification restriction in Proposition3 delivers an additional equation to the sys-

26Appendix D.4 provides a detailed discussion on the pattern of sector selection implied by the Frechet distribution. While the restriction of αg = −Ag is a direct implication of assuming a Frechet distribution, the restriction of αg < 1 is necessary to guarantee a finite supply of tasks in each sector. In Appendix D.4, I propose an alternative skill distribution that delivers the log-linear model in Assumption5 with the sole restriction of a positively sloped comparative advantage schedule (i.e., αg positive). Appendix D.4 also discusses the system implied by normally distributed skills — as in Roy(1951), Heckman and Sedlacek(1985), and Ohnsorge and Trefler(2007). Although the normal distribution leads to distinct functional forms, the implied system also entails two parameters that parametrize the slopes of the schedules of comparative and absolute advantage, allowing for a flexible selection pattern.

29 tem. Specifically, equation (18) under Assumption5 implies that " # lN   ¯ C − C = − +  · g,r,t N − · C + e + ∆Yg,r,t ∆ωg,r,t αg Ag ∆ C ln lg,r,t αg ∆ ln lg,r,t ∆Xg,r,tγg ∆eg,r,t (22) lg,r,t where Xg,r,t is the same control vector as above. u v e Conditional on the parameter vector Θg ≡ (αg, Ag, γg, γg, γg), equations (20)-(22) immediately al-  ≡   low the computation of the vector of structural errors: eg Θg ∆ug,r,t, ∆vg,r,t, ∆eg,r,t r,t. Combined ≡   ( ) with a matrix of instruments Wg ∆Zg,r,t, ∆Xg,r,t r,t satisfying Assumption3, the error vector eg Θg provides moment conditions for the consistent estimation of Θg with the following GMM estimator:

ˆ 0 0 Θg = arg min eg(Θg) WgΦWgeg(Θg) (23) Θg where Φ is a matrix of moment weights.27 As in Section3, microregions are weighted by their share in the national population of 1991 and standard errors are clustered by microregion to account for serially correlated errors. To build instruments for the estimation of the parameter vector, I rely on a generalization of the regional exposure to commodity price shocks of Section3. Precisely, I consider the following set of instruments: =  ( ) · ( ) ∆Zg,r,t ∆F φg,r j ln pt j j∈J C (24) where F(.) is a quadratic polynomial function. Similarly to Section3, ln pt(j) is the international log- price of commodity category j at year t; and φg,r(j) is the share of industry j in total labor payments of the commodity sector to individuals of group g in microregion r on the initial year of 1991. In the empirical application, the instrument vector contains five major commodity groups: Grains, Soft Agriculture, Livestock, Mining (Metals and Precious Metals), and Energy.28 As in Section3, the control vector ∆Xg,r,t contains ten macroregion-period dummies, initial commodity sector size controls, and initial labor market conditions interacted with period dummies.29

6.4 Results

Table5 presents the structural estimates obtained with the procedure described in Section 6.3. Col- umn (1) reports the structural parameters implied by the estimation of equations (20)-(21) under the

27In the main specification, I use the two-step optimal GMM weights. The sensitivity analysis in AppendixE shows that similar results are obtained with other matrices of moment weights. 28In principle, one could use any non-linear function of the six commodity categories. Yet, the inclusion of too many instruments may lead to the well known weak instrument problem. To maintain a parsimonious number of instruments, I adopt two simplifications. First, I aggregate the two categories with lowest employment participation, Metals and Precious Metals, into a single category, Mining. Second, I consider a quadratic polynomial function, yielding two instruments per commodity category. In AppendixE, I show that similar results are obtained with alternative specifications for the instrument vector. 29Labor market conditions: quadratic polynomial of per-capita income, share of white individuals, share of employed individuals, share of formal sector employees, share of individuals earnings less than the federal minimum wage, and share of social security earners (only HSD). Commodity sector size controls: quadratic polynomial of commodity sector share in group labor income and dummy for commodity sector share in group labor income in the bottom and top deciles of national distribution.

30 parameter restriction imposed by a skill distribution of the Frechet family (i.e., αg = −Ag). For both groups, estimated parameters indicate a steep schedule of comparative advantage that is associated with a low degree of between-sector employment reallocation.30 Evaluated at the national sector em- ployment composition of 2010, estimates imply that, following a 10% increase in the relative task price in the commodity sector, the share of workers employed in the commodity sector increases from 5.14% to 5.63% for HSG and from 19.94% to 21.65% for HSD. Table 5: Structural Parameters: Log-Linear System Frechet model Log-linear model Log-linear model

훼푔 = −퐴푔 with overidentification (1) (2) (3) A. High-School Graduates

훼퐻푆퐺 1.040** 1.051** 0.835** (0.195) (0.192) (0.212) 퐴퐻푆퐺 -1.040** 2.483* 1.966* (0.195) (0.976) (0.935) Test of Frechet restriction (p-) - 0.000 0.005 B. High-School Dropouts

훼퐻푆퐷 0.963** 1.873* 0.916* (0.151) (0.773) (0.399) 퐴퐻푆퐷 -0.963** -0.888** -0.727** (0.151) (0.146) (0.142) Test of Frechet restriction (p-value) - 0.208 0.644

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. Two-Step GMM estimator with microregions weighted by their share in the 1991 national population. Excluded instruments: quadratic polynomial of local exposure to international product prices. Control vector includes macroregion-period dummies, initial commodity sector size controls, and initial labor market conditions interacted with period dummies (full description in Section 6.3). Standard Errors clustered by microregion. ** p<0.01, * p<0.05

As discussed above, the Frechet distribution entails a very particular pattern of selection that rules out responses in sector wage differentials. Such feature is inconsistent with the reduced-form evidence for HSG presented in Section3. To allow for a more flexible behavior of sector average wages, column (2) presents the estimation of the log-linear model in equations (20)-(21). Indeed, this additional degree of freedom is important for HSG as the estimated parameter has the inverse sign and the Frechet restriction is rejected at usual significance levels. Among HSG, the estimated parameters indicate negative selection into the non-commodity sector with curves of potential sector wages similar to those in Figure5. In contrast, the estimated parameters for HSD imply similar qualitative predictions of those of the Frechet model, yielding positive selection into both sectors with curves of potential sector wages similar to those in Figure4. In this case, the Frechet restrictions cannot be rejected at usual significance levels. Lastly, column (3) presents the implementation of the full estimation procedure that includes the overidentifying equation (22). The overidentification equation yields more precise estimates for HSD

30 In fact, it is not possible to reject the hypothesis that αg = 1 at the usual significance levels. Thus, estimated parameters suggest higher dispersion of comparative advantage than that allowed by the Roy model with a Frechet distribution of skills.

31 and has little effect on estimates for HSG. In this case, comparative advantage parameters indicate between-sector employment reallocation whose magnitude is similar to that of the Frechet model in column (1). Nevertheless, the slope of the absolute advantage schedule is similar to that reported in column (2), indicating a response in the sector wage premium that is positive for HSG and almost zero for HSD.

7 Counterfactual Exercise: Effect of International Commodtiy Prices on the Wage Distribution

To conclude, I use the estimated schedules of comparative and absolute advantage to investigate the consequences to the Brazilian wage distribution of shocks in international commodity prices. Precisely, I ask: “In 2010, how the wage structure would change if commodity prices fall by 10%?" The counter- factual exercise takes into account the variation in shock exposure across microregions and groups to compute the implied change in the national dispersion of wages.

7.1 First-Step: From Changes in Sector-Specific Task Prices to Changes in the Log-Wage Distribution

As a first step to answer this question, I consider responses in the average and the variance of the log-wage distribution among workers of group g in region r following a change in sector-specific task C N prices, (∆ωg,r, ∆ωg,r), triggered by a sector demand shock — in this case, the change in international commodity prices. To this extent, I consider the estimated schedules of comparative and absolute advantage, obtained in Section6, while holding constant the structure of task supply (i.e. ∆v˜g,r =

∆u˜g,r = 0). In this environment, equations (20)-(22) immediately deliver the changes in sector employment, k ¯ k ∆lg,r, and sector average wages, ∆Yg,r, implied by the counterfactual change in sector-specific task prices. Together with equation (8) in Section4, we obtain the counterfactual change in the average log-wage:

N N lg,r+∆lg,r ¯ C N N C N N ∆Yg,r = ∆ωg + ∆ωg,r · lg,r + αg log(lg,r/lg,r)∆lg,r − αg [log(q) − log(1 − q)] dq + ∆ωg,r∆lg,r (25) ˆ N | {z } lg,r Direct Effect | {z } Compositional Effect

N C N where ∆ωg,r ≡ ∆ωg,r − ∆ωg,r is the counterfactual change in relative task prices; and lg,r is the observed employment share of the non-commodity sector among workers of group g in region r. In equation (25), the first term is the direct effect on the wage of sector employees if they are unable to reallocate between sectors. Notice that this direct effect depends solely on the pre-shock employ- ment composition, being independent from estimated parameters. The second term is the composi- tional effect generated by the reallocation of workers in response to the shock which, intuitively, de- pends on the slope of the comparative advantage schedule. As noted in Section4, this compositional effect is negligible for small price shock since, in such cases, only workers with very similar potential wages in the two sectors choose to switch their sector of employment.

32 To compute the change in log-wage variance among workers of group g in region r, we rely on the model’s predicted response obtained from equation (10) in Section4:

     2  2 N N C C ¯ C ¯ N ¯ C ¯ N N C ¯ C ¯ N ∆Vg,r = lg,r + ∆lg,r lg,r + ∆lg,r · Yg,r − Yg,r + ∆Yg,r − ∆Yg,r − lg,rlg,r · Yg,r − Yg,r

N lg,r  h i2 h i2  2  2 k k ¯ k N N k ¯ k N + (αg + Ag) log(q) − αg log(1 − q) − Ag log(q) dq − (lg,r + ∆lg,r) Ag(lg,r + ∆lg,r) − lg,r Ag(lg,r) ˆ N N ∑ lg,r +∆lg,r k=C,N (26)

¯ C ¯ N where Yg,r − Yg,r is the commodity sector wage premium that is observed among workers of group g ¯ k in region r; and Ag(.) is the schedules of average efficiency in sector k estimated in Section6 (i.e., the right hand side of equations (21)-(22)). The log-wage variance has a between-sector component that responds to the change in sector wage premium as captured by the term in the first row of equation (26). The term in the second row cap- tures the variance in log-wages within the two sectors of the economy. This is directly related to the estimated schedules of comparative and absolute advantage that govern the dispersion of average sector-specific efficiency across the comparative advantage quantiles allocated to each sector.

7.2 Second-Step: Pass-Through from Changes in Commodity Prices to Changes in Sector- Specific Task Prices

In order to implement equations (25)-(26), it is necessary to evaluate how the change in commodity prices translates into changes in sector-specific task prices for each group and microregion. In this section, I focus on the reduced-form pass-through of the shock exposure to sector-specific task prices as captured by the following equation:

k k k ∆ωg,r,t = βg · ∆Zg,r,t + ∆Xg,r,tγg + ∆eg,r,t (27) where Xg,r,t is the control vector of group-microregion-period variables described in Section 6.3. As above, microregions are weighted by their share in the national population of 1991 and standard errors are clustered by microregion. Table6 reports the estimation of equation (27) in the sample of Brazilian microregions in 1991-2000 and 2000-2010. Estimates in columns (1)-(4) indicate that exposure to the commodity price shock is associated with a higher task price in the commodity sector. With the full control vector used in the structural estimation, the estimated pass-through is 0.9-1.2 for HSD and 1.2-1.3 for HSG. In addition, columns (5)-(8) investigate the relation between shock exposure and non-commodity sector task prices, implying a pass-through of .25-.35 for HSG and 0-.10 for HSD. In response to higher commodity prices, these estimates suggest stronger gains in the commodity sector task price which, in turn, imply an increase in the relative task price of the commodity sector. Combined with the commodity sector employment expansion, this change in sector-specific task prices yield the structural estimation above. In the counterfactual analysis, I calibrate the pass-through from commodity prices to sector-specific task prices using the estimates in Table6. To focus on supply-driven movements in inequality, I assume that the pass-though is identical for HSG and HSD. Taking an intermediate value of the estimates for the two groups, the pass-though from international commodity prices to task prices is set to be 1.10 in

33 the commodity sector and 0.25 in the non-commodity sector.

Table 6: Exposure to Commodity Price Shocks and Sector-Specific Task Prices

Dependent Variable: Change in task price of commodity sector Change in task price of non-commodity sector

(1) (2) (3) (4) (5) (6) (7) (8) A. High-School Graduates Commodity price shock 1.407** 1.120** 1.096** 0.962** 0.354** 0.323** 0.324** 0.282** (0.394) (0.390) (0.392) (0.359) (0.071) (0.069) (0.069) (0.062) R2 0.524 0.570 0.585 0.598 0.558 0.585 0.586 0.592 B. High-School Dropouts Commodity price shock 1.599* 1.292* 1.268* 1.274* -0.056 0.028 0.008 -0.014 (0.730) (0.571) (0.574) (0.571) (0.112) (0.092) (0.091) (0.089) R2 0.644 0.689 0.690 0.690 0.461 0.555 0.576 0.576 Structural Estimation Controls Initial commodity sector size controls Yes Yes Yes Yes Yes Yes Yes Yes Initial labor market conditions x period dummies No Yes Yes Yes No Yes Yes Yes Additional Controls Initial commodity sector size controls x period dummies No No Yes Yes No No Yes Yes Initial size of manufacturing sector x period dummies No No No Yes No No No Yes

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. All regressions are weighted by the microregion share in national population on 1991 and include ten macroregion-period dummies. Structural estimation controls described in Section 6.3. Standard Errors clustered by microregion ** p<0.01, * p<0.05

7.3 Counterfactual Predictions

In any microregion r on 2010, we are interested in the counterfactual change of the wage distribution implied by a 10% decrease in international commodity prices. Using the pass-through obtained in Section 7.2, I compute the change in sector-specific task prices triggered by the price shock:

k k ∆ωg,r = βg · 0.10

C N where βg = 1.10 and βg = 0.25. For every group g and region r, equations (25)-(26) yield the counterfactual change in the average and the variance of log-wages conditional on estimated parameter of comparative and absolute ad- C N vantage, (αg, Ag), and the counterfactual changes in sector-specific task prices, (∆ωg,r, ∆ωg,r). Table7 report these counterfactual responses at the national level where the aggregate log-wage variance is computed with the total variance formula and the microregion’s employment share in 2010. Table7 also reports the bootstrapped 90% confidence intervals associated with such changes.31

As shown in column (1), the fall in commodity prices causes average wage losses for the two worker groups. Yet, the wage loss is more pronounced among HSD due to their higher employment share in the commodity sector — the 2010 national employment share of the commodity sector was 5.1% for HSG and 19.9% for HSD. Consequently, the 10% decrease in commodity prices leads to an increase in

31To compute confidence intervals, I take R draws from the asymptotic normal distribution of the GMM estimator implied by the estimates in Table5. With each parameter draw, I compute counterfactual changes in the average and the variance of the log-wage distribution using equations (25)-(26). The confidence interval bounds correspond to the 5th and the 95th percentiles of the counterfactual changes obtained from the R parameter draws. In Table7, I set R = 300.

34 Table 7: Effect of 10% decrease in commodity prices, benchmark estimates Change in group Percentual change in the between component of log-wage average wage variance between-region between sector national average Total within-group within region-group (1) (2) (3) (4) Panel A. High-School Graduates -2.94 1.09 11.09 12.18 [-2.94; -2.92] [1.09; 1.09] [-2.35; 48.31] [-1.26; 49.40] Panel B. High-School Dropouts -4.13 3.26 -3.72 -0.47 [-4.15; -4.02] [3.10; 3.34] [-12.88; 13.65] [-9.54; 16.45] Panel C. All Workers -3.52 2.76 1.71 4.48 [-3.52; -3.51] [2.57; 2.80] [-2.14; 12.22] [0.62; 14.91]

Note. Initial labor market outcomes in 518 microregions on 2010. Estimated parameters of the Log-linear model in column (3) of Table 5. For each worker group, columns (2)-(4) report the percentual change in the between-component of log-wage variance implied by the decomposition presented in Figure 1. Pass-through of commodity prices to task prices equal to 1.1 in the commodity sector and 0.25 in non-commodity sector. Each response is accompanained by its 90% confidence interval computed with the bootstrap procedure described in the main text. the HSG wage premium of approximately 1.2%. Notice that confidence intervals are extremely tight, reflecting the quantitative dominance of the parameter-free direct effect in equation (25). Columns (2)-(4) present the percentage change in the between component of the log-wage variance implied by the shock.32 Column (2) shows the increase in wage dispersion associated with the change in average regional wages. Because areas specialized in commodity production had lower wages on 2010, the negative shock triggers an increase in the log-wage dispersion within both groups. At the national level, this effect is reinforced by the higher shock exposure of low-wage workers in the group of HSD. Column (3) shows the shock effect on the log-wage variance within groups and regions com- puted with equation (26). For HSG, the negative impact of the shock on the commodity sector wage differential leads to an increase in the log-wage variance. In contrast, the weaker response for HSD leads to a decrease in the log-wage variance. Lastly, column (5) combines these components to obtain the counterfactual change in total wage dispersion: a 10% decrease in commodity prices implies a 4.5% increase in the Brazilian log-wage variance.

8 Conclusion

This paper contributes to our understanding of the role of comparative advantage in labor markets in the response of within- and between-wage inequality following international trade shocks. This paper establishes a reduced-form relation between exposure to international price shocks and labor market outcomes that asks for differential effects depending on the worker’s affiliation to groups, regions and sectors. Such structure is provided by a Two-Sector Roy Model where the distribution of worker

32The between component of the log-wage variance is obtained form the log-wage decomposition presented in Figure2 of Section3.

35 heterogeneity in terms of sector-specific efficiency governs the sectoral allocation in various groups and regions. Using robust predictions of the model, I show that this distribution is nonparametrically identified from cross-regional variation in employment and wages induced by exogenous variation in sector labor demand. The application of this novel strategy to Brazilian regions suggest a rich esti- mated pattern of worker selection that cannot be captured by traditional models. In the counterfactual analysis, the estimates of comparative and absolute advantage attest the importance of selection forces in the behavior of wage differentials associated with educational groups, regions and sectors.

References

D. Acemoglu and D. H. Autor. Skills, tasks and technologies: Implications for employment and earnings. Handbook of Labor Economics, 4:1043–1171, 2011.

E. Artuç, S. Chaudhuri, and J. McLaren. Trade Shocks and Labor Adjustment: A Structural Empirical Approach. The American Economic Review, pages 1008–1045, 2010.

D. H. Autor, D. Dorn, and G. H. Hanson. The China Syndrome: Local Labor Market Effects of Import Competition in the United States. The American Economic Review, 103(6):2121–2168, 2013.

D. H. Autor, D. Dorn, G. H. Hanson, and J. Song. Trade Adjustment: Worker-Level Evidence. The Quarterly Journal of Economics, 129(4):1799–1860, 2014.

E. Bekman, J. Bound, and S. Machin. Implications of Skill-Biased Technological Change: International Evidence. The Quarterly Journal of Economics, 113(4):1245–1279, 1998.

G. J. Borjas. Self-Selection and the Earnings of Immigrants. The American Economic Review, 77(4):531–553, 1987.

A. Burstein and J. Vogel. International trade, technology, and the skill premium. Manuscript, Columbia University and UCLA, 2015.

A. Burstein, E. Morales, and J. Vogel. Accounting for Changes in Between-Group Inequality. Technical report, Working Paper, 2015.

L. Caliendo, M. Dvorkin, and F. Parro. The Impact of Trade on Labor Market Dynamics. Technical report, National Bureau of Economic Research, 2015.

F. Costa, J. Garred, and J. P. Pessoa. Winners and Losers from a Commodities-for-Manufactures Trade Boom. Technical report, Centre for Economic Performance, LSE, 2014.

A. Costinot. An elementary theory of comparative advantage. Econometrica, 77(4):1165–1192, 2009.

A. Costinot and J. Vogel. Beyond Ricardo: Assignment Models in International Trade. Technical report, National Bureau of Economic Research, 2014.

A. Costinot and J. E. Vogel. Matching and Inequality in the World Economy. Journal of , 118(4):747–786, 2010.

R. Dix-Carneiro. Trade liberalization and labor market dynamics. Econometrica, 82(3):825–885, 2014.

R. Dix-Carneiro and B. K. Kovak. Trade reform and regional dynamics: evidence from 25 years of brazilian matched employer-employee data. Technical report, National Bureau of Economic Research, 2015a.

R. Dix-Carneiro and B. K. Kovak. Trade liberalization and the skill premium: A local labor markets approach. Technical report, National Bureau of Economic Research, 2015b.

36 F. H. Ferreira, S. Firpo, and J. Messina. A more level playing field? Explaining the decline in earnings inequality in Brazil, 1995-2012. 2014.

E. French and C. Taber. Identification of Models of the Labor Market. Handbook of Labor Economics, 4:537–617, 2011.

J. A. Frias, D. S. Kaplan, and E. Verhoogen. Exports and within-plant wage distributions: Evidence from mexico. American Economic Review, 102(3):435–40, 2012.

S. Galle, A. Rodriguez-Clare, and M. Yi. Slicing the Pie: Quantifying the Aggregate and Distributional Effects of Trade. UC Berkeley, mimeo, 2015.

R. Gibbons and L. Katz. Does unmeasured ability explain inter-industry wage differentials? The Review of Economic Studies, 59(3):515–535, 1992.

R. Gibbons, L. F. Katz, T. Lemieux, and D. Parent. Comparative Advantage, Learning, and Sectoral Wage Determination. Journal of Labor Economics, 23(4):681–724, 2005.

P. K. Goldberg and N. Pavcnik. Distributional Effects of in Developing Countries. Journal of Economic Literature, 45(1):39–82, 2007.

G. H. Hanson. The Rise of Middle Kingdoms: Emerging Economies in Global Trade. The Journal of Economic Perspectives, 26 (2):41–63, 2012.

J. J. Heckman and B. E. Honore. The Empirical Content of the Roy Model. Econometrica, pages 1121–1149, 1990.

J. J. Heckman and G. Sedlacek. Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self- Selection in the Labor Market. The Journal of Political Economy, 93(6):1077, 1985.

E. Helpman, O. Itskhoki, and S. Redding. Inequality and in a global economy. Econometrica, 78(4):1239–1283, 2010.

E. Helpman, O. Itskhoki, M.-A. Muendler, and S. J. Redding. Trade and inequality: From theory to estimation. Technical report, National Bureau of Economic Research, 2015.

C.-T. Hsieh, E. Hurst, C. I. Jones, and P. J. Klenow. The Allocation of Talent and US . Technical report, National Bureau of Economic Research, 2013.

R. W. Jones. Income distribution and effective protection in a multicommodity trade model. Journal of Economic Theory, 11(1): 1–15, 1975.

G. Kambourov. Labour market regulations and the sectoral reallocation of workers: The case of trade reforms. The Review of Economic Studies, 76(4):1321–1358, 2009.

B. K. Kovak. Regional Effects of Trade Reform: What is the Correct Measure of Liberalization? The American Economic Review, 103(5):1960–1976, 2013.

D. Lagakos and M. E. Waugh. Selection, Agriculture, and Cross-Country Productivity Differences. The American Economic Review, 103(2):948–980, 2013.

R. Z. Lawrence and M. J. Slaughter. International trade and American wages in the 1980s: giant sucking sound or small hiccup? Brookings papers on economic activity. , pages 161–226, 1993.

R. L. Matzkin. Nonparametric Identification. Handbook of , 6:5307–5368, 2007.

C. B. Mulligan and Y. Rubinstein. Selection, investment, and women’s relative wages over time. The Quarterly Journal of Economics, 123(3):1061–1110, 2008.

37 D. Neal. Industry-specific human capital: Evidence from displaced workers. Journal of labor Economics, pages 653–677, 1995.

W. K. Newey. Nonparametric Instrumental Variables Estimation. The American Economic Review, 103(3):550–556, 2013.

W. K. Newey and J. L. Powell. Instrumental Variable Estimation of Nonparametric Models. Econometrica, 71(5):1565–1578, 2003.

F. Ohnsorge and D. Trefler. Sorting It Out: International Trade with Heterogeneous Workers. Journal of Political Economy, 115 (5):868–892, 2007.

D. Parent. Industry-specific capital and the wage profile: Evidence from the national longitudinal survey of youth and the panel study of income dynamics. Journal of Labor Economics, 18(2):306–323, 2000.

A. D. Roy. Some Thoughts on the Distribution of Earnings. Oxford Economic Papers, 3(2):pp. 135–146, 1951.

W. Stolper and P. A. Samuelson. Protection and Real Wages. Review of Economic Studies, 9(1):58–73, 1941.

P. Topalova. Factor Immobility and Regional Impacts of Trade Liberalization: Evidence on Poverty from India. American Economic Journal: , 2(4):1–41, 2010.

E. A. Verhoogen. Trade, quality upgrading, and wage inequality in the Mexican manufacturing sector. The Quarterly Journal of Economics, 123(2):489–530, 2008.

R. Wacziarg and J. S. Wallack. Trade liberalization and intersectoral labor movements. Journal of , 64 (2):411–439, 2004.

A. Young. Structural Transformation, the Mismeasurement of Productivity Growth, and the Cost Disease of Services. The American Economic Review, 104(11):3635–3667, 2014.

38 A Data Construction and Measurement

Commodity Price Data. To capture Brazil’s exposure to world prices of basic products, I build price in- dices for six commodity categories. The first source of international commodity prices is the Commod- ity Research Bureau that publishes price indices by commodity group based on product spot prices in the main exchange markets in the United States. In the paper, I use those groups with sizeable employ- ment participation in Brazil: Grains (corn, soybeans, and wheat), Soft Agriculture (cocoa, coffee, sugar, orange juice and others), Livestock (hides, hogs, lard, steers, tallow), Metals (copper scrap, lead scrap, steel scrap, tin, zinc). In addition, I build price indices for two commodity groups using future prices in the New York Mercantile Exchange: Precious Metals (gold and silver) and Energy (crude oil). These series of international nominal prices were converted into local currency using the nominal exchange rate and deflated by the Brazilian consumer price index (IPCA).33 To avoid short-term price volatility, the reduced-form analysis relies on the average price in the six months preceding the process of data collection of the Census. Specifically, I use the average price between March and August of each year.

Labor Market Data. Data on labor market outcomes come from publicly available long versions of the Brazilian Census performed by the Brazilian Institute of Geography and Statistics (IBGE) for the years of 1980, 1991, 2000, and 2010. From the Census, I extract a sample of full-time workers aged between 16 and 64. Full-time workers are defined as those reporting more than 35 weekly worked hours. I restrict the sample to workers with calculated experience between 0 and 39 years. Experience is defined as the individual’s age minus a predicted initial working age that equals 23 for college graduates, 18 for high-school graduates and 15 for those with only primary education. The benchmark sample is further restricted to include only white male workers. This restriction allows us to focus on individuals with strong labor force attachment that are not directly subject to changes in employer prejudice attitudes towards female and non-white individuals. In fact, Ferreira, Firpo, and Messina(2014) show that gender and race wage differentials were important drivers of recent changes in the Brazilian wage inequality. In robustness exercises, I extend the benchmark sample to also include female and non- white workers. Individuals in the sample are allocated to sectors according to their self-reported industry of em- ployment. Table A1 shows the industry classification used in this paper together with corresponding industry codes in each year of the Census. I use crosswalk tables publicly provided by the IBGE to link the different activity codes in the Censuses of 1980-1991, 2000 and 2010. The division of industries in the commodity sector accommodates available information on international prices as described above.

Regional Labor Markets. As a regional labor market unit, I use the microregion concept created by the IBGE in the 1991 Census. Each of the 558 microregions corresponds to a set of economically integrated municipalities with interconnected labor markets. This definition of regional labor markets in Brazil is similar to Commuting Zones in the United States, being recently used in a series of papers analyzing the response of local labor markets to aggregate trade shocks (e.g. Kovak, 2013, and Dix-Carneiro and

33All commodity price series were downloaded from the Global Financial Database. In the end of 2008, the soft and grains indices were unified under the foodstuff index. Thus, I build these series for 2009-2010 using each index description. Series of nominal exchange rate and IPCA were downloaded from the IPEADATA.

39 Table A1: Industry Classification and Census Activity Codes Industry 1980-1991 Census (Atividades) 2000 Census (CNEA-Dom) 2010 Census (CNEA-Dom 2.0) Commodity Sector Grains (corn, soybeans, wheat) 20; 21; 22 1102; 1103; 1107 1102; 1103; 1107 Soft (coffee, cocoa, sugar, and others) 11; 12; 14-17; 23; 24 1104; 1105; 1110-1116; 2001; 2002; 1104; 1105; 1110-1116; 10022; 10093 15022; 15042 Livestock (cattle, hogs, and others) 26; 27; 41; 42 1201-1205; 1208; 1209; 1300; 1402; 5001; 1201-1205; 1208; 1209; 1402; 1999; 3001; 5002; 15010; 15030 3002; 10010; 10030 Metals (copper, lead, steel, tin, and zinc) 58 13002 7002 Precious Metals (gold and silver) 55 13001 7001 Energy (crude oil) 51 11000 6000 Other agriculture and mining 13; 18-19; 25; 28; 29; 31-37; 50; 52-54; 56; 1101; 1106; 1108; 1109; 1117; 1118; 1206; 1101; 1106; 1108; 1109; 1117-1119; 1206; 57; 59; 581 1207; 1401; 10000; 12000; 14001-14004 1207; 1401; 5000; 8001-9000

Non-Commodity Sector Manufacturing 100-300 15021; 15041; 15043; 15050; 16000-37000 10021; 10091; 10092; 10099-32999; 38000

Non-Tradable 340-901 1500; 40010-99000 1500; 33001-37000; 39000-99000

Kovak, 2015a,b). Despite the sharp increase in the number of municipalities between 1991 and 2010, the IBGE maintained the same microregion definition in the Censuses of 1991, 2000 and 2010.34 In the 1980 Census, the microregion variable does not exist so I create it from existing municipalities in 1980. Because of the change in municipality borders between 1980 and 1991, it is only possible to replicate a subset of the microregions using historical administrative borders. To be more precise, I recover 540 microregions in the 1980 Census compared to the 558 microregions in the 1991 Census.35

Wage Distribution. As a measure of individual wage, I use the self-reported monthly labor income in the main job. Table A2 presents the trends in log wage inequality among male white full-time workers. As discussed in the main text, the variance of log wage has an inverted U-shape in the period: it increased in the 1980s and decreased in the 1990s and 2000s. To attest that the behavior of log wage variance is not entirely related to individuals with extreme wage values, I consider the change in the differential between percentiles of the wage distribution. Specifically, the P90-P10 wage gap increased by 40 log-points in 1980-1991 and decreased by 66.3 log-points in 1991-2010. In terms of observable worker characteristics, the wage premium associated with two attributes were particularly important in the wage compression process of 1991-2010: educational and sectoral wage differentials fell respectively by 32.1 and 19.0 log-points in the period. To further investigate the movement in wage inequality, I perform a wage variance decomposition with a regression of log wages on a full set of dummies for years of experience, level of education, sec- tor of employment and microregion of residence. Table A3 presents the result of this decomposition

34There were 4,491 municipalities in 1991 and 5,565 in 2010. Out of the 1074 municipalities created in the period, 998 municipalities had parent municipalities in a single microregion and, therefore, they were allocated to this microregion. The other 76 municipalities had parent municipalities in more than one microregion. These municipalities, that represented .33% of employment in 2000, were allocated to the microregion of the parent municipality that ceded the highest population share to the new municipality. This procedure adopted by the IBGE minimizes any measurement error implied by the border change. In fact, all results in the paper are robust to using a sample of 491 microregions built by aggregating microregions such as to keep borders unchanged in the period. 35Out of the 3,991 municipalities in the 1980 Census, I am able to link 3,938 municipalities to at least one of the 4,491 municipalities in the 1991 Census. With these linked municipalities, I construct microregions in 1980 using the microregion assigned to corresponding municipalities in 1991. The main problem of this method is the existence of new municipalities in 1991 that belonged to a different microregion than their parent municipalities in 1980. This is the case for 85 of the 500 municipalities created between 1980 and 1991, accounting for .67% of total employment in 2000.

40 Table A2: Trends in Brazilian Log Wage Inequality, 1980-2010 1980 1991 2000 2010 Variance 0.893 1.095 1.017 0.781 P90th - P50th 1.378 1.476 1.492 1.386 P50th - P10th 0.944 1.247 1.092 0.673 P90th - P10th 2.322 2.722 2.584 2.060 High-school wage differential 1.161 1.046 0.965 0.725 Non-commodity sector wage differential 0.388 0.422 0.358 0.231

Note. Sample of male white full-time workers extracted from the Brazilian Census of 1980, 1991, 2000 and 2010. Education and sector wage differentials implied by a regression of log real wage on a full set of dummies for high-school degree, employment in the commodity sector, microregion of residence, and experience (0-39yrs). between 1980 and 2010. The education-sector-region component is the main driver of wage disper- sion, accounting for 49.8% of the 1980-1991 growth and 69.1% of the 1991-2010 decline. Notice that this component’s contribution is distributed across all its three terms. Conclusions in Table A3 are related to results reported elsewhere in the literature. In particular, Ferreira, Firpo, and Messina(2014) highlight the importance of falling educational and state wage gaps for the 1995-2012 decrease in wage inequality observed among workers in the PNAD survey. Using administrative data for formal sec- tor employees, Helpman, Itskhoki, Muendler, and Redding(2015) conclude that observable worker attributes account for roughly half of the increase in log wage variance between 1986 and 1995.

Table A3: Decomposition of the Variance of Log Wages, 1980-2010 1980 1991 2000 2010 Overall 0.893 1.095 1.017 0.781 Residual 0.481 0.593 0.581 0.524 Between 0.412 0.502 0.436 0.257 Education-Region-Sector 0.356 0.457 0.387 0.240 Education 0.181 0.205 0.209 0.131 Region 0.059 0.107 0.073 0.055 Sector 0.027 0.029 0.017 0.006 Covariance (education, region, sector) 0.089 0.117 0.088 0.049 Experience 0.095 0.077 0.099 0.059 Covariance (experience, education-region-sector) -0.040 -0.033 -0.050 -0.042

Note. Sample of male white full-time workers extracted from the Brazilian Census of 1980, 1991, 2000 and 2010. Wage decomposition implied by a regression of log real wage on a full set of dummies for high-school degree, employment in the commodity sector, microregion of residence, and experience (0-39yrs).

Sample of Microregions in the Reduced-Form Exercise. In the reduced-form exercise, I select a base- line sample of 518 microregions with positive employment in the commodity sector for all groups in the 1991-2010 period, covering 98.4% of the country’s population on 1991. The 1980-1991 period is excluded from the baseline sample mainly because of the turbulent economic environment in Brazil during the 1980s. The decade was marked by hyperinflationary episodes, suspension of foreign cur- rency convertibility, and the adoption of restrictive internal controls on prices and wages. In this en- vironment, it is not clear that relative international prices were very informative about relative prices faced by domestic producers when deciding resource allocation. More normal economic conditions

41 returned after the series of structural reforms implemented in 1993-1994 that brought monetary sta- bilization, eliminated price controls, and restored full currency convertibility. In any case, robustness exercises attest that similar results hold in the extended sample spanning the entire 1980-2010 period.

Bartik-Instrument Weights. To build the group-region exposure to international commodity prices, I compute total labor income by industry as the weighted sum of monthly wages of individuals report- k ing to have their main job in the industry using Census sampling weights. Denote Yg,r,t(j) as the total labor payments of industry j for workers of group g in microregion r on year t. The initial industry participation in sector labor payments to group g in microregion r is

k k φg,r(j) ≡ Yg,r,1991(j)/ ∑ Yg,r,1991(j) j∈J k where J k is the set of industries in sector k. Table A4 reports summary statistics of industry composition in the sample of 518 microregions on 1991. Columns (1) and (3) indicate that regions, on average, have a large fraction of their work force allocated to the commodity sector with agriculture accounting for the bulk of the sector’s labor expen- diture. Importantly, columns (2) and (4) document great heterogeneity in industry composition across microregions. Comparing columns (1) and (3), it is possible to identify different exposure patterns for the two groups. While HSD are more likely to be employed in the production of grains and soft agricultural items, HSG are more likely to be employed in the production of livestock and crude oil.

Table A4: Summary Statistics: Labor Income Share by Industry in Brazil, 1991 High-School Graduates High-School Dropouts Industry Mean SD Mean SD (1) (2) (3) (4) 1. Commodity Sector 9.0% 9.6% 21.6% 19.7% Grains (corn, soybeans, wheat) 4.7% 11.6% 10.1% 16.3% Soft (coffee, cocoa, sugar and other) 13.0% 16.1% 19.5% 18.0% Livestock (cattle, hogs, and others) 35.5% 21.1% 26.8% 15.6% Metals (copper, lead, steel, tin, and zinc) 3.0% 7.2% 1.6% 4.2% Precious Metals (gold and silver) 1.0% 4.0% 1.8% 4.7% Energy (crude oil) 8.4% 17.1% 2.3% 6.2% Other agriculture and mining 34.3% 20.7% 37.9% 19.8%

2. Manufacturing 16.1% 10.7% 18.1% 11.3% 3. Non-Tradable Goods and Services 74.9% 10.8% 60.4% 14.8%

Note. Sample of male white full-time workers extracted from the Brazilian Census of 1991. Statistics are weighted by the microregion share in national population on 1991.

Regional Labor Market Outcomes. For each triple of group-region-year, I compute labor market out- comes used in the reduced-form exercise: sector employment, sector average log wage and average log wage. To calculate wage outcomes, I estimate wage regressions separately for each year using the entire sample of workers in the country. Specifically, I regress the log monthly wage on a full set of experi-

42 ence dummies (0-39yrs) interacted with dummies for female and white workers. The residual of this regression corresponds to a wage measure adjusted for variation in these demographic characteristics across groups and microregions. For each triple of group-region-year, the sector average log wage is the weighted average of the adjusted wage among individuals reporting to have their main job in the sector. Lastly, the average log wage is the weighted average of the adjusted wage among all individuals in the triple of group-region-year. In both cases, the computation uses Census sampling weights. Sector employment corresponds to the sum of efficiency-adjusted hours supplied by sector em- ployees in a group-region-year. First, the efficiency adjustment is performed by multiplying individ- ual weekly hours by a time-invariant measure of relative wage for each cell of sex-race-education- 36 k experience. Second, the total sector employment, Hg,r,t, is the weighted sum of efficiency-adjusted hours of individuals reporting to have their main job in the sector computed with Census sampling k k C N weights. Thus, sector employment share is defined as lg,r,t ≡ Hg,r,t/(Hg,r,t + Hg,r,t). To obtain total labor supply, I use the aggregate amount of efficiency-adjusted hours in a group- ¯ C N region-year: Hg,r,t ≡ Hg,r,t + Hg,r,t. Lastly, the labor supply of immigrants is computed exactly as above in a restricted sample of individuals identified as non-native residents of each microregion.37

Task Price Estimation Data. In the structural exercise, the estimation of sector-specific task prices re- quires initial sector employment and wage growth across percentiles of the wage distribution for each triple of group-region-period. To this extent, I compute percentiles of log hourly wage from the sample of individuals in a group-region-period using Census sampling weights. Individuals are distributed across percentile bins according to their log hourly wage — in the benchmark specification, I consider bins with width of 1pp. The sector employment share in each percentile bin corresponds to the frac- tion of efficiency-adjusted hours reported by sector employees in that percentile bin. Additionally, the wage growth in each percentile corresponds to the difference in the average hourly wage of individu- als in that percentile bin between two consecutive years. Since extreme wage values are more likely to be generated by measurement error, I ignore the wage distribution tails by restricting the estimation to bins between the 6th and the 94th percentiles. Thus, the task price estimation is based on 88 percentile bins for each of the 2,072 group-region-period triples.

Chinese Imports Data. The series of Chinese imports by industry used in the robustness exercise below is constructed from imports information at the 2-digit SITC (rev3) industry level in the Comtrade database. I use China imports from the rest of the world in Food & live animals, Crude material, Mineral fuel, and Animal/Veg Oil. I map the 26 industries in these groups to the six commodity categories on Table A1 (crosswalk available upon request).

36I consider 48 cells based on two sex groups, two race groups (white and non-white), three educational groups (high- school dropouts, high-school graduates and college graduates) and four experience groups (0-9, 10-19, 20-29 and 30-39 years). For each cell, the relative wage is the average hourly wage divided by the average wage of female non-white high-school dropouts with 0-9 years of experience. The cell weight is the average relative wage across regions and years (1991, 2000, 2010). In the 1980 Census, weekly hours are only reported in ranges. To compute efficiency-adjusted hours, I assign 45 and 54 weekly hours to individuals reporting respectively 40-48 and 49+ hours. 37Given the information in the Census, I can only identify as microregion natives those individuals satisfying one out of two conditions. First, they were born in the same municipality that they current live in. Second, if they were born in a different municipality, then I also consider microregion natives those that moved into the current municipality from another municipality in the same microregion during the previous ten years.

43 B Reduced-Form Evidence: Sensitivity Analysis

Having established the main results in Section3, I turn to an empirical investigation of the robustness of these results taking as starting point the detailed specification with period dummies interacted with macroregion dummies, initial labor market conditions and initial commodity sector size controls.

Additional Controls. Table B1 extends the baseline model with additional demographic controls in the initial period. In columns (2), (6) and (10), I investigate the effect of the increase in employment formalization in Brazil in 1991-2010 — for the share of formalized employees increased from 72.3% in 1991 to 82.3% in 2010. To the extent that formalization was lower in rural areas, this increase is potentially correlated with regional exposure to commodity price shocks. I address this concern with the inclusion of the initial share of formal sector employees in the microregion.38 Yet, these controls have almost no impact on estimated coefficients. The other two variables refine the set of initial sector composition controls, capturing more flexibly confounding effects related to heterogeneity in regional exposure to sectoral shocks. I introduce the initial share of group labor income in manufacturing industries (columns 3, 7 and 11) and in agriculture and mining products without international prices (columns 4, 8 and 12). By interacting these variables with period dummies, I allow for an arbitrary change in the price of these products relative to the price of internationally traded commodities. In both cases, coefficients are essentially the same for HSG, but they are slightly lower for the wage and sector employment responses of HSD remaining nevertheless statistically significant at 5%.

Additional Period. Table B2 estimates the model in the extended sample spanning the entire period of 1980-2010. As argued above, the peculiar economic conditions in Brazil could potentially weaken the connection between domestic and international commodity prices during the 1980s. Yet, columns (2) and (5) indicate very similar responses in terms of group wages and commodity sector employment. Differences arise for the response of the commodity sector wage differential in column (8). In this case, the coefficient for HSG falls by 40%, moving towards the lower bound of the baseline confidence interval. For HSD, we obtain a higher and more precise coefficient compared to the nonsignificant coefficient implied by the baseline specification. Compared to the period of 1991-2010, the 1980s exhibit another important difference: commodity prices experienced strong losses in the decade. Taking advantage of this qualitatively different price behavior, columns (3), (6) and (9) estimate the model using only within-microregion variation across time — that is, I include microregion-specific time trends in the model. This specification addresses concerns that shock exposure is picking up secular trends in microregions specialized in the commodi- ties with larger price gains in 1991-2010. Although the microregion fixed effects absorb much of the cross-section variation in labor market outcomes, they have little effect on estimated coefficients.39

38Restricting the sample to formal sector employees creates a selection problem because low-wage individuals are more likely to drop out of the sample which attenuates wage gains in the period. To avoid this issue, I control for the pre-sample formal sector size which captures any correlation between production specialization and employment formalization. 39In omitted exercises, I have estimated the model with micreoregion fixed effects in the baseline sample of 1991-2010. In this case, standard errors become five to ten times higher. This increase is related to the high correlation in shock exposure between 1991-2000 and 2000-2010 — the autocorrelation of shock exposure is 0.734. Consequently, there is little within-

44 Additional Worker Groups. In Table B3, I extend the sample to include female and non-white individ- uals. With this exercise, I evaluate whether these additional worker groups exhibit similar behaviors in the labor market. This possibility is specially relevant given the large changes in gender and race wage gaps in the period (Ferreira, Firpo, and Messina, 2014). In columns (2), (6) and (10), I include female white individuals without significant changes in estimated coefficients. The inclusion of non-white male individuals entails a more intricate change in estimated coefficients as shown in columns (3), (7) and (11). Responses of sector employment and wages became weaker for HSG in Panel A, but the op- posite is true for HSD in Panel B. These different estimated responses are likely related to differences between white and non-white individuals in terms of unobservable characteristics driving their sec- toral allocation. This is particularly important among HSG because of the extremely low high-school graduation rate among non-white individuals in Brazil.

IV: Exposure to Chinese Imports Growth. Now I address concerns regarding endogeneity of relative commodity prices introduced if Brazilian regions are large enough to affect the world price of basic commodities. For example, a negative shock to a region’s commodity sector labor supply would si- multaneously imply a decrease in sector employment and an increase in world prices, introducing a negative bias in the OLS estimates. Motivated by the strategy in Autor, Dorn, and Hanson(2013), I pro- pose an instrument based on the regional exposure to Chinese imports growth. That is, I instrument

∆Zg,r,t with the following variable:

∆Mg,r,t = ∑ φg,r(j) · ∆ ln Mt(j) j∈J C where Mt(j) is the total value of Chinese imports of product j in year t. Table B4 presents the estimation of this model by 2SLS. Estimated coefficients for HSG are essen- tially the same albeit more imprecise. For HSD, there is an increase in point estimates, but this change is not significant given the even stronger increase in standard errors.

Shock Exposure and Labor Supply. The labor demand shock captured in the exposure variable has the potential to affect the total quantity of hours supplied by workers in a microregion. Such response can be related to changes in the labor supply of native workers and/or changes in the flow of immi- grant workers. To investigate the labor supply response to commodity price shocks, Table B5 estimates model (2) for the log change in total hours supplied by all workers in columns (1)-(2) and immigrant workers in columns (3)-(4). Among HSD, shock exposure is not related to the labor supply of both native and non-native workers. For HSG, columns (1) and (3) indicate a weak negative relation be- tween their labor supply and the commodity price shock. Yet, this negative relation disappears once the initial educational composition of the labor force is included in columns (2) and (4). These controls capture the educational achievement convergence across Brazilian microregions in the period, elim- inating the relation between labor supply growth and shock exposure.40 Overall, Table B5 suggests limited responses in regional labor supply following commodity price shocks. microregion exposure variation to precisely estimate the coefficient of interest. When the 1980-1991 period is included, there is a significant increase in exposure variation within microregions, leading to the more precise results in Table B2. 40In omitted exercises, I attest that educational compositional controls do not affect the results in Tables1 and2.

45 Alternative Measure of Shock Exposure. Table B6 investigates the robustness of sector-level responses to the use of the alternative exposure measure computed with the overall industry participation in the group-region wage bill. Columns (1)-(3) indicate that employment in the commodity sector is positively related to this alternative measure of price shock exposure. Also, predicted elasticities are similar to those in the baseline specification: using the average commodity sector size for each group, estimated coefficients in column (1) imply that a 1% increase in commodity prices cause the commodity sector employment share to increase by .029 p.p. for HSG and .069 p.p. for HSD. Column (4)-(6) perform the same exercise for the change in the commodity sector wage premium. Similar to the baseline specification, there is a significant response in sector wage differential for HSG. However, the alternative measure also predicts a significant positive response in the sector wage premium for HSD — notwithstanding the coefficient remains lower than its counterpart for HSG.

46 Table B1: Exposure to Commodity Price Shocks and Labor Market Outcomes, Additional Controls Change in Commodity Sector Employment Change in Commodity Sector Average Log Dependent Variable: Change in Average Log Wage Share Wage Premium (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) A. High-School Graduates Commodity price shock 0.288** 0.286** 0.277** 0.283** 0.038** 0.039** 0.039** 0.036** 0.416** 0.420** 0.410** 0.414** (0.064) (0.064) (0.060) (0.067) (0.010) (0.010) (0.010) (0.010) (0.098) (0.098) (0.100) (0.100) R2 0.440 0.441 0.443 0.440 0.405 0.406 0.407 0.406 0.220 0.223 0.220 0.221 B. High-School Dropouts Commodity price shock 0.268** 0.268** 0.201* 0.217* 0.099** 0.097** 0.089** 0.067* 0.059 0.062 0.080 0.000 (0.099) (0.098) (0.086) (0.101) (0.029) (0.029) (0.029) (0.030) (0.161) (0.162) (0.163) (0.171)

47 R2 0.636 0.637 0.645 0.639 0.5591036 0.5651036 0.5601036 0.566 0.2181036 0.2211036 0.2211036 0.221 Baseline Controls Initial labor market conditions x period dummies Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Initial commodity sector size controls x period dummies Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Additional Controls

Initial size of formal sector x period dummies No Yes No No No Yes No No No Yes No No

Initial size of manufacturing sector x period dummies No No Yes No No No Yes No No No Yes No Initial size of other agriculture and mining in the No No No Yes No No No Yes No No No Yes commodity sector x period dummies

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. All regressions are weighted by the microregion share in national population on 1991 and include ten macroregion-period dummies. Baseline controls as in Table 1. Standard Errors clustered by microregion. ** p<0.01, * p<0.05 Table B2: Exposure to Commodity Price Shocks and Labor Market Outcomes, Additional Period, 1980-2010 Change in Commodity Sector Change in Commodity Sector Dependent Variable: Change in Average Log Wage Employment Share Average Log Wage Premium (1) (2) (3) (4) (5) (6) (7) (8) (9) A. High-School Graduates Commodity price shock 0.288** 0.214** 0.231** 0.038** 0.051** 0.062** 0.416** 0.234* 0.218+ (0.064) (0.049) (0.069) (0.010) (0.013) (0.018) (0.098) (0.099) (0.132) R2 0.440 0.500 0.597 0.405 0.319 0.417 0.220 0.223 0.322 B. High-School Dropouts Commodity price shock 0.268** 0.235** 0.285** 0.099** 0.071** 0.068* 0.059 0.339** 0.417** (0.099) (0.054) (0.079) (0.029) (0.021) (0.028) (0.161) (0.095) (0.125) 48

R2 0.636 0.624 0.706 0.5591036 0.4901036 0.6031036 0.2181036 0.2591036 0.4261036 Baseline Controls Initial labor market conditions x period dummies Yes Yes Yes Yes Yes Yes Yes Yes Yes Initial commodity sector size controls x period dummies Yes Yes Yes Yes Yes Yes Yes Yes Yes Additional Controls Microregion-specific linear time trend No No Yes No No Yes No No Yes Sample Period Baseline sample: 1991-2000 and 2000-2010 Yes Yes Yes Yes Yes Yes Yes Yes Yes Extended sample: including 1980-1991 No Yes Yes No Yes Yes No Yes Yes

Note. Stacked sample of 518 microregions in baseline sample and 503 microregions in extended sample. All regressions are weighted by the microregion share in national population on 1991 and include macroregion-period dummies and initial share of group labor income in the commodity sector. Labor market conditions as in Table 1. Industry composition measured in the initial period of 1991 for baseline sample and of 1980 for extended sample. Commodity sector size controls in extended sample: share in group labor income of other agriculture and mining industries in the commodity sector, and quadratic polynomial of the share in group labor income of the commodity sector (columns 2-3). Standard Errors clustered by microregion. ** p<0.01, * p<0.05, + p<0.1 Table B3: Exposure to Commodity Price Shocks and Labor Market Outcomes, Additional Worker Groups Change in Commodity Sector Employment Change in Commodity Sector Average Log Dependent Variable: Change in Average Log Wage Share Wage Premium (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) A. High-School Graduates Commodity price shock 0.288** 0.240** 0.303** 0.260** 0.038** 0.032** 0.028** 0.022** 0.416** 0.469** 0.359** 0.431** (0.064) (0.064) (0.062) (0.063) (0.010) (0.008) (0.009) (0.008) (0.098) (0.100) (0.099) (0.101) R2 0.440 0.539 0.561 0.617 0.405 0.420 0.410 0.441 0.220 0.280 0.276 0.359 B. High-School Dropouts Commodity price shock 0.268** 0.295** 0.317** 0.338** 0.099** 0.113** 0.113** 0.121** 0.059 0.091 0.234* 0.227* (0.099) (0.099) (0.110) (0.108) (0.029) (0.031) (0.031) (0.034) (0.161) (0.149) (0.099) (0.090) 49 R2 0.636 0.650 0.653 0.664 0.5591036 0.5911036 0.620 0.6521036 0.2181036 0.2541036 0.258 0.2851036 Baseline Controls Initial labor market conditions x period dummies Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Initial commodity sector size controls x period dummies Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Additional worker groups Baseline sample: male / white Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Including female No Yes No Yes No Yes No Yes No Yes No Yes Including non-white No No Yes Yes No No Yes Yes No No Yes Yes

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. All regression are weighted by the microregion share in national population on 1991 and include ten macroregion-period dummies. Baseline controls as in Table 1. Standard Errors clustered by microregion. ** p<0.01, * p<0.05 Table B4: Exposure to Commodity Price Shocks and Labor Market Outcomes, 2SLS using Exposure to Chinese Imports Growth Change in Commodity Sector Change in Commodity Sector Dependent Variable: Change in Average Log Wage Employment Share Average Log Wage Premium (1) (2) (3) (4) (5) (6) A. High-School Graduates Commodity price shock 0.286** 0.262** 0.039** 0.035+ 0.420** 0.398+ (0.064) (0.095) (0.010) (0.020) (0.098) (0.205) F-stat of excluded variable 36.1 36.1 36.1 B. High-School Dropouts Commodity price shock 0.268** 0.746** 0.097** 0.261* 0.062 0.575 (0.098) (0.271) (0.029) (0.109) (0.162) (0.364) F-stat of excluded variable 28.6 28.6 28.6

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. 2SLS estimator where the instrument is the microregion's exposure to growth in Chinese imports. All regressions are weighted by the microregion share in national population on 1991. Regressions include period dummies interacted with macroregion dummies, initial formal sector size and baseline controls described on Table 1. Standard Errors clustered by microregion. ** p<0.01, * p<0.05, + p<0.1

Table B5: Exposure to Commodity Price Shocks and Labor Supply Change in Log of Total Change in Log of Dependent Variable: Labor Supply Immigrants' Labor Supply (1) (2) (3) (4) A. High-School Graduates Commodity price shock -0.529+ 0.051 -0.535+ 0.055 (0.271) (0.114) (0.279) (0.131) R2 0.686 0.811 0.649 0.753 B. High-School Dropouts Commodity price shock 0.006 0.209 0.005 0.226 (0.173) (0.156) (0.221) (0.202)

R2 0.8471036 0.8771036 0.8451036 0.8691036 Baseline Controls Initial labor market conditions x period dummies Yes Yes Yes Yes Initial commodity sector size controls x period dummies Yes Yes Yes Yes Initial educational composition of labor force x period dummies No Yes No Yes

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. All regressions are weighted by the microregion share in national population on 1991 and include ten macroregion-period dummies. Baseline controls as in Table 1. Educational composition of labor force: share of colloge graduates and share of high-school graduates and some college. Standard Errors clustered by microregion. + p<0.1

50 Table B6: Exposure to Commodity Price Shocks and Sector Labor Outcomes, Alternative Measure Change in Commodity Sector Change in Commodity Sector Dependent Variable: Employment Share Average Log Wage Premium (1) (2) (3) (4) (5) (6) A. High-School Graduates Commodity sector size x Commodity price shock 0.193* 0.196* 0.158+ 1.971* 2.032* 1.365+ (0.094) (0.089) (0.087) (0.858) (0.870) (0.703) R2 0.282 0.341 0.402 0.143 0.195 0.215 B. High-School Dropouts Commodity sector size x Commodity price shock 0.183* 0.175+ 0.161+ 0.683** 1.075** 1.068** (0.088) (0.092) (0.092) (0.227) (0.244) (0.241)

R2 0.5071036 0.5551036 0.5571036 0.1641036 0.2271036 0.2281036 Baseline Controls Initial labor market conditions x period dummies No Yes Yes No Yes Yes Initial commodity sector size controls x period dummies No No Yes No No Yes

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. All regressions are weighted by the microregion share in national population on 1991 and include ten macroregion-period dummies. Baseline controls as in Table 1. Standard Errors clustered by microregion. ** p<0.01, * p<0.05, + p<0.1

51 C Two-Sector Roy Economy: Competitive Equilibrium

In this section, I describe the production and market structures that generate generate the demand for k N sector-specific tasks with log-prices given by (ωg, ωg ). To this extent, I assume that each market is a small open economy with a segmented competitive labor market. Specifically, goods are freely traded in the world market but factors of production are not. Therefore, product prices are exogenously determined internationally and factor prices are endogenously determined locally.

C.1 Demand of Sector-Specific Tasks

Suppose sector k is constituted of multiple competitive industries, j ∈ J k. In each industry j of sec- tor k, the production technology requires workers of group g to perform a sector-specific task. The production function combines the sum of the sector-specific task produced by all industry employees, k,j k,j {Tg }g, and other industry-specific inputs, X . Formally, the production function is

 k j k j qk,j = Qk,j {Tk,j} , Xk,j|B , s.t. Tk = Tk(i) · L , (i) di (28) g m g ˆ g g i∈Ig

k,j where Lg (i) is an indicator function that equals one if individual i is employed in industry j of sector k; and Xk,j is the vector of industry-specific inputs. In equation (28), the central restriction is that production in every industry j of sector k requires the same sector-specific task. Thus, individuals are perfect substitutes across industries within the sector. Notice that the production function allows, but does not require, tasks of different groups to be imperfect substitutes. k Firms take as given the unit log-price of the sector-specific task, ωg. Hence, the demand for the k,j k k,j k,j sector-specific task of group g in industry j of sector k is Tg ({wg}g|Z ) = Tg such that

k  k,j k,j k,j ∂Q {Tg }g, X¯ B k = k,j + ωg ln p ln k for all g ∂Tg where Dk,j ≡ ln pk,j, Bk,j, X¯ k,j is the vector of industry-sector demand shifters with X¯ k,j denoting the endowment of the industry-specific factor. Lastly, the sector-specific task demand combines the task demand of all industries in the sector:

k  k k k,j  k k,j Tg {ωg}g D = ∑ Tg {ωg}g D (29) j∈J k

k ≡  k,j 41 where D D j∈J k is the vector of sector demand shifters.

41The simplicity of this production structure sheds light on the distributive channel implied by worker heterogeneity in the Two-Sector Roy Economy. Accordingly, one could consider alternative environments with a generic demand for sector- specific tasks as in the case of non-competitive product markets and additional mobile factors of production.

52 C.2 Supply of Sector-Specific Tasks

In the analysis of Section4, I have focused on the structure of employment and wages in the Two- Sector Roy Economy. Here, I describe the consequences of this structure to the supply of sector-specific N tasks. Recall that preferences in (3) imply individuals with comparative advantage below αg(lg ) are employed in the non-commodity sector. As a result, the total supply of tasks in the non-commodity sector is N   lg ¯ N N ¯   Tg lg ≡ Ig · E exp(ag(i))|sg(i) = αg(q) dq (30) ˆ0   where I¯g is the mass of individuals of group g; and E exp(ag(i))|sg(i) = αg(q) is determined by the conditional distribution of absolute advantage in (4). N Analogously, individuals with comparative advantage above αg(lg ) are employed in the commod- ity sector. Thus,

  1 ¯ C N ¯   Tg lg ≡ Ig · exp(αg(q)) · E exp(ag(i))|sg(i) = αg(q) dq. (31) ˆ N lg

The supply of sector-specific tasks depends on both the skill distribution and the endogenous sec- N tor employment composition. As discussed in Section4, the sector employment composition, lg , is determined by the free mobility condition (5) which I repeat here to facilitate exposition:

N C  N ωg − ωg = αg lg . (32)

C.3 Competitive Equilibrium

Market Clearing. The competitive equilibrium requires the equalization of supply and demand for tasks in each sector. For group g in sector k, this is equivalent to     k k k ¯ k N Tg {ωg}g D = Tg lg (33)

    k k k ¯ k N where Tg {ωg}g Z is the task demand in (29), and Tg lg is the task supply in (30)-(31). The following definition outlines the competitive equilibrium of the Two-Sector Roy Economy.

Definition 1. [Competitive Equilibrium] The competitive equilibrium in the Two-Sector Roy Economy is C N determined by the vector of sector-specific task prices, {(ωg , ωg )}g, such that, for all g and k, (i) sector-specific     k k k ¯ k N task demand, Tg {ωg}g D , is given by (29); (ii) sector-specific task supply, Tg lg , is given by (30)-(31); N (iii) sector employment composition, lg , is given by (32); and (iv) market clearing is given by (33).

53 D Theory: Auxiliary Results and Model Extensions

D.1 Lemma 1

Lemma 1. Consider a model of the form:

em = ym − Φ (lm) − Xmγ.

Assume there is an instrumental variable Zm satisfying Assumption4. If E [em|Zm, Xm] = 0, then the function

Φ(.) is identified up to a constant. With the normalization E[Xmγ] = 0, the constant in Φ(.) is also identified.

Proof. First, E[em|Zm, Xm] = 0 implies that E [Φ (lm) + Xmγ|Zr,t, Xr] = E [ym|Zm, Xm]. Now let us   proceed by contradiction. Suppose there exist Φ˜ (lm) and γ˜ such that E Φ˜ (lm) + Xmγ˜|Zm, Xm =

E [ym|Zm, Xm]. By Assumption4,

Φ (lm) − Φ˜ (lm) = Xm(γ − γ˜) almost surely.

0 Take m and m such that Xm = Xm0 . The condition above implies that, for all lm and lm0 , we must have Φ (lm) − Φ˜ (lm) = Φ (lm0 ) − Φ˜ (lm0 ). Thus, Φ(.) is identified up to a constant. To determine this constant, we can use the normalizations E[em] = E[Xmγ] = 0 which imply that E[Φ˜ (lm)] = E[ym]. 

D.2 Derivation of Equation (19)

The analysis in this section considers a shock to endogenous and exogenous variables in an arbitrary region and year; accordingly, I drop indices r and t to simplify the notation. Define individual i’s log-wage as yg(i) such that

C N yg(i) = max{ωg + sg(i) + ag(i); ωg + ag(i)}.

Under Assumption1, this implies that the log-wage distribution of group g is given by

N C N C Pr[yg(i) ≤ y] = Pr[yg(i) ≤ y; sg(i) ≤ ωg − ωg ] + Pr[yg(i) ≤ y; sg(i) > ωg − ωg ] N lg h i 1 h i (34) N C = Pr ag(i) ≤ y − ωg αg(q) dq + Pr ag(i) ≤ y − ωg − αg(q) − u˜g αg(q) dq ˆ ˆ N 0 lg

a e where Pr[ag(i) ≤ a|s˜g(i) = s] ≡ µHg(a|s) + (1 − µ)H (a|ug, θg).   Recall that, by construction, Yg(π) is implicitly defined as the solution of Pr yg(i) ≤ Yg(π) = π. Taking a first-order expansion of this equation,

  N  C  N  N C  C 0 fg Yg(π) + fg Yg(π) ∆Yg(π) = fg Yg(π) · ∆ωg + fg Yg(π) · ∆ωg − ∆vg(π)

N ∂Pr[yg(i)≤y; sg(i)≤ωg] C ∂Pr[yg(i)≤y; sg(i)>ωg] N C where fg (y) ≡ ∂y , fg (y) ≡ ∂y , and ωg ≡ ωg − ωg . N In this expression, the first-order impact of the endogenous change in lg is eliminated by the em- ployment condition (5), reflecting the fact that marginal workers are indifferent between the two sec-

54 tors. The term v˜g(π) incorporates changes to other exogenous parameters of the skill distribution:

" N # lg   1   0 e N e C ∆vg(π) ≡ (1 − µ) ∇θ H Yg(π) − ωg u˜g, θg dq + ∇θ H Yg(π) − ωg − αg(q) − u˜g u˜g, θg dq · ∆θg ˆ ˆ N 0 lg " N !# lg   1   C  e N e C + fg Yg(π) + (1 − µ) ∇u H Yg(π) − ωg u˜g, θg dq + ∇u H Yg(π) − ωg − αg(q) − u˜g u˜g, θg dq ∆u˜g. ˆ ˆ N 0 lg

h i N N C To obtain equation (19), notice that, by definition, lg (π) ≡ P sg(i) ≤ ωg − ωg yg(i) = Yg(π) which is equivalent to h i P y (i) = Y ( ) s (i) ≤ N − C N  g g π ; g ωg ωg fg Yg(π) lN( ) = = g π   N  C  P yg(i) = Yg(π) fg Yg(π) + fg Yg(π)

Thus, C C N N 00 ∆Yg(π) = ∆ωg · lg (π) + ∆ωg · lg (π) + ∆vg (π)

00 0  h N  C i where ∆vg (π) ≡ −∆vg(π) fg Yg(π) + fg Yg(π) . Finally, equation (19) is obtained by assuming a particular structure of observable and unboserv- able components of v˜g(π): 00 ˜ ∆vg (π) ≡ µg · Xg(π) + ∆vg(π).

D.3 Model Extensions: Non-monetary Employment Benefits

This section extends the Two-Sector Roy Economy of Section4 by incorporating non-monetary em- ployment benefits — a reduced-form for work conditions and switching cost. The environment of Section4 remains the same except that now I allow workers to derive utility from both consumption and employment. In particular, assume that, if employed in sector k, a consumption bundle c yields k utility τg · u(c) where u(.) is homogeneous of degree one. Thus, individual i’s payoff of employment in sector k is given by k k wgTg (i) Uk(i) = τk · (35) g g P k where τg is the private benefit of being employed in sector k for workers of group g; and P is the price index (i.e., the cost of achieving one utility unit). C N Without loss of generality, normalize τg ≡ 1 and denote τg ≡ ln τg . Assuming that workers maximize their total utility, individual i chooses to be employed in sector Kg(i) such that

n C C N N o Kg(i) = arg max ωg + ln Tg (i); τg + ωg + ln Tg (i) .

Following the same steps of Section4, the employment share in the non-commodity sector is de- termined by the following condition:

N C  N ωg − ωg = αg lg − τg. (36)

55 Intuitively, the choice of employment balances potential earnings against amenities of the sector.

These amenities are captured by the wedge τg. Those individuals with strong comparative advantage in the commodity sector (i.e., high sg(i)) obtain higher income in the commodity sector compared to the non-commodity sector. To the extent that the higher income compensates for any non-monetary benefit, these individuals choose to be employed in the commodity sector. Hence, every individual N C with sg(i) above the threshold ωg − ωg + τg,m self-selects into the commodity sector. The private benefit only affects the allocation of individuals to sectors. Conditional on the sector allocation, workers have the same sector-specific task efficiency. Hence, the remaining equilibrium conditions are not modified with the average sector wage is still given by (6)-(7). The main modification implied by private employment benefits is the introduction of composi- tional effects in the aggregate wage distribution. To see this, notice that expression (34) still determines

Yg(π), but its first-order approximation contains one extra term:

  N  C  N  N C  C fg Yg(π) + fg Yg(π) ∆Yg(π) = fg Yg(π) · ∆ωg + fg Yg(π) · ∆ωg − v˜g(π)+ h    i a N N a N N N + Hg Yg(π) − ωg − τg αg(lg ) − Hg Yg(π) − ωg αg(lg ) ∆lg .

The extra term in this equation is directly implied by equation (36). It is zero if τg = 0; however it cannot be ignored if τg 6= 0. To obtain a clear interpretation for this term, consider the fraction of N marginal workers with a wage of Yg(π): ψg(π) ≡ P[s˜g(i) = αg(lg ) yg(i) = Yg(π)]. These marginal workers are the sector-switchers for small shocks. Using Bayes’ rule,     P yg(i) = Yg(π) s˜g(i) = αg(lg) P yg(i) = Yg(π) s˜g(i) = αg(lg) ψ (π) = = g 1   N  C  f Yg(π) + f Yg(π) 0 P yg(i) = Yg(π) αg(q) dq g g ´ N where the numerator is the wage density conditional on a comparative advantage equal to αg(lg ). h i   N N N Notice that P yg(i) = Yg(π) s˜g(i) = αg(lg ) = hg Yg(π) − ωg αg(lg ) because the wage distri-    N N 42 bution among marginal workers is Hg y − ωg αg lg . Thus,

 N  hg Yg(π)|αg(lg ) ( ) = ψg π N  C . fg Yg(π) + fg Yg(π)

Using this expression, we have that     N N N N Hg Yg(π) − ωg − τg αg(lg ) − Hg Yg(π) − ωg αg(lg ) ≈ − · ( ) N  C  τg ψg π . fg Yg(π) + fg Yg(π)

42This expressions relies on the assumption that marginal workers are allocated to the non-commodity sector since they are indifferent between the two sectors. If instead they are allocated to the commodity sector, we would have  N  N  Hg y − ωg − τg αg lg and the same steps lead to an identical final expression.

56 Hence, the wage gain at quantile π is, up to a first-order approximation, given by:

C C N N ∆Yg(π) = ∆ωg · lg (π) + ∆ωg · lg (π) − ∆lg · [τgψg(π)] + v˜g(π) (37) where ψg(π) is the share of sector-switchers among workers at the π-percentile of the earnings distri- bution. The additional term in (37) captures the compositional effect introduced by the difference in sector potential wage of marginal workers in the presence of non-monetary employment benefits. Accord- ingly, the compositional effect is proportional to the fraction of sector-switchers among individuals at wage quantile π, ψg(π), and their between-sector wage wedge, τg.

D.4 Parametric Restrictions on the Distribution of Comparative and Absolute Advantage

This section discusses prominent distributional assumptions that determine the form of αg(.) and ¯ k Ag(.). To simplify notation, we omit subscripts for groups, regions and years.

D.4.1 Normal Distribution

Particularly important in the selection literature is the case of log-normally distributed sector-specific skill (Roy, 1951; Heckman and Sedlacek, 1985; Borjas, 1987; Ohnsorge and Trefler, 2007; and Mulligan and Rubinstein, 2008). That is, assume that sector-specific skill is independently drawn from a bivariate normal distribution: " # " #!   µ σ2 σ ln TC(i), ln TN(i) ∼ N C ; C CN . 2 µN σCN σN

Because the comparative advantage of individual i is defined as s(i) = ln TC(i) − ln TN(i), it is 2 2 2 2 straight forward to conclude that s(i) ∼ N (µ, σ ) where µ ≡ µC − µN and σ ≡ σC + σN − 2σCN. As a 2 result (s(i), a(i)) is jointly normal with covariance of Cov(s(i), a(i)) = σCN − σN. Thus, the distribution of a(i) conditional on s(i) = s is normal with conditional mean given by

σ − σ2 [ ( )| ( ) = ] = + · ≡ ( + ) − ≡ CN N E a i s i s µ˜ ρ s s.t. µ˜ 1 ρ µN ρµC, ρ 2 2 ; σC + σN − 2σCN and the variance is 2 σ − σ2  [ ( )| ( ) = ] = 2 ≡ 2 − CN N V a i s i s σ2 σN 2 2 . σC + σN − 2σCN

 s−µ  By definition, F(s) ≡ Φ σ where Φ(.) is the CDF of the standard normal distribution. Thus,

α(q) ≡ F−1(q) = µ + σ · Φ−1(q). (38)

57 Also, notice that

N N 1 l ρ l ρσ l ¯ N( N) ≡  ( )| ( ) = ( ) = + ( ) = ( + ) + −1( ) A l N E a i s i αg q dq µ¯ N α q dq µ˜ ρµ N Φ q dq. l ˆ0 l ˆ0 l ˆ0

N l −1 −1 N  Because 0 Φ (q)dq = φ Φ (l ) , ´ φ Φ−1(lN) A¯ N(l) = µ¯ − (ρσ) · (39) lN where µ¯ ≡ (µ˜ + ρµ) = µN. For completeness, consider the average task efficiency in the commodity sector:

−  1 1 φ Φ 1(lN) ¯ C( N) ≡ ( ) + [ ( )| ( ) = ( )] = ( + ) + ( + ) · A l α q E a i s i α q dq µ µ¯ σ 1 ρ N . 1 − l ˆl 1 − l

Equations (38)-(39) illustrate the connection between the parameters governing the skill distribu- tion and the schedules of comparative and absolute advantage. First, the dispersion of comparative advantage, σ, controls the magnitude of the between-sector reallocation of individuals in response to changes in the relative task price. Second, the sensitivity of the mean absolute advantage to the comparative advantage, ρ, controls the compositional effect of employment on sector average wage.

D.4.2 Extreme Value Distribution

Recent papers have adopted a skill distribution of the Frechet family (Hsieh, Hurst, Jones, and Klenow, 2013; Burstein, Morales, and Vogel, 2015; Galle, Rodriguez-Clare, and Yi, 2015). The main advantage of this distribution is its tractability in the multi-dimensional problem of sectoral choice, allowing for an analytical characterization of the equilibrium with an arbitrary number of sectors. As discussed below, this tractability comes at a price: it imposes a restrictive pattern of selection across sectors. Specifically, assume that skill in each sector is independently draw from a Frechet distribution: " #    −κ TC(i), TN(i) ∼ exp − ∑ Tk k=C,N where I assume that κ > 1 to guarantee finiteness of first-order moments. First, consider the distribution of comparative advantage:

∞ ∞ −κ(a+s) −κa −κs −κa F(s) ≡ Pr [s(i) < s] = e−e κe−κae−e da = κe−κae−(1+e )e da. ˆ−∞ ˆ−∞

−κs −κa −κs −κa 1 Define x ≡ (1 + e )e such that dx = −κ(1 + e )e da. Thus, F(s) = 1+e−κs and, therefore,

1  q  α(q) ≡ F−1(q) = ln . (40) κ 1 − q

58 Second, consider the joint distribution of absolute and comparative advantage:

a¯ −κs −κa 1 −κs −κa¯ [ ( ) < ( ) < ] = −κa −(1+e )e = −(1+e )e Pr a i a¯; s i s κe e da −κs e . ˆ−∞ 1 + e

To obtain the average task efficiency, notice that the skill distribution in the non-commodity sector is  N  −κα(l ) −κa 1 N − 1+e e 1 −κa −κ a+ ln l h  Ni − e −e ( κ ) Pr a(i) < a|s(i) < α l = e = e lN = e where the second equality follows from the definition of α(.). 1 N Since this is a Gumbel distribution with parameters β ≡ 1/κ and µ ≡ − κ ln l , the average task efficiency in the non-commodity sector is

γ 1 A¯ N(l) = − ln lN κ κ where γ is the Euler-Mascheroni constant. From this expression, we can recover

γ − 1 1 A(q) = − ln lN. (41) κ κ

Analogously, the skill distribution in the commodity sector is

C 1 N h  i −κ(a + κ ln(1−l ))   γ 1   Pr ln TC(i) < aC|s(i) > α lN = e−e ⇒ A¯ C lN = − ln 1 − lN . κ κ

The schedules of comparative and absolute advantage in equations (40)-(41) are fully characterized by the skill dispersion parameter, κ. If skill dispersion is low (i.e., κ is high), then a small variation in the relative task price is associated with a large response of sector employment. In addition, a sector employment expansion causes a decrease in the average sector efficiency whose magnitude is also controlled by the skill dispersion. In other words, the extreme value distribution only allows for positive selection in both sectors. This very particular pattern of selection has strong implications for the wage distribution, implying that both sectors exhibit the same distribution of labor earnings. Specifically, the log-wage distribution in sector k is

N 1 N C 1 N −κ(y−ω + κ ln l ) −κ(y−ω + κ ln(1−l )) GN(y) = e−e = e−e = GC(y) where the second equality follows from the employment equation in (5).

D.4.3 Log-Linear System: A Skill Distribution Example

In this section, I describe a distribution that delivers the log-linear functional forms in Assumption5. To avoid infinite moments, assume that comparative advantage has a modified logistic distribution

59 with an upper bound. In particular,

( 1 ¯ 1 if s < S s(i) ∼ F(s) = 1+e− α s 1 if s ≥ S¯ where α > 0 and s ∈ (−∞, S¯] such that S¯ ≡ −α ln[ε/(1 − ε)] and ε ≥ 0. Immediately, ( α ln [q/(1 − q)] if q < 1 − ε α(q) = . α ln [(1 − ε)/ε] if q ≥ 1 − ε where ε ≥ 0. Although the comparative advantage distribution has finite moments for every ε ≥ 0 and α > 0, this is not necessarily true for its moment generating function. Accordingly, the upper bound in the support implies a well defined moment generating function for all ε > 0 and, therefore, a finite supply N of sector-specific tasks. Notice that positive employment in both sectors implies that lg,r < 1 − ε and, therefore, the empirically relevant portion of the quantile function is that in Assumption5. Also, assume that the conditional distribution of absolute advantage is normal with a linear condi- tional mean: 2 {a(i)|s˜(i) = α(q)} ∼ N Ag(q), σ s.t. A(q) ≡ A¯ + A · ln q.

Thus, N 1 l ¯ N( N) ≡ ( ) = N + · N A l N A q dq A A ln l l ˆ0 where AN = (A¯ − A). N With positive employment in both sectors, lg,r,t < 1 − ε and

1 1 lN ¯ C( N) ≡ ( ) + ( ) = C − ( N + ) · N − C · ( − N) A l N α q A q dq A α A N ln l α ln 1 l . 1 − l ˆlN 1 − l where AC ≡ (A¯ − A) + αe · ln [(1 − ε)/ε].

E Structural Estimation: Sensitivity Analysis

In Table E1, I evaluate the robustness of the structural results of Section6 to specific choices of the es- timation procedure regarding the moment weighting matrix and the set of instruments. In this inves- tigation, I focus on the log-linear model with the overidentification restriction. Column (1) replicates the baseline specification with the Two-Step GMM estimator and the vector of instruments built from the quadratic polynomial of the exposure to international prices in Grains, Soft Agriculture, Livestock, Mining and Energy. Columns (2)-(3) estimate the model with alternative moment weighting matrices. Specifically, col- umn (2) imposes that structural errors in the three equations are independent and, in addition, column (3) imposes that structural errors are homoskedastic (i.e., 2SLS weights). In both cases, point estimates are similar albeit, as expected, more imprecise.

60 Table E1: Structural Parameters, Specification Robustness Choice of Moment Weight Baseline Choice of Instrument Vector Matrix (1) (2) (3) (4) (5) (6) A. High-School Graduates

훼퐻푆퐺 0.835** 0.879** 0.997** 0.924** 0.698** 0.933** (0.212) (0.185) (0.237) (0.359) (0.219) (0.110) 퐴퐻푆퐺 1.966* 1.759* 2.032* 1.501+ 1.605+ 2.595* (0.935) (0.834) (0.978) (0.873) (0.897) (0.904) B. High-School Dropouts

훼퐻푆퐷 0.916* 1.302* 1.475+ 0.394 0.823 1.030* (0.399) (0.701) (0.879) (0.536) (0.568) (0.460) 퐴퐻푆퐷 -0.727** -0.640** -0.795** -0.811** -0.552** -0.617** (0.142) (0.134) (0.147) (0.190) (0.212) (0.146)

Matrix of Moment Weights Optimal GMM weights Yes No No Yes Yes Yes Optimal weights under independent equations No Yes No No No No 2SLS weights under independent equations No No Yes No No No Vector of instruments Exposure to international prices in Grains, Soft, Livestock, Mining and Energy Yes Yes Yes Yes Yes Yes Squared exposure to international prices in Grains, Soft, Livestock, Mining and Energy Yes Yes Yes No No Yes Sum of squared exposure to international prices in Grains, Soft, Livestock, Mining and Energy No No No No Yes No Sum of cubic and quartic of exposure to international prices in Grains, Soft, Livestock, Mining and Energy No No No No No Yes

Note. Stacked sample of 518 microregions in 1991-2000 and 2000-2010. GMM estimator with microregions weighted by their share in the 1991 national population. All columns correspond to the log-linear model with the overidentification equation. Control vector includes macroregion-period dummies, initial commodity sector size controls, and initial labor market conditions interacted with period dummies (full description in Section 6.3). Standard Errors clustered by microregion.** p<0.01, * p<0.05, + p<0.1

In columns (4)-(6), I estimate the model with different instrument sets. The instrument vector in column (3) is restricted to contain only the exposure to commodity price shocks by category. In this case, estimates are similar for HSG, but the comparative advantage parameter for HSD is lower and more imprecise. Similar conclusions are obtained with the inclusion of the sum of the squared exposure to commodity price shocks in column (5). Thus, the disaggregated squared exposure is the main source of identification of the comparative advantage parameter for HSD. Lastly, column (6) augments the instrument vector to also include the cubic and quartic of the microregion’s total exposure to the price shock. In this case, results are very similar, indicating that there is little additional gain from the use of higher order polynomials in (24).

61