From: Clayton Petree To: [email protected]; Council Cc: [email protected] Subject: Testimony for upcoming Comprehensive Plan Update Date: Tuesday, July 05, 2016 11:32:45 AM Attachments: growthlostopportunities.pdf why_has_regional_income_convergence_in_the_us_declined_01.pdf

Dear County and Bellingham Council,

Using Boulder, CO as an example, this article and studies it has links to show how poor and over regulation leads to the exclusion of poorer segments of the population, much like is happening here. It also illustrates the broadening gap between the wealthy and the working class. As I read the article, I see Whatcom County and Bellingham doing many of these exact things that harm our community. I certainly hope the City and County council take this issue very seriously.

Link to article: http://www.nytimes.com/2016/07/04/business/how-anti-growth-sentiment-reflected-in- zoning-laws-thwarts-equality.html?smid=fb-share

Study 1: http://scholar.harvard.edu/files/shoag/files/why_has_regional_income_convergence_in_the_us_declined_01.pdf Study 2: http://eml.berkeley.edu//~moretti/growth.pdf

Regards, Clayton

How Anti-Growth Sentiment, Reflected in Zoning Laws, Thwarts Equality By CONOR DOUGHERTYJULY 3, 2016

A construction site in Boulder marks a new Google campus that will allow the company to more than triple its local work force. Credit Matthew Staver for The New York Times BOULDER, Colo. — The small city of Boulder, home to the University of Colorado’s flagship campus, has a booming local economy and a pleasantly compact downtown with mountain views. Not surprisingly, a lot of people want to move here.

Something else is also not surprising: Many of the people who already live in Boulder would prefer that the newcomers settle somewhere else.

“The quality of the experience of being in Boulder, part of it has to do with being able to go to this meadow and it isn’t just littered with human beings,” said Steve Pomerance, a former city councilman who moved here from Connecticut in the 1960s.

All of Boulder’s charms are under threat, Mr. Pomerance said as he concluded an hourlong tour. Rush-hour traffic has become horrendous. Quaint, two-story storefronts are being dwarfed by glass and steel. Cars park along the road to the meadow.

These days, you can find a Steve Pomerance in cities across the country — people who moved somewhere before it exploded and now worry that growth is killing the place they love.

But a growing body of economic literature suggests that anti-growth sentiment, when multiplied across countless unheralded local development battles, is a major factor in creating a stagnant and less equal American economy.

It has even to some extent changed how Americans of different incomes view opportunity. Unlike past decades, when people of different socioeconomic backgrounds tended to move to similar areas, today, less-skilled workers often go where jobs are scarcer but housing is cheap, instead of heading to places with the most promising job opportunities, according to research by Daniel Shoag, a professor of public policy at Harvard, and Peter Ganong, also of Harvard.

One reason they’re not migrating to places with better job prospects is that rich cities like San Francisco and Seattle have gotten so expensive that working-class people cannot afford to move there. Even if they could, there would not be much point, since whatever they gained in pay would be swallowed up by rent.

In Boulder, for instance, the median home price has risen 60 percent over the last five years, to $648,200. Today, someone who makes the typical Boulder salary would have to put about 40 percent their monthly income toward payments on a new mortgage or about half toward rent, according to Zillow.

“We’ve switched from a world where everybody educated and uneducated was moving from poorer parts of the country to the richer parts of the country,” said Professor Shoag, “to a world where the higher-educated people move to San Francisco and lower educated people move to Vegas.”

Zoning restrictions have been around for decades but really took off during the 1960s, when the combination of inner-city race riots and “white flight” from cities led to heavily zoned suburbs.

They have gotten more restrictive over time, contributing to a jump in home prices that has been a bonanza for anyone who bought early in places like Boulder, San Francisco and New York City. But for latecomers, the cost of renting an apartment or buying a home has become prohibitive.

In response, a group of politicians, including Gov. Jerry Brown of California and President Obama, are joining with developers in trying to get cities to streamline many of the local zoning laws that, they say, make homes more expensive and hold too many newcomers at bay.

To most people, zoning and land-use regulations might conjure up little more than images of late-night City Council meetings full of gadflies and minutiae. But these laws go a long way toward determining some fundamental aspects of life: what American neighborhoods look like, who gets to live where and what schools their children attend.

And when zoning laws get out of hand, economists say, the damage to the American economy and society can be profound. Studies have shown that laws aimed at things like “maintaining neighborhood character” or limiting how many unrelated people can live together in the same house contribute to racial segregation and deeper class disparities. They also exacerbate inequality by restricting the housing supply in places where demand is greatest.

The lost opportunities for development may theoretically reduce the output of the economy by as much as $1.5 trillion a year, according to estimates in a recent paper by the economists Chang-Tai Hsieh and Enrico Moretti. Regardless of the actual gains in dollars that could be achieved if zoning laws were significantly cut back, the research on land-use restrictions highlights some of the consequences of giving local communities too much control over who is allowed to live there.

“You don’t want rules made entirely for people that have something, at the expense of people who don’t,” said Jason Furman, chairman of the White House Council of Economic Advisers.

So far, the biggest solution offered comes out of California, where Governor Brown has proposed a law to speed up housing development by making it harder for cities to saddle developers with open-ended design, permit and environmental reviews. The Massachusetts State Senate passed similar reforms. And President Obama has taken a more soft-touch approach, proposing $300 million in grants to prod local governments to simplify their building regulations.

Mr. Brown’s proposal has inflamed local politicians, environmentalists and community groups, who see it as state overreach into how they regulate development. But it has drawn broad support from developers, not surprisingly, as well as a number of business leaders, including many from the technology industry.

To understand how a bunch of little growth laws make their way into existence, consider how well Boulder is doing — and why it has made people wary of promoting local growth.

The city, about a 45-minute drive from Denver, is surrounded by postcard views of the Rockies. The list of yuppie-friendly amenities includes streets full of bike lanes and a walkable downtown full of bars, restaurants and marijuana shops. It has the charms of a resort town with the bonus that, unlike a resort town, it has more than ski lift and bartending jobs.

The result is a virtuous economic cycle in which the university churns out smart people, the smart people attract employers, and the amenities make everyone want to stay. Twitter is expanding its offices downtown. A few miles away, a big hole full of construction equipment marks a new Google campus that will allow the company to expand its Boulder work force to 1,500 from 400.

Looking to slow the pace of development, last year a citizens’ group called Livable Boulder — Mr. Pomerance was involved in the effort — pushed a pair of local ballot measures that would have increased fees on new development and given neighborhood groups a vote on zoning changes in their area.

Will Toor, Boulder’s former mayor, who worked for the “no” campaign, said, “You would never build affordable housing, you would never build community facilities like homeless shelters — those things would just never happen if you had all veto power at the neighborhood level.”

Voters shot the measures down, but development is almost certain to remain the city’s most contentious issue.

“We don’t need one more job in Boulder,” Mr. Pomerance said. “We don’t need to grow anymore. Go somewhere else where they need you.”

A version of this article appears in print on July 4, 2016, on page A1 of the New York edition with the headline: When Cities Spurn Growth, Equality Suffers. Order Reprints| Today's Paper|Subscribe Why Do Cities Matter? Local Growth and Aggregate Growth

Chang-Tai Hsieh Enrico Moretti University of Chicago University of California, Berkeley

May 2015

Abstract. We study how growth of cities determines the growth of nations. Using a spatial equilibrium model and data on 220 US metropolitan areas from 1964 to 2009, we first estimate the contribution of each U.S. city to national GDP growth. We show that the contribution of a city to aggregate growth can differ significantly from what one might naively infer from the growth of the city’s GDP. Despite some of the strongest rate of local growth, New York, San Francisco and San Jose were only responsible for a small fraction of U.S. growth in this period. By contrast, almost half of aggregate US growth was driven by growth of cities in the South. We then provide a normative analysis of potential growth. We show that the dispersion of the conditional average nominal wage across US cities doubled, indicating that worker productivity is increasingly different across cities. We calculate that this increased wage dispersion lowered aggregate U.S. GDP by 13.5%. Most of the loss was likely caused by increased constraints to housing supply in high productivity cities like New York, San Francisco and San Jose. Lowering regulatory constraints in these cities to the level of the median city would expand their work force and increase U.S. GDP by 9.7%. We conclude that the aggregate gains in output and welfare from spatial reallocation of labor are likely to be substantial in the U.S., and that a major impediment to a more efficient spatial allocation of labor are housing supply constraints. These constraints limit the number of US workers who have access to the most productive of American cities. In general equilibrium, this lowers income and welfare of all US workers.

We are grateful to Klaus Desmet, Rebecca Diamond, Daniel Fetter, Cecile Gaubert, Ed Glaeser, Eric Hurst, Pat Kline, Steve Redding and seminar participants at the AEA Meetings, Brown, Chicago Booth, Chicago Fed, Minneapolis Fed, NBER Summer Institute, Simon Fraser University, Stanford SIEPR, University of Arizona, Universita’ Cattolica di Milano, University of British Columbia, UC Berkeley Haas, UCLA Anderson, University of Venice, and Yrjo Jahnsson Foundation for useful suggestions. This paper previously circulated under the title "Growth in Cities and Countries." 1. Introduction Economists have long been fascinated by the vast differences in economic activity between countries and between cities within each country. However, with a few exceptions, research on growth at the aggregate country level has not paid much attention to the distribution of economic activity across cities within a country. At the same time, a large urban literature identifies local forces that explain differences in wages and economic activity across cities but has paid comparatively less attention to how these forces aggregate to affect growth for the country as a whole. In this paper we bridge this gap. We study how economic growth of cities determines the growth of nations. We use data on 220 US cities over the past five decades and a spatial equilibrium model to address two related questions---a positive one and a normative one. First, we estimate the contribution of each US metropolitan area to aggregate output growth between 1964 and 2009. We show that our model-based calculation of a given city’s contribution to aggregate growth differs significantly from what one might naively infer from the growth of the city’s GDP. We then turn to a normative analysis of potential growth. We document a significant increase in the spatial dispersion of wages between 1964 and 2009, indicating that worker productivity is increasingly different across American cities. We argue that these productivity differences reflect an increasingly inefficient spatial allocation of labor across US cities, and that much of this inefficiency is caused by restrictive housing policies of municipalities with high productivity, like New York and San Francisco. We base our analysis on a Rosen-Roback model where workers can freely move across cities and geographical differences in wages reflect differences in local labor demand and supply. In turn, local labor demand reflects forces that affect the TFP of firms in a city---infrastructure, industry mix, agglomeration economies, human capital spillovers, access to non-tradable inputs and local entrepreneurship---while local labor supply reflects amenities and housing supply. We analyze how these local forces aggregate in the Rosen-Roback model to affect national output and welfare. We show that aggregate output increases in local TFP in each city but decreases in the dispersion of wages across cities.1 The reason is that wage dispersion across cities reflects variation in the marginal product of labor: The wider the dispersion of marginal

1 Formally, we show that aggregate output growth can be decomposed into the contribution of the weighted average of the growth rate of local TFP and into the change in the dispersion of the marginal product of labor across cities. The growth rate of aggregate output is higher when the weighted average of the growth rate of local TFP is higher and is lower when the weighted average of wage dispersion across cities increases (for a given distribution of TFP).

1 products across cities, the lower aggregate output, everything else constant. Intuitively, if labor is more productive in some areas than in others, then aggregate output may be increased by reallocating some workers from low productivity areas to high productivity ones. In this setting, geography matters in the sense that the same localized shock can have profoundly different aggregate effects depending on where it takes place. An increase in local labor demand caused by a TFP increase will have a large effect on aggregate output if the TFP increase generates an increase in local employment, but the same increase in local TFP in another city can have a much smaller aggregate effect if it largely results in higher nominal wages in the city and only a small increase in local employment. Empirically, we begin by calculating the contribution of each US city to aggregate growth and compare it with an accounting measure based solely on the growth of the city’s GDP. We show there are large differences between these two numbers. For example, growth of New York’s GDP was 12 percent of aggregate output growth from 1964 to 2009. However, viewed through the lenses of the Rosen-Roback model, New York was only responsible for less than 5 percent of aggregate output growth. The difference is because much of the output growth in New York was manifested as higher nominal wages, which increased the overall spatial misallocation of labor. On the other extreme, Detroit’s GDP fell dramatically from 1964 to 2009, but its net contribution to aggregate output growth was actually positive. In the case of Detroit, the decline in its nominal wage from 1964 to 2009 lowered the overall wage dispersion. We then turn from a positive analysis of the local forces underlying aggregate growth to a normative analysis of potential growth. We focus on the effects of the growing dispersion in the marginal product of labor across cities. We show that after conditioning on workers characteristics, the geographical distribution of nominal wages is significantly wider today than in 1964. In particular, the standard deviation of conditional wages across US cities in 2009 is twice as large compared to 1964, indicating that differences in worker productivity across cities are growing. When we quantify the output and welfare cost of this increase in dispersion of the marginal product of labor, we find that aggregate output in 2009 would have been significantly higher if the dispersion of nominal wages had not increased. Holding the distribution of local TFP fixed at 2009 levels, we hypothetically reallocate labor from high wage to low wage cities such that the hypothetical wage in each city (relative to the average wage) is equal to the relative wage in 1964. Intuitively, this scenario involves setting amenities and housing supply at their 1964 level, while keeping labor demand constant at its 2009 level, and allowing workers to reallocate across 2 cities in response. Under this scenario, aggregate yearly GDP growth from 1964 to 2009 would have been 0.3 percentage points higher. In levels, U.S. GDP in 2009 would be 13.5% or $1.95 trillion higher. This amounts to an annual wage increase of $6958 for the average worker. This output effect is driven to a large extent by three cities -- New York, San Francisco and San Jose – which experienced some of the strongest growth in labor demand over the last four decades, thanks to growth of human capital intensive industries like high tech and finance (Moretti, 2012). But most of the labor demand increase was manifested as higher nominal wages instead of higher employment. The resulting increase in overall wage dispersion negatively impacted aggregate growth. In contrast, Southern cities also experienced rapid output growth, but much of this growth showed up as employment growth and only a small amount as an increase in the nominal wage. The resulting decrease in overall wage dispersion fostered aggregate growth, although the impact was smaller than that one in New York, San Francisco and San Jose. Of course, the potential output gains from spatial reallocation of labor do not necessarily translate into welfare gains. The effect on aggregate welfare depends on why wages are not equalized across cities in the first place.2 If the relative increase in nominal wages in high TFP cities such as San Francisco and New York is due to restrictions to housing supply, then the aggregate output loss due to differences in the marginal product of labor also imply welfare losses. In this case, removing constraint to housing supply in cities like San Francisco and New York would allow more workers to move there and take advantage of their higher productivity, increasing both aggregate output and welfare. In contrast, if labor supply in New York and San Francisco is low because of increasingly undesirable local amenities, then the loss in aggregate output from the gaps in the marginal product of labor does not necessarily reflect a loss in welfare. For example, if equilibrium wages in New York are high because people dislike congestion, noise and pollution and need to be compensated for it, then moving more people to New York will increase aggregate output, but will lower welfare. When we decompose the increase in wage dispersion into the changes due to housing supply and amenities, we find that the increase is almost entirely driven by the former. Setting amenities back to their 1964 levels slightly decreases the overall wage dispersion and increases aggregate output, but the effect is quantitatively small. In contrast, we find that constraints to housing

2 Formally, we show that aggregate welfare in the Rosen-Roback model is simply aggregate output divided by a weighted average of the ratio of local housing prices to local amenities. Holding aggregate output constant, higher housing prices lower aggregate welfare and better local amenities increase welfare.

3 supply in cities with high TFP are a major driver of our findings. We use data from Saiz (2010) to separate overall elasticity of housing supply in each U.S. city into the availability of land and municipal regulations. We estimate that holding constant land but lowering land use regulations in New York, San Francisco and San Jose to the level of the median city would increase U.S. output by 9.7%. In essence, more housing supply would allow more American workers to access the high productivity of these high TFP cities. We also estimate that increasing regulations in the South would be costly for aggregate output. In particular, we estimate that increasing land use regulations in the South to the level of New York, San Francisco and San Jose would lower U.S. output by 3%. We conclude that the aggregate gains in output and in welfare from spatial reallocation of labor are likely to be substantial in the U.S., and that a major impediment to a more efficient spatial allocation of labor is the growing constraints to housing supply in high wage cities. These constraints limit the number of US workers who can work in the most productive of American cities. In general equilibrium, this lowers income and welfare of all US workers and amount to a large negative externality imposed by a minority of cities on the entire country. This paper builds on two bodies of work. First, we build on the large empirical work, beginning with Rosen (1979) and Roback (1982), on local labor supply and labor demand. The effect of stringent land use regulations on local housing prices is well documented (Glaeser, Gyourko and Saks, 2005 and 2006; Gyourko and Glaeser, 2005; Saiz, 2010), and our paper highlights the aggregate negative impacts of such regulations (and the positive effect of the relative absence of such regulations in the US South). Our findings on the importance of housing supply constrains are consistent with those in Ganong and Shoag (2013). Second, we draw on the theoretical work on systems of cities in spatial equilibrium. In particular, Henderson (1981, 1982), Au and Henderson (2006a and 2006b), Behrens et. al. (2014), Eeckout et. al. (2014), Desmets and Rossi-Hansberg (2013) and Redding (2014) model the equilibrium allocation of resources across cities. Our approach is most closely related to Desmets and Rossi-Hansberg (2013), Redding (2014) and Gaubert (2014). Desmets and Rossi-Hansberg (2013) analyze the effect on the heterogeneity of local TFP, amenities and “local frictions” in the US and China, Redding (2014) analyzes on the effect of internal trade frictions, and Gaubert (2014) analyzes optimal city size. We abstract from trade frictions and heterogeneity in local TFP to focus on the effect of local housing supply on wage dispersion, aggregate output and welfare. Another

4 closely related paper is Duranton et al. (2015) who quantify the misallocation of manufacturing output in India caused by misallocation of land. The paper is organized as follows. In Section 2 we present the model. In Section 3 we describe the data and the key changes in wage dispersion. Empirical findings are in Section 4. Section 5 discusses policy implications.

2. Model This section examines the channels by which local forces in a city affect aggregate output and welfare. The model is a standard Rosen-Roback model with a spatial equilibrium. Cities differ by local labor demand and local labor supply. Specifically, city i produces a traded good sold at a fixed price in the national market with the following technology

  (1.1) Yi  ALKi i i .

Here, Ai denotes total factor productivity, Li employment, and Ki capital. We assume    1.

We interpret Ai as capturing forces such as cost advantages enjoyed by firms in the city (access to waterways, railways, airports, topography, nature of the terrain, weather, local institutions, labor and environmental regulations), demand for products made by the city, ease of entry, agglomeration economies or technological spillovers that benefit all firms in the city. Workers can freely move across cities and their indirect utility given by

WZi i (1.2) V   . Pi

Here Wi denotes the nominal wage, Zi amenities, Pi the price of housing in city, and  is the share of expenditures on housing.3 We assume that capital is supplied with infinite elasticity at an exogenously given rental price. We make several simplifying assumptions. First, the expression for indirect utility implicitly assumes workers do not own the housing stock, but rent from an absentee landlord. Second, we assume that workers have homogeneous tastes over locations and are perfectly mobile across locations. This makes labor supply to a local labor market infinitely elastic. Third we assume that TFP and amenities can vary across cities but are exogenous. Fourth, we assume

3 While different cities have different income and different prices, the share of expenditures on housing does not vary with income (Davis and Ortalo-Magnes, 2010; Lewbel, Arthur and Krishna Pendakur, 2008), which suggests that  is roughly constant.

5 all cities produce the same product and do not specialize. Finally, we assume no heterogeneity in labor demand elasticity. We relax all these assumptions later. We now solve for the equilibrium allocation of employment and wage across cities. First, equating the marginal product of labor to the cost of labor in each city and the cost of capital to an exogenously determined interest rate, employment is:

1 1    Ai  (1.3) Li   1  Wi 

Employment is increasing in local TFP and decreasing in the nominal wage, with an elasticity that depends on the slope of the labor demand curve. After substituting (1.2) into (1.3)

1 1 1    AZi i  employment can also be expressed as Li    (1 )  . Not surprisingly, cities with more  Pi  employment are those with high local TFP, low housing prices, or high quality amenities. We assume housing prices reflect local demand and supply conditions. Specifically, we

 i assume Pi  Li where  i is a parameter that governs the elasticity of housing supply with respect to the number of workers. An increase in the number of workers has a larger effect on housing prices when  i is large. We think of heterogeneity in  i as capturing differences in both land availability and housing regulations (such as land use regulations). Cities with limited amount of land and stringent land use regulations have a large  i ; cities with abundant land and permissive land use regulations have a small  i (Glaeser, Gyourko and Saks, 2005 and 2006; Saiz, 2010). We can now write the equilibrium wage as a function of three exogenous factors:

1  i (1 )(1 i )  Ai  (1.4) Wi   1     Zi 

The equilibrium wage is increasing in local TFP and decreasing in amenities with an elasticity that depends on the local elasticity of housing supply  i . The first factor – local TFP – reflects labor demand. Higher local TFP implies stronger demand for labor and therefore higher equilibrium nominal wages, ceteris paribus. The other two factors -- amenities and housing supply reflect labor supply. In equilibrium, better amenities imply larger supply and therefore lower nominal wages. Intuitively, the utility stemming from the amenities makes workers willing to live in a city even if their nominal wages are lower. More elastic housing supply also implies 6 lower wages, but for a different reason. More elastic labor supply means that in cities with positive demand or amenity shocks, the cost of housing increase by less. The spatial variation of wages reflects the variation in TFP, amenities, and housing supply and the covariance between these variables.

2.1 Aggregate Output and Welfare We now solve for aggregate output and welfare. First, we use (1.2) and (1.3) to express welfare as: Z (1.5) V Y L i    i   i Pi

where Y  Yi denotes aggregate output. Intuitively, welfare is aggregate output in units of i

1  Z  L i utility and   i    is the cost minimizing price of a unit of utility (the price of goods is  i Pi  normalized to one).4 Second, we solve for aggregate output by imposing the condition that aggregate labor demand is equal to aggregate labor supply (normalized to one):

1    1  1 1  W 1   (1.6) Y  Y   A 1     i   i    i  i  Wi    

where W  WLi  i denotes the employment-weighted average nominal wage and Wi is i determined by (1.4). Aggregate output is a harmonic mean of the product of local TFP and the inverse of the wage gap of the city relative to the mean wage. Housing supply restrictions affect welfare through their effect on the average price of housing and on aggregate output by changing the dispersion of nominal wages. We can now decompose the sources of aggregate growth in output and welfare. The growth of aggregate output is:

4 Equation (1.5) only considers the effect of labor income on welfare. If we instead assume that firm profits accrue to the workers, the sum of labor income and profits is proportional to aggregate output. Therefore, welfare would still be proportional to aggregate output divided by the price of utility.

7 1   1    1  1  1  1  W 1   1     t1  Li, t 1      Ai, t 1     Y i W (1.7) t1   i    i, t 1   Y  1   1  t 1    W 1     Ai, t    t  Li, t     i    W   i  i, t   1 1     1  1     1    where Li  Ai   Aj  denotes the hypothetical city size when wages are the   j       same in all cities. Equation (1.7) suggests that aggregate output growth can be decomposed into the effect of local TFP (the first term in (1.7)) and into the effect of changes in the spatial dispersion of wages

1 1     W  (the second term in (1.7)). The effect of spatial dispersion is given by  Li   measured i Wi  in the two years. Intuitively, this term measures the ratio of aggregate output observed in each year to the hypothetical output when wages were the same in all cities in that year (and labor and capital is reallocated in response to the change in the wage distribution). Because the exponent on the relative wage is greater than one, aggregate output rises when wage dispersion falls (holding local TFP fixed).5 The growth of aggregate welfare depends on the same two forces as well as on changes in the price of utility, because we have seen in equation (1.5) that aggregate welfare is equal to aggregate output times the price of utility (i.e. the weighted average of the ratio of local amenities to the local housing price.) Thus there are three channels via which local shocks affect aggregate welfare: the price of utility, the weighted average of local TFP, and the weighted dispersion of wages across cities. To illustrate these mechanisms, consider how changes in local TFP or local amenities affect aggregate output and welfare. First, suppose that local TFP rises in a city. This raises the weighted average of local TFP, which increases aggregate output and welfare (holding the price of utility constant). The increase in local TFP also raises the local housing price by increasing the local demand for housing. This increases the price of utility in all cities, which lowers welfare (holding aggregate output fixed), and this effect is larger when the local housing supply

5 We assume decreasing returns to scale. With constant or increasing returns to scale, the distribution of employment would be degenerate as the city with the highest TFP would attract all economic activity.

8 is inelastic. Finally, high local housing prices increases the local wage, but the aggregate effect of a higher local wage is ambiguous. If the high local housing price increases the gap between the local wage and the average wage, aggregate output -- and welfare -- falls. When this is the case, the growth rate of local GDP overstates the contribution of the local growth to aggregate output growth. But if the TFP increase occurs in a low wage city, the increase in the local wage potentially lowers the overall wage dispersion, which increases aggregate output. In this case, the growth rate of local GDP understates the local contribution to aggregate GDP. Second, consider the effect of a decline in local TFP. Low TFP lowers the average of local TFP, which lowers aggregate output and welfare. In addition, Glaeser and Gyourko (2005) show that housing prices drop sharply in cities that suffer from an adverse labor demand shock. In our framework, the decline in housing prices has two additional effects. First, lower housing prices lowers the price of utility, which offsets the effect of lower aggregate output on welfare. The drop in housing prices also lowers the local nominal wage, but as before the aggregate effect depends on whether the local nominal wage was above or below the nationwide mean. If the local wage is above the mean, the decline in the nominal wage potentially narrows the marginal product gap, which increases aggregate output and welfare. In this case, local GDP falls because of the direct effect of the decline in local TFP and the fall in the local wage. However, even when local output growth is negative, the net effect on aggregate output growth may well be positive if the effect of the narrowing wage dispersion is larger than the direct effect of the decline in local TFP. In the empirical results, we will show that this appears to have been the case in many US cities where local TFP fell. Third, consider the effect of an improvement in amenities. When amenities improve in high wage cities, this increases the average level of amenities and lowers overall wage dispersion. Here the decline in wage dispersion unambiguously improves welfare, and the local output growth understates the contribution of the local economy to aggregate output. On other hand, when amenities improve in low wage cities, this also increases the average level of amenities, but increases the overall wage dispersion. Here, although local GDP increases, the improvement in amenities lowers aggregate output. The output decline due to increased wage dispersion offsets some of the direct effect of the improvement in the average level of amenities. In the empirical section of the paper, we use this framework to provide two calculations. First, we measure the contribution of each US city to aggregate US growth. We show that the model-based calculation of the contribution of each city to aggregate growth is empirically quite different from a naïve accounting-based calculation based on the measured growth of local 9 output. Second, we use this framework to calculate the counterfactual output and welfare growth in the US under different assumptions on wage dispersion. We ask how much faster output and welfare growth would have been if wage dispersion had not increased in the US but had remained constant and link the increase in wage dispersion to specific housing supply policies on the part of US cities.

2.2 Extensions We now consider the effect of several extensions of our basic model.

Ownership of Housing Stock: We have assumed that workers do not own the housing stock so that an increase in average housing prices lowers welfare holding aggregate output fixed. Suppose we assume instead that the housing stock is owned by the workers in equal proportions, irrespective of where they live. Think of workers as owning equal shares in a mutual fund that own all the housing in the US. All the equations are the same, except that welfare is given by   Z V Y L h P L i     i i i  i   where hi denotes per-capita housing consumption in city i. After  i  i Pi imposing the condition that the share of nominal expenditures on housing is equal to  , the change in housing prices has the same effect on nominal income as on the average price of housing. In this case, changes in housing prices only affect welfare through the effect of the dispersion of the nominal wage on aggregate output, but changes in the average price of housing has no effect on welfare. The most realistic case is of course the one where workers own housing in the city where they live. In this case, changes in house prices induced by our counterfactuals have redistributive effects: workers in some areas are made better off, while workers in other areas are made worse off. But in the aggregate, the conclusions are identical to the case in which the housing stock is owned by the workers in equal proportions, irrespective of where they live: housing prices only affect welfare through the effect of the dispersion of the nominal wage on aggregate output. Thus, estimates of the effects based on the baseline model remains valid in the aggregate.

Specialization by Cities: Our baseline model assumes that the output of a city is a perfect substitute for the products made by other cities. Suppose instead that each city makes a differentiated product with a production function given by Yi  ALi i . The demand for the

10  (1 )   1   1   product of each city is determined by utility defined as U j   Yij  hj Z j where Uj  i  denotes utility in city j, Yij denotes consumption of city i's output in city j, and hj is per-capita

 1  Ai  housing in city j. The labor demand in each city is given by Li    and aggregate output Wi 

1   1  1  1  W  by Y   Ai    . These last two equations are identical to (1.3) and (1.6) when we   W   i  i   1 substitute with   1 .6 In words, a model with constant returns to scale and where cities 1   are specialized in production is isomorphic to a model where cities produce identical products and with a decreasing returns to scale production function. Finally, assuming that the output good is available in all cities at the same price, we can normalize the cost-minimizing price of

   1  1  one unit of the CES aggregate of the output good Yi  to one. With this normalization,  i  welfare is still given by (1.5).

Imperfect Labor Mobility: We can also relax the assumption of infinite labor mobility. Suppose that workers differ in preferences over locations. Specifically, suppose the indirect

WZi i utility of worker j in city i is given by Vji   ji   where  jt is a random variable measuring the Pi taste of individual j in city i as, for example, in Moretti (2010). A larger  jt means that worker i is particularly attached to city jfor idiosyncratic reasons. We assume that workers locate in the city where her utility Vji is maximized. In this case, workers tend to move toward cities with high real wages and good amenities, but they are not infinitely sensitive to small wage differences. The implication is that only marginal workers are indifferent across cities and all the other workers are infra-marginal.

To make this model tractable, we assume that  jt are independently distributed and drawn from a multivariate extreme value distribution. Specifically, we follow Kline and Moretti (2013)

6 Since labor is the only factor of production in the differentiated products model, we set the capital share to zero in the baseline model for comparability. 11  N  and assume the joint distribution of is given by F ( ,.., ) exp    where the parameter  jt g 1 N   i   i  1/ governs the strength of idiosyncratic preferences for location and therefore the degree of labor mobility. If 1/ is large, many workers require large real wage or amenity differences to be compelled to move. On the other hand, if 1/ is small, most workers are not particularly attached to one community and will be willing to move in response to small differences in real wages or amenities.7 In this model, employment in a city is still given by (1.3) and aggregate output by (1.6). What is new is that the Rosen-Roback condition that differences in wages across cities are directly proportional to the ratio of housing prices to amenities (equation (1.5)) no longer holds. Instead, the (inverse) labor supply equation of a city is given by:

1   PLi i (1.8) Wi  Zi This says even when housing prices and amenities are the same in all cities, wages will differ between large and small cities with an elasticity that depends on the heterogeneity in preferences for location. Intuitively, higher wages in large cities are needed to compensate marginal individuals to live in the city. When we endogenize the housing price as a function of city size and the local housing supply elasticity and impose the condition that labor demand is equal to labor supply, the equilibrium nominal wage is given by:

1    i 1/  (1   )  (1)(  i 1/ ) 1 (1   ) (1 )(  i 1/ ) (1.9) Wi    A  Zi  Finally, while utility differs across workers, average utility is the same in all cities and given by:

1 1 Z VY L  i (1.10)   i   i Pi In sum, conditional on the observed changes in the wage distribution, the implications for city size and aggregate output is the same as before and does not depend on  . But the effect of local TFP, amenities, and the local housing supply elasticity on the wage distribution (and by extension on aggregate output and average welfare) depends critically on  .

7 None of the substantive results here hinge on the extreme value assumption. See Kline (2010) and Busso, Gregory, and Kline (2013) for analyses with a nonparametric distribution of tastes.

12 Heterogeneity in Labor Demand Elasticity: Our basic model assumes that the output elasticity with respect to labor is constant. We can relax this assumption. Specifically, suppose that total output of in a city is the sum of the output produced in different industries indexed by j:

Yi  Yij j

 j  j where Yij  ALKi i i denotes output of industry j in city i. Note that the labor and capital shares are now indexed by industry. In this case, there are two changes in the key endogenous variables. First, total employment in a city is given by:

1

1j   j   j Aij  Li     1 j  j Wi  Second, aggregate output Y is implicitly defined by

1 j 1   1j   j 1j   j j W 1 Aij    i j YWi 

Yij where     j is the aggregate labor share. All the other equations are the same. i j Y

Endogenous TFP and Amenities: We can also relax the assumption that TFP and amenities are exogenous. In practice, it is plausible to think that both TFP and amenities are endogenous to changes in city size. For example, a large literature in urban and regional economics posits that in the presence of agglomeration economies, Ai depends positively on Li as denser cities are more productive. This would make our counterfactual exercise conceptually more complicated, as changes in city size would induce an endogenous feedback effect through the agglomeration economies. In practice, our estimates of aggregate effects are not affected if the elasticity of agglomeration is constant across cities. With constant elasticity, reallocation of workers across cities has no aggregate impact, because the increases in agglomeration economies experienced by cities that grow in size are exactly offset by the losses in agglomeration economies experienced by cities that shrink in size. Empirically, the assumption of constant elasticity appears consistent with the empirical evidence on US manufacturing (Kline and Moretti, forthcoming).

13 In terms of amenities, a large literature posits that amenities might depend on city size and/or density. Our baseline assumption of exogenous amenities does not require that amenities are necessarily fixed (as in the case of weather). It allows amenities --in particular public services like schools, public transit or police--to expand (contract) as the counterfactual population of the area expands (contracts), as long as the per-capita availability remains stable at current levels. While this is realistic for many public services, it is possible that the per capita amount of other amenities depend on city size. This could happen, for example, if congestion is an increasing function of city size– i.e. more people in a city mean more noise, traffic and pollution. It could also happen for the opposite reason, if more people improve urban amenities such as variety of restaurants and variety of cultural events. Glaeser (2010) has argued that cities like London, New York and San Francisco are attractive precisely because of their urban amenities stemming from high density of residents. Thus higher population density can create both negative and positive externalities. Irrespective of the sign, the possibility of this type of endogenous amenities makes our counterfactual exercise more complicated because changes in the number of workers induce an endogenous feedback effect on residents’ welfare through changes in amenities.8 Note that what matters is the aggregate effect. Our counterfactual exercise increases size of some cities and reduced size of other cities. If amenities decline in the first group and improve in the second group (or vice versa), the question that matters for us is the net effect in the aggregate. To see this more clearly, consider the following extension of our model. Suppose that the production function is still given by (1.1) and welfare by (1.2) but amenities are now given by

 Zi  ZLi i . Here, Zi denotes the component of per capita amenities exogenous to city size and

 Li the component that varies endogenously with the size of the city. City size is given by

1 1 (1 )(1 )  AZi i  Li    (1 )  and aggregate output and welfare by  Pi 

(1 )(1 )  1  1 (1 )(1 )  1     ZP  Y   A (1 )(1 ) i i   i   1   Z P  L  i    j j  j    j    

8 If the elasticity of endogenous amenities with respect to city size is constant across cities, then the net effect on aggregate welfare is zero, as gains in some cities are off by losses elsewhere.

14  1 V Y Z j Pj  Lj . j

In the end, the size and sign of the parameter  is an empirical question. If   0 then our counterfactual will imply welfare losses that will reduce the welfare benefits stemming from increased output, as it increases the size of cities that are already large, further exacerbating congestion. On the other hand, if   0 then our counterfactual will imply welfare gains that will magnify the welfare benefits stemming from increased output. If   0 then our counterfactual will be measuring welfare gains correctly. The existing evidence indicates that the assumption that endogenous amenities are either increasing or do not depend on city size. Ahlfeldt, Redding, Sturm and Wolf (2014) and Diamond (2014) find that urban amenities slightly increase with density in Germany and the US. The most direct estimate of  for the US is found in Albouy (2012). He shows that quality of life in a city is positively correlated with the city population, when no controls are included. But when natural amenities such as weather and coastal location are controlled for, Albouy (2012) finds no relationship between city population and quality of life. This suggests that cities with better natural amenities are bigger (just as predicted by the equilibrium expression above for city size), but endogenous amenities are not significantly better or worse in large cities compared to small cities.9 If Albouy's estimates are correct, then allowing for endogenous amenities should not change our estimates of aggregate impacts very much. Finally, it is worth highlighting an important caveat. It is in principle possible that inelastic housing supplies may contribute to the high TFP in cities like San Francisco and New York. This could happen, for example, if productivity is endogenous to college share (as in Moretti 2004 and Diamond 2013) and college workers more willing to pay high house prices. In this case, TFP would be endogenous with respect to housing supply, and our framework would not be adequate to estimate counterfactual output.

3. Data and Key Facts About the Spatial Dispersion of Wages in the U.S. 3.1 Data The ideal data for this project would have three features: they go back in time as much as possible; they have detailed and consistently defined geocodes; and they have detailed industry definition. To approximate it, we use a combination of data sources taken from the 1964, 1965,

9 This is true within the range of city sizes observed in the data. There is of course no guarantee that if one were to significantly expand the largest cities in the US, this would remain true. 15 2008 and 2009 County Business Patterns (CBP); the 1960 and 1970 Census of Population; the 2008 and 2009 American Community Survey (ACS); and the 1964 and 2009 Current Population Survey (CPS). Since the earliest year for which we could find city-industry level data on wages and employment is 1964, we focus on changes between 1964 and 2009.

Employment, Wages and Rents: Data on employment and average wages are available at the county and county-industry level from the CBP and are aggregated to MSA and MSA-industry level. The main strength of the CBP is its fine geographical-industry detail and the fact that data are available for as far back as 1964. 10 The main limitation of the CBP is that it does not provide worker level information, but only county aggregates, and it lacks information on worker characteristics. Obviously differences in worker skill across cities can be an important factor that affects average wages. In addition, union contracts may create a wedge between the marginal product of labor and the wage, as union wages may contain economic rents. We augment CBP data with MSA-level information on worker characteristics from the Census of Population, the ACS and CPS: three levels of educational attainment (high school drop-out, high school, college); race; gender; age; and union status. To purge average wage from differences in worker characteristics across cities, we calculate a residual wage that conditions for geographical differences in the composition of the workforce. Specifically, we use nationwide individual level regression based on the CPS in 1964 and 2009 to estimate the coefficients on worker characteristics, and use those coefficients to compute residual wages based on city averages.11 We end up with a balanced sample of 220 MSA’s with non-missing values in 1964 and 2009.12 The Data Appendix provides additional information on how we defined the variables, the limitations of the data, and presents summary statistics. Appendix Figure A1 shows that in 2009 the estimated average residual wage obtained from MSA-level data correlates well with average residual wage obtained from individual level data. (We cannot do the same for 1964, which is why we rely on MSA-level data.)

10 The published tabulations of the Census of Population provide MSA level averages of worker characteristics, but the individual level data on employment and salary with geocodes is not available from the public version of the Census of Population on a systematic basis until 1980. Only a third of metro areas are identified in the 1970 Census. 11 Residual wage is defined as W  X/ b where W is the average wage in the MSA, X is the vector of average workers characteristics in the MSA, and b is a vector of coefficients on workers characteristics from individual level regressions estimated on nationwide samples. 12 These MSAs account for 71.6% and 72.8% of US employment in 1964 and 2009, respectively, and 74.3% and 76.3% of the US wage bill in 1964 and 2009.

16 Housing Supply: Data on housing supply are from Saiz (2010). For each MSA, these data provide overall elasticity of housing supply  i as well as its two main determinants: land availability and land use regulations. Land use regulations are measured using the Wharton Residential Land Use Regulatory Index, originally obtained by Wharton researchers through a detailed survey of municipalities in 2007 and aggregated up at the MSA level by Saiz. It is the best available measure of differences in land use restrictions. We follow his estimates (Table 5, column 2) to divide overall supply elasticity into the part that reflects land use regulations and the part that reflects land availability.

Technology: Finally, to take the model to the data, we need to specify the technology parameters. In our baseline estimates, we assume a labor share  of 65 percent and a capital share  of 25 percent, which imply that the profit share 1    is 10 percent. The assumption that the labor share is 65 percent is consistent with BEA data (BEA, 2013), data in Piketty (2014), and Karabarbounis and Neiman's (2014). The assumption that the profit share is 10 percent is consistent with Basu and Fernald's (1997) estimates of the returns to scale in U.S. manufacturing as well as with estimates in Atkeson, Khan and Ohanian (1997). In additional estimates, we relax this assumption. First, we provide various alternative estimates under different assumptions on  and  . In these models, we either vary  and  individually or we vary the degree of returns to scale   . Second, in separate models, we relax the assumption that technology is the same across all cities and years by allowing the technology parameters to vary by industry and over time. Because the geographical location of industries is different for different cities, this assumption allows different cities to have different technologies. In practice, we use a dataset that is analogous to the one used in the baseline analysis, but that includes separate observations (and a separate technology) for each 1-digit industry in each city in each year. We use data on the labor share by industry in 1964 from Close and Shulenberg (1971) for 1964 and similar data for 2009 from BEA (2013).13

3.2 Changes in the Spatial Dispersion of Nominal Wages 1964-2009 The model in the previous section highlights the importance of wage differences across cities for aggregate output. It indicates that larger wage differences result in lower output, everything

13 No historical data exist on capital share by industry or by city. In both years, we retain the assumption of a 10% profit share (Basu and Fernald, 1997).

17 else constant. Intuitively, wage dispersion across cities reflects variation in the marginal product of labor. If labor is more productive in some areas than in others, then aggregate output may be increased by reallocating some workers from low productivity areas to high productivity ones. For example, in 2009 average nominal wages in San Jose, CA were twice as large as nominal wages in Brownsville, TX, presumably because the marginal product of labor in San Jose is twice as large. If some workers were moved from Brownsville to San Jose, aggregate GDP would increase because more workers would have access to whatever productive factor generates high productivity in San Jose. In principle, aggregate output is maximized when the marginal product of labor is equalized across locations. Empirically, the spatial distribution of nominal wages across US metropolitan areas is significantly more dispersed in 2009 than it was in 1964, suggesting a negative effect on output growth. Figure 1a plots the weighted distribution of the unconditional average wage in a MSA in 1964 and 2009 (after removing the mean US wage in each year), where the weights are MSA employment in the relevant year. It is clear that the 2009 distribution is more significantly dispersed. It is also clear that the right tail -- which includes cities with average wages that are 50% above the mean -- has become thicker.14 The bump of the right tail includes New York, San Francisco and San Jose. Table 1a quantifies the change in the dispersion in average nominal wage. Panel A indicates that the employment-weighted standard deviation (column 1), interquartile range (column 2), and the range (column 3) of the log average MSA wage increased significantly from 1964 to 2009 (by .07 log points, .10 log points, and .38 log points respectively). Panel B controls for the average wage in nine Census divisions and it suggests that increases in wage dispersion is not just a regional phenomenon, but it occurs even within Census divisions. Indeed, controlling for regional wage differences generates a larger increase in the wage dispersion, so regional wage differences have declined over time. Panel C shows that a non-trivial part of the increase in dispersion is due to three large cities with high output growth over the last 50 years: New York, San Francisco, and San Jose. Dropping these three cities has a significant effect on the right tail of the distribution. The standard deviation and the range of the log average wage when we exclude these three cities increase by much less from 1964 to 2009.

14 Weighting is empirically important. The unweighted distribution shows a more limited increased in dispersion (see Appendix Figure A2). As our model makes clear, the weighted distribution is the relevant one for our purposes.

18 These findings are not driven by differences in observable worker characteristics across cities. Figure 1b and Table 1b present the spatial dispersion of the average of the residual wage in each MSA.15 Controlling for changes in worker composition does not alter the picture of increased spatial dispersion between 1964 and 2009. The picture that emerges indicate that (i) spatial dispersion has increased significantly; (ii) such increase is not all concentrated in one specific region; and (iii) New York, San Francisco and San Jose account for an important part of such increase. These findings are generally robust. First, all the results are identical if we use 2007 data (pre-recession) instead of 2009. 16 Second, our approach of controlling for workers characteristics assumes that the effect of workers characteristics is the same everywhere in the country, but it is possible that the return to these characteristics (such as education) varies across cities (Dahl, 2003). To see whether this matters empirically, we estimate models where we allow the effect of workers characteristics on wages to vary by region or by state. When we do this, the resulting spatial distribution of wage residuals is very similar to that shown in Table 1b. A potential concern is that we cannot control for unobserved differences in worker ability. It is possible that average unobserved ability differs between cities, and that some of the documented wage differences across cities are not differences in the marginal product of labor, but difference in the quality of labor. We cannot completely rule out the possibility of unobserved worker heterogeneity. However, three considerations are worth mentioning. First, the fact that the unconditional distribution (Figure 1a) is basically the same as the distribution conditional on observable worker characteristics (Figure 1b) should alleviate the concern at least in part. Second, recent evidence based on longitudinal data that follow workers moving from low wage cities to high wage cities indicates that this problem may be limited once education is controlled for. Baum-Snow and Pavan (2012), for example, find that sorting on unobserved ability within education group contributes little to observed differences in wages across cities of different size. Similarly, De La Roca and Puga (2012) find that workers in cities that are bigger

15 The five cities for which the difference between unconditional and conditional wages is the largest are, Madison WI; Ann Arbor, MI; Boston, MA; Champaign-Urbana, IL; State College, PA. In other words, controlling for education and other workers characteristics has the largest impact in university towns and other cities with very high density of college educated workers. By contrast, the five cities for which the difference between unconditional and conditional wages is the smallest are McAllen, TX; Brownsville, TX; Visalia, CA; Yakima, WA; and Bakersfield, CA. Controlling for education and other workers characteristics has the smallest impact in cities that have a labor force with low levels of schooling and high levels of minority workers. 16 As explained in the Appendix, to increase the sample size, our 2009 data actually includes 2008 and 2009.

19 and have higher wages do not have higher unobserved initial ability, as reflected in individual fixed-effects. These findings are consistent with Glaeser and Mare (2001), who show that workers who move from low wage areas to high wage areas experience significant wage increases and that this is not just the result of sorting by ability. We also point out that what matters for our analysis is not merely the possibility of differences in unobserved ability in a cross-section of cities. Rather is whether these differences have changed differentially over time. Third, we have explored the relationship between worker ability and nominal wages. Specifically, we have used NLSY data to relate the average AFQT scores to the nominal wage across metropolitan areas. This data indicates that workers in high nominal wage MSAs tend to have higher AFQT scores, but the correlation attenuates and becomes statistically insignificant once we introduce controls for education, race, and ethnicity.17 The flipside of the increase in the dispersion in wages is an increase in the dispersion in housing costs, since in equilibrium workers need to be compensated for housing costs. Panel A in Appendix Table A2 shows that the dispersion in average rent has increased between 1964 and 2009. Rents are a good approximation to the user cost of housing. In panel B we show the corresponding figures for housing prices. The increase in the spatial dispersion of housing prices is larger than that of housing rents.

4. Empirical Findings We now take the model to the data. First, we decompose aggregate GDP growth into the contribution of each US city and compare it with a naïve “accounting” calculation (section 4.1). Second, we turn to the increased dispersion of wages and calculate how much larger US GDP would be in the counterfactual where the spatial dispersion of wages is fixed from 1964 to 2009 (section 4.2). We then turn to the causes of increased dispersion of wages to discuss its welfare implications (section 4.3). Finally, we discuss limitations of our approach (section 4.4).

4.1 Local Growth and Aggregate Growth Equation (1.7) allows us to calculate the contribution of each city to aggregate growth in 1964 and 2009. This calculation is presented in Table 2 and Figures 2a-2e. Figure 2a plots the

17 We first regress log AFQT scores on log nominal wages. We then replicated the same regression controlling for the same vector of controls used in panel B of table 1b. Both regressions are weighted by MSA employment. While the coefficient is positive in the first regression, it is statistically indistinguishable from zero in the second regression. However, the small sample size precludes definitive conclusions.

20 percentage contribution of the 220 cities to aggregate growth from 1964 to 2009 (on the y-axis) against the growth of local GDP as a percentage of aggregate GDP growth over the same period (on the x-axis). To be clear, the calculation on the y-axis is based on the model (specifically on equation (1.7)) while the x-axis is the growth in local GDP as a ratio of aggregate GDP growth.18 We call the latter the naïve “accounting” calculation. The solid line is the 45 degree line so cities that lie above the 45 degree line contribute more to growth than is apparent from their measured GDP growth, and cities below the 45 degree line contribute less to growth than suggested by their output growth. If all the observations lie on the 45 degree line, the growth rate of aggregate GDP would simply be given by the weighted average of local GDP growth. The first feature that is apparent in Figure 2a is that the dispersion of the accounting measure of the contribution of each city is much wider than the actual contribution. The range of the accounting calculation of the contribution of a city to aggregate growth is 20 percent while the range of the model based calculation is only 5 percent. The second and most important feature of Figure 2a is that there are sizable and systematic differences between local growth and local contribution to aggregate growth. For example, growth of New York’s GDP was 12 percent of aggregate output growth from 1964 to 2009. However, viewed from the lenses of the Rosen-Roback model, New York was only responsible for less than 5 percent of aggregate output growth. The difference is because much of the output growth in New York was manifested as higher nominal wages, which increased the overall spatial misallocation of labor. On the other extreme, Detroit’s GDP fell dramatically from 1964 to 2009. Although one might expect the contribution of Detroit to be negative because of the decline measured local output, the net contribution of Detroit to aggregate output growth is positive. The difference in the case of Detroit is because much of the decline in local GDP in Detroit was driven by a decline in nominal wages. And in Detroit, nominal wages in 1964 were significantly higher than the nationwide mean so the decline in the nominal wage from 1964 to 2009 lowered the overall wage dispersion which increases aggregate output. Other cities display large differences: Chicago and Los Angeles, for example, are well above the 45 degree lines.

18 yt 1 yi, t 1 yi, t The "accounting" calculation is based on the accounting identity     Li, t where the t and t 1 yt i yi, t yi subscripts denote time, y denotes aggregate GDP per worker (in the country), yi is GDP per worker in city i, and

yi, t 1 yi, t Li the employment share in city i. The "contribution" of city i is measured by   Li, t . yi, t yi

21 While their contribution to aggregate growth as calculated from equation (1.7) is not unlike New York’s contribution, the growth of their GDP as a fraction of overall growth is much smaller. Overall, Figure 2a shows that the relation between local growth and local contribution to aggregate growth is positive, but with an elasticity that is much less than one. A regression of the variable on the y-axis on the variable on the x-axis yields a coefficient (standard error) of .295 (.018), with an intercept equal to .320 (.033). The slope is statistically different from one and the intercept is statistically different from zero. Cities with large positive shocks to their local economy tend to contribute less to aggregate growth than their local gains would suggest. At the same time, cities with large negative shocks tend to contribute more than their local losses would suggest. This discrepancy between local growth and local contribution reflects changes in each city relative wage. Figures 2b-2e and Table 2 separately present the contribution of four groups of cities. Figure 2b presents the actual vs. the accounting calculation of New York, San Francisco, and San Jose to aggregate output growth from 1964 to 2009. All three cities lie significantly below the 45 degree line. Although local output grew rapidly in all three cities, so did the gap between local wages and the nationwide wage. The first row in Table 2 indicates that although local GDP growth was almost 20 percent of aggregate US output growth, the actual contribution of these three cities was much lower, at 6 percent of US output growth. Figure 2c shows the contribution of 37 cities in the Rust Belt. As can be seen, all the Rust Belt cities lie above the 45 degree line: the actual contribution of Rust Belt cities is larger than suggested by observed changes in local GDP. What is driving this discrepancy is that nominal wages in Rust Belt cities were typically above the nationwide mean in 1964. And since 1964 wages have fallen and have thus narrowed the gap between wages in the Rust belt and the nationwide mean. What is perhaps more surprising is that although local GDP growth is negative in every Rust Belt city, the actual contribution of every Rust Belt city to aggregate growth is positive. Although the decline in labor demand caused by the decline of manufacturing presumably implies that the contribution of the Rust Belt cities to aggregate growth would be negative, the allocative effects of the sharp decline in the wage gap has a larger effect on aggregate growth. Table 2 shows that the Rust Belt cities contributed as much as New York, San Jose, and San Francisco (taken together) to aggregate output, despite the sharp decline in GDP in the Rust Belt cities. Figure 2d presents the contribution of 86 Southern cities. In the period under consideration, the South of the US has grown more rapidly than the rest of the country. 22 Washington, DC, Houston, Atlanta, and Dallas are among five fastest growing cities in the US (the fastest growing city is New York). All cities lie significantly below the 45% line because the gap in local wages and the nationwide wage increased in all these cities. Therefore, the contribution of the large Southern cities to aggregate growth is less than suggested by their output growth. The fact that relative wages increased in the large southern cities also suggests that the standard narrative that growth in these cities was driven by improved amenities (hot weather became more tolerable with air conditioning) and cheap housing is not the entire story. If the only change in the South was that amenities have improved or housing became cheaper, then relative wages should have fallen in these cities whereas the opposite is true. Taken together, Southern cities were responsible for 42 percent of aggregate growth in the US (Table 2). This is sizeable to be sure, but 20 percentage points lower than what one might infer from the observed growth of GDP in the Southern cities. Figure 2e presents the contribution of the remaining large US cities. This group includes 19 large cities with 2009 employment above 600,000 that are not in any of the previous three groups. Here, the story is more mixed. There are cities where the observed local growth almost exactly measures the actual contribution. These are cities such as Boston, Portland, and Salt Lake City. There are also cities where the growth contribution is larger than suggested by local growth. These are cities such as Chicago, Los Angeles, and Philadelphia where relative wages have fallen and the gap in the marginal product of labor relative to the rest of the country has narrowed. Finally, there are also cities where the growth contribution is smaller than suggested by the local output growth numbers. For example, Phoenix, is one of the fastest growing metro areas in the country; based on the accounting measure, GDP growth in Phoenix “accounts” for six percent of aggregate US growth. Yet, much of this growth has accompanied by a decline in wages in Phoenix, which in the framework of the Rosen-Roback model must be driven by a decline in relative housing prices or an improvement in relative amenities. And since wages in Phoenix were already below the nationwide mean in 1964, the further decline in wages increases the wage gap. Las Vegas and Riverside have similar experiences. Essentially, Phoenix, Las Vegas and Riverside have attracted many residents because of good weather and abundant supply of cheap housing but this reallocation results in a loss in aggregate output because it has brought more people working in cities where the marginal product of labor is low. This effect is reminiscent of the Dutch disease in two-sector models of growth. The bottom line is that almost three quarters of aggregate US output growth from 1964 to 2009 was driven by local forces in southern US cities and the group of "large" 19 cities. And 23 despite the large difference in local GDP growth between New York, San Jose, and San Francisco and the Rust Belt cities, both groups of cities had roughly the same contribution to aggregate output growth (about 6 percent). In Table 3 we probe the robustness of our estimates using different assumptions on technology. Recall that our baseline estimates assume that α = .65 and η =.25 in all cities in both years (column 1). In columns 2 and 3, we keep the returns to scale constant and alter α or η. The estimates are almost identical to the baseline estimates. In columns 4 to 7 we alter the labor or capital share to vary the returns to scale. In columns 4 and 5, we increase return to scale, as α + η increases from .9 to .95. In columns 6 and 7 we alter the labor or capital share to decrease the returns to scale -- α + η decreases from .9 to .85. Entries are virtually unchanged. So far we have constrained the technology to be the same in all cities and industries. Next, we relax our assumptions on technology by allowing technology to vary across cities and years. Specifically, we allow labor and capital shares in 1964 and 2009 to be different in different industries. Because the geographical locations of industries are not the same, this allows different cities to have different technologies. In practice, we use a dataset that is analogous to the one used in the baseline analysis, but that includes separate observations for each 1-digit industry in each city in each year. We assume that workers can move freely across industries within each city, so that the wage is the same. The entries in column 8 indicate that the results are not very sensitive to this generalization. We have performed several additional checks, and found our results to be generally robust. For example, in some models residual wage is estimated using models where the coefficient of workers characteristics is allowed to vary not just by year, but also by state. Results did not change significantly. We have also re-estimated our models dropping the two cities that in Figure A1 are outliers, and found similar estimates.19

4.2 Wage Dispersion and Aggregate Growth Equation (1.7) decomposes the growth rate of aggregate output into two components: growth of local TFP and change in the spatial dispersion of wages. It indicates that increases in the spatial dispersion of wages negatively affect aggregate growth: for a given local TFP growth,

19 We have also re-estimated our 2009 model dropping the restaurant sector, as one where minimum wage workers are particularly prevalent and therefore the assumption that equates wages with marginal product of labor may be violated. The correlation of the share that each city contributes to 2009 output with and without the restaurant sector is .99. We can’t do the same for 1964, since industry definition in 1964 is less disaggregated.

24 a more dispersed spatial wage distribution results in slower growth. Empirically, we have seen that the spatial dispersion of wages across US cities increased significantly from 1964 to 2009 --- the standard deviation, for example, is now double relative what it used to be in 1964. We now quantify the effect of this increase in wage dispersion on the rate of growth of aggregate output between 1964 and 2009 and on the level of output in 2009. We estimate counterfactual output under the scenario where the dispersion of wages across cities remained constant between 1964 and 2009. Specifically, we calculate the counterfactual where the relative wage of a city in 2009 is equal to the relative wage of the same city in 1964. We take local TFP in each city as fixed and allow labor and capital to endogenously reallocate across cities in response to the change in the distribution of local housing supply and amenities. Clearly wages are an endogenous variable. As we have seen, they are determined by local TFP, amenities and elasticity of housing supply (equation (1.4)). But the effect of changes in the wage dispersion on aggregate output growth does not depend on the sources of wage dispersion. (The effect on welfare does depend on the source of wage dispersion. We take up the question of the exact mechanism underlying the change in the spatial wage dispersion in the next section.) In terms of output growth, when we take equation (1.7) to the data, we find that the growth of local TFP boosts aggregate GDP by 2.5 percent a year from 1964 to 2009, holding the spatial dispersion of wages fixed. The increased spatial dispersion of wages lowers aggregate GDP growth by 0.3 percent a year, holding constant local TFP. The net effect of these two forces is that aggregate GDP grew by 2.2 percent a year from 1964 to 2009. In other words, under the counterfactual scenario where wage dispersion did not increase in the U.S., aggregate yearly GDP growth from 1964 to 2009 would have been 0.3 percentage points higher. In terms of output level, the increase in the spatial dispersion of wages resulted in a significantly lower level of output in 2009. This effect is quantified in Table 4. The first row indicates that if the spatial dispersion of relative wages had not changed, 2009 U.S. GDP would be 13.5% higher. Given that US GDP in 2009 was 14.5 trillion, this implies an additional annual aggregate income of $1.95 trillion. Given a labor share of .6, this amounts to an increase of $1.1 trillion in the wage bill, or $6958 additional salary per worker (if number of workers was fixed).20 More than half of US workers would move under this scenario (column 2). In the second row of Table 4, we set the distribution of nominal wages in 2009 equal to its 1964 level only in New York, San Francisco and San Jose. Remember that the increase in

20 The salary increase would be smaller if more workers decide to enter the labor market in response to the higher salary. 25 relative wages from 1964 to 2009 was particularly pronounced in these cities. In addition, these cities are among the largest cities in the US in terms of TFP so the effect on aggregate output growth of the change in the change in the wage in these three cities is largely to be large. Aggregate output would increase by 13.2% if the relative average wage in only these three cities is set to their 1964 level. 54% of U.S. workers would relocate.21 The third row illustrates the effect on aggregate output when the distance from the mean wage in the Rust Belt cities is set to gap in 1964. As can be seen, the effect is small, as aggregate 2009 output increases by 0.5% and only 9% of workers relocate. The last row shows the effect on aggregate output when the distance from the mean wage in Southern cities is set to gap in 1964. Row 3 shows that if the distance from the average wage in Southern cities were set at the 1964 gap, aggregate 2009 output would fall by 0.4%. The changes in the economic geography of the US implied by Table 4 are massive and probably not realistic. Changing the geographical location of American workers to the point that brings wages back to their 1964 level would likely take several decades. One way to see how extreme implied by this scenario is to compare the implied mobility rate with the one observed in reality. Consider that less than 20% of workers change MSA every 10 years. By comparison, the scenario in row 1 of Table 4 involves the relocation of more than half of the US work force. Table 5 shows the equivalent of Table 4, but for partial adjustment. We scale partial adjustment based on the fraction of movers. For example, the second row in the table shows that if 2009 wages were set so that only 50% of workers were to relocate, the output gain in 2009 is 13.2%. The other rows show that if 2009 wages were set so that only 40%, 30%, 20% or 10% of workers were to relocate, the output gain would be respectively 11.8%, 9.4%, 6.5%, and 3.4%. We consider the scenario where 20% of workers change MSA -- corresponding to the counterfactual shown in the fifth row of Table 5--as our benchmark scenario, as it is the closest to the typical mobility rate that we observe over a decade. Table 6 shows counterfactual employment for selected cities under full adjustment and partial adjustment. In particular, in column 1, counterfactual employment is computed setting 2009 relative wage to 1964 levels in all cities (first row of Table 4). In column 2, counterfactual employment is computed moving 2009 relative wage toward their 1964 levels in all cities up to the point where 20% of U.S. workers change MSA (row 5 of Table 5).

21 The size of these three cities would grow. It is important to understand, however, that in general equilibrium the spatial relocation of labor would affect not only these three cities, but all cities in the U.S. 26 By a vast margin, New York is the city that would experience the largest percentage increase in employment: a staggering 787% increase in the case of full adjustment. San Jose and San Francisco would grow by more than 500%, while Austin would increase by 237%. All these cities are important innovation clusters and have experienced rapid wage growth since 1964 mostly driven by human capital intensive industries. Surprisingly, Fayetteville is also in the top group. What distinguishes this MSA is the fact that its economy has changed enormously over the past 3 decades due to the location of Walmart headquarters. The median city, Sheboygan, WI would lose 80% of its employment. The bottom of the table reports the cities that would experience the largest decline in employment. This group includes Rust Belt former manufacturing centers, like Mansfield OH, Muncie, IN and Flint, MI. Under our counterfactual scenario, virtually all of Flint’s workers would move and relocate to other cities. Column 2 shows the counterfactual employment for selected cities under the more plausible intermediate scenario where 20% of workers change city of residence. New York remains the city that would experience the largest percentage increase in employment, but the increase in only 179%. San Jose, San Francisco, Fayetteville and Austin would grow by 149%, 147%, 118% and 102%, respectively. The median city Sheboygan, would lose a third of its employment. The bottom of the table indicates that 78% of Flint’s workers would move and relocate to other cities. Three considerations are worth keeping in mind. First, these are intended to be long term benchmarks. They are based on the assumption that as the population expands in an area, local services also expand to keep the per-capita availability of schools, parks, public transit and other public amenities stable at their current levels. Thus, one should not think of these counterfactuals as taking place overnight and holding fixed public services. Rather, one should think of these counterfactuals taking place slowly over the long run, matched with a steady increase in the supply of public services so that the per-capita level of public services is unchanged. Second, while the counterfactual employment for the top group of cities in column 2 imply city sizes that are very large, they are not completely implausible. For example, the Association of Bay Area Governments (which is made of all municipalities in the San Francisco Bay Area) has recently adopted a formal economic development plan for the region that calls for the addition of enough housing units to increase the region’s population by 80% in 2030 (ABAG, 2013). This increase is smaller than the one estimated in column 2 of Table 6 for the San Francisco MSA, but not too far off.

27 Third, these estimates are obtained assuming that the total number of workers in the US is fixed. In reality, if wages were to rise of average, total employment is likely to increase due to international migration and increased domestic labor supply. This would further increase counterfactual output. Thus, our estimates of output gains are to be interpreted as a lower bound. In Appendix Table A3 we probe the robustness of our estimates using different assumptions to calibrate the model parameters. In rows 2 and 3, we keep the returns to scale constant and alter α or η. In rows 4 to 7 we alter the labor or capital share to vary the returns to scale. We find that the results are not sensitive to changes in labor or capital share for a given degree of return to scale. But they are quantitatively sensitive to the degree of decreasing return to scale. The closer the sum α + η is to 1, the larger the output gain. This makes intuitive sense, because the sum α +η governs the returns to scale. With α + η close to 1 our technology approaches constant returns to scale and there is the most productive cities attracting an increasingly larger share of the economic activity of the country. Finally, in the bottom row, we allow labor and capital shares in 1964 and 2009 to be different in different industries and years. Since cities have different shares of each industry, this models allows technology to vary across cities and years, as a function of their industry mix.

4.3 Sources of Wage Dispersion: Housing Supply vs. Amenities We have shown that the spatial dispersion of nominal wages has increased significantly over the past 50 years and, as a consequence, aggregate growth and aggregate output are lower than what they could have been. However, we have been silent on what has caused the increase in wage dispersion and on the implications for welfare. Formally, we have shown that the difference between welfare and output is simply the weighted average of the ratio of housing prices to local amenities. Understanding how changes in housing prices and amenities have affected wages is thus crucial to understand the implications of changes in wages for welfare. In other words, we need to determine why U.S. labor is not flowing to high wage cities to a larger degree. Our calculations of the counterfactual output in the previous section did not depend on the specific reason for the increased spatial dispersion in wages. But to understand the implications for welfare, we need to understand what has been increasingly constraining labor supply to high wage cities in the U.S. In our setting, labor supply to a city depends on two exogenous factors ---amenities and elasticity of hosing supply --- with opposite implications for worker welfare.

28 Intuitively, if labor is not moving to high wage cities like San Francisco or New York because of undesirable amenities – for example, workers may find these cities crowded, noisy and polluted -- then increasing their size will increase aggregate output but not aggregate welfare. On the other hand, if labor is not moving to cities like San Francisco or New York due to housing supply constraints caused by land use regulations, then increasing their size will increase aggregate output and aggregate welfare. This possibility is consistent with anecdotal evidence on the evolution of land use regulations over the past half century. Glaeser (2014), among others, points out that since the 1960’s, expensive coastal U.S. cities have gone through a property rights revolution which has significantly reduced the elasticity of housing supply: “In the 1960s, developers found it easy to do business in much of the country […]. In the past 25 years, construction has come to face enormous challenges from any local opposition. In some areas it feels as if every neighbor has veto rights over every project.” 22 We now examine which of these two factors –amenities or housing supply restrictions created by land use regulations---have contributed the most to the output losses uncovered above.

(A) Amenities: The effect of the distribution of amenities on aggregate output depends on whether amenities have improved more in high wage cities or in low wage cities. If amenities have improved by more in high wage cities, this lowers the dispersion of the nominal wage across cities and, ceteris paribus, increases aggregate output. Consistent urban economics literature, we use the spatial equilibrium condition (equation

 (1.2)) to measure amenities: Zi WPi i . This condition indicates that local amenities are proportional to the difference between properly weighted housing rents and nominal wages, where the weight on housing rents  reflects the share of housing in total expenditures. We set

23 the housing share  equal to 0.32 from Albouy’s (2012) estimates. Albouy (2012) shows that this measure of local amenities is highly correlated with available measures of specific amenities (such as weather and crime) and with existing indices of the quality of life.

22 Glaeser also points to political economy causes of this trend: “To most residents, a new project is nothing but a bother. They don’t care about the welfare received by the new resident, or the benefits earned by the builders or by the employers who have to pay lower wages when housing costs are lower. Moreover, unaffordable housing isn’t a problem to most homeowners — it represents an increase in the value of their biggest asset." (Glaeser, 2014) 23 Following Albouy (2012) we multiply wages by 0.52 to account for taxes and transfers. Note that amenity levels are not identified because we do not know the absolute value of welfare.

29 Table 7 quantifies the role played by changes in amenities.24 In the top row, we compute counterfactual output under the assumption that the level of amenities in 2009 is set equal to its 1964 level. To obtain this counterfactual, we proceed in two steps. We first use equation (1.4) and compute what wages would be in 2009 had amenities in each city stayed at 1964 levels (holding TFP and housing supply constant). We then allow workers and capital to reallocate and compute counterfactual employment and output. The results in the first row of Table 7 show that counterfactual output is higher than observed output, but only marginally. If the level of amenities in 2009 was equal to its 1964 level, 2009 output would grow by only 1.6% and less than 10% of workers would move. In rows 2 to 4, we repeat the same exercise changing amenities levels only in selected cities. Row 2 shows that changes in amenities in New York, San Francisco and San Jose between 1964 and 2009 had a positive impact on aggregate output, but the effect is quantitatively small.25 Row 3 performs the same exercise for the Rust Belt. Our estimates indicate that, unsurprisingly, amenities in Rust Belt cities worsened from 1964 to 2009. Changing amenities back to their 1964 level would further lower wages in the Rust Belt and slightly increase the overall wage dispersion. Row 3 shows that aggregate output would fall under this scenario, although the magnitude is trivial. In row 4 we look at the South. Empirically, amenities have improved in Southern cities from 1964 to 2009. This is plausible, and likely reflects air conditioning, and the general improvement in quality of life in the South. Rolling amenities back to their 1964 level would increase wages in the South and slightly reduce the overall wage dispersion. Aggregate output would increase under this scenario, although the estimate in row 4 indicates that the effect is very small. Here the improvement in amenities experienced by Southern cities increases aggregate welfare, but this effect is slightly offset by the decline in aggregate output.

24Appendix Table A4 shows that the spatial dispersion of amenities has increased between 1964 and 2009, although the increase in the spatial dispersion is significantly less than that observed for wages 25 While crime, cultural amenities and quality of life in general are generally thought to be better to have improved in New York, San Francisco and San Jose since the 1990s, the evidence in row 2 suggests that the post 1990s improvement in amenities have offset the decline in amenities prior to the 1990s. So here, the change in amenities in New York, San Francisco and San Jose has two effects on welfare. First, it directly lowers the average level of amenities. Second, it increases the nominal wage in these cities, increases the overall wage dispersion, and lowers aggregate output.

30 In sum, we conclude that amenities have changed differentially across US cities. But the overall effect across all cities of changes in the distribution of amenities is limited and cannot explain but a small fraction of our counterfactual output gains.

(B) Housing Supply: In our model, the equilibrium housing price is given by

 i 1 (1 )(1 i ) Pi   AZi i  . This says that higher housing prices can be driven by higher local TFP, better amenities, and more inelastic housing supply (higher  i ). Based on Saiz's estimates, New York, San Francisco and San Jose have some of the most inelastic housing supplies in the country (high  i ). Specifically, San Francisco is at the 99th percentile of the inverse elasticity distribution, while New York and San Jose are at the 96th percentile. Saiz shows that this is due to a combination of geographical features and restrictions to housing supply due to land use regulations, as measured by the Wharton Residential Land Use Regulatory Index. We cannot measure land use restrictions in 1964, because the Wharton survey does not go back in time. Instead, in Table 8, we estimate counterfactual output under the assumption that land use regulations in New York, San Francisco and San Jose are set equal to the level of regulations in the median US city. Thus, our counterfactual takes as given geographical factors that can affect housing supply, and only changes factors that are set by policy. To obtain this counterfactual, we proceed in three steps. First, we use Saiz (2010) coefficients (Table 5 column 2 in his paper) to estimate the elasticity of housing supply in New York, San Francisco and San Jose if land use regulations in these three cities were equal to the level of regulations in the median US city, holding constant geography. The resulting counterfactual elasticity of housing supply is mechanically higher in these three cities. Second, we use this counterfactual elasticity to estimate the counterfactual levels of housing prices and wages in New York, San Francisco and San Jose holding local TFP and amenities constant at 2009 levels, along with the counterfactual employment levels. Empirically, we find that counterfactual wages are on average 25% lower in the three cities and employment is higher. This is not surprising: because counterfactual housing supply is more accommodating, in equilibrium more workers can move to these three cities from the rest of the US. Empirically, San Francisco is the city that grows the most in this counterfactual, followed by New York and San Jose. Third, we compute the counterfactual output that is generated by this new allocation of labor. 31 The first row in Table 8 indicates that this would significantly speed up growth. The difference between the actual and counterfactual annualized output growth rate between 1964 and 2009 is .21%. This would induce 30% of workers to relocate, and would increase 2009 output level by 9.7%. Average earnings would increase by $5024 (if number of workers was fixed). Comparing the increase in output with the corresponding estimate in Table 4 (13.5%), we conclude that this change in supply elasticities accounts for more than two thirds of the overall output gains. The second row of the table focuses on the role played by land use regulations in the South. Housing supply is generally rather elastic in Southern cities. This reflects abundant land and permissive land use regulations. We estimate counterfactual output under the assumption that land use regulations in the South are set to the level of New York, San Francisco and San Jose, holding constant land availability in the South. More stringent regulations would result in higher wages and lower employment in the South. The entry shows that in turn, US output would be 3% lower in this counterfactual scenario. We note that our estimates are sensitive to the assumption of perfect mobility. In the theory section, we have shown how preferences for location may reduce the effect of changes in amenities or housing supply, although they do not alter the estimates of the overall effect of changes in relative wages. The key parameter in this case is the dispersion parameter, which governs the strength of preference for location. Stronger preferences for location induce some individuals to optimally choose cities where real wages net of amenities are low. To our knowledge, there are only two empirical estimates of this parameter based on MSA-level data, although neither fits our setting perfectly. Serrato and Zidar (2014, Table 5) estimate this parameter to be in the range .47 - .75, while Diamond (2013, Table 3) estimates the parameter to be .57 for college graduates and .27 for workers with lower education. If we use the largest value of the parameter in Serrato and Zidar's -- .75 – we find output gains that are significantly smaller. For example, the estimate in row 1 of Table 8 drops to 1.6%. In this case, employment in New York, San Francisco and San Jose increase only by 54%, 50%, and 31% respectively. We note however, that both Serrato and Zidar's and Diamond’s parameters are likely to be conservative for our setting, as they are obtained using 10 year changes or less. A longer time horizon would likely imply more mobility and yield larger estimates.

4.4 Caveats and Limitations.

32 This paper highlights the possibility of output and welfare losses stemming from an inefficient geographical allocation of labor. The number we present should not be taken as precise estimates of the losses but rather as guidance on the general order of magnitude of the losses, as they are based on a number of untestable assumptions. First, our findings depend on specific assumptions on technology. While our estimates are qualitatively robust to alternative technology parameters, we have shown that they are quantitatively sensitive to the assumed degree of returns to scale (Appendix Table A3). Second, we use residual wages as a measure of the marginal product of labor. This requires that differences across cities in unobserved worker characteristics have not changed over time, or, if they have changed, they have changed in ways that are uncorrelated with nominal wages. While this might not be true, there is little we can do to relax this assumption, as detailed data on worker cognitive ability are not available at a scale large enough to allow for a city-level analysis. Failure of this assumption may lead us to overestimate potential benefits of geographical reallocation of labor. In particular, if workers in MSA’s with high nominal wages have higher IQ than workers in MSA’s with low nominal wages after conditioning on education and other characteristics, then the documented spatial dispersion in nominal wages overestimates the true degree of dispersion. If, in addition, the amount of unobserved ability has increased more in MSA’s with high nominal wages than in MSA’s with low nominal wages, then the estimated counterfactual output gains reported in the paper are too large. Third, we have made restrictive assumptions on the relationship between TFP and city size; and the relationship between amenities and city size. A large literature in urban economics indicates that TFP might not be exogenous, but could depend on the size or the density of a city. Similarly, it has long been posited that local amenities can depend on city size and/or density. Our assumptions don’t rule out these possibilities, but restrict the relationship between TFP and employment and the relationship between amenities and employment. Recall that we have assumed that the elasticity of agglomeration and the elasticity of amenities is constant across cities. With constant elasticity, reallocation of workers across cities has no aggregate impact on aggregate productivity or aggregate amenities, because the changes experienced by cities that grow in size are exactly offset by changes experienced by cities that shrink in size. As noted above, the assumption of constant elasticity for TFP is consistent with Kline and Moretti, forthcoming; the assumption of constant elasticity for amenities is consistent with Albouy (2012). However, we stress that the estimates in both Kline and Moretti (forthcoming) and Albouy (2012) are based on ranges of city size historically observed in the U.S. data. There is no 33 guarantee that the same estimates extend to city sizes that are significantly larger than the ones observed in the data. Fourth, we have assumed that workers can freely move across industries. This assumption is useful because cities have distinct industry specialization. Thus, spatial reallocation of labor also implies industry reallocation. For example, scaling up employment in New York, San Francisco and San Jose implicitly requires increasing the number of workers in finance and high tech, since tradable sector employment in these three cities is heavily concentrated in finance and high tech. The assumption of inter-industry mobility is clearly false in the short run. For example, it would be hard to relocate a Detroit car manufacturing worker to a San Francisco high tech firm overnight. On the other hand, the assumption is more plausible in the long run, as workers skills –especially the skills of new workers entering the labor market --- can adjust. In this respect, it is important to note that not all the workers need to adjust, because not all the workers are spatially reallocated in our counterfactual exercises. In addition, not all workers are employed in the tradable sector. While wages are set in the tradable sector, two third of the labor force is employed in the non-tradable sector, which is arguably much less specialized.

5. Policy Implications We find that three quarters of aggregate U.S. growth between 1964 and 2009 was due to growth in Southern US cites and a group of 19 other cities. Although labor productivity and labor demand grew most rapidly in New York, San Francisco, and San Jose thanks to a concentration of human capital intensive industries like high tech and finance, growth in these three cities had limited benefits for the U.S. as a whole. The reason is that the main effect of the fast productivity growth in New York, San Francisco, and San Jose was an increase in local housing prices and local wages, not in employment. In the presence of strong labor demand, tight housing supply constraints effectively limited employment growth in these cities. In contrast, the housing supply was relatively elastic in Southern cities. Therefore, TFP growth in these cities had a modest effect on housing prices and wages and a large effect on local employment. Constraints to housing supply reflect both land availability and deliberate land use regulations. We estimate that holding constant land availability, but lowering regulatory constraints in New York, San Francisco, and San Jose cities to the level of the median city would expand their work force and increase U.S. GDP by 9.5%. Our results thus suggest that local land use regulations that restrict housing supply in dynamic labor markets have important externalities 34 on the rest of the country. Incumbent homeowners in high wage cities have a private incentive to restrict housing supply. By doing so, these voters de facto limit the number of US workers who have access to the most productive of American cities. For example, Silicon Valley---the area between San Francisco and San Jose---has some of the most productive labor in the globe. But, as Glaeser (2014) puts it, “by global urban standards, the area is remarkably low density” due to land use restrictions. In a region with some of the most expensive real estate in the world, surface parking lots, 1-story buildings and underutilized pieces of land are still remarkably common due to land use restrictions. While the region’s natural amenities---its hills, beaches and parks---are part of the attractiveness of the area, there is enough underutilized land within its urban core that housing units could be greatly expanded without any reduction in natural amenities. Our findings indicate that in general equilibrium, this would raise income and welfare of all US workers. In principle, one possible way to minimize the negative externality created by housing supply constraints in high TFP cities would be for the federal government to constraint U.S. municipalities’ ability to set land use regulations. Currently, municipalities set land use regulations in almost complete autonomy since the effect of such regulations have long been thought as only local. But if such policies have meaningful nationwide effects, then the adoption of federal standard intended to limit negative externalities may be in the aggregate interest. An alternative is the development of public transportation that link local labor markets characterized by high productivity and high nominal wages to local labor markets characterized by low nominal wages. For example, a possible benefit of high speed train currently under construction in California is to connect low-wage cities in California’s Central Valley -- Sacramento, Stockton, Modesto, Fresno -- to high productivity jobs in the San Francisco Bay Area. This could allow the labor supply to the San Francisco economy to increase overnight without changing San Francisco housing supply constraints. An extreme example is the London metropolitan area. A vast network of trains and buses allows residents of many cities in Southern England – including far away cities like Reading, and Bristol-- to commute to high TFP employers located in downtown London. Another example is the Tokyo metropolitan area. While London and Tokyo wages are significantly above the UK and Japan averages, they would arguably be even higher in the absence of these rich transportation networks. Our argument suggests that UK and Japan GDP are significantly larger due to the transportation network.

35 36 References ABAG (2013), “Bay Area Plan.”

Ahlfeldt, Redding, Sturm and Wol, “The Economics of Density: Evidence from the Berlin Wall”, Econometrica, 2014.

Albouy, David, “Are Big Cities Bad Places to Live? Estimating Quality of Life across Metropolitan Areas”, mimeo, 2012.

Atkeson, Andrew Aubhik Khan and Lee Ohanian "Are Data on Industry Evolution and Job Turnover Relevant for Macroeconomics?", Carnegie Rochester Conference Series on Public Policy, vol. 44, June (1996), pp. 215-250.

Au Chun-Chung and Vernon Henderson (2006a), "Are Chinese Cities Too Small?" Review of Economic Studies, 73 (3): pp. 549-576.

Au Chun-Chung and Vernon Henderson “How Migration Restrictions Limit Agglomeration and Productivity in China,” Journal of Economic Development, 2006b, 80, 350-388.

Avent, Ryan (2011), The Gated City, Kindle Edition.

Basu, Susanto & Fernald, John G, 1997. "Returns to Scale in U.S. Production: Estimates and Implications," Journal of Political Economy, vol. 105(2), pages 249-83, April.

Baum-Snow, Nathaniel and Ronni Pavan “Understanding the City Size Wage Gap”, Review of Economic Studies, 2012, 79(1): 88-127.

Behrens, Kristian, Gilles Duraton, and Frederic Robert-Nicoud (2014), "Productive Cities: Sorting, Selection, and Agglomeration," Journal of Political Economy 122 (3): 507-553.

Close, Grank and David E. Shulenburger, Industrial and Labor Relations Review, Vol. 24, No. 4 (Jul., 1971), pp. 588-602

Davis, Morris, Francois Ortalo-Magne “Household Expenditures, Wages and Rents” Review of Economic Dynamics 14-2, 2011.

De La Roca, Jorge and Diego Puga, 'Learning by working in big cities', CEPR discussion paper 9243, December 2012.

Desmet, Klaus, and Esteban Rossi-Hansberg. 2013. "Urban Accounting and Welfare." American Economic Review, 103(6): 2296-2327.

Diamond, Rebecca, “The Determinants and Welfare Implications of US Workers' Diverging Location Choices by Skill: 1980-2000”, Stanford University , 2013.

Diamond Rebecca, “Housing Supply Elasticity and Rent Extraction by State and Local Governments”, Stanford University, 2014.

Duranton, Gilles, Ejaz Ghani, Arti Grover Goswami and William Kerr, “The Misallocation of Land and Other Factors of Production in India” World Bank Working Paper, 2015. 37 Eeckout, Jan, Roberto Pinheiro, and Kurt Schmidheiny (2014), "Spatial Sorting," Journal of Political Economy 122 (3): 554-620.

Ganong Peter and Daniel Shoag, “Why Has Regional Convergence in the U.S. Stopped?”, Harvard Kennedy School mimeo, 2013.

Glaeser, Edward: “The Rise of the Sunbelt,” Southern Economic Journal, 74(3) (2008): 610-643

Glaeser, Edward, “The Triumph of the City”, Penguin Books, 2011.

Glaeser, Edward “Land Use Restrictions and Other Barriers to Growth”, Cato Institute, 2014.

Glaeser, Edward Joseph Gyourko and Raven Saks (2006), “Urban Growth and Housing Supply,” Journal of Economic Geography, 6: 71-89.

Glaeser, Edward Joseph Gyourko and Raven Saks (2005), "Why is Manhattan So Expensive?: Regulation and the Rise in House Prices”, Journal of Law and Economics, 48(2): 331-370.

Glaeser, Edward Joseph Gyourko and Raven Saks (2005), “Why Have House Prices Gone Up?”, American Economic Review, Vol. 95, no. 2 : 329-333

Gaubert, Cecile, mimeo, UC Berkeley, 2014.

Gyourko, Joseph and Edward Glaeser (2005), Urban Decline and Durable Housing, Journal of Political Economy, Vol 113, no 2, 345-375, 113 (2), 345 - 375.

Henderson, Vernon "Systems of Cities in Closed and Open Economies," Regional Science and Urban Economics, 1982, 12, 325-350.

Henderson and Ioannides "Aspects of Growth in a System of Cities," Journal of Urban Economics, 1981, 10, 117-139.

Hirsch, Barry T., David A. Macpherson, and Wayne G. Vroman (2001), "Estimates of Union Density by State," Monthly Labor Review, Vol. 124, No. 7, pp. 51-55.

Hsieh Chang-Tai and Peter J. Klenow, "Misallocation and Manufacturing TFP in China and India," Quarterly Journal of Economics (2009) 124 (4): 1403-1448.

Karabarbounis, Loukas and Brent Nieman (2014), "The Global Decline of the Labor Share," Quarterly Journal of Economics, forthcoming.

Kline Patrick and Enrico Moretti, People, Places and Public Policy: Some Simple Welfare Economics of Local Economic Development Programs”, Annual Review of Economics, 2014

Lewbel, Arthur and Krishna Pendakur, 2008, ”Tricks with Hicks: The EASI Implicit Marshallian Demand System for Unobserved Heterogeneity and Flexible Engel Curves.”, American Economic Review.

Moretti, Enrico, “The New Geography of Jobs” Houghton Mifflin Harcourt (2012).

38 Morris A. Davis & Francois Ortalo-Magne, 2011. "Household Expenditures, Wages, Rents," Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 14(2), pages 248-261, April

Redding, Stephen (2014), "Goods Trade, Factor Mobility and Welfare," Princeton University working paper.

Redding, Stephen "Economic Geography: a Review of the Theoretical and Empirical Literature", Chapter 16 in The Palgrave Handbook of International Trade, 2011

Redding, Stephen and Matt Turner “Transportation Costs and the Spatial Organization of Economic Activity", Handbook of Urban and Regional Economics, forthcoming, 2015.

Saiz, Albert 2010. "The Geographic Determinants of Housing Supply," Quarterly Journal of Economics, Vol. 125(3), pages 1253-1296, August.

Serrrato, Juan Carlos Suarez and Owen Ziidar, “Who Benefits from State Corporate Tax Cuts? A Local Labor Market Approach with Heterogeneous Firms”, mimeo, 2014

39 Data Appendix In this appendix we describe where each variable used in the paper comes from. We begin by measuring average wages in a county or in a country-industry cell by taking the ratio of total wage bill in private sector industries and total number of workers in private sector industries using CBP data for 1964-65 (referred to as 1964) and 2008-2009 (referred to as 2009). To increase sample size and reduce measurement error, we combine 1964 with 1965 and 2008 with 2009. 1964 is the earliest year for which CBP data are available at the county-industry level. Data on total employment by county are never suppressed in the CBP. By contrast, data by county and industry are suppressed in the CBP in cases where the county-industry cell is too small to protect confidentiality. In these cases, the CBP provides not an exact figure for employment, but a range. We impute employment in these cases based on the midpoint of the range. We aggregate counties into MSA’s using a crosswalk provided by the Census based on the 2000 definition of MSA. The main strength of the CBP is a fine geographical-industry detail and the fact that data are available for as far back as 1964.26 But CBP is far from ideal. The main limitation of the CBP data is that it does not provide worker level data on salaries, but only a county aggregate and therefore does not allow us to control for changes in worker composition. We augment CBP data with information on worker characteristics from the Census of Population and the ACS. Specifically, we merge 1964 CBP average wage by MSA to a vector of workers characteristics from the 1960 US Census of Population; we also merge 2009 CBP average wage by MSA to a vector of workers characteristics from the 2008 and 2009 ACS. These characteristics include: three indicators for educational attainment (high school drop-out, high school, college); indicators for race; an indicator for gender; and age. We drop all cases where education is missing. In the small number of cases where one of the components of the vector other than education is missing, we impute it based on the relevant state average. Because the Census does not report information on union status, we augment our merged sample using information on union density by MSA from Hirsch, Macpherson, and Vroman (2001). Their data represent the percentage of each MSA nonagricultural wage and salary employees who are covered by a collective bargaining agreement. Their estimates for 1964 and 2009 are based on data from the Current Population Survey Outgoing Rotation Group (ORG) earnings files and the now discontinued BLS publication Directory of National Unions and Employee Associations (Directory), which contains information reported by labor unions to the Federal Government. The exact methodology is described in Hirsch, Macpherson and Vroman (2001).27 This allows us to estimate average residual wage in each MSA, defined as average wage conditional on worker characteristics. Specifically, we estimate residual wages as Wic –Xi’b, where W is average wage in the MSA, X is the vector of average workers characteristics in the MSA and b is a vector of coefficients on workers characteristics from individual level regressions estimated on a nationwide sample in 1964 and 2009 based on CPS data. The

26 Unfortunately, individual level data on employment and salary with geocodes is not available from the Census of Population on a systematic basis until 1980. A third of metro areas are identified in the 1970 Census. 27 For 1964, estimates are calculated based on figures in the BLS Directories, scaled to a level consistent with CPS estimates using information on years in which the two sources overlap. Only state averages are estimated in 1964. Thus, in 1964 we assume assign union density to each MSA based on the state average.

40 coefficients for 1964 are: high-school or more .44; college or more .34; female: -1.13; non white: -.44; age: .004; union .14. The coefficients for 2009 are: high-school or more .50; college or more .51; female: -.45; non white: -.07; age: .007; union .14. Because a union identifier is not available in the 1964 CPS, the 1964 regression assumes that the coefficient on union is equal to the coefficient from 2009, which is estimated to be equal to .14. For 2009, we can compare the wage residuals estimated our approach with those that one would obtain from individual level data. (Of course we can’t do this for 1964, because we don’t have micro data in that year). Appendix Figure 1 shows that while noisy, our imputed wage residuals do contain signal. The two measures have correlation .75. In some models residual wage is defined as Wic –Xi’bs where bs is a vector of coefficients on workers characteristics from individual level regressions which is allowed to vary across states. The correlation in 2009 increases only marginally to .78. Data on housing costs are measured as median annual rent from the 1960, 1970 US Census of Population and the 2008 and 2009 American Community Survey. For 1964, we linearly interpolate Census data between 1960 and 1970. Because rents may reflect a selected sample of housing units, in some models we use average housing prices. Data for 2009 are from individual level data from the American Community Survey. To get more precise estimate, we combine 2008 and 2009. Our sample consists of 220 MSA’s with non-missing values in 1964 and 2009. These cities account for 71.6% of US employment in 1964 and 72.8% in 2009. They account for 74.3% of US wage bill in 1964 and 76.3% in 2009. The average city employment is 144,178 in 1964 and 377,071 in 2009. Appendix Table A1 presents summary statistics. Data on housing supply elasticities, land use regulations and land availability are from Saiz (2010). They are intended to measure variation in elasticity that arises both from political constraints and geographical constraints. In 19 cities, Saiz data are missing. In those cases, we impute elasticity based on the relevant state average.

41 Table 1a: Spatial Dispersion of Nominal Wages in 1964 and 2009

Std. Interquartile Range Deviation Range (1) (2) (3) Panel A Log Wage in 1964 .132 .163 .793 Log Wage in 2009 .205 .268 1.179

Panel B: Diff. with 9 Census Division mean Log Wage in 1964 .090 .132 .670 Log Wage in 2009 .198 .270 1.076

Panel C: Drop NY, San Francisco, San Jose Log Wage in 1964 .133 .171 .793 Log Wage in 2009 .157 .256 .842

Notes: The sample includes 220 metropolitan areas observed in both 1964 and 2009. All figures are weighted by employment in the relevant metropolitan area and year. Wage is the unconditional average wage in the metropolitan area. Panel B shows the distribution of the difference between log nominal wages and the average log nominal wage in each census division. Table 1b: Spatial Dispersion of Residual Wages in 1964 and 2009

Std. Interquartile Range Deviation Range (1) (2) (3) Panel A Log Wage in 1964 .102 .123 .702 Log Wage in 2009 .185 .247 .920

Panel B: Diff. with 9 Census Division mean Log Wage in 1964 .095 .127 .612 Log Wage in 2009 .181 .221 .883

Panel C: Drop NY, San Francisco, San Jose Log Wage in 1964 .105 .138 .702 Log Wage in 2009 .140 .219 .686

Notes: The sample includes 220 metropolitan areas observed in both 1964 and 2009. All figures are weighted by employment in the relevant metropolitan area and year. Residual wage is the average wage in the metropolitan area after controlling for three levels of educational attainment (high school drop-out, high school, college); race; gender; age; and union status. Panel B shows the distribution of the difference between log nominal wages and the average log nominal wage in each census division. Table 2: City GDP Growth and City Contribution to Aggregate Growth, by Group

Accounting Estimates Model-Driven Estimates

Growth of City GDP Growth of City As a Fraction of Contribution to Aggregate Growth Aggregate Growth As a Fraction of Aggregate Growth (1) (2) NY, San Francisco, San Jose 19.3% 6.1% Rust Belt Cities (N=37) -28.5% 6.1% Southern Cities (N=86) 66.8% 42.0%

Other Large Cities (N=19) 31.4% 32.1%

Notes: Entries in column 1 are the growth of the city’s GDP as a percentage of aggregate GDP growth over the period 1964-2009. Entries in column 2 are the percentage contribution of each city to aggregate growth from 1964 to 2009. We measure the contribution of a city to aggregate growth as the change in local TFP adjusted by the change in the gap between the local wage and the average wage as a share of the change in aggregate GDP. The group “Other Large Cities” includes 19 MSA with 2009 employment above 600,000 that are not in the other three groups. The sample includes 220 metropolitan areas observed in both 1964 and 2009. Table 3: City Contribution to Aggregate Output 2009, by Group – Robustness to Different Assumptions on Technology

(1) (2) (3) (4) (5) (6) (7) (8) Baseline α = .65 α = .70 α = .60 α = .70 α = .60 α = .60 α = .65 α and η vary by η = .25 η = .20 η = .30 η = .25 η = .35 η = .25 η = .20 industry and year NY, San Francisco, San 6.1 6.1 6.1 6.1 6.1 6.1 6.1 5.8 Jose Rust Belt Cities (N=37) 6.1 6.1 6.1 6.1 6.1 6.1 6.1 6.2 Southern Cities (N=86) 42.0 42.0 42.0 42.0 42.0 42.0 42.0 43.2

Other Large Cities 32.1 32.1 32.1 32.1 32.1 32.1 32.1 31.7 (N=19)

Notes: Entries are the percentage contribution of each city to aggregate growth from 1964 to 2009. We measure the contribution of a city to aggregate growth as the change in local TFP adjusted by the change in the gap between the local wage and the average wage as a share of the change in aggregate GDP. Entries in column 1 are based on our baseline assumption on technology and are reproduced from Table 2, column 2. Entries in other columns vary assumptions on technology. The group “Other Large Cities” includes 19 MSA with 2009 employment above 600,000 that are not in the other three groups. The sample includes 220 metropolitan areas observed in both 1964 and 2009. Table 4. Counterfactual Output– The Effect of Changes in the Spatial Dispersion of Relative Wages

2009 Counterfactual Percent Who Output Have Moved by 2009

(1) (2) 1) In All Cities 13.5% 52.5% 2) In NY, San Francisco, San Jose 13.2% 54.0%

3) In Rust Belt Cities 0.5% 8.7% 4) In Southern Cities -0.4% 21.2%

Notes: Entries in column 1 are the percent difference between counterfactual output level in 2009 and actual output level. Entries in column 2 are the percent of workers who in the counterfactual scenario reside in a MSA different from their actual MSA of residence. The counterfactual involves setting 2009 relative wage equal to their 1964 level in selected cities. The sample includes 220 metropolitan areas observed in both 1964 and 2009. Table 5. Counterfactual Output – The Effect of Changes in the Spatial Dispersion of Relative Wages - Partial Adjustment

Percent Who Have 2009 Moved by 2009 Counterfactual Output

(1) (2) (1) 52.5% 13.5%

(2) 50% 13.2%

(3) 40% 11.8% (4) 30% 9.4% (5) 20% 6.5%

(6) 10% 3.4% (7) 0 0

Notes: Entries in column 1 are the percent of workers who in the counterfactual scenario reside in a MSA different from their actual MSA of residence in 2009. Entries in column 2 are the percent difference between counterfactual output level in 2009 and actual output level. Row 1 reproduces Table 4, row 1. We scale partial adjustment based on the fraction of movers. For example, row (2) shows counterfactual output gains if 2009 relative wages were set so that 50% of workers relocate to a different MSA. The counterfactual involves setting 2009 relative wage equal to their 1964 level in all cities. The sample includes 220 metropolitan areas observed in both 1964 and 2009. Table 6: Counterfactual Employment – The Effect of Changes in the Spatial Dispersion of Relative Wages

Full Partial Adjustment Adjustment (52.5% of US (20% of US Workers Move) Workers Move)

Percent Change Percent Change in MSA in MSA Employment Employment (1) (2) Cities with Largest Increases NEW YORK-NEWARK, NY-NJ-PA 787.7% 179.8% SAN JOSE, CA 522.4% 149.2% SAN FRANCISCO, CA 509.9% 147.9% FAYETTEVILLE-SPRINGDALE, AR 320.2% 118.1% AUSTIN-SAN MARCOS, TX 237.7% 102.7%

City with Median Change SHEBOYGAN, WI -79.7% -32.4%

Cities with Largest Decreases KENOSHA, WI -97.3% -74.7% MANSFIELD, OH -97.6% -75.7% MUNCIE, IN -97.8% -76.9% GADSDEN, AL -97.9% -76.1% FLINT, MI -97.9% -77.4% SHARON, PA -98.1% -78.3%

Note: Entries represents the percent difference between counterfactual employment and actual employment. In column 1, counterfactual employment is computed setting 2009 relative wage to 1964 levels in all cities (Panel B, first row of Table 4). In column 2, counterfactual employment is computed moving 2009 relative wage toward their 1964 levels in all cities up to the point where 20% of U.S. workers change MSA (row 5 of Table 5). The sample includes 220 metropolitan areas observed in both 1964 and 2009.

Table 7: Counterfactual Output – The Effect of Changes in Amenities

2009 Percent Who Counterfactual Have Moved Output by 2009

(1) (2) 1) In All Cities 1.6% 9.3% 2) In NY, San Francisco, San Jose 1.5% 3.1%

3) In Rust Belt Cities -0.2% 0.8% 4) In Southern Cities 0.3% 3.7%

Notes: Entries in column 1 are the percent difference between counterfactual output level in 2009 and actual output level. Entries in column 2 are the percent of workers who in the counterfactual scenario reside in a MSA different from their actual MSA of residence. The counterfactual involves setting 2009 amenities are equal to their 1964 level in selected cities. The sample includes 220 metropolitan areas observed in both 1964 and 2009. Table 8: Counterfactual Output – The Effect of Changing Housing Supply Regulations

2009 Percent Who Counterfactual Have Moved Output by 2009

(1) (2) 1) Regulations in New York, San Francisco 9.70% 30.0% and San Jose are set equal to regulations of the median city

2) Regulations in South are set equal to -3.0% 33.5% regulations in New York, San Francisco and San Jose

Notes: Entries in column 1 are the percent difference between counterfactual output level in 2009 and actual output level. Entries in column 2 are the percent of workers who in the counterfactual scenario reside in a MSA different from their actual MSA of residence. The counterfactual involves changing 2009 housing supply regulations in selected cities, holding land availability constant. Housing supply regulations vary at the MSA level and are measured using Saiz (2010) data, which in turn are based on the Wharton Index aggregated at the MSA level. The sample includes 220 metropolitan areas observed in both 1964 and 2009. Figure 1a: Spatial Dispersion of Demeaned Log Nominal Wages in 1964 and 2009.

Note: The distribution is weighted by MSA employment in the relevant year.

Figure 1b: Spatial Dispersion of Demeaned Log Residual Nominal Wages in 1964 and 2009

Note: The distribution is weighted by MSA employment in the relevant year. Figure 2a: City GDP Growth and City Contribution to Aggregate Growth

Notes: The Figure plots the percentage contribution of each city to aggregate growth from 1964 to 2009 (on the y-axis) against the growth of the city’s GDP as a percentage of aggregate GDP growth over the same period (on the x-axis). We measure the contribution of a city to aggregate growth as the change in local TFP adjusted by the change in the gap between the local wage and the average wage as a share of the change in aggregate GDP. The solid line is the 45 degree line. The sample includes 220 cities observed in 1964 and 2009. Figure 2b: City GDP Growth and City Contribution to Aggregate Growth – New York, San Francisco, San Jose

Notes: The Figure plots the percentage contribution of each city to aggregate growth from 1964 to 2009 (on the y-axis) against the growth of the city’s GDP as a percentage of aggregate GDP growth over the same period (on the x-axis). We measure the contribution of a city to aggregate growth as the change in local TFP adjusted by the change in the gap between the local wage and the average wage as a share of the change in aggregate GDP. The solid line is the 45 degree line. Figure 2c: City GDP Growth and City Contribution to Aggregate Growth – Rust Belt Cities

Notes: The Figure plots the percentage contribution of each city to aggregate growth from 1964 to 2009 (on the y-axis) against the growth of the city’s GDP as a percentage of aggregate GDP growth over the same period (on the x-axis). We measure the contribution of a city to aggregate growth as the change in local TFP adjusted by the change in the gap between the local wage and the average wage as a share of the change in aggregate GDP. The solid line is the 45 degree line. Figure 2d: City GDP Growth and City Contribution to Aggregate Growth –Southern Cities

Notes: The Figure plots the percentage contribution of each city to aggregate growth from 1964 to 2009 (on the y-axis) against the growth of the city’s GDP as a percentage of aggregate GDP growth over the same period (on the x-axis). We measure the contribution of a city to aggregate growth as the change in local TFP adjusted by the change in the gap between the local wage and the average wage as a share of the change in aggregate GDP. The solid line is the 45 degree line. Figure 2e: City GDP Growth and City Contribution to Aggregate Growth – Other Large Cities

Notes: The Figure plots the percentage contribution of each city to aggregate growth from 1964 to 2009 (on the y-axis) against the growth of the city’s GDP as a percentage of aggregate GDP growth over the same period (on the x-axis). We measure the contribution of a city to aggregate growth as the change in local TFP adjusted by the change in the gap between the local wage and the average wage as a share of the change in aggregate GDP. The solid line is the 45 degree line. This group, called “Other Large Cities” includes 19 MSA with 2009 employment above 600,000 that are not in the other three groups. Appendix Table A1: Summary Statistics

1964 Average 2009 Average (1) (2) Average Annual Salary – Private Sector Workers 25,538 29,018 (3,868) (5,278) Average Annual Rent 4,770 6,553 (932) (1826) Private Sector Employment 144,178 377,071 (294,016) (604,448) Private Sector Wage Bill (billion) 4.04 13.04 (8.95) (25.5) High School Drop Out 0.59 0.10 (0.11) (.05) High School or More 0.40 0.90 (0.08) (0.04) College or More 0.07 0.26 (0.02) (0.07) Hispanic 0.03 0.10 (0.05) (0.10) Non White .09 0.22 (0.11) (0.15) Age 28.1 39.9 (3.3) (0.9) Female 0.51 0.51 (0.01) (0.01) Union 0.26 0.11 (0.12) (.06)

Number of Cities 220 220

Note: The unit of analysis is a MSA. The sample includes 220 metropolitan areas observed in both 1964 and 2009. All monetary figures are in 2000 dollars. Appendix Table A2: Spatial Dispersion of Cost of Housing in 1964 and 2009.

Std. Deviation Interquartile Range Range (1) (2) (3) Panel A: Median Rent Log Rent in 1964 .205 .306 .975 Log Rent in 2009 .279 .427 1.380

Panel B: Median Housing Price Log Annual Cost in 1964 .278 .421 1.142 Log Annual Cost in 2009 .464 .691 2.093

Notes: Median housing price is annualized using a discount factor of 7.85% (Peiser and Smith, 1985). All figures are weighted by employment in the relevant metropolitan area and year. Appendix Table A3: Robustness - The Effect of Changes in the Spatial Dispersion of Relative Wages Under Alternative Assumptions on Production Technology

2009 Percent Who Counterfactual Have Moved by Output 2009 (2) (1) Baseline 1) α = .65; η = .25 13.5% 52.5%

Different Labor and Capital Shares, Same Returns to Scale 2) α = .70; η = .20 14.8% 55.9% 3) α = .60; η = .30 12.2% 49.1%

Different Returns to Scale 4) α = .70; η = .25 29.9% 85.9% 5) α = .60; η = .35 28.2% 83.9% 6) α = .60; η = .25 7.2% 34.4% 7) α = .65; η = .20 8.0% 37.0%

Technology Parameters Vary Across Industries and Years 7.4% 53.9%

Notes Entries in column 1 are the percent difference between counterfactual output level in 2009 and actual output level. Entries in column 2 are the percent of workers who in the counterfactual scenario reside in a MSA different from their actual MSA of residence. The counterfactual involves setting 2009 relative wage equal to their 1964 level in all cities. The sample includes 220 metropolitan areas observed in both 1964 and 2009. Appendix Table A4. Spatial Dispersion of Amenities in 1964 and 2009

Std. Deviation Interquartile Range Range (1) (2) (3) Amenities in 1964 1223.7 1737.7 6563.8 Amenities in 2009 1601.7 2304.3 7607.3

Notes: All figures are weighted by TFP1/(1- α- η). The sample includes 220 metropolitan areas observed in both 1964 and 2009. Appendix Figure A1: Estimated 2009 Average Wage Residual vs Actual 2009 Average Wage From Individual Level Data

Note: Each dot is a MSA. The x axis reports average residuals by MSA from an individual level regression based on individual level data from the Census of Manufacturers. The y axis has residuals based on CBP data used in the main analysis. The employment weighted correlation is .75. Appendix Figure A2: Spatial Dispersion of Demeaned Log Nominal Wages in 1964 and 2009 - Unweighted Why Has Regional Income Convergence in the U.S. Declined?

Peter Ganong and Daniel Shoag⇤ January 2015

Abstract The past thirty years have seen a dramatic decline in the rate of income convergence across states and in population flows to wealthy places. These changes coincide with (1) an increase in housing prices in productive areas, (2) a divergence in the skill-specific returns to living in those places, and (3) a redirection of unskilled migration away from productive places. We develop a model in which rising housing prices in wealthy areas deter unskilled migration and slow income convergence. Using a new panel measure of housing supply regulations, we demonstrate the importance of this channel in the data. Income convergence continues in less-regulated places, while it has mostly stopped in places with more regulation. JEL Codes: E24, J23, J24, R14, R23, R52 Keywords: Convergence, Regulation, Land Use, Migration, Housing Prices

⇤Email: [email protected] (Harvard University) and [email protected] (Harvard Kennedy School). We would like to thank Marios Angeletos, Robert Barro, George Borjas, Gary Cham- berlain, Raj Chetty, Gabe Chodorow-Reich, David Dorn, Bob Ellickson, Emmauel Farhi, Bill Fischel, Dan Fetter, Edward Glaeser, Claudia Goldin, Joe Gyourko, Larry Katz, and seminar participants at Harvard, Tel Aviv, Bar Ilan, Dartmouth and the NBER Summer Institute for their valuable feedback. Shelby Lin provided outstanding research assistance. We thank Erik Hurst for alerting us to the end of regional convergence and spurring us to work on an explanation. Peter Ganong gratefully acknowledges residence at MDRC when work- ing on this project and funding from the Joint Center for Housing Studies and the NBER Pre-Doctoral Fellow- ship in Aging and Health. Daniel Shoag and Peter Ganong gratefully acknowledge support from the Taubman Center on State and Local Government. Animations illustrating the changes in income convergence, directed migration, and regulations can be found at http://www.people.fas.harvard.edu/~ganong/motion.html.

1 1Introduction

The convergence of per-capita incomes across US states from 1880 to 1980 is one of the most striking patterns in macroeconomics. For over a century, incomes across states converged at a rate of 1.8% per year.1 Over the past thirty years, this relationship has weakened dramatically (see Figure 1).2 The convergence rate from 1990 to 2010 was less than half the historical norm, and in the period leading up to the Great Recession there was virtually no convergence at all. During the century-long era of strong convergence, population also flowed from poor to rich states. Figure 2 plots “directed migration”: the relationship between population growth and income per capita across states. Prior to 1980, people were moving, on net, from poor places to richer places. Like convergence, this historical pattern has declined over the last thirty years. We link these two fundamental reversals in regional economics using a model of local labor markets. In this model, changes in housing regulation play an important role in explaining the end of these trends. Our model analyzes two locations that have fixed productivity differences and downward-sloping labor demand. When the population in a location rises, the marginal product of labor (wages) falls. When the local housing supply is unconstrained, workers of all skill types will choose to move to the productive locations. This migration pushes down wages and skill differences, generating income convergence. Unskilled workers are more sensitive to changes in housing prices. When housing supply becomes constrained in the productive areas, housing becomes particularly expensive for unskilled workers. We argue that these price increases reduce the labor and human capital rebalancing that generated convergence. The model’s mechanism can be understood through an example. Historically, both jani- tors and lawyers earned considerably more in the tri-state New York area (NY, NJ, CT) than

1See Barro and Sala-i Martin [1992], Barro and Sala-i Martin [1991], and Blanchard and Katz [1992] for classic references. 2Figure 1 plots convergence rates (change in log income on initial log income) for rolling twenty-year windows. The standard deviation of log per capita income across states also fell through 1980 (sigma convergence), and then held steady afterward. The end of this type of convergence demonstrates that the estimated decline in convergence rates is not due to a reduction in the variance of initial incomes relative to a stationary shock process.The strong rate of convergence in the past as well as the decline today do not appear to be driven by changes in measurement error. When we use the Census measure of state income to instrument for BEA income, or vice-versa, we find similar results. The decline also occurs at the Labor Market Area level, using data from Haines [2010] and U.S. Census Bureau [2012]. We report additional results connected to these measures in the Appendix. The decline of convergence has been observed at the metro-area level in Berry and Glaeser [2005]. See also chapter 2 of Crain [2003] and Figure 6 of DiCecio and Gascon [2008].

2 their colleagues in the Deep South (AL, AR, GA, MS, SC). This was true in both nominal terms and after adjusting for differences in housing prices.3 Migration responded to these differences, and this labor reallocation reduced income gaps over time. Today, though nominal premiums to being in the NY area are large and similar for these two occupations, the high costs of housing in the New York area has changed this calculus. Though lawyers still earn much more in the New York area in both nominal terms and net of housing costs , janitors now earn less in the NY area after housing costs than they do in the Deep South.4 This sharp difference arises because for lawyers in the NY area, housing costs are equal to 21% of their income, while housing costs are equal to 52% of income for NY area janitors. While it may still be “worth it” for skilled workers to move to productive places like New York, for unskilled workers, New York’s high housing prices offset the nominal wage gains. We build on research showing that differences in incomes across states have been in- creasingly capitalized into housing prices (Van Nieuwerburgh and Weill [2010], Glaeser et al. [2005b] Gyourko et al. [2013]). In this paper, we show that the returns to living in productive places net of housing costs have fallen for unskilled workers but have remained substantial for skilled workers. In addition, we show that skilled workers continue to move to areas with high nominal income, but unskilled workers are now moving to areas with low nominal income but high income net of housing costs. Each of these stylized facts represents the aggregate version of the lawyers and janitors example above. To better understand the causes and consequences of housing price increases, we construct anewpanelmeasureoflanduseregulation.Ourmeasureisascaledcountofthenumber of decisions for each state that mention “land use,” as tracked through an online database of state appeals court records. We validate this measure of regulation using existing cross- sectional survey data. To the best of our knowledge, this is the first national panel measure of land use regulations in the US.5 Using differential regulation patterns across states, we report five empirical findings that

3In 1960, wages were 42% and 85% higher in NY than in the Deep South for lawyers and janitors respectively. After adjusting for housing costs (12 times monthly rent of .05 of home value), these premia were 41% and 68%. 4In nominal terms, the wages of lawyers and janitors are 45% and 32% higher in NY respectively in 2010. After adjusting for housing prices, these premia are 37% and -6%. 5 Prior work has examined housing price and quantity changes to provide suggestive evidence of increasing supply constraints (Sinai [2010], Glaeser et al. [2005a], Glaeser et al. [2005b], Quigley and Raphael [2005], and Glaeser and Ward [2009]).

3 connect housing supply limits to declines in migration and income convergence. Tight land use regulations weaken the historic link between high incomes and new housing permits. Instead, income differences across places become more capitalized into housing prices. With constrained housing supply, the net migration of workers of all skill types from poor to rich places is replaced by skill sorting. Skilled workers move to high cost, high productivity areas, and unskilled workers move out. Finally, income convergence persists among places unconstrained by these regulations, but it is diminished in areas with supply constraints.

To assess whether these patterns reflect a causal relationship, we conduct three tests designed to address omitted variable bias and possible reverse causality. First, we repeat our analysis using a placebo measure of all court cases, not just those restricted to the topic of land use. In contrast to our results for land use cases, we find no impact on the outcomes of interest using this measure. Second, we use a state’s historical tendency to regulate land use as measured by the number of cases in 1965 and study the differential impact of broad national changes in the regulatory environment after this date.6 We find that income convergence rates fell after 1985, but only in those places with a high latent tendency to regulate land use. We repeat this exercise using another predetermined measure of regulation sensitivity based on geographic land availability from Saiz [2010] at both the state and county levels. Again, we find income convergence declined the most in areas with supply constraints.

In this paper, we highlight a single channel – labor mobility – which can help explain both income convergence through 1980 and its subsequent disappearance from 1980 to 2010. Much of the literature on regional convergence has focused on the role of capital, racial dis- crimination, or sectoral reallocations.7 We build on an older tradition of work by economic historians (Easterlin [1958] and Williamson [1965]) as formalized by Braun [1993], in which directed migration drives convergence. Similarly, much of the existing literature on recent regional patterns in the US emphasizes changes in labor demand from skill-biased techno- logical change and its place-based variants (Autor and Dorn [2013], Diamond [2012], Moretti [2012b]). Our explanation, which is complementary to these other channels, emphasizes the role of housing supply constraints. In Section 5, we discuss thse alternate channels and their inability to fully account for the data in the absence of housing supply constraints.

The remainder of the paper proceeds as follows. In Section 2, we develop a model to ex-

6Many authors use a region’s historical features interacted with national changes. For example, Bartik [1991] uses historical industry shares, Card [2009] uses historical ethnicity shares, and Autor and Dorn [2013] use historical occupation shares. 7See Barro and Sala-i Martin [1992], Caselli and Coleman [2001], Michaels et al. [2012], and Hseih et al. [2013].

4 plore the role of labor migration and housing supply in convergence. Section 3 demonstrates that this model is consistent with four stylized facts about migration and housing prices. Section 4 introduces a new measure of land use regulation and directly assesses its impact on convergence, Section 5 considers alternative forces at work during this period, and Section 6concludes.

2ASimpleModelofRegionalMigration,HousingPrices, and Convergence

In this section, we develop a simple model to structure our study of the interaction between directed migration, housing markets, and income convergence. The model builds upon a long line of papers in urban economics following the spatial equilibrium framework of Rosen [1979], Roback [1982], and Blanchard and Katz [1992]. It combines elements from Braun [1993] and Gennaioli et al. [2013b], who solves a dynamic model of migration and regional convergence, and Gennaioli et al. [2013a], who study a static regional model with heterogeneous skill types. Our model considers two locations within a national market: a more productive North and a less productive South. Tradable production employs the local labor supply and has decreasing returns to scale.8 As a consequence of this assumption, more workers in a lo- cation drives down average wages. We solve a similar model without decreasing returns in production in Appendix B. Workers are endowed with a skill level, and skilled and unskilled labor are imperfect substitutes in the production of tradables. Workers in each location consume two goods: non-tradable housing and a tradable nu- meraire. All workers must consume a baseline, non-utility producing amount of housing in their respective location. This non-homotheticity, which we implement using a Stone- Geary utility function, ensures that housing accounts for a smaller share of skilled workers’ consumption baskets. Next, we consider the interregional allocation of labor. We begin from initial productivity levels such that real wages are lower in the South. Once we allow migration, labor inflows into the North drive down wages for all skill types due to decreasing returns in production. Conversely, wages rise in the South as labor becomes more scarce. The positive impact on

8We view this assumption as a reduced form representation of a more complicated process. An alternative way of motivating downward-sloping labor demand could use constant returns to scale in production, each region producing a unique good, and a taste for variety in consumption.

5 wages in the South and negative impact in the North generate interregional convergence in incomes. If there is a shock that causes the cost of new construction to rise, however, housing prices rise in North, and migration flows become smaller and biased towards skilled workers. Because fewer people move to the North – and because the people who move there are more skilled – income convergence slows. We demonstrate these effects in an illustrative simulation below and in calibration exercises in Appendix A. Our interpretation of the data relies on two crucial features of the model:

1. Regional labor demand slopes downward. A few examples from the economic history literature help illustrate this concept. First, Acemoglu et al. [2004] study labor sup- ply during and after World War II. States which had more mobilization of men had increased female labor force participation. After the war, both males and females in these places earned lower wages. Second, Hornbeck [2012] studies the impact of a major negative permanent productivity shock, the Dust Bowl. He finds that out-migration is the primary factor adjustment which allowed wages to partially recover. Third, Margo [1997] studies the impact of a positive productivity shock: the Gold Rush. At first, wages soared, but as people migrated in to California, wages declined. We present two methods of deriving this downward sloping labor demand in the paper (Appendix B contains a version without decreasing returns in production), and while our results do not depend on the derivation, they do rely upon the concept. While the extent of this effect is an open question, many papers find evidence for downward-sloping labor demand and our interpretation of the data is consistent with this view.9

2. Housing is an inferior good within a city; meaning that within a labor market, low-skill workers spend a disproportionate share of their income on housing.10 Many studies have estimated Engel curves for housing, and some find elasticities slighly below one.11 These estimates generally differ from the parameter of interest in our model in two ways. First, they often express housing as a share of consumption rather than as a share of income (Diamond [2012]). Second, they estimate Engel curves across labor markets rather than within labor markets. These differences mute the non-homotheticity of housing demand due to the positive correlation between income and savings rates, and

9See Iyer et al. [2011], Boustan et al. [2010], Cortes [2008], and Borjas [2003]. 10In fact, our model requires the weaker assumption that land within a labor market is an inferior good. The structural value of housing can be treated as non-housing consumption in our framework. The literature that has estimated the income elasticity of land consumption robustly shows income elasiticites below 1 even in the national cross-section (Glaeser et al. [2008]). Glaeser and Gyourko [2005] and Notowidigdo [2013] provide indirect evidence of non-homotheticity in migration patterns. 11See Harmon [1988] for an example. Similarly, Davis and Ortalo-Magné [2011] demonstrates that expen- diture shares on housing are relatively flat when not adjusting for skills.

6 due to the positive correlation between incomes and house prices across cities. Below we plot the relevant within-city Engel curve using housing as a share of household income and instrumenting for household income with education to address measurement error (Ruggles et al. [2010]). As is evident in the figure, there is a considerable degree of non-homotheticity within labor markets when measuring housing as a share of income. We calibrate our model to match this degree of non-homotheticity in Appendix A.

Household Level: Housing Share of Income MSA Fixed Effects .35 .3 .25 .2 .15 Fraction of Household Income Spent on Housing on Spent Income of Fraction Household 0 50000 100000 150000 200000 Household Income (Instrumented with Education)

Note: This figure plots the relationship between the share of household income spent on housing and average household income in the 2010 ACS, conditional on MSA-level fixed effects. Annual income is volatile, meaning that the baseline non-homothetic cross-sectional relationship between housing share and annual income might not reflect the true relationship between housing share and permanent income. To address this issue, we instrument for household income using the education level of its prime age members (25-65). We construct predicted income for each household by summing the average wages associated with the detailed education level of all the household’s prime age members. To make this non-homothetic relationship easier to see, we then divide the sample into 50 bins based on household predicted income and plot the average housing share for each bin, controlling for the MSA fixed effects. This data presentation technique is widely used (see Chetty, Friedman, and Rockoff 2013 for an example). Housing expenditure is computed as twelve times monthly rents or 5% of housing costs. Housing shares above 100% and below zero are excluded.

We now describe our model for each state’s economy, before turning to the model’s interregional dynamics.

7 2.1 Within-state equilibrium

Each location consists of three markets: a market for labor, a housing market, and a goods market that clears implicitly.

Individual Decisions: Goods Demand and Indirect Utility There are njkt agents are endowed as either skilled or unskilled in production k s, u ,andhaveutilityinstate 2 { } j N,S at every date t of 2 { }

rt max e ln(ujkt) c ,h { jkt jkt} t X 1 where ujkt = c hjkt H¯ jkt subject to cjkt + pjthjkt = wjkt + ⇡t (1)

Workers’ preferences take the Stone-Geary functional form with a baseline housing require- ment H¯ that is common for both skilled and unskilled workers. This functional form gen- erates non-homothetic housing demand.12 To keep things simple, we assume inelastic labor supply and abstract from intertemporal markets by imposing a static budget constraint.13

Workers receive the local wage wjkt for their skill type k and the price of housing relative to tradables is pjt. Profits from both the housing sector and the tradable sector in North and

South (⇡t) are rebated lump-sum nationally. We can therefore write each agent’s indirect utility as a function of the wage, price and preference parameters:

1 1 v (w ,p )=ln w + ⇡ p H¯ jkt jkt jt jkt t jt p ✓ jt ◆ ! Labor Market Next, we turn to the production of tradables. State-level production is given by

1 ↵ ⇢ ⇢ ⇢ Yjt = Aj njut + ✓njst 14 where njk is the number of people of type k residing in state j. We normalize AS =1 throughout, and assume AN > 1. This term can encompass capital differences, natural advantages, institutional strengths, different sectoral compositions, amenities, and agglom- eration benefits. Assuming labor earns its marginal product, we have:

12See Mulligan [2002] and Kongsamut et al. [2001] for other examples of papers using Stone-Geary prefer- ences. 13We allow for endogenous labor supply in a calibration exercise in Appendix A. 14This widely used form of imperfect substitution ensures an interior solution for skill ratios in equilibrium.

8 1 ↵ ⇢ ⇢ ⇢ ⇢ ⇢ 1 wjut = Aj (1 ↵) n + ✓n (njut) (2) jut jst

1 ↵ ⇢ ⇢ ⇢ ⇢ ⇢ 1 wjst = Aj (1 ↵) ✓ n + ✓n (✓njst) (3) jut jst demand Equilibrium in each these markets is given by the wage such that ljkt (wjkt)=njkt .

Housing Market Define the quantity of housing in place j at time t as Hjt. Every state is endowed with a housing supply at time zero equal to the demand of the initial population. Regulations can only affect new construction. Because they are designed to minimize the amount of cumulative development, we model them as imposing a convex cost as a function of the existing housing stock, where ⌘,themeasureofregulatoryconstraints,governsthe elasticity of supply in growing regions. The marginal cost per unit of construction is

0 Hjt

1 if Hjt Hjt 1 pjt =  (4) 8 1/⌘ 1+Hjt if Hjt >Hjt 1 < Regulations affect the dynamics of the: system only in places where the population would oth- w +⇡ erwise be increasing. Demand for housing for each individual is equal to H¯ +(1 ) jkt t , pjt and therefore aggregate demand is ⇣ ⌘

wjut + ⇡t wjst + ⇡t Hjt = njut H¯ +(1 ) + njst H¯ +(1 ) (5) p p ✓ ✓ jt ◆◆ ✓ ✓ jt ◆◆ We model regulations as affecting the elasticity of supply rather as a direct cost shock. This choice is motivated by empirical evidence that regulations affect the relationship be- tween income and prices and not merely the price itself (see Figure 8 and Table 2). This choice is also consistent with the existing empirical work on regulations and housing (Saiz [2010]and Saks [2008]), and the dominant interpretation in the legal literature (Ellickson [1977]).

Equilibrium Taking njut,njst as given, prices wjut,wjst,pjt and allocations cjkt,Hjkt { } { } { } that satisfy equations 1-5 constitute an equilibrium in the housing and labor markets. This equilibrium also allows us to write indirect utility as a function of the local population

9 (vjkt(njut,njst)). 2.2 Migration and Dynamics

Having characterized the equilibrium within a location, we turn to cross-location dy- namics. Normalizing the national population of each skill type to 1, we define kt = vNkt(nNut,nNst) vSlt((1 nNut), (1 nNst)) as the flow utility gains to living in the North. Note that when land supply is perfectly elastic (⌘ ), kt does not depend on the skill !1 type k.15 We can now define the present discounted value of migrating from South to North as:

1 r⌧ qk (t)= e k⌧ (6) ⌧=t X These expressions depend upon exogenous parameters and shocks, as well as two state vari- ables nNut and nNst. Given these gains to migration, how many people migrate each period? We follow Braun [1993] in assuming that the migration rate is proportional to the present-discounted value of migrating:

ln(nNkt) ln(nSkt)= qk(t) (7) This equation holds exactly for i.i.d. migration cost draws from a specific distribution derived in Appendix C, or viewed as a linear approximation of a more general class of processes. The equations represented in (6) and (7) constitute a dynamic system in terms of two endogenous variables and exogenous shocks and parameters. To illustrate the dynamics of the system, we consider a numeric example. We plot the dynamics in a simulation where (1) the population of skilled and unskilled workers are evenly divided between North and South, (2) the housing supply in the North is completely elastic (⌘ 0), and where (3) ! the productivity parameter AN is significantly greater than 1. Given these assumptions, the initial population in the South exceeds the steady-state population values. The figure below illustrates the dynamics of the system from these conditions until time 16 t1. When the housing supply in the North is completely elastic, the relative gains to migration are independent of skill type, and hence both high and low productivity workers migrate away from the South at the same constant rate. This directed migration makes labor

15This holds under the normalization that H¯ = ⇡. 16This graph is meant to illustrate the model’s dynamics. To do this, we set ✓ =1.7, ↵ = 0.33, ⇢=0.9, =0.25, H=0.25, An =2, =0.005, and r=0.05. We then simulated a falling housing supply elasticity by having 1/⌘ ascend from a value near zero to 0.25.

10 more scarce in the South and more plentiful in the North, which yields a constant rate of convergence in per capita incomes between the regions. Additionally, if there were a larger fraction of unskilled workers in the South, then migration would have driven convergence by equating average human capital levels as well.

Reg ↑ Begins Reg ↑ Complete Mig Rate from South to North (Unanticipated) (Skilled & Unskilled)

Mig Rate (Skilled) 0 Rate of Convergence / Migration of Rate Convergence Mig Rate (Unskilled) Income Convergence Rate

t0 t1 t2 Time

At date t1,theelasticityofhousingsupply,⌘, begins to fall and reaches a new, per- manently lower level at time t2. This unanticipated shock increases housing prices in the growing North, and alters the value of living in the North in the future. Both skilled and unskilled migration rates fall, but they do not fall to the same degree. Skilled workers con- tinue to find it worthwhile to move from South to North, but the increase in housing prices actually makes the North relatively unattractive to unskilled workers who begin to move in the opposite direction. The joint effect is that, by t2,thereisnomorenetmigrationfrom South to North and no further convergence in incomes per capita. Instead, migration flows lead to skill-sorting and segregation by skill type. This model lays out a theory that can account for the changing migration and convergence patterns reported in the beginning of the paper. We assess the validity of this explanation in two ways; we first present stylized facts that suggest housing markets have played a key role in altering migration patterns, and then we introduce a new measure of housing supply restrictions to test this model directly.

11 3MotivatingFactsonHousingPricesandMigration

In this section, we highlight four stylized facts on the evolution of the flows of and returns to migration in the U.S. These facts motivated the model laid out in the previous section and its emphasis on the elasticity of housing supply. Fact 1: Differences in Housing Prices Have Grown Relative to Differences in Incomes

In the last fifty years, there has been a shift in the relationship between prices and incomes across states. Figure 3 plots the relationship between log income and log housing prices in 1960 and 2010. Each observation is a state’s mean income and median house value from the Census. In 1960, housing prices were 1 log point higher in a state with 1 log point higher income. By 2010, the slope had doubled, with housing prices 2 log points higher in a state where income was 1 log point higher.

Fact 2: Housing Prices Have Lowered the Returns to Living in Productive Places For Unskilled Workers

We test for changing returns by examining the relationship between unconditional average income in a state and skill-group income net of housing prices (Ruggles et al. [2010]).17 With i indexing households and j indexing state of residence, we regress:

Y P = ↵+ Y (1 S )+ Y S +⌘S +X +" ij ij unskilled j ⇥ ij skilled j⇥ ij ij ij ij Income-Housing Cost Nominal Income | {z } |{z} where Yisj is household wage income, Pij is a measure of housing costs defined as 12 times the monthly rent or 5% of house value for homeowners, and Sij is the share of the household 18 that is skilled, and Yj is the mean nominal wage income in the state.

Figure 4 shows the evolution of skilled and unskilled decade by decade. These coefficients measure the returns by skill to living in a state that is one dollar richer. For example,

17Ideally, we would have a cost index for the price of all goods and services and use this to deflate income. Moretti [2012a] finds a strong positive correlation between housing prices and the price of other consumer goods. Unfortunately, we are unaware of any regional price indices going back to 1940. 18Income net of housing cost is a household-level variable, while education is an individual-level variable. We conduct our analysis at the household level, measuring household skill using labor force participants ages 25-65. A person is defined as skilled if he or she has 12+ years of education in 1940, and 16+ years or a BA thereafter. The household covariates Xij are the size of the household, the fraction of household members in the labor force who are white, the fraction who are black, the fraction who are male, and a quadratic in the average age of the adult household members in the workforce.

12 unskilled is 0.88 in 1940, meaning that for unskilled workers, income net of housing costs was

$0.88 higher in states with $1.00 higher nominal income. unskilled shows a secular decline from 1970 forward. The decade-specific coefficients on skilled show a different pattern. In 1940 and 1960, skilled and unskilled households had similar returns to migrating. By 2010, income net of housing costs is three times more responsive to nominal income differences by state for skilled households than for unskilled households. The returns to living in high income areas for unskilled households have fallen dramatically when housing prices rose, even as they have remained stable or grown for skilled households.19

Fact 3: Migration Flows Respond to Skill-Specific Gains Net of Housing Prices

Next, we examine the extent to which people moved from low to high income places. We estimate income in both nominal terms and using the income net of housing cost measure developed above. We estimate net migration using the Census question “where did you live 5 years ago?”, which was first asked in 1940 and last asked in 2000. We use the most detailed geographies available in public use microdata: State Economic Areas in 1940 (467 regions) and migration PUMAs in 2000 (1,020 regions). In Figure 5, we examine migration patterns from 1935 to 1940. As is evident from the graphs, both skilled and unskilled adults moved to places with higher nominal income.20 The same relationship holds true for income net of housing cost.21 In Figure 6, we examine migration patterns from 1995 to 2000. Although skilled adults are still moving to high unconditional nominal income locations, unskilled adults are actually weakly migrating away

19In the Appendix, we report the results of two robustness checks. First, to reduce the bias arising from the endogeneity of state of residence, we also provide instrumental variables estimates using the mean income level of the household workers’ state of birth as an instrument. To be precise, we estimate Y P = is is ↵+ Yˆ (1 S )+ Yˆ S +⌘S +X +" ,usingY and Y S as instruments unskilled j ⇥ ij skilled j ⇥ ij ij ij ij j,birth j,birth ⇥ js for the two endogenous variables Yˆ (1 S ) and Yˆ S .Second,wedemonstratethathousingcosts j ⇥ ij j ⇥ ij have differentially changed housing prices in high nominal income places for low-skilled workers. 20Migration and education are person-level variables, while income net of housing cost is a household-level variable. We conduct our analysis at the individual level, merging on area-by-skill measures of income net of housing cost. To construct area-by-skill measures, we define households as skilled if the adult labor force participants in said household are skilled, and as unskilled if none of them are skilled. See notes to Figure 5 for details. The specifications shown in Figures 5 and 6 involve some choices about how to parameterize housing costs and which migrants to study. In the Appendix, we report four robustness checks: doubling housing costs for the income net of housing cost measure, excluding migrants within-state, using only whites, and using a place of birth migration measure. In 1940, all slopes are positive, and most are statistically significant. In 2000, all slopes are positive and statistically significant for skilled workers. For unskilled workers, the coefficients broadly fit the patterns in Figure 6, although only sometimes are statistically significant. 21These results are similar to work by Borjas [2001], who finds that immigrants move to places which offer them the highest wages.

13 from these locations.22 This finding sharply contrasts with the results from the earlier period in which there was directed migration for both groups to high nominal income areas. It is an apparent puzzle that unskilled households would be moving away from productive places. However, this seeming contradiction disappears when we adjust income to reflect the group- specific means net of housing prices. High housing prices in high nominal income areas have made these areas prohibitively costly for unskilled workers. Changes in observed migration patterns are consistent with the changes in the returns to migration shown above.

Fact 4: Migration Used to Generate Substantial Human Capital Convergence Across Regions

We now examine the effect of migration flows on aggregate human capital levels. We present evidence that the transition from directed migration to skill sorting appears to have sub- stantially weakened human capital convergence due to migration. We follow the growth- accounting literature (e.g. Denison [1962], Goldin and Katz [2001]) and estimate a Mincer regression in the IPUMS Census files. Under the assumption of a fixed national return to schooling, a state’s skill mix and these coefficients can be used to estimate its human 23 capital. We construct predicted income as Inck for each education level k and Sharekj as the share of people in human capital group k living in state j.Astate-levelindexis d Human Capital Inck Sharekj. Our research design exploits the fact that the Cen- j ⌘ k ⇥ sus asks people about both their state of residence and their state of birth. We can then P d compute the change in the human capital index due to migration as

HC Inc Share Inc Share , j ⌘ k kj,residence k kj birth Xk Xk Realized Humand Capital Allocation No-Migrationd Counterfactual | {z } | {z } Next, we take the baseline measure of what human capital would have been in the absence of migration (HCj,birth) and examine its relationship with how much migration changed the

22Young et al. [2008] similarly show that from 2000 to 2006, low-income people migrated out from New Jersey, while high-income people migrated in. 23 Formally, we estimate the specification log Incik = ↵k + Xik + "ik where Incik is an individual’s annual income, and Xik includes demographic covariates using data from the 1980 Census. We construct predicted income as Inck = exp(ˆ↵k).Skilllevelk is defined as seven possible completed schooling levels (0 or NA, Elementary, Middle, Some HS, HS, Some College, College+). Xik includes a dummy for Hispanic, a dummy for Black, ad dummy for female and four age bin dummies. There is a substantial literature showing that the South had inferior schooling quality conditional on years attained (e.g. Card and Krueger [1992]). Thus this measures is, if anything, likely to underestimate the human capital dispersion across states.

14 skill composition of the state (HCj). Specifically, we regress

HCj = ↵ + HCj,birth + "j

Figure 7 shows the results of this regression for different years in the U.S. Census. We focus our analysis on people ages 25 to 34 to focus on people who have completed their education but are likely to have migrated recently.24 We estimate a slope of ˆ = 0.33 in the 1960 Census. Of the human capital dispersion by state of birth, migration of low human capital workers to high human capital places was sufficient to eliminate 33% of the disparities in human capital. By 2010, migration would have eliminated only 8% of the remaining disparity.25

4APanelMeasureofHousingRegulations

These stylized facts suggest that changes in housing prices were an important contributor to changing migration and convergence patterns. The model formalized this idea and high- lighted the importance of changes in the elasticity of housing supply in growing regions. In this section, we explore the role of regulations directly. We develop a new measure of housing supply regulations based on state appeals court records. Past empirical work has shown tight links between prices and measures of land use regulation in the cross-section, and these regulations are a good proxy for the parameter ⌘ in the model.26 This new measure is, to the best of our knowledge, the first panel of housing supply regulations covering the United States and we validate it against existing cross-sectional regulation measures.27 We use this measure to test for the entire causal chain of the model by showing that housing supply constraints reduce permits for new construction, raise prices, lower net migration, slow human capital convergence and slow income convergence.

24To the extent that people migrate before age 25 (or their parents move them somewhere else), we may pick up older migration flows. Nevertheless, this statistic still has a well-defined interpretation as the amount of human capital convergence due to migration within a cohort. 25This figure shows that migration contributed to convergence in human capital levels. Looking at conver- gence in average human capital levels, including native-born residents human capital investment decisions, we do not see the same decline in human capital convergence for the same aged sample. This occurs in part because the fraction of natives completing high school rose sharply among low human capital Southern states in the 1970’s and 1980’s, while this fraction was already high for the rest of the country. 26Examples include Glaeser et al. [2005a], Katz and Rosen [1987], Pollakowski and Wachter [1990], Quigley and Raphael [2005], and Rothwell [2012] using US data. See Brueckner and Sridhar [2012] for work on building restrictions in India. 27In a similar spirit, Hilber and Vermeulen [2013] analyze a panel of land use regulations in the UK.

15 4.1 Measuring Land Use Regulations

Our measure of land use regulations is based upon the number of state appellate court cases containing the phrase “land use” over time. The phrase “land use” appears 42 times in the seminal case Mount Laurel decision issued by the New Jersey Supreme Court in 1975. We also show similar results for the phrase “zoning” in the Appendix. Municipalities use a wide variety of tactics for restricting new construction, but these rules are often controversial and any such rule, regardless of its exact institutional origin, is likely to be tested in court. This makes court decisions an omnibus measure which capture many different channels of restrictions on new construction. We searched the state appellate court records for each state-year using an online legal database and produce counts of land use cases in per capita terms. One immediate result from constructing this measure is that the land use cases have be- come increasingly common over the past fifty years. Figure 8 displays the national regulation measure over time, which exhibits strong secular growth. Growth is particularly rapid from 1970, when it stood at about 25% of its current level, to 1990, when it reached about 75% of its present day level. We validate our measure against the existing cross-sectional measures that focus on supply constraints. The first survey, from the American Institute of Planners in 1975, asked 21 land use-related questions of planning officials in each state (The American Institute of Planners [1976]).28 To build a summary measure, we add up the total number of yes answers to the 21 questions for each state. As can be seen in Figure 8, the 1975 values of our measure are strongly correlated with this measure. Similarly, our measure is highly correlated with the 2005 Wharton Residential Land Use Regulation Index (WRLURI).29 Finally, state-years with high levels of regulation show increased capitalization of income into housing prices.

4.2 Why Did Land Use Regulations Change?

Since Ellickson [1977]’s seminal article, it has been widely accepted that municipalities’ land use restrictions serve to raise property values for incumbent homeowners.30 In this section, we examine the institutional and demographic factors which may have led such regulations

28Saks [2008] also uses this survey as a measure of land use regulations. 29To construct state-level measures, we weighted the metro estimates in Gyourko et al. [2008] by 1960 population and imputed from neighbors where necessary. 30Blanchflower and Oswald [2013] demonstrate the link between homeownership and land use regulation empirically.

16 to become more widespread and more effective in constraining supply across an entire region.

Many land use scholars point to a landmark shift toward new stringencies in regulations in the 1960’s and 1970’s. Fischel [2004] argues that in the wake of racial desegregation, land use restrictions allowed suburban residents to keep out minorities using elevated housing prices, and that environmentalism provided a sanitized language for this ideology. He writes “I submit that neighbour empowerment and double-veto systems, in conjunction with local application of environmental laws, changed metropolitan development patterns after 1970.” In a book on land use regulation, Garrett [1987] writes

A changing public attitude toward growth and development within many local communities emerged in the early 1960s. Two factors were simultaneously respon- sible for this change. First, there was an increasing concern over environmental issues, and it was apparent that certain types of economic development were detrimental to the environment. Second, economic analysis began to demon- strate that all forms of economic development did not generate a positive fiscal impact in every community.

Along similar lines, the American Land Planning Law textbook (Taylor and Williams [2009]) write that, after a period in the 1900’s during which courts typically held the application of restrictions to particular tracts of land to be invalid, the courts “went to the other ex- treme, tending to uphold anything for which there was anything to be said.” Our statistical regulation measure is broadly consistent with this argument, although the change in the intellectual climate described above somewhat preceded the run-up in our measure – the flow of new land use cases rose sharply from 1970 to 1990. Because land use rules are administered at the local level, there are no seminal Supreme Court cases which marked this new era of jurisprudence. Among state cases, scholars typ- ically cite Mount Laurel vs. National Association for the Advancement of Colored Persons (NAACP) as among the most important. Philadelphia suburb Mount Laurel, at the time composed primarily of single family houses, adopted rules which required that developers of multi-family units provide in leases that (1) no school-age children may occupy a one- bedroom unit and (2) no more than two children may occupy a two-bedroom unit. In addition, should a development have more than 0.3 children per unit on average, the devel- opers were required to pay any additional tuition costs. The NAACP sued, and in 1975, the New Jersey Supreme Court ruled in its favor, finding that each community had to provide its “fair share” of “low- and moderate-income housing.”

17 While the NAACP won the case, Mount Laurel and like-minded suburbs won the war. Mount Laurel’s new planning ordinance rezoned only 20 of its 14,300 acres, choosing locations such that “the new zones had serious physical difficulties and restrictions created by the ordinance that rendered their actual development for low-cost housing virtually impossible” (Garrett [1987]). In 1977, the state Supreme Court issued a new ruling in the Oakwood at Madison decision, which substantially rolled back its prior decision, finding instead that that courts were not competent to determine what constituted a “fair share”. These cases led to the “Mount Laurel Doctrine,” wherein judges began to play a continuing role in monitoring local zoning policies, but the sea change had already occurred in New Jersey. From 1970 to 2010, its urban population grew at an annual rate of 0.4%, less than half the national average for this period.31 New state and regional environmental restrictions on land use, detailed in a White House report titled “The Quiet Revolution in Land Use Control”, added another constraint on new construction. These restricions played a crucial role in preventing construction on a metro-wide level, an argument highlighted by Ellickson [1977]. In a Tiebout model where consumers choose locations, if some municipalities restrict construction as Mount Laurel did, and other places respond by issuing more permits, then the aggregate impact on new units and average prices could be zero. For example, in the East Bay region in California, while many municipalities restricted construction, the coastal city of Emeryville adopted developer- friendly policies, yielding much higher-density units. In 1969, the California Legislature gave the San Francisco Bay Conservation and Development Commission the power to require permits from anyone seeking to develop land along the shoreline (Bosselman and Callies [1971]). The Commission then blocked a plan by Emeryville to fill the Bay and construct large developments there.32 The East Bay has remained an attractive place to live, but with no municipality willing to allow new construction, housing prices across the East Bay have soared in recent years.

Local variation in regulations is not randomly assigned; it is the product of substantial work by local governments and regulatory bodies. There is some recent work on the political economy of the regulations. Kahn [2011] shows that in California, cities which vote Demo- cratic tend to issue fewer housing permits. Hilber and Robert-Nicoud [2013] and Schleicher [2013] develop political economy stories where changes in the share of developed land, and in the structure of city politics, respectively, cause changes in land use policies.

31Urban population is defined as population living in a Primary Metropolitan Statistical Area. 32A change in town leadership in the election of 1987 also led to a slowdown in new development. Nev- ertheless, Emeryville today still has some of the highest-density construction in the East Bay and this new regional authority further limited Tiebout competition.

18 In our empirical analysis, we first examine the relation between regulation and regional economic outcomes. Then, cognizant of the fact that regulations do not arise randomly, we address concerns about causality by studying the heterogeneity of states’ responses to the national change in the regulatory environment described above. We test whether this aggregate change had a different impact on the convergence rates of states with larger or smaller historical tendencies to regulate land use, and for states with more or less severe geographic limits on development. We also consider the main alternative interpretations of the data in Section 5, and find that housing supply constraints are required to make sense of the data. .

4.3 Testing the Model using a Panel Measure of Regulations

Having established that our regulation measure is a good proxy for housing supply con- straints, we test its direct effect on the convergence relationship. Before turning towards re- gressions, we first demonstrate the effect of land-use regulations on convergence graphically. Figure 9 shows differential convergence patterns among the high and low regulation states. The convergence relationship within the low regulation states remains strong throughout the period. Conceptually, we can think of this group of states as reflecting the model prior to the change in regulations, with within-group reallocations of people from low-income states to high-income states. In contrast, the convergence coefficients among states with tight reg- ulations display a pronounced weakening over time (although convergence reappears briefly among high-regulation states during the recent recession). As a robustness check, we di- vide the states according to a measure of their housing supply elasticity based upon land availability and the WRLURI constructed by Saiz [2010]. Again, we find that convergence continues among states without supply constraints, but has stopped primarily in states with constraints. We now turn towards regressions and explore the effect of regulations more rigorously on the entire convergence mechanism described above. It is not obvious what functional form should be used to scale court cases into a regulation measure. We adopt a flexible and transparent specification – ranking state-years by their land use cases per capita:

LandUseCasesst Regs,t = Rank { Popst }

We rescale these values to create a variable ranging from zero for the least regulated state-

19 year to one for the most regulated state-year.33 Regulations are rising over time, from an average of 0.15 in 1950 to 0.64 in 1990. Our baseline specifications are of the following form:

Ys,t = ↵t + ↵tRegs,t + Incs,t + Incs,t Regs,t + "s,t (8) high reg ⇥

The coefficients of interest, and high reg,measuretheeffect of lagged income in low and high regulation state-years and are reported in Table 2.34 First, we examine housing supply. Absent land use restrictions, places with higher income will face greater demand for houses and will permit at a faster rate. Accordingly, the base coefficient on income in column 1 is positive, indicating that places with 10% higher incomes had a .5% higher annual permitting rate. The interaction term high reg is negative and similar in size: in the high-regulation regime there is no correlation between income and permits for new construction. This reduction in housing supply in high-income places means that housing prices should rise in those places. In column 2, we show that at baseline there is a positive correlation between income and housing prices (with 1% higher income associated with 0.8% higher prices), but that the slope of the relationship doubles in high regulation state-years. Income differences are increasingly capitalized into prices.35 Columns 3 and 4 explore migration responses to this change in prices. In our model, states with high income per capita will draw migrants when regulation is low, consistent with the baseline coefficient in column 3 that shows 0.17% higher annual population growth

33We conduct robustness tests on alternate scaling of the regulation measure in the appendix. We also explore the robustness of the relationship between declining income convergence and regulations in alternate regression models in Appendix Table 6. Specifically, this table reports the following specifications in the correspondingly numbered columns: (1) Our baseline convergence relationship; (2) A specification where the regulation variable is interacted with a dummy for greater than median income. This follows our model in assuming that regulations only bind in growing locations; (3) A specification that controls for the percent of the population with a BA and the interaction of this share with initial income. This specification, like Section 5.1, is designed to show the robustness of the regulation result to controls for skill-biased technological change; (4) A specification with log income squared, accounting for potential nonlinearity in convergence; (5) A specification that includes Census division fixed effects interacted with regulations to account for differential regulation growth across regions; (6) A specification that includes year fixed effects interacted with initial income, which allows for different baseline convergence rates across time. In all of these models, the relationship between tighter regulation and slower convergence remains statistically significant. 34This specification follows the literature in not including state fixed effects. See Barro [2012] for a discussion of how state/country fixed effects can lead to misleading convergence results in short panels. 35Our findings that increases in regulation raise capitalization are similar to those by Hilber and Vermeulen [2013] for the UK. Similarly, Saks [2008] and Glaeser et al. [2006] find in the US that employment demand shocks are capitalized into prices rather than quantities in the high regulation regime. However, see Davidoff [2010] for a dissenting view about the impact of regulations on housing prices using cross-sectional data. Davidoff writes “Unfortunately, a panel of regulations is not available, so there is no way to determine if time series changes in regulations are associated with changes in supply.”

20 in places with 10% higher incomes. When income differences are capitalized into prices, the incentive to move is diminished, and directed migration slows. The positive interaction co- efficient shows that directed migration almost completely disappears in the state-years with high regulation. We also examine how the composition of migration responds to income, using the change in the log of the human capital measure from Section 3. When hous- ing supply is elastic, the negative baseline coefficient in column 4 indicates that migration undoes any initial human capital advantage held by productive places. The interaction co- efficient is positive, indicating that human capital convergence slows among high regulation observations. Finally, Column 5 brings this analysis full circle by directly looking at the effect of high regulations on the convergence relationship. The uninteracted coefficient (-2.0) captures the strong convergence relationship that exists absent land use restrictions shown in the early years in Figure 1. However, the interaction coefficient is large and positive (1.3). This finding indicates that the degree of convergence among states in periods of high regulation is significantly diminished. One potential concern is that our measure is picking up changes in the overall regulatory or legal climate, rather than a change which is specific to land use. As a placebo test, we repeat the analysis above substituting placebo measure

Casesst RegP lacebos,t = Rank { Popst }

This measure also exhibits secular growth, from an average of 0.30 in 1950 to 0.66 in 1990. This means that if our results above were due to changes in the overall state-level regulatory climate or due to time trends, then we should expect them to also appear as part of this placebo test. Instead, however, we find that the interaction coefficients on RegP lacebos,t are small in magnitude and not statistically significant. Table 2 tightly links the theory from Section 2 to the observed data. The first row of coefficients describe a world where population flows to rich areas, human capital converges across places, and regional incomes converge quickly as in the model before the regulatory shock. The second row of coefficients is consistent with the high regulation regime described in the model after the shock, with increased capitalization, no net migration, and much less income convergence.

21 4.4 Identification from National Changes and Preexisting Regional Differences

This section analyzes evidence in favor of a causal relationship between land use regulations and convergence. In the 1970s there was a dramatic change in the prevalence of land use regulations in the US, as described by land use scholars in Section 4.2. Though our regulation measure is lower across the board prior to the 1970s, states nevertheless differed in their legal cultures regarding land use and in their natural supply constraints. This heterogeneity made some states more likely to be affected by change in the national climate towards land use regulations. Many other authors use a similar identification strategy of using historical differences across places and studying national changes in industry, ethnic composition or occupations (Bartik [1991], Card [2009], and Autor and Dorn [2013]). We estimate specifications of the form

Inc = ↵ +↵ LatentConstraint +Inc + Inc LatentConstraint +" s,t t t s s,t constrained s,t ⇥ s s,t where LatentConstraints are measures of a state’s susceptibility to regulations that are fixed across time. We split the sample into a pre-period, with twenty year windows from 1940-1960 through 1965-1985, and a post-period, with twenty year windows from 1965-1985 through 1990-2010. Statistically, this takes the form of testing whether constrained is the same in the pre and the post period. Before turning to preexisting measures, we first demonstrate the result of this test when using a recent cross section of regulations. Columns 1 and 2 demonstrate that states with high and low-regulation in 2005 had similar convergence rates in the first half of the sample, but that convergence slowed in high-regulation states after these restrictions were enacted. A potential concern raised above is that changes in skill composition, demographics or industrial patterns raised regulations and independently affected migration and convergence patterns. To gauge the importance of this bias, Columns 3 and 4 re-estimate this relationship controlling for a wide variety of state level measures of industry and skill composition from Autor and Dorn [2013] and show similar results.36 Controlling for potentially confounding covariates does not address the possibility of reverse causality through unobserved channels. Although regulation was low across the board in 1965, there is still cross-sectional variation in our measure for that year. This variation in permissiveness to laws regarding land use is predictive of subsequent increases in regulation,

36Specifically, we control for their measures of the share of workers in routine occupations, the college to non-college population ratio, immigrants as a share of the non-college population ratio, manufacturing employment share, the initial rate, the female share, the share age 65+, and the share earning less than the 10 year ahead minimum wage. We aggregate their data to the state level via population weighting.

22 and the correlation between the measures in 1965 and 2005 is 0.47. Though this measure is correlated with eventual regulation outcomes, variation in this measure cannot be plausibly explained by a subsequent shock affecting migration and convergence. Nevertheless, we find that states with low and high regulation values displayed similar convergence behavior in the first half of the sample. In the second half, once these latent tendencies had been activated in the form of high regulations, these states experience a sizeable drop in their degree of income convergence. Finally, we classify counties based upon the geographic availability of developable land using data from Saiz [2010].37 This measure can not be affected by any shock altering migration or convergence, yet it too should predict the severity of supply constraints after a nationwide rise in building restrictions. Again, the table demonstrates that counties with low geographic land availability did not display different convergence behavior in the past. In the period with tight building restrictions, however, these counties also experience a reduction in their rates of income convergence. We interpret these results as consistent with a change in housing supply constraints over time, with a latent tendency to regulate that was higher among states with more land use cases in 1965.38 Table 3 shows that if housing supply restrictions did not affect income convergence, then regulations must be correlated with a non-related convergence-ending shock, and this new shock must also be correlated with both states’ geography and historical legal structures. Moreover, such an explanation would have to explain why neither feature influenced convergence rates prior to the period of high land use regulation. Although it is possible to generate such an explanation, articulating such a story is sufficiently complicated that we feel the weight of the evidence supports a role for housing supply restrictions.

5OtherFactorsAffecting Convergence

Our analysis thus far has explored the role that housing regulations have played in changing skill-specific labor mobility and income convergence. Of course, other factors are likely to

37Saiz [2010] produces a metro-area level measure of developable land. Using data from the Census, we build a consistent series for median household income at the county-level. While the unit of observation is the county, we cluster our standard errors at the metro area level. 38One alternative interpretation is that our 1965 empirical measure detects fixed, heterogeneous elasticities across places. This interpretation is inconsistent with the secular increase in land use cases shown in Figure 8. It is also inconsistent with sustained income convergence and directed migration observed in the data from 1880 to 1980. If the North had substantial barriers to new construction before 1980, then its population could not have grown so rapidly beforehand.

23 affect both patterns, and in this section, we consider how these forces relate to the results in the previous section.

5.1 Skill Biased Technological Change

Conceptually, skill-biased technological change (SBTC) could slow the rate of convergence for several reasons. Consider an increase in the skill premium. This change would have two effects on conver- gence rates. It mechanically widens the income gaps between richer, more educated states and poorer, less educated ones. Additionally, in our model, it raises the returns to migration for skilled workers living in low-income states. The change in the returns to migration is complementary to our supply constraints story – both forces serve to make migration to rich places more heavily weighted towards skilled workers. As for the magnitude of the mechan- ical effect, Autor et al. [2008] estimate that the college-high school premium rose from 0.40 in 1980 to 0.64 in 2000. The share of people with a BA (henceforth “share BA”) in 1980 had a standard deviation of 3 percentage points across states, and the mechanical increase in the skill premium would have reduced the annual convergence rates by roughly 0.18. The observed change in annual convergence rates was 1.11, meaning that the mechanical effect of SBTC provides a partial but incomplete account for the change. Finally, it is possible that as skills have become more important, incomes of everyone in high share BA places would rise due to agglomeration externalities. We know from the work of Gennaioli et al. [2013a] that human capital levels play a central role in determining the level of regional development (see also Moretti [2012b], Glaeser and Saiz [2004], Berry and Glaeser [2005]). Under this theory, incomes would grow more quickly in these places, slowing convergence. One testable prediction which differentiates this story (a demand shock in productive areas) from our housing supply constraints story comes from skill-specific migration patterns. A positive demand shock should raise in-migration rates for all workers. If this demand shock mostly affected skilled workers, then it should raise the migration rate for skilled workers. In contrast, a negative housing supply shock predicts sharply falling in-migration by low-skill workers and a smaller decline in in-migration for skilled workers. Although information-economy cities such as San Francisco, Boston and New York offer high nominal wages to all workers (typically in the top quintile nationally), after adjusting for housing costs all three cities offer below average returns to low-skill workers (typically in the bottom decile). In Table 4, we examine the flows of unskilled and skilled workers in 1980 and 2010 to high skilled states as measured in 1980. This period and independent variable

24 were chosen to be consistent with the literature on skill agglomerations. There has been a marked shift in the composition of migration to high share BA places. From 1980 to 2010, there was a large decrease in the in-migration rate of low-skilled workers to high share BA states, and no change or a small decline in the in-migration rate of skilled workers to high share BA states. These results suggest that rising share BA in areas with ahighinitialshareofBAsdocumentedbyotherresearchersmaypartiallybetheresult of out-migration by unskilled workers and increased domestic human capital production, rather than increasing in-migration by skilled workers. Overall, SBTC and its place-specific variants are complementary with the supply constraints story developed here. When supply is constrained, increases in demand for skilled labor serve to further slow convergence.

5.2 Different Steady States: Convergence Has Already Happened

Income gaps across states are smaller today than they were in the past. Perhaps differences in incomes today reflect steady-state differences. While possible, two pieces of evidence are inconsistent with this suggestion. First, a close examination of Figure 1 shows that from 1940 to 1960 there was within-group convergence among the rich states as well as among the poor states. The income differences between Connecticut and Illinois or Mississippi and Tennessee in 1940 are smaller than the differences between Connecticut and Mississippi in 1990, and yet there was substantial within-group convergence from 1940 to 1960 and much less from 1990 to 2010. Second, our analysis with the regulation measure (e.g. Figure 9) shows substantial within-group convergence in the low regulation group, suggesting that existing income differences today are sufficiently large and transitory as to make convergence possible.

5.3 Racial Migration Patterns

In parts of the previous analysis, we did not distinguish between the income convergence and migration patterns of different racial groups. A possible interpretation of the migration patterns we observe over this period might attribute them to black mobility for non-economic motives. If changes in racial discrimination were correlated over time and across places with changing land use regulations, then our results may falsely attribute a causal role to housing prices in ending convergence. To check this possibility, we re-create the top two panels of Figures 1, 2, and 9 using income and population growth rates for whites only. These results

25 (presented in Appendix A.3) show that outcomes for whites closely follow the aggregate pattern.

5.4 Land Constraints, Productive Land and Physical Capital

Our analysis abstracted from considerations about the role of land and physical capital and in this section, we consider these factors briefly. While there are certainly technological and physical constraints to urban growth, we believe that regulatory constraints have been the primary barrier to new construction. Our view is based on two sets of facts: growth has fallen in some wealthy areas very heterogeneous densities, and there is a strong correlation between growth slowdowns and our measure of regulations. Perhaps the most striking example of a growth slowdown comes from the Primary Metropolitan Statistical Area (PMSA) formed by Bergen and Passaic counties in New Jersey, which are located directly across the Hudson River from New York City. Starting from a density of about 1,700 people per square mile in 1940, this area’s population grew at a rate of over 2% a year. Then, having reached a density of about 3,200 people per square mile in 1970, over the next thirty years, its population grew by 0.04% at an annual rate. Perhaps 3,200 people per square mile is a technological cutoff to feasible density, or Americans have astrongpreferencefordensitytobelessthanthisvalue.However,thedatashowapattern of low population growth rates among urban areas with very heterogeneous densities. An- nual population growth from 1990 to 2010 was 0.5% or lower in the PMSAs of Jersey City (with density of 11,800 people per square mile in 1990), San Francisco (density: 1,600), and Boston (density: 1,600). If Bergen-Passiac’s density were the natural limit, then we would have expected to see continued growth in San Francisco and Boston. Further, while there might be heterogeneity in natural density limits across places, it seems unlikely that these limits would be naturally correlated with both the time and cross-sectional pattern of regu- lations. Thus, while the baseline migration and convergence facts might be consistent with heterogeneous, fixed supply curves, this evidence suggests policy-driven supply changes.

Our analysis also abstracted from the role of land in production, but it is straight- forward to incorporate this factor as a complement in production by setting Yjt = Aj 1 ↵ ⇥ ⇢ ⇢ ⇢ njut + ✓njst Land .Ifregulationsreducedtheavailabilityofresidentialandproduc- tive land, then the marginal product of labor would fall in areas with tighter restrictions. Given that the rise in regulations is correlated with income, this would increase the speed of convergence. We have shown that convergence has actually slowed considerably, meaning

26 that the countervailing forces described in our model must be sufficient to overcome this channel. Past work, most notably Barro and Sala-i Martin [1992], has also explored the role of physical capital accumulation in convergence. Empirical measures of the state-level capital stock are quite difficult to obtain.39 One alternative measure of the returns to capital comes from regional interest rates. Landon-Lane and Rockoff [2007] report that regional interest rates largely converged by the end of World War II, relatively early in the time period of our study. This makes changes in the accumulation of physical capital a less likely candidate to explain changes in post-war convergence we study.

5.5 Amenities

In addition to differing in their productivity and housing supply, locations also differ in the non-productive amenities they offer workers. The value of these amenities have surely changed over time (Diamond [2012]), yet in the absence of housing supply constraints, amenity shocks alone are unlikely to explain the changing convergence patterns we observe. To see this, note that the model in Section 2 can be modified to accommodate these differ- ences or shocks to these consumption amenities by rewriting the per-period utility function 1 ujkt = c hjkt H¯ + amenityjt. The model can then map changes in a region’s ameni- jkt 40 ties into changes in migration patterns, housing prices, and rates of income convergence. Consider, first, a positive amenity shock in the more productive North. Such a shock raises the benefit of migrating from South to North. While this shock would raise housing prices in the North, it would also increase migration and speed income convergence, which is inconsistent with the data in our paper. Alternately, consider a positive amenity shock in the less productive South. This shock would indeed reduce migration rates from South to North and do so disproportionately for unskilled workers. By reducing the population in the North, however, it would predict a relative decline in housing prices in that region, rather than the increase that we see in the data. Therefore, while amenities are certainly important for understanding migration patterns, an amenity shock to North or South in our model produces testable predictions inconsistent with the data. There is also little evidence that weather-related amenities can explain the changes in migration patterns documented here. Research by Glaeser and Tobio [2007] suggests that

39Garofalo and Yamarik [2002] constructed indirect state-level capital estimates by combining state-level industry employment composition with national industry-level capital-labor ratios. 40These dynamics are presented in an illustrative simulation in Appendix A.4.

27 population growth in the South since 1980 is driven by low housing prices rather than good weather. Though average January temperature is predictive of population growth, it is not correlated with high housing prices. Moreover, the relationship between temperature and population growth has remained stable or declined in the post-war period.

6Conclusion

For more than 100 years, per-capita incomes across U.S. states were strongly converging and population flowed from poor to wealthy areas. In this paper, we claim that these two phenomena are related. By increasing the available labor in a region, migration drove down wages and induced convergence in human capital levels. Over the past thirty years, both the flow of population to productive areas and income convergence have slowed considerably. We show that the end of directed population flows, and the decline of income convergence, can be explained in part by a change in the relation- ship between income and housing prices. Although housing prices have always been higher in richer states, housing prices now capitalize a far greater proportion of the income differences across states. In our model, as prices rise, the returns to living in productive areas fall for unskilled households, and their migration patterns diverge from the migration patterns of the skilled households. The regional economy shifts from one in which labor markets clear through net migration to one in which labor markets clear through skill-sorting, which slows income convergence. We find patterns consistent with these predictions in the data. To identify the effect of these price movements, we introduce a new panel instrument for housing supply. Prior work has noted that land use regulations have become increas- ingly stringent over time, but panel measures of regulation were unavailable. We create a proxy for these measures based on the frequency of land use cases in state appellate court records. First, we find that tighter regulations raise the extent to which income differences are capitalized into housing prices. Second, tighter regulations impede population flows to rich areas and weaken convergence in human capital. Finally, we find that tight regulations weaken convergence in per capita income. We see this same link between rising regulations and declining convergence using a “shift-share” Bartik-like approach as well. Indeed, though there has been a dramatic decline in income convergence nationally, places that remain unconstrained by land use regulation continue to converge at similar rates. These findings have important implications not only for the literature on land use and regional convergence, but also for the literature on inequality and segregation. A simple

28 back of the envelope calculation shown in the Appendix finds that cross-state convergence accounted for approximately 30% of the drop in hourly wage inequality from 1940 to 1980 and that had convergence continued apace through 2010, the increase in hourly wage inequality from 1980 to 2010 would have been approximately 10% smaller. The U.S. is increasingly characterized by segregation along economic dimensions, with limited access for most workers to America’s most productive cities and their amenities. We hope that this paper will highlight the role land use restrictions play in supporting this segregation.

29 References Daron Acemoglu, David H Autor, and David Lyle. Women, War, and Wages: The Effect of Female Labor Supply on the Wage Structure at Midcentury. The Journal of Political Economy,112(3),2004.

David H Autor and David Dorn. The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market. American Economic Review,2013.

David H Autor, Lawrence F Katz, and Melissa S Kearney. Trends in U.S. Wage inequality: Revising the Revisionists. Review of Economics and Statistics,90(2):300–323,2008.

Robert J Barro. Convergence and Modernization Revisited. 2012.

Robert J Barro and Xavier Sala-i Martin. Convergence across States and Regions. Brookings Papers on Economic Activity,(1):107–182,1991.

Robert J Barro and Xavier Sala-i Martin. Convergence. Journal of Political Economy,100(2):223–251,1992.

Timothy Bartik. Who Benefits from State and Local Economic Development Policies? W.E. Upjohn Institute for Employment Research, Kalamazoo, Michigan, 1991.

Christopher R Berry and Edward Glaeser. The Divergence of Human Capital Levels across Cities. Papers in Regional Science,84(3):407–444,2005.

Olivier Jean Blanchard and Lawrence F Katz. Regional Evolutions. Brookings Papers of Economic Activity, 23:1–76, 1992.

David Blanchflower and . Does High Home Ownership Impair the Labor Market? 2013.

George J Borjas. Does Immigration Grease the Wheels of the Labor Market? Brookings Papers on Economic Activity,1:69–133,2001.

George J Borjas. The Labor Demand Curve Is Downward Sloping: Reexamining the Impact of Immigration on the Labor Market. Quarterly Journal of Economics,118(4):1335–1374,2003.

Fred Bosselman and David Callies. The Quiet Revolution in Land Use Control. White House Council on Environmental Quality, 1971.

Leah Platt Boustan, Price V Fishback, and Shawn Kantor. The Effect of Internal Migration on Local Labor Markets: American Cities during the Great Depression. Journal of Labor Economics,28(4):719–746,2010.

Juan Braun. Essays on economic growth and migration.PhDthesis,Harvard,1993.

Jan K. Brueckner and Kala Seetharam Sridhar. Measuring welfare gains from relaxation of land- use restrictions: The case of India’s building-height limits. Regional Science and Urban Economics, 42(6):1061–1067, November 2012. ISSN 01660462. doi: 10.1016/j.regsciurbeco.2012.08.003. URL http://linkinghub.elsevier.com/retrieve/pii/S0166046212000737.

Bureau of Economic Analysis. State Personal Income, Tables SA1-SA3, 2012.

David Card. Immigration and Inequality. American Economic Review: Papers and Proceedings,99(2):1–21, 2009.

David Card and Alan B Krueger. Does School Quality Matter? Returns to Education and the Characteristics of Public Schools in the United States. The Journal of Political Economy,100(1):1–40,1992.

Francesco Caselli and Wilbur John Coleman. The U.S. Structural Transformation and Regional Convergence: AReinterpretation.Journal of Political Economy,109(3):584–616,2001.

30 Raj Chetty. Bounds on Elasticities with Optimization Frictions: A Synthesis of Micro and Macro Evidence on Labor Supply. Econometrica,2013.

Patricia Cortes. The Effect of Low-Skilled Immigration on U . S . Prices : Evidence from CPI Data Patricia Cortes. Journal of Political Economy,116(3):381–422,2008.

W. Mark Crain. Volatile States: Institutions, Policy, and the Performance of American State Economies. University of Michigan Press, Ann Arbor, MI, 2003.

Thomas Davidoff. What Explains Manhattan’s Declining Share of Residential Construction? Journal of Public Economics, 94(7-8):508–514, August 2010. ISSN 00472727. doi: 10.1016/j.jpubeco.2010.02.007. URL http://linkinghub.elsevier.com/retrieve/pii/S0047272710000137.

Morris a. Davis and François Ortalo-Magné. Household expenditures, wages, rents. Review of Eco- nomic Dynamics, 14(2):248–261, April 2011. ISSN 10942025. doi: 10.1016/j.red.2009.12.003. URL http://linkinghub.elsevier.com/retrieve/pii/S1094202509000830.

Edward F. Denison. The Sources of Economic Growth in the United States and the Alternatives Before Us. Comittee for Economic Development, New York, 1962.

Rebecca Diamond. The Welfare Effects of Changes in Local Wages, Prices and Amenities Across US Cities. Harvard Working Paper,2012.

Riccardo DiCecio and Charles S Gascon. Income Convergence in the United States: A Tale of Migration and Urbanization. Annals of Regional Science,45:365–377,2008.doi:10.1007/s00168-008-0284-1.

Richard A. Easterlin. Long Term Regional Income Changes: Some Suggested Factors. Papers in Regional Science,4(1):313–325,1958.

Robert C. Ellickson. Suburban Growth Controls: An Economic and Legal Analysis. The Yale Law Journal, 86(3):385–511, January 1977. ISSN 00440094. doi: 10.2307/795798. URL http://www.jstor.org/stable/795798?origin=crossref.

Joseph P Ferrie. Internal Migration. In Susan Carter, editor, Historical Statistics of the United States, Millennial Edition,volume1776.2003.

WA Fischel. An economic history of zoning and a cure for its exclusionary effects. Urban Studies,pages 317–340, 2004. URL http://usj.sagepub.com/content/41/2/317.short.

Price V Fishback, William Horrace, and Shawn Kantor. The Impact of New Deal Expenditures on Mobility During the Great Depression. Explorations in Economic History,pages179–222,2006.

Gasper A Garofalo and Steven Yamarik. Regional Convergence: Evidence from a New State-by-State Capital Series. Review of Economics and Statistics,84(May):316–323,2002.

Martin Garrett. Land use regulation: the impacts of alternative land use rights.PraegerPublishers,New York, 1987.

Nicola Gennaioli, Rafael La Porta, Florencio Lopez-de silanes, and Andrei Shleifer. Human Capital and Regional Development. Quarterly Journal of Economics,(forthcoming),2013a.

Nicola Gennaioli, Rafael La Porta, Florencio Lopez-de silanes, and Andrei Shleifer. Growth in Regions. mimeo,2013b.

Edward Glaeser and Bryce Ward. The causes and consequences of land use regulation: Evidence from Greater Boston. Journal of Urban Economics,65:265–278,2009.

31 Edward L Glaeser and Joseph Gyourko. Urban Decline and Durable Housing. Journal of Political Economy, 113(2):345–375, 2005.

Edward L. Glaeser and Albert Saiz. The Rise of the Skilled City. Brookings-Wharton Papers on Urban Affairs, pages 47–105, 2004. ISSN 1533-4449. doi: 10.1353/urb.2004.0005. URL http://muse.jhu.edu/content/crossref/journals/brookings-wharton_papers_on_urban_affairs/v2004/2004.1glaeser.pdf.

Edward L Glaeser and Kristina Tobio. The Rise of the Sunbelt. NBER Working Paper 13071,2007.

Edward L Glaeser, Joseph Gyourko, and Raven Saks. Why is Manhattan So Expensive? Regulation and the Rise in House Prices. Journal of Law and Economics,pages1–50,2005a.

Edward L. Glaeser, Joseph Gyourko, and Raven E Saks. Why Have Housing Prices Gone Up? American Economic Review,95(2):329–333,2005b.

Edward L. Glaeser, Joseph Gyourko, and Raven E Saks. Urban growth and housing supply. Journal of Economic Geography,6:71–89,2006.

Edward L Glaeser, Matthew E Kahn, and Jordan Rappaport. Why do the Poor Live in Cities? The Role of Public Transportation. Journal of Urban Economics,63:1–24,2008.

Claudia Goldin and Lawrence F Katz. The Legacy of U. S. Educational Leadership: Notes on Distribution and Economic Growth in the 20th Century. American Economic Review: Papers and Proceedings,91(2): 18–23, 2001.

Joseph Gyourko, Albert Saiz, and Anita Summers. A New Measure of the Local Regulatory Environment for Housing Markets: Regulatory Index. Urban Studies,45:693–729,2008.

Joseph Gyourko, Christopher Mayer, and Todd Sinai. Superstar Cities. American Economic Journal: Economic Policy,forthcomin,2013.

Michael R. Haines. Historical, Demographic, Economic, and Social Data: The United States, 1790-2002 [Computer file]. ICPSR02896-v3. Inter-university Consortium for Political and Social Research, Ann Arbor, MI, 2010.

Oskar R Harmon. The Income Elasticity of Demand for Housing : An Empirical Reconciliation. Journal of Urban Economics,24:173–185,1988.

Christian Hilber and Fredric Robert-Nicoud. On the Origins of Land Use Regulations: Theory and Evidence from US Metro Areas. Journal of Urban Economics,75:29–43,2013.

Christian Hilber and Wouter Vermeulen. The Impact of Supply Constraints on House Prices in England. mimeo,2013.

Richard Hornbeck. The Enduring Impact of the American Dust Bowl: Short- and Long-Run Adjustments to Environmental Catastrophe. American Economic Review,102(4):1477–1507,2012.

Chang Tai Hseih, Erik Hurst, Charles I Jones, and Peter J Klenow. The Allocation of Talent and U.S. Economic Growth. mimeo,2013.

Lakshmi Iyer, Xin Meng, Nancy Qian, and Xiaxue Zhao. The General Equilibrium Effects of Chinese Urban Housing Reforms on the Wage Structure. mimeo,2011.

Matthew E. Kahn. Do Liberal Cities Limit New Housing Development? Evidence from California. Journal of Urban Economics, 69(2):223–228, March 2011. ISSN 00941190. doi: 10.1016/j.jue.2010.10.001. URL http://linkinghub.elsevier.com/retrieve/pii/S0094119010000720.

32 Lawrence Katz and Kenneth T. Rosen. The Interjurisdictional Effects of Growth Controls on Housing Prices. The Journal of Law and Economics, 30(1):149, January 1987. ISSN 0022-2186. doi: 10.1086/467133. URL http://www.journals.uchicago.edu/doi/abs/10.1086/467133.

Piyabha Kongsamut, Sergio Rebelo, and Danyang Xie. Beyond Balanced Growth. Review of Economic Studies,68(869-882),2001.

John Landon-Lane and Hugh Rockoff. The origin and diffusion of shocks to regional interest rates in the United States, 1880 to 2002. Explorations in Economic History,44:487–500,2007.

Peter Lindert and Richard Sutch. Consumer Price Indexes for All Items, 1774-2003. In Susan B. Carter, Scott Sigmund Gartner, Michael R. Haines, Alan L. Olmstead, Richard Sutch, and Gavin Wright, editors, Historical Statistics of the United States, pages Table Cc1–2. Cambridge University Press, New York, 2006.

Robert A Margo. Wages in California During the Gold Rush. 1997.

Guy Michaels, Ferdinand Rauch, and Stephen J Redding. Urbanization and Structural Transformation. Quarterly Journal of Economics,127:535–586,2012.

Enrico Moretti. Real Wage Inequality. American Economic Journal: Applied Economics,2012a.

Enrico Moretti. The New Geography of Jobs.HoughtonMifflin Harcourt, Boston, 2012b.

Casey Mulligan. A Century of Labor Leisure Distortions. NBER Working Paper 8774,2002.

Matthew J Notowidigdo. The Incidence of Local Labor Demand Shocks. NBER Working Paper 17167,2013.

Henry O Pollakowski and Susan M Wachter. The Effects of Land-Use Constraints on Housing Prices. Land Economics,66(3),1990.

John Quigley and Steven Raphael. Regulation and the High Cost of Housing in California. American Economic Review,95(2):323–328,2005.

Jennifer Roback. Wages, Rents, and the Quality of Life. The Journal of Political Economy,90(6):1257–1278, 1982.

Sherwin Rosen. Wages-based Indexes of Urban Quality of Life. In Peter Mieszkowski and Mahlon Straszheim, editors, Current Issues in Urban Economics.JohnHopkinsUniversityPress,1979.

Jonathan Rothwell. Housing Costs, Zoning, and Access to High-Scoring Schools. Metropolitan Policy Program at Brookings,2012.

Steven Ruggles, J. Trent Alexander, Katie Genadek, Ronald Goeken, Matthew B. Schroeder, and Matthew Sobek. Integrated Public Use Microdata Series: Version 5.0 [Machine-readable database]. 2010.

Albert Saiz. The Geographic Determinants of Housing Supply. Quarterly Journal of Economics,125(3): 1253–1296, 2010.

Raven E Saks. Job Creation and Housing Construction: Constraints on Metropolitan Area Employment Growth. Journal of Urban Economics,64:178–195,2008.doi:10.1016/j.jue.2007.12.003.

David Schleicher. City Unplanning. Yale Law Journal,122(7):1670–1737,2013.

Todd Sinai. Feedback Between Real Estate and Urban Economics. Journal of Regional Science,50(1): 423–448, 2010.

33 John Taylor and Norman Williams. American Land Planning Law: Land Use and the Police Power.West Group, Eagan Minnesota, 2009.

The American Institute of Planners. Survey of State Land Use Planning Activity.1976.

Charles M. Tolbert and Molly Sizer. US Commuting Zones and Labor Market Areas. Rural Economy Division, Economic Research Service, U.S. Department of Agriculture,Staff Pape, 1996.

U.S. Census Bureau. USACounties, 2012.

Stijn Van Nieuwerburgh and Pierre-Olivier Weill. Why Has House Price Dispersion Gone Up? Review of Economic Studies, 77(4):1567–1606, October 2010. ISSN 00346527. URL http://restud.oxfordjournals.org/lookup/doi/10.1111/j.1467-937X.2010.00611.x.

Jeffrey G Williamson. Regional Inequality and the Process of National Development: A Description of the Patterns. Economic Development and Cultural Change,13(4):1–84,1965.

Cristobal Young, Charles Varner, and Douglas S Massey. Trends in New Jersey Migration: Housing, Em- ployment, and Taxation. Policy Research Insitute at the Woodrow Wilson School,(September),2008.

34 FIGURE 1 The Decline of Income Convergence

1940-1960, Coef: -2.41 SE: .11 1990-2010, Coef: -.99 SE: .28 4

MS 5

ARAL NDSD OK KYNCGA NM KS NE TN SC LA TX ND CO UTMO IA ID MNWI WV VAAZ IN 2 WY SD WY LA 3 FL NH MS VT OR MT WA ME OH AR MT VT MI IL NM MA PA WV OK NE UT TXIA COMNVA AL KS WANH MD CA MEWI RI MD KY TN MO NJ CT NJNV PA NY IDSC AZ NY MA NCOR IL GA CA IN FL RI CT OH DE

Annual Inc Growth Rate, 1940-1960 Annual Inc Growth Rate, 1990-2010 Annual MI NV DE 1 0 8 9 10 10 10.4 10.8 Log Income Per Cap, 1940 Log Income Per Cap, 1990

Convergence Rates Over Time 0 -1 -2

Annual Inc Conv Rate -3 Convergence for 20-YearConvergence at Windows Rate Annual 1950 1960 1970 1980 1990 2000 2010

Notes: The y-axis in the first two panels is the annual growth rate of income per capita. The third panel plots coefficients from 20-year rolling windows. The larger red and purple dots correspond to the coefficients from the top two panels. Income data from the Bureau of Economic Analysis [2012]. Alaska, Hawaii, and DC are omitted here, and in all subsequent figures and tables.

35 FIGURE 2 The Decline of Directed Migration

1940-1960, Coef: 1.59 SE: .37 1990-2010, Coef: -.47 SE: .63

AZ 4 NV FL NV

CA 4

AZ

NM UT MD DE UT ORWA ID CO

CO 2 TXGA

2 TX CT FL VA MI NJ NC OH WA LA NM IN OR DE WY SC NC ID TN VA SCGA WI IL NH NY TN KS MNMT AR MT WY CA MA RI MN MD AL MOME PA ALOK NH KY SD INMO MS NEWIKS IA NJ KY SD NE VT IL IAVT Annual Pop Growth Rate, 1940-1960 Pop Annual Growth Rate, 1990-2010 Pop Annual MA LA ME NY CT 0 OK MS ND OHMI PA WV ND RI WV

AR 0 8 9 10 10 10.4 10.8 Log Income Per Cap, 1940 Log Income Per Cap, 1990

Convergence and Directed Migration Rates Over Time 2 1 0 -1 -2

Annual Inc Conv Rate Coefs for Coefs 20-Year at Windows Rate Annual Annual Directed Mig Rate -3 1950 1960 1970 1980 1990 2000 2010

Notes: The y-axis in the first two panels is the annual growth rate of log population. The third panel plots coefficients from 20-year rolling windows for population changes and income changes. The larger red and purple dots correspond to the coefficients from the top two panels.

36 FIGURE 3 Rising Prices in High Income States

1960, Coef: .95 SE: .08 2010, Coef: 2.04 SE: .25 CA 12 13

CT NJMA NJ NYCANV IL MD MA NY OH CT RI UT MNWI NV WA

11.5 RIWYCO DE MI FL MDWA NH 12.5 OR VA AZ VA MT MO DE CO LA IDNM NHOR IN PA AZ ND IA UT FL MN GA VT VT IL KSNE KY SDMETX AL TN ME NC ID WI MS OK GAMT WY 11 WV SC 12 PA NMMI NC

Log Housing Value, Housing Log 1960 AR Value, Housing Log 2010 OHMO SC TN IN LATXKSNE IA SD KYAL ND AR OK

11.5 MSWV 10.5 9.2 9.4 9.6 9.8 10 10.3 10.5 10.7 10.9 11.1 Log Income Per Cap, 1960 Log Income Per Cap, 2010

Timeseries of Coefs 2

1.5 1

1950 1960 1970 1980 1990 2000 2010

Notes: The first two panels regress median housing value on income per capita at the state level. The third panel plots coefficients from 20-year rolling windows. The larger red and purple dots correspond to the coefficients from the first two panels.

37 FIGURE 4 Returns to Migration: Skill-Specific Income Net of Housing Cost

Effect of $1 of Statewide Inc on Skill-Specific Inc Net of Housing

1.5 Unskilled HH Skilled HH 1 Coef .5 0 1940 1960 1970 1980 1990 2000 2010

Notes: This figure plots the relationship between unconditional mean household income and mean skill- specific income net of housing costs for several decades. The regression in each year is Y P = ↵ + ij ij Y (1 S )+ Y S + ⌘S + X + " for households with at least one labor force unskilled j ⇥ ij skilled j ⇥ ij ij ij ij participant aged 25-65. See Section 3.2 for details. We report 95% confidence intervals for unskilled and for skilled. Housing costs are defined as 5% of house value for homeowners and 12X monthly rent for renters. No coefficient is reported from 1950 because the IPUMS USA sample for this year does not include housing cost data. High-skilled households are defined as households in which all adult workers have 12+ years of education in 1940 or 16+ years of education thereafter and low-skilled households are defined as households in which no worker adult worker has this level of education. Mixed skill-type households, which range from 2%-14% of households, are dropped from the regression sample, but not from the construction of unconditional state average income. The modest non-linearity amongst high-income places apparent in the 1940 results is due to Chicago and New York, both of which are very large cities that were hit hard by the Great Depression and failed to attract as many migrants as predicted. Standard errors are clustered by state.

38 FIGURE 5 Net Migration Flows by Skill Group: Nominal Income vs. Income Net of Housing Cost, 1935-1940

Low Skill Coef: 1.31 SE: .47 High Skill Coef: .61 SE: .39 2 2 0 0 -2 Net Migration as % Pop Net Migration as % Pop Net Migration -2 8.5 9 9.5 10 10.5 8.5 9 9.5 10 10.5 Log Nominal Income Log Nominal Income

Low Skill Coef: 1.23 SE: .36 High Skill Coef: .77 SE: .40 2 2 0 0 -2 Net Migration as % Pop Net Migration as % Pop Net Migration -2 8 8.5 9 9.5 10 9 9.5 10 10.5 Log (Inc-Housing Cost) for Low Skill Log (Inc-Housing Cost) for High Skill

Notes: These panels plot net migration over a five-year horizon as a fraction of the population ages 25-65 for 466 State Economic Areas (SEA) in the 1940 IPUMS Census extract. Each panel stratifies the SEAs into 20 quantiles by income, weighting each SEA by its population, and then computes the mean net migration within each quantile. The two top panels plot net migration as a function of the log household wage income in the destination SEA, for individuals with less than 12 years of education (left) and those with 12+ years (right). The two bottom panels plot the migration rates for these skill groups against the log skill-group mean value of household wage income net of housing costs. Housing costs are defined as 5% of house value for homeowners and 12X monthly rent for renters. All x-axis variables are computed for non-migrating households with at least one labor force participant aged 25-65.

39 FIGURE 6 Net Migration Flows by Skill Group: Nominal Income vs. Income Net of Housing Cost, 1995-2000

Low Skill Coef: -2.17 SE: 1.00 High Skill Coef: 4.07 SE: .69 8 8 4 4 0 0 Net Migration as % Pop Net Migration as % Pop Net Migration -4 -4 10.8 11 11.2 11.4 11.6 11.8 10.8 11 11.2 11.4 11.6 11.8 Log Nominal Income Log Nominal Income

Low Skill Coef: 4.30 SE: 2.00 High Skill Coef: 4.71 SE: .89 8 8 4 4 0 0 Net Migration as % Pop Net Migration as % Pop Net Migration -4 -4 10.4 10.6 10.8 11 11.2 11 11.2 11.4 11.6 11.8 Log (Inc-Housing Cost) for Low Skill Log (Inc-Housing Cost) for High Skill

Notes: These panels plot net migration over a five-year horizon as a fraction of the population ages 25-65 for 1,020 3-digit Public Use Microdata Area (PUMA) in the 2000 IPUMS 5% Census extract. Each panel stratifies the PUMAs into 20 quantiles by income, weighting each PUMA by its population, and then computes the mean net migration within each quantile. The two top panels plot migration rates as a function of log household wage income in the PUMA, for individuals with less than a bachelor’s degree (left) and with at least a bachelor’s (right). The two bottom panels plot the migration rates for these skill groups against the skill-group mean value of household wage income net of housing costs. Housing costs are defined as 5% of house value for homeowners and 12X monthly rent for renters. All x-axis variables are computed for non-migrating households with at least one labor force participant aged 25-65.

40 FIGURE 7 The Decline of Human Capital Convergence

1960 Coef: -.33 SE: .04 2010 Coef: -.08 SE: .06 NM 1000 1000

DE VA ME 500 500 GA AZ MD

SCKY CO AL OK VA MSTN LA NC TX MD 0 0 AR MA FL MN VT NH SC WA WV VT MO MAWY FLTNNC COIL AZ KYGA OR NH WI NE OH MN ME SD CT MS NY PA KS ALLA MO NJ AR MI WY PAMT NJ WA WV SD ND ND IA UT CA UT RI IN MI OR DE KS IN CA NM TX OH NV MT OK WI IA CT -500 -500 ID RI ID NE IL NY NV Human Cap of Residents - Human Cap of People Born in State in Born of People Cap - Human of Residents Cap Human State in Born of People Cap - Human of Residents Cap Human

-1000 14000 15000 16000 17000 -1000 18000 18500 19000 19500 20000 Human Capital of People Born in State Human Capital of People Born in State

Extent of Human Capital Convergence due to Migration Over Time 0 -.1 -.2 -.3 Human Capital Convergence, 20-Year Convergence, Capital Human Window

1960 1970 1980 1990 2000 2010

Notes: Human capital index is estimated by regressing log Incik = ↵k + Xik + "ik in the 1980a Census, where ↵ is a set of seven education indicators, and then constructing Human Capital = exp(ˆ↵ ) k j k k ⇥ Sharekj.Weseparatelyestimatethehumancapitalindexbystateofresidenceandbystateofbirth,to P develop a no-migration counterfactual. The top panels show figures from a regression of HumanCap j,res HumanCapj,birth = ↵ + HumanCapj,birth + "j in 1960 and 2010. Sample is people ages 25-34, see Section 3 for details. The bottom panel plots a time-series of coefficients. The larger red and purple dots correspond to the coefficients from the first two panels.

41 FIGURE 8 Regulation Measure: Timeseries and Validity

Land Use Cases Per Million People Land Use Cases vs 1975 Survey, Coef: .1 SE: .03 SD MN CT NJ OR 15 3 WA RI MD

NC WI FL CA CO DE

2 SCVA TX NY ME VT UT KY MI KSARPA NV MT 10 IN MA

1 GA LA NH WY ID OHMS AZ MO IL Cases Per Million People Per Million Cases NM ALND OKIA NE

0 5 WVTN American Institute of Planners Measure Institute of Planners American 1940 1950 1960 1970 1980 1990 2000 2010 0 10 20 30 40 50 Rank of Land Use Cases Per Capita, 1965-1975

Land Use Cases vs 2005 Survey, Coef: .03 SE: .007 Regulations Capitalize Incomes into Prices 2 RI 12 Low Reg State-Years NH MA High Reg State-Years

MD VT 1 CO WA AZ NJ ME CA PA

DE 11.5 OR FL NY NM WIMN CT

0 MIUT IL VA TN MT NC NE TX GA OH KY Value Housing Log OK NV ID AL MS

Wharton Index of Index Regulation Wharton AR SC MO NDIN IA SDWY KS WV 11 -1 LA 0 10 20 30 40 50 9.4 9.6 9.8 10 10.2 Rank of Land Use Cases Per Capita, 1995-2005 Log Income

Notes: The top left panel plots the number of cases containing the phrase “land use” in the state appeals court databases in per capita terms. The top right panel plots the relationship between the 1975 values of the regulation measure introduced in the text and the sum of affirmative answers to the regulation questions asked in the 1975 American Institute of Planners Survey of State Land Use Planning Activities. The lower left panel plots the relationship between the 2005 values of the regulation measure introduced in the text and the 2005 Wharton Residential Land Use Regulatory Index. The lower right panel plots deciles of log income with year fixed effects on the x-axis and conditional means for housing prices for each decile on the yaxis.

42 FIGURE 9 Income Convergence by Housing Supply Elasticity

3 Low Reg MS 5 MA ARAL High Reg NDSD NH KY OK KS

NM 2.5 NCGA NE NCGA CT TN SC LA SD VT NJ TX SCND TN CO ME MN CO UTIAMO NY 2 VA WV IDVAAZMNWI NE MD IN MS AL KY RI

3 WY FL NH AR MO PA WA VT MTORWA IL ME OHMI IN FL PA IL UT TXWI DE ID IA KSOHORMI MD CA 1.5 CA MANJNYNV AZ WV NM RI CT LA Low Reg NV DE MT OK High Reg 1

1 WY Inc, 1940-1960 (Annual Rate) (Annual Inc, 1940-1960 Rate) (Annual Inc, 1980-2000

Δ 8 8.5 9 9.5 10 Δ 9.9 10 10.1 10.2 10.3 10.4 Log Income Per Cap, 1940 Log Income Per Cap, 1980

Split by Land Use Instrument Split by Saiz Housing Supply Elasticity

1 Coef High Reg States Coef Low Elasticity States 0 Coef Low Reg States Coef High Elasticity States 0 -1 -1 -2 -2 -3 -3 1960 1970 1980 1990 2000 2010 1950 1960 1970 1980 1990 2000 2010

Convergence for 20-YearConvergence at Windows Rate Annual for 20-YearConvergence at Windows Rate Annual

Notes: The top panels show income convergence for two different twenty-year periods, labeling states ac- cording to their estimated regulation levels in 1965. Blue states have below median housing supply regulation and red states above median regulation. The bottom left panel depicts the coefficients from Incs,t = ↵t + Incs,t 20 + "s,t over rolling twenty year windows. The regressions are estimated separately for two equally sized groups of states, split by their 1965 measure of land use regulations from the legal database. The bottom right panel splits states by their measure of housing supply elasticity in Saiz [2010]. We weight the time-invariant MSA-level measures from Saiz by population to produce state-level estimates and impute a value for Arkansas based on neighboring states.

43 TABLE 1 Summary Statistics

1940 1960 1980 2000 Mean SD Mean SD Mean SD Mean SD

Personal Income Per Capita ($000, 2012 $) 8.83 3.18 16.34 3.15 26.63 3.63 38.41 5.95

Population (Million) 2.73 2.69 3.72 3.80 4.69 4.76 5.83 6.26

Median House Price ($000, 2012 $) 39.7 15.4 85.2 18.6 129.4 32.1 152.3 44.5

Regulation Measure (land use cases per capita*10^6) 0.17 0.56 0.32 0.50 2.18 2.59 3.77 6.15

Sources: IPUMS Census extract, BEA Income estimates, and an online database of state appellate court documents. Notes: n=48 states, excluding Alaska, Hawaii, and DC. Dollar amounts are in real 2012 dollars deflated using the Lindert and Sutch price index (2006). TABLE 2 Impacts of Regulation on Permits, Prices, Migration, and Convergence

Annual Construction Log House ΔLog Δ Log Human Δ Log Income

Permitst Price t Populationt,t+20 Capital Per Capt,t+20 % of Housing Stock Annual Rate in % Annual Rate in % (1) (2) (3) (4) (5)

Regulation Measure: Rank of Land Use Cases Per Capita scaled [0,1]

Log Inc Per Capit 5.039** 0.774*** 1.688** -0.0434*** -2.034*** (2.106) (0.105) (0.637) (0.00744) (0.102) -5.868** 0.833*** -1.875*** 0.0400** 1.304***

Log Inc Per Capit *Regit (2.290) (0.255) (0.608) (0.0157) (0.393)

Year*Reg FEs Y Y Y Y Y R2 0.217 0.891 0.142 0.249 0.811 N 1,536 384 2,448 288 2,448

Placebo Measure: Rank of Total Cases Per Capita scaled [0,1]

Log Inc Per Capit 1.313 0.984*** 1.017 -0.0292* -1.707*** (1.627) (0.148) (0.813) (0.0157) (0.206) -1.029 0.269 0.380 0.000479 0.202

Log Inc Per Capit *Regit (2.396) (0.267) (2.616) (0.0295) (0.400)

Year*Reg FEs Y Y Y Y Y R2 0.164 0.871 0.179 0.191 0.791 N 1,536 384 2,448 288 2,448

Notes: The table reports the coefficients β and βreg from regressions of the form: lnyit=αt+αtregit+βlnyit+ βreglnyitregit+εit. The regulation measure is rank of land use cases per capita and its construction is described in the text. The dependent variables are new housing permits from the Census Bureau, the median log housing price from the Census, population change, the change in log human capital of people ages 25-34 due to migration, and the change in log per-capita income. Construction of the human capital index is described in Section 3. For columns (1), (3), and (5), where we have annual data, the regulation measure is constructed using cases per capita. For columns (2) and (4), where we have decennial data, the regulation measure is constructed using average cases per capita over the last ten years. Standard errors clustered by state. *** p<0.01, ** p<0.05, * p<0.1 TABLE 3 Latent Tendency to Regulate, Geographic Land Availability, and Convergence

Δ Log Income Per Capt,t+20 (Annual Rate in %) Year Pre Post Pre Post Pre Post Pre Post (1) (2) (3) (4) (5) (6) (7) (8)

Log Inc Per Capt -1.93*** -1.80*** -2.47*** -3.06*** -2.05*** -1.97*** -2.49*** -1.20*** (0.11) (0.33) (0.20) (0.57) (0.15) (0.47) (0.06) (0.08)

Log Inc Per Capt * 0.22 2.01*** 0.14 2.00*** 0.20 1.91*** -0.09 0.71*** Constraint (0.27) (0.66) (0.25) (0.68) (0.27) (0.69) (0.10) (0.17) pre interaction = post interaction (pval) 0.002 0.005 0.003 <0.001

Year*Constraint Fixed Effects Y Y Y Y Y Y Y Y Autor-Dorn Controls ------Skill Measures R2 0.84 0.45 0.87 0.60 0.84 0.46 0.72 0.91 N 1,248 1,200 1,248 1,200 1,248 1,200 8,413 9,194 Unit of Observation State State State State State State County County Share of Land Land Use Cases Per Land Use Cases Per Land Use Cases Per Unavailable (Saiz, Constraint Measure Capita, 1996-2005 Capita, 1996-2005 Capita, 1956-1965 2010)

Notes: This table uses time-invariant measures of the housing supply elasticity, while Table 2 used time-varying measures of the elasticity. The table reports the coefficients β and βconstraint from regressions of the form

Δlnyit,t+20=α1+α2Constrainti+βlnyit+ βConstraintlnyit x Constrainti+εi. The pre period is 20-year windows ending in 1960 through 1984. The post period is 20-year windows ending in 1985 through 2010. The constraint measures are all in quintiles normalized such that 0 means least constrained and 1 means most constrained. The constraint measures are: the number of land use cases per capita 1996-2005 in columns (1)-(4), the number of land use cases per capita 1956-1965 in (5)-(6), and land availability constructed from Saiz (2010) in columns (7)-(8). The availability measure assumes that all land is available for construction in non-urban counties. Columns (3)-(4) control for skill measures in Autor and Dorn (2013): the share of workers in routine occupations, the college to non-college population ratio, immigrants as a share of the non-college population ratio, manufacturing employment share, the initial unemployment rate, the female share, the share age 65+, and the share earning less than the 10 year ahead minimum wage. We aggregate their data to the state level via population weighting. Standard errors clustered by state for columns (1)-(6) and by metro area for columns (7)-(8) in parentheses. *** p<0.01, ** p<0.05, * p<0.1 TABLE 4 Migration By Skill Group and Share BA

Panel A: Total Migration (Extensive + Intensive Margin)

# Residents - # Born in State as % of Total State Pop Low-Skill High-Skill Total Mig Difference (1) (2) (2) + (1) (2)-(1) 1980 Census, n=48 Share BA, 1980 2.624*** 0.762*** 3.386*** -1.862*** (0.479) (0.131) (0.550)

2010 American Community Survey, n=48 Share BA, 1980 0.490** 0.614*** 1.104*** 0.124 (0.235) (0.138) (0.354)

Coef 2010 - Coef 1980 -2.134*** -0.148 -2.282***

Panel B: Choice of Destination | Decision to Leave Birth State (Intensive Margin)

# Migrants to state j from state of birth j' - Pop j / (Pop National - Pop j') Low-Skill High-Skill Difference (1) (2) (2)-(1) 1980 Census, n=2256 Share BA, 1980 0.116* 0.173*** 0.057** (0.0608) (0.0460)

2010 American Community Survey, n=2256 Share BA, 1980 -0.0297 0.129*** 0.149*** (0.0400) (0.0326)

Coef 2010 - Coef 1980 -0.136** -0.044

Notes: This table examines differences by skill group and over time in migration to high BA states. Panel A measures net migration of 25-44 year olds relative to state of birth as a share of the state's total population. There is one observation per state, and robust SE are in parentheses. This measure is attractive because it captures both the decision to migrate and the choice of destination, but it is sensitive to differential trends in domestic BA production. Panel B corrects for this issue and focuses on choice of destination among those who choose to migrate within the 48 continental states. Each observation is a state of origin by state of destination pair. We examine whether people who migrate are disproportionately attracted to states with high share BA. We normalize each observation by subtracting the ratio of the population of the destination state to the population of all states (dropping the population of the state of origin). Observations are weighted by the total number of migrants from the origin state, and the standard errors are clustered by destination. Share BA is calculated using people ages 25-65. Low-skill is defined as having less than a BA. High skill is defined as having a BA or higher. *** p<0.01, ** p<0.05, * p<0.1 ACalibration

In this section we extend the model to allow for a more realistic calibration and the simulation of additional shocks. Specifically, we add elastic labor supply and non-productive, time-varying amenities to the individuals decision problem. Given that the reminder of model matches the model presented in the text, we do not reproduce those equations here. Further variations on the model, such as a setup with regionally differentiated goods and constant returns in production, are avaliable online. A.1 Individual Decisions

Once again, agents are either skilled or unskilled k u, s ,andhaveutilityinstatej N,S 2 { } 2 { } of

1 1+ ✏ 1 ⇠ljkt U = max cjkt (hjkt H) + amenjt c ,l 1+1/✏ { jkt jkt} subject to cjkt + pjthjkt = wjktljkt + ⇡t

Labor supply is now elastic and governed by the elasticity parameter ✏. Non-productive amenities, 41 amenjt can vary over time, but are not skill specific. The first order condition on labor supply implies:

1 ✏ 1 l = w jkt p jkt ✓ jt ◆ ! Profits from both the housing sector and the tradable section in North and South are again rebated lump-sum nationally. We can therefore write each moment’s indirect utility as a function of the wage, price and these parameters:

1 1+✏ 1 1 wjkt 1 pjt v (A, n ,n , amenity )=(w + ⇡ p H) ✓ ◆ +amen jkt jlt jht jt jkt t jt p ⇣ 1+1⌘ /✏ jt ✓ jt ◆ A.2 Calibration

Despite the simplicity of the model, there are a large number of parameters to calibrate. Thankfully, many of them can be inferred from the data or sourced from the literature. We set ✓, the premium for skilled versus unskilled workers, equal to 1.7. This is representative of the BA/non-BA relative wages in data, holding race and gender constant. We set the elasticity of subsitution between skilled and unskilled workers, ⇢, equal to 0.6 as in Card [2009]. The initial share of skilled workers living in the North is set to 0.69, and the initial share of unskilled workers is set to 0.63. This matches 41Recent work, such as Diamond [2012], has looked at the impact of time-varying, skill-specific amenity shocks.

44 the population distribution in 1950, when splitting states in to “North” and “South” at the median based on per capita incomes. The total population of each skill type is normalized to one.

We use the two parameters of the utility function, H¯ and , to match the Engel-curve for housing estimated in Section 2. This entails setting = .06 and H¯ = .25 in Appendix. This parameter choice means that we can analyze whether the nonhomotheticity we observe for housing within labor markets is large enough to generate the changes we see in migration for the observed change in housing prices. The discount rate r , treating each period as one year, and the labor share of production (1 ↵) are set to 0.05 and 0.65 as in much of the literature. The elasticity of labor supply ✏ is set to 0.6 as in Chetty [2013]. We set A, the relative productivity parameter, equal to 1.8. This is consistent with a fraction of 85% of the population residing in the North in the steady state given equalized skill distributions.

Finally, we are left to calibrate the moving cost parameter , the elasticity parameter ⌘,and the size of the elasticity shock. We initially set ⌘ equal to 0.4, which generates roughly a 1 to 1 relationship between log prices and log per capita income, matching the relationship in the data for 1950 and 1960 as reported in Figure 3. The parameter is set equal to .002 to match the speed of directed migration observed prior to the explosion of land use regulations.

We simulate a shock that lowers ⌘ to 0.4 to 0.135 after 10 periods. This drop is calibrated to match the change in the log price to log income ratio, which in the data (Figure 3) rises to 2 from 1. The dynamics of the system to this shock displayed below.

Mig S→N (Unskilled)

.02 Unanticipated Reg Increase

Mig S→N (Skilled)

Mig Rate S→N (Skilled) .01

Inc Converge Rate 0

Rate of Convergence / Migration of Rate Convergence Mig Rate S→N (Unskilled)

Inc Converge Rate -.01

t0 t1 Time

The figure shows that, before the shock, total directed migration averaged slightly less than 2% per year as in the data. Both skilled and unskilled workers migrate from South to North, with unskilled workers actually moving at a slightly faster rate due to initial skill imbalances. The

45 convergence rate before the shock is slightly less than 1% per year. The rate in the data is closer to 2% per year, meaning that under this calibration, the migration mechanism can account for roughly 50% of convergence prior to the regulatory shock.

When a shock calibrated to match changing price ratios hits, both directed migration and income convergence cease as in the data. The rate of income convergence falls roughly 1%, similar to the change in the rate of beta-convergence reported in Figure 1. Thus, while the migration channel can only account for half of the level of convergence, changes in migration can account for roughly 100% of the change. The cessation of total directed migration masks different trends for skilled and unskilled workers. Skilled workers continue to move from South to North at a reduced, but still significant rate. Unskilled migration, which had previously exceeded skilled migration, stops completely. Thus net migration has turned into skill-sorting across locations as in the data.

A.3 Income Convergence and Directed Migration of Whites

Income Convergence Whites Only

1940-1960, Coef: -1.14 SE: .117 1990-2010, Coef: -.389 SE: .261 AR 4.5 SD MS 8.5 AL KS OK

NM 4 ND

KY LA TX WY 8 GA MT VA SC NE MD IA LA MN ID CO AR IA NE CO TN MO 3.5 MS MA NCND MNIN WA WI RINH NY VAAZ NMUT KS WV KY NCVT CT OR IDALOK TX WI PA NJ 7.5 OH SC MEGA SD MO 3 TNAZ OR ILDE UT MI IL DE WV WAMD IN FL ME CA NV OH CA WY

7 FL PA NV MT

CT 2.5 VT NH MI MANJNY RI 2 6.5 8 8.5 9 9.5 10 10 10.2 10.4 10.6 10.8 Annual Inc for Growth Rate Whites, 1940-1960 Annual Log Income Per Cap, 1940 Inc for Growth Rate Whites, 1990-2010 Annual Log Income Per Cap, 1990

46 Directed Migration Whites Only

1940-1960, Coef: 1.359 SE: .451 1990-2010, Coef: -1.305 SE: .668 6 4

AZFL NV NV 3

CA AZ 4

UT ID 2 TX NM CO ORWA DE UT MD TX CO NC FL NM VA SC GA 2 LA SC CT TN OR WA MI NJ 1 GA ID WY MT WY DE IN OH AR NC MT VA KY NH CA AL MNWI AL MO MN TN KS NH IL NY SD INVTNEWIKS RIMA MS OK MS ME MEIA IL MO PA IAVT WV LA ND MI KY SD NE 0 OHPA MD NJ MA CT

0 NDOKWV NY AR RI 8 8.5 9 9.5 10 10 10.2 10.4 10.6 10.8 Annual Pop Growth Rate for Growth Rate Whites, Pop 1940-1960 Annual Log Income Per Cap, 1940 for Growth Rate Whites, Pop 1990-2010 Annual Log Income Per Cap, 1990

Income Convergence by Wharton Regulatory Index Whites Only

AR Low Reg 4.5 Low Reg SD

8.5 MS High Reg High Reg AL KS OK

NM 4 ND

KY LA TX WY 8 GA MT VA SC NE MD IA LA MN ID CO AR IA NE CO TN MO 3.5 MS WA MA NCND MNIN WI KS RINH NY VAAZ NMUT VT WV KY OK NC CT OR IDAL TX WI PA NJ 7.5 OH SC MEGA SD MO

3 DE TNAZ OR IL UT MI IL DE WV WAMD IN FL ME CA NV OH CA WY

7 FL PA NV MT

CT 2.5 Inc, 1940-1960 (Annual Rate) (Annual Inc, 1940-1960 VT Rate) (Annual Inc, 1990-2010 Δ NH Δ MI MANJNY RI 2 6.5 8 8.5 9 9.5 10 10 10.2 10.4 10.6 10.8 Log Income Per Cap, 1940 Log Income Per Cap, 1990

Notes: The horizontal axis in each panel is the log of state per capita income reported by the BEA. In the top and bottom panels, the vertical axis plots the average annual per capita income growth rate for whites in the state using data from Census and ACS extracts. We measure annual per capita income using the mean wage income for workers ages 25 through 65. In the middle row of panels, the vertical axis plots the average annual population growth rate for whites in the state. The bottom panel colors states based on the population weighted value of their housing supply elasticity as measured in Saiz [2010]. Blue states have above median elasticity and red states have an elasticity below the median.

47 A.4 Amenity Changes

This plot shows the impact of an amenity increase in the North, using the model in Section 2. See Section 5.5 for an extended discussion of these results. Other papers cited in notes to appendix tables: Tolbert and Sizer [1996], U.S. Census Bureau [2012], Haines [2010], Ferrie [2003], Fishback et al. [2006], Lindert and Sutch [2006].

BConstantReturnstoScaleinProduction

B.1 Downward-Sloping Product Demand, Population Flows, and Convergence In Section 2, we developed a model where downward-sloping labor demand came from the assump- tion of a production function that had decreasing returns to scale in labor. Here we show that downward-sloping labor demand can also come from a production function with constant returns to scale (Y = AL), combined with elastic product demand and monopolistic competition. Previous drafts (avaliable on request from the authors) have derived this result in a model with multiple skill types.

B.1.1 Individual Decisions: Labor Supply and Product Demand Individuals i in the region “home” consume a basket of differentiated good x from each region { j} j [0, 1]. Individuals solve the following problem, taking the local price for labor w and the national 2 price for products p as exogenous { j} 1 1+ 1 ⇢ " ⇢ li U = x + wili pjxijdj ˆ ij 1+ 1 ˆ ✓ ◆ " ✓ ◆

48 1 ⇢ @U ⇢ ⇢ ⇢ 1 = x x p =0 @x ˆ ij ij j j ✓ ◆ ! @U = l1/" + w =0 @l i i

1 ⇢ ⇢ ⇢ ⇢ 1 xij xij 1/" li ✓⇣´ ⌘ ◆ = pj wi )

1 ⇢ ✏ ⇢ 1 xij ⇢ lsupply(w)=w" x⇢ dk (9) i i p ˆ ik j ✓ ◆ !! Equation (9) holds for all markets j [0, 1].WenowapplythestandardDixit-Stiglitzsolution 2 techniques to derive the demand for any individual good j in terms of its own price pj, household income wili and the aggregate price index P. The first order conditions imply that an individual’s consumption of two goods must have the following ratio:

x p p ik = k x = x k x p ) ik ij p ij ✓ j ◆ ✓ j ◆ p p x = p x k k ik k ij p ✓ j ◆ pk Integrating p x dk = p x dk ˆ k ik ˆ k ij p ✓ j ◆ wl = x p p1 dk i ij j ˆ k

1 P | {z } wili xij = pj 1 (10) P

Recall that li is actually li⇤(w) from equation (9) which governed labor supply. We now substitute in for the labor supply elasticity above, to write an individual’s demand for good xj as:

1+" Demand pj wi xij (p, w, ⇠,P)= 1 ⇠i (11) P where ⇠i is a scaling of household marginal utility.

B.1.2 Firm Decisions: Product Supply and Labor Demand We assume that each region has a single firm j, which takes the national demand curve and local wages as exogenous. As before, we suppress the notation for the location of the home firm through- out. Firms produce using the constant returns to scale production function qj = ALj. The firm

49 serves the national market but hires labor locally (Lj) at wage wj.

max pjqj(pj) wjLj pj ,lj ,qj pj 1+" subject to (1) qj = 1 wi ⇠iµidi and (2) Lj = qj/A P ˆ national demand w1+"⇠ µ di w w1+"⇠ µ di max p1 | i {zi i }j (p i i i ) j ´ 1 j ´ 1 () pj P A P w w1+"⇠ µ di max p1 j p i i i j j ´ 1 () p A P ⇣ ⌘ w FOC : p = j ) j 1 A Having derived the optimal prices, we can determine output by substituting the price FOC back in to equation (11) for consumer demand:

1+" demand pj wi xij = 1 ⇠i P We can integrate over all the individuals i to calculate an aggregate demand curve for good j:

1+" w w µ ⇠ di xdemand = j i i i j 1 A ´ P 1 ✓ ◆ Inverting the production function q = AL gives a company’s labor demand as a function of wages and downward-sloping demand for their good.

1+" wj wi ⇠iµidi 1 1+" Demand 1 A ´ P 1 wi ⇠iµidi L (w)= = A w (12) ⇣ ⌘ A 1 j ´ P 1 ✓ ◆ B.1.3 Labor Market Equilibrium Recall that labor supply is given by the individual labor supply decision (equation (9)) times the share of individuals µj in the regional market.

Supply " L (w)=µjwj ⇠j (13) Now we can equate labor supply from equation (13) and demand from equation (12) to solve for the market-clearing wage

LD(w)=LS(w) 1+" " 1 wi µi⇠idi µ w ⇠ = A w ) j j j 1 j ´ P 1 ✓ ◆ Recall from equation (9) that

⇢ 1 1 ⇢ " x ⇢ ⇠ = ij x⇢ dk p ˆ ik ij ✓ ◆ !!

50 wli Recall equation (10), that consumer i’s demand for good j is xij = p 1 . Plugging the demand j P equation into the marginal utility expression gives

1 ⇢ " ⇢ 1 ⇢ ⇢ (⇢ 1) wli wli ⇠ = p p dj j P 1 ˆ j P 1 ✓ ◆ ✓ ✓ ◆ ◆ !! 1 ⇢ " ⇢ 1 1 ⇢ (⇢ 1) wli wli ⇢ ⇢ = p p dj j P 1 P 1 ˆ j ✓ ◆ ✓ ◆ ✓ ◆ ! 1 ⇢ " (⇢ 1) ⇢ ⇢ = p p dj j ˆ j ✓ ◆ ! This shows that ⇠ is a function of prices which are exogenous from the perspective of the home region, meaning that it cancels from both sides of the labor-market clearing condition. This means we can solve for the market-clearing wage in terms of exogenous parameters.

/(+") 1 1 1 Market-clearing +" w = A "+ P "+ µ j j 1 ✓ ◆ With the market-clearing wage, we can go back to the individual labor supply condition (equation (8)) to solve for per capita income

(1+")/(+") ( 1)(1+") ( 1)(1+") (1+") 1+" w⇤l⇤ = w ⇠ = A ✏+ P ✏+ µ +" ⇠ (14) j i 1 i ✓ ◆ B.1.4 Comparative Static We are interested in the impact of a population change in the home region on local per-capita incomes, or mathematically, @w⇤l⇤/@µ. A, P, ,⇠ and " are exogenous parameters or functions of nation-wide variables. From equation (14) we have an elasticity of per capita income with respect to population of :

per cap income (1 + ✏) " = population + ✏ where 0 <µ<1, " > 0, and > 1. We can interpret this elasticity intuitively. When the labor supply elasticity is high, inflows have a bigger impact on income because a small increase in labor supply greatly bids down the price of labor. When a monopolistic region faces a less elastic demand curve ( 1), then it will not increase production much in response to a migration-induced ⇠ decrease in the cost of labor. As a result, incomes will fall to a greater degree if the demand curve is more inelastic ( is lower). In this way, monopolistically competitive markets can provide a microfoundation for the result of downward-sloping labor demand.

CDistributionofMigrationCosts

C.1 The Path of Income and Population Over Time For this exercise, we abstract from different skill types, and focus on a single skill model. As before, output in an area is a function of the local population:

51 1 ↵ Y = An The parameter ↵ governs the elasticity of both per capita income and the exponential of indirect utility with respect to population. Further, let A be the ratio of relative productivity in North relative to South. Here we use notation N for the Northern rich region and S for the Southern poor region. We then have per capita incomes:

↵ ↵ yNt = AnNt and ySt = nSt Let x be the share of people leaving place S for place N. The gap in per capita income growth rates between North and South is

dln (y)=dln (y ) dln (y ) Nt St n n = ↵ ln Nt ↵ ln St nNt 1 nSt 1 ✓ ✓ ◆◆ ✓ ✓ ◆◆ nNt 1 + xnSt 1 (1 x)nSt 1 = ↵ ln ln nNt 1 nSt 1 ✓ ✓ ◆ ✓ ◆◆ xnSt 1 = ↵ ln 1+ ln (1 x) nNt 1 ✓ ✓ ◆ ◆ The convergence rate is the gap in per capita growth rates divided by the gap in levels. We set this to a negative constant .

xnSt 1 ↵ ln 1+ ln (1 x) dln (y) nNt 1 = n =  ln(y ) ⇣ ⇣ ⌘ Nt 1 ⌘ t 1 ln(A) ↵ln nSt 1 convergence ⇣ ⌘ Given this constant, we can solve for x: |{z}

xnSt 1 ↵ 1+ nNt 1 nSt 1 ↵ ln =  ln(A ) 1 x nNt 1 !! ✓ ✓ ◆ ◆

xnSt 1 ↵ 1+ ↵ nNt 1  nSt 1 e = e A 1 x nNt 1 ! ✓ ◆ ↵ nSt 1 +↵ nSt 1 x = e A (1 x) 1 nNt 1 nNt 1 ✓ ◆ ✏ ✏ nSt 1 +↵ nSt 1 +↵ nSt 1 x + e A = e A 1 nNt 1 nNt 1 nNt 1 ✓ ✓ ◆ ◆ ✓ ◆ ↵ +↵ nSt 1 e A 1 nNt 1 x = ↵ nSt 1 ⇣+↵ ⌘nSt 1 + e A nNt 1 nNt 1 ↵ ⇣ ⇣ ⌘ ⌘ nSt 1 YNt 1 A = Because nNt 1 YSt 1 , we can rewrite this as ⇣ ⌘ +↵ YNt 1 e 1 YSt 1 x = 1 ⇣ ⌘1 ↵ +↵ YNt 1 YNt 1 e + A ↵ YSt 1 YSt 1 ⇣ ⌘ ⇣ ⌘ Finally, define YN0 as income in the North and YS0 as income in the South at t = t0. Then

52 +↵ YN0 (t t ) e 1+ e 0 1 YS0 xt⇤ = x = 1 Y ⇣ ⇣ ⌘ 1 ⌘ Y ✏↵ e+↵ 1+ N0 e(t t0) + A ↵ 1+ N0 e(t t0) YS0 YS0 ⇣ ⇣ ⌘ ⌘ ⇣ ⇣ ⌘ ⌘ We need optimal migration from the South to produce this fraction of the Southern population moving North for each time t. Below, we derive conditions under which this fraction is declining over time. It is intuitive that the share of the Southern population moving would fall over time, because as migration rates should fall as the benefit to moving falls. Still, the ratio between the amount of directed migration and the initial income gap will be constant, so that income convergence continues at constant rate.

C.2 Individual Migration Decisions Consider an agent in the South deciding whether to move to the North today or stay in the South, with the possibility of moving in the future, valued at VT˜+1. This agent discounts the future at rate r. In each 1 period, agents draw i.i.d. moving costs F .Define⇤ = F (x⇤ ). The agent will move if ⇠ T T Gain to Moving at T Flow Cost > Benefit to Moving Later 1 rt YN0 (t t0) ⇢rT e e T >e VT˜+1 YS0 tX=T ✓ ◆

At xT⇤ ,theagentisindifferent between moving and staying. This implies that

1 YN0 1 rt (t t0) rT ˜ T⇤ F (xT⇤ )= e e e VT +1 ⌘ YS0 t=T X ✓ ◆ Benefit to Staying Gains to moving at T | {z } The benefit to waiting is that expected| future migration{z costs} are lower. We know at each period how likely it is that the agent would choose to move in all future periods. So we can integrate up the value the agent gets from eventually winding up in Productiveville. The difference between that and the value of moving today is the expected savings in moving costs. This defines the distribution of moving costs for the part of the distribution hit covered by the sequence x⇤ 1 . { t }t=0

t 1 1 1 YN0 (r+)t+t0 YN0 (r+)t2+t0 1 e (1 xj⇤) xt⇤ e = F (xT⇤ ) E[FutureMC T ] YS0 0 1 YS0 | t=T t=T +1 j=T t =t ✓ ◆ 2 ✓ ◆ Costtomoving now Eventual Moving Cost X X Y X Excess Gains to moving at T @ A Excess Gains From Moving Eventually | {z }

| {z } 1 | {z F (x⇤) } t t 1 1 = F (x⇤ ) 1 x⇤ f() d T j ˆ · · t=T +1 j=T X Y 0 C.3 Finding An Interior Solution

dxt⇤ To finish the proof, we need to show that dt < 0 for t>0.BecauseincomegapsbetweenNorthandSouth are falling, this implies that we need the fraction of Southern residents leaving each period to be declining. This ensures that the dynamic problem described above has an interior solution. Recall from the previous section that +↵ YN0 (t t ) e 1+ e 0 1 YS0 xt⇤ = 1 Y ⇣ ⇣ ⌘ 1 ⌘ Y ↵ e+↵ 1+ N0 e(t t0) + A ↵ 1+ N0 e(t t0) YS0 YS0 ⇣ ⇣ ⌘ ⌘ ⇣ ⇣ ⌘ ⌘

53 +↵ YN0 (t t ) e e 0 dxt⇤ YS0 = 1 dt Y ⇣ ⌘ 1 Y ↵ e+↵ 1+ N0 e(t t0) + A ↵ 1+ N0 e(t t0) YS0 YS0

⇣ ⇣ ⌘ ⌘ P 1 ⇣ ⇣ ⌘ ⌘

xt⇤ | {z }1 Y 1 Y ↵ ⇥ e+↵ 1+ N0 e(t t0) + A ↵ 1+ N0 e(t t0) YS0 YS0 ✓ ⇣ ⇣ ⌘ ⌘ ⇣ ⇣ ⌘ ⌘ ◆ P 2 1 1 ↵ YN0 (t t ) YN0 (t t ) | A{z↵ 1+ e 0 } e 0 +↵ YN0 (t t ) YS0 YS0 e e 0 + 0 ⇣ ⇣ ⌘ Y ⌘ ⇣ ⌘ 1 YS0 ↵ 1+ N0 e(t t0) ✓ ◆ YS0 B ⇣ ⇣ ⌘ ⌘ C @ P 3 A

| dx {z } t⇤ < 0 P 1

1 1 ↵ YN0 (t t ) A ↵ 1+ e 0 +↵ +↵ YS0 e < xt⇤ e + 0 ⇣ ⇣Y ⌘ ⌘ 1 ⇥ ↵ 1+ N0 e(t t0) YS0 B C @ ⇣ ⇣ ⌘ ⌘ A

YN0 (t t ) Define ⌘ =1+ e 0 . YS0 | {z } ⇣ ⌘ 1 A ↵ 1 ↵ +↵ +↵ ↵ e < xt⇤ e + ⌘ ⇥ ↵✏ ! 1 +↵ e ⌘ 1 A ↵ 1 ↵ +↵ +↵ ↵ e > 1 e + (⌘) e+↵⌘ + A ↵ ⌘1/↵ ⇥ ↵ ! 1 1 1 ↵ 1 ↵ ↵ 1 ↵ +↵ 2 +↵ 1/↵ +↵ 2 +↵ +↵ A A e ⌘ + e A ↵ ⌘ > e ⌘ e + e ⌘ (⌘) ↵ (⌘) ↵ ) ↵ ↵ 1 1 1 ↵ 1 ↵ 1 ↵ +↵ 1/↵ A A e A ↵ ⌘ +1 (⌘) ↵ > (⌘) ↵ ) ↵ ! ↵ 1 ↵ 1 ↵ 1 1 1 ↵ 1 1 +↵ A ↵ 1 e > (⌘) ↵ 1 A ↵ (⌘) ↵ = ⌘ ↵ (1 ↵)⌘ ↵ ↵A ↵ ) ↵ \ ↵ \ ✓ ◆ ⇣ ⌘ Plugging back in for ⌘ gives

1 ↵ ↵ YN0 (t t ) 1+ e 0 +↵ YS0 e > 1 ⇣ ⇣ ⌘ ⌘ ↵ 1 YN0 (t t ) (1 ↵) 1+ e 0 ↵A ↵ YS0 ✓ ⇣ ⇣ ⌘ ⌘ ◆ (t t ) 0 We need this to be true at both t = t and t = .Att = t , e 0 = e =1,andatt = , e1 =0. 0 1 0 1

54 This gives us the conditions:

1 ↵ ↵ 1+ YN0 YS0 1 e+↵ > max , 1 1/↵ { ⇣ ⇣ Y ⌘⌘ ↵ 1 1 ↵ ↵A } (1 ↵) 1+ N0 ↵A ↵ YS0 ✓ ◆ ⇣ ⇣ 1 ↵ ⌘⌘ ↵ 1+ YN0 YS0 = 1 ⇣ ⇣ Y ⌘⌘ ↵ 1 (1 ↵) 1+ N0 ↵A ↵ YS0 ✓ ◆ ⇣ ⇣ ⌘⌘ 1 ↵ ↵ 1+ YN0 YS0  > log 1 ↵ ⇣ ⇣ Y ⌘⌘ ↵ 1 (1 ↵) 1+ N0 ↵A ↵ YS0 ✓ ⇣ ⇣ ⌘⌘ ◆ ↵ A 1+ YN0 So as long as this combination of , ,and YS0 are sufficiently small, then there exists some moving cost distribution F such that convergence occurs⇣ at⌘ a constant rate.

55 APPENDIX TABLE 1 σ Convergence, IV Estimates of Convergence and Labor Market Area Convergence

Panel A: Cross-Sectional Standard Deviation of Income 1950 1960 1970 1980 1990 2000 2010 BEA Log Inc Per Cap 0.236 0.199 0.155 0.137 0.150 0.150 0.138

Panel B: Additional Convergence Regressions

Δln yit (Annual Rate in %) = α+βtln yit-1+εit 20 year period ending in… OLS BEA 1950 1960 1970 1980 1990 2000 2010 Coefficient -2.38 -2.41 -1.98 -1.85 -0.58 -0.39 -0.99 Standard Error 0.16 0.11 0.16 0.15 0.31 0.46 0.29

OLS Census Coefficient -- -1.82 -2.33 -2.42 -0.36 -0.26 -1.33 Standard Error -- 0.13 0.16 0.12 0.33 0.50 0.32

IV BEA with Census Coefficient -- -2.46 -1.65 -1.59 -0.37 -0.22 -1.23 Standard Error -- 0.12 0.22 0.25 0.32 0.46 0.42

IV Census with BEA Coefficient -- -1.81 -2.42 -2.37 -0.48 -0.27 -0.84 Standard Error -- 0.12 0.18 0.14 0.38 0.59 0.27

Panel C: Convergence at Labor Market Area Level

Δln varit (Annual Rate in %) = α+βtln yit-1+εit 20 year period ending in… Income Convergence 1950 1960 1970 1980 1990 2000 2010 Coefficient -- -0.97 -1.69 -2.13 -0.21 0.23 -0.26 Standard Error -- 0.19 0.10 0.13 0.18 0.26 0.16

Notes: Panel A. This panel reports the standard deviation of log income per capita across states. This corresponds to the σ convergence concept in Barro and Sala-i-Martin (1992). Panel B. Figure 1 calculates convergence coefficients using data on personal income from the BEA. That specification is biased in the presence of classical measurement error. We address the bias issue by instrumenting for the BEA measure using an alternative Census measure and vice versa. The Census measure is log wage income per capita for all earners, except in 1950 where it is only household heads. The first stage F-statistics range from 189 to 739. Classical measurement error is not an issue in these IV regressions, and the convergence coefficients display a similar time-series pattern. Panel C. This panel replicates the "OLS Census" specification from this table at the Labor Market Area (LMA) level, with each LMA weighted by its population. We construct a panel of income and population at the Labor Market Area (LMA) level. LMAs are 382 groups of counties which are linked by intercounty commuting flows and partition the United States (Tolbert and Sizer, 1996). LMA income is estimated as the population-weighted average of county-level income. The income series uses median family income from 1950-2000 from Haines (2010) and USACounties (2012). In 1940 and 2010, the series is unavailable. In 1940, we use pay per manufacturing worker from Haines (2010). Pay per manufacturing worker which had a correlation of 0.77 with median family income in 1950, a year when both series were available. In 2010, we use median household income from USACounties (2012), which had a correlation of 0.98 with median family income in 2000, a year when both series were available. APPENDIX TABLE 2 Directed Migration From Poor to Rich States and Labor Market Areas

Δ Yit (Annual Rate in %) = α+βtln yit+εit 20 year period ending in… 1950 1960 1970 1980 1990 2000 2010

Y: Δ Log Popit, State Level Baseline, State-Level Coefficient 0.56 1.60 2.13 0.75 0.26 1.18 -0.48 Standard Error 0.27 0.37 0.60 0.78 1.03 1.05 0.64

Y: Net Migration (Birth-Death Method), State Level Coefficient 1.16 2.68 2.92 1.14 0.78 1.06 -0.49 Standard Error 0.19 0.36 0.59 0.77 0.97 1.02 0.58

Y: Net Migration (Survival Ratio Method), State Level Coefficient 1.29 2.04 2.20 0.67 0.05 -- -- Standard Error 0.23 0.35 0.58 0.77 0.92 -- --

Y: Δ Log Popit, Labor Market Area Level Coefficient -- 1.82 1.73 -0.02 -0.88 0.17 0.13 Standard Error -- 0.31 0.26 0.32 0.42 0.41 0.25

Sources: BEA Income estimates, Ferrie (2003) and Fishback et al. (2006) Notes: Robust standard errors are shown below coefficients. Birth-death method uses state-level vital statistics data to calculate net migration as ObservedPopt - (Popt-10 + Birthst,t-10 + Deathst,t-10). Survival ratio method computes counterfactual population by applying national mortality tables by age, sex, and race to the age-sex-race Census counts from 10 years prior. Both published series end in 1990, and we use vital statistics to construct the birth-death measure through 2010. See notes to Appendix Table 1 for details on construction of the Labor Market Area sample. APPENDIX TABLE 3 Returns to Living in a High Income State by Skill

1940 1960 1970 1980 1990 2000 2010

Panel A. Returns to Migration (OLS) Income Net of Housing Costs Average State Income X 0.880*** 0.736*** 0.786*** 0.726*** 0.657*** 0.539*** 0.356*** Unskilled (0.0204) (0.0257) (0.0421) (0.0775) (0.0347) (0.0349) (0.0465) Average State Income X 0.700*** 0.869*** 0.876*** 0.766*** 0.885*** 1.153*** 0.967*** Skilled (0.0615) (0.0633) (0.0620) (0.124) (0.0961) (0.111) (0.0903)

N 255,391 306,576 339,412 2,116,772 2,924,925 3,142,015 694,985

Panel B: Returns to Migration (IV for State of Residence with State of Birth) Income Net of Housing Costs Average State Income X 0.932*** 0.776*** 0.859*** 0.772*** 0.667*** 0.488*** 0.258*** Unskilled (0.0298) (0.0381) (0.0559) (0.0937) (0.0362) (0.0358) (0.0518) Average State Income X 0.719*** 0.740*** 0.775*** 0.418*** 0.889*** 1.196*** 0.872*** Skilled (0.0622) (0.0814) (0.0998) (0.138) (0.113) (0.136) (0.131)

N 255,391 306,576 339,412 2,116,772 2,924,925 3,142,015 694,985

Panel C: Differential Impacts of Housing Costs in High-Income States (OLS) Log Housing Costs Log Average State Income 1.138*** 1.076*** 1.449*** 1.755*** 2.632*** 2.249*** 2.329*** X Unskilled (0.0902) (0.0957) (0.160) (0.437) (0.284) (0.281) (0.284) Log Average State Income 1.657*** 0.878*** 1.274*** 1.347*** 2.338*** 1.540*** 1.802*** X Skilled (0.139) (0.103) (0.0935) (0.250) (0.285) (0.247) (0.238)

N 235,121 296,484 324,017 1,951,058 2,615,879 2,788,921 606,001

Notes: All standard errors are clustered by state. *** p<0.01, ** p<0.05, * p<0.1

Panel A. This panel reports the coefficients β1 and β2 from the regression Yi-Pi=α+γSkilli + β1Y *(1-Skilli) + β2Y * Skilli +

θXi + εi, where Yi and Pi measure household wage income and housing costs respectively, Y measures average state income and Xi are household covariates. Household Skilli is the fraction of household adults in the workforce who are skilled, defined as 12+ years of education in 1940 and 16+ years thereafter. Household covariates are the size of the household, the fraction of adult workers who are black, white, and male, and a quadratic in the average age of adult household workers. Housing costs Pi are defined as 5% of house value or 12 times monthly rent for renters. 1950 is omitted because household-level rent data are unavailable. Panel B. The IV regressions replicate panel A, but instrument for average state income and its interaction with household skill using the average income of the state of birth of adult household workers. The first stage F-statistics in these regressions exceed 80.

Panel C. This panel reports the coefficients β1 and β2 from the regression log(Pi)=α+γSkilli + β1log(Y)* (1-Skilli) +

β2log(Y)* Skilli + θXi + ε. APPENDIX TABLE 4 Migration Flows by Skill Group: Nominal vs. Real Income

Dep Var: 5-Year Net Migration as Share of Total Pop Double Exclude Only Mig Measure Baseline Housing Cost In-State Mig Whites Birth State (1) (2) (3) (4) (5) Panel A: Low-Skill People, 1940 Log Nominal Income 1.313*** -- 1.049** 1.007** 1.086** (0.470) -- (0.438) (0.443) (0.443)

Log Group-Specific Income Net 1.236*** 1.109*** 1.017*** 0.980*** 0.995*** of Housing (0.364) (0.274) (0.350) (0.352) (0.338)

Panel B: High-Skill People, 1940 Log Nominal Income 0.611 -- 0.617 0.585 0.475 (0.392) -- (0.419) (0.387) (0.411)

Log Group-Specific Income Net 0.773* 0.899** 0.905* 0.821* 0.701 of Housing (0.400) (0.337) (0.462) (0.415) (0.513)

Panel C: Low-Skill People, 2000 Log Nominal Income -2.173** -- -2.456*** -2.377*** 0.281 (1.006) -- (0.792) (0.757) (8.453)

Log Group-Specific Income Net 4.309** 6.042*** -0.357 1.725 -11.99 of Housing (2.007) (2.140) (1.167) (1.418) (11.51)

Panel D: High-Skill People, 2000 Log Nominal Income 4.077*** -- 1.786*** 2.894*** 19.32*** (0.694) -- (0.611) (0.649) (5.373)

Log Group-Specific Income Net 4.715*** 3.634*** 1.937*** 3.593*** 14.06*** of Housing (0.894) (1.280) (0.701) (0.874) (4.567)

Note: Each cell represents the results from a different regression. The table regresses 5 year net-migration rates on average income and skill-specific income net of housing. Low-skill is defined as having less than 12 years of education in 1940 and less than a BA in 2000. In 1940, the unit of observation is State Economic Area, with n=455 to 466, depending on specification. In 2000, the unit of observation is three-digit Public Use Microdata Areas, with n=1,020. The baseline case reproduces the results in Figures 5 and 6. The second column shows the effect of doubling the housing costs described in the text to control for non-housing price differences across places. The third column excludes intra- state migrants in calculating net-migration rates. The fourth column excludes non-white migrants in calculating net- migration rates. The final measure calculates migrants as the number of residents residing outside their state of birth. Additional details are presented in the text. Standard errors clustered by state. *** p<0.01, ** p<0.05, * p<0.1 APPENDIX TABLE 5 Impacts of Alternate Regulation Measures on Permits, Prices, Migration, and Convergence

Annual Construction Log House ΔLog Δ Log Human Δ Log Income Per

Permitst Price t Populationt,t+20 Capital Capt,t+20 % of Housing Stock Annual Rate in % Annual Rate in % (1) (2) (3) (4) (5)

"Land Use" Cases Per Capita, Continuous & Winsorized @ 90th Percentile, scaled [0,1]

Log Inc Per Capt 2.042 0.907*** 1.297** -0.0370*** -1.804*** (1.232) (0.0882) (0.607) (0.00756) (0.108)

Log Inc Per Capt * -2.868* 0.809*** -2.132** 0.0298 1.765*** Continuous Reg (1.466) (0.247) (0.821) (0.0218) (0.563)

N 1,536 384 2,448 288 2,448

"Land Use" Cases Per Capita, Above/Below Median

Log Inc Per Capt 3.200** 0.903*** 1.381** -0.0367*** -1.884*** (1.551) (0.0784) (0.585) (0.00715) (0.0956)

Log Inc Per Capt * -2.984** 0.633*** -1.043** 0.0310*** 1.113*** Binary Reg (1.380) (0.175) (0.441) (0.0103) (0.244)

N 1,536 384 2,448 288 2,448

"Zoning" Cases Per Capita, Rank scaled [0,1]

Log Inc Per Capt 5.955*** 0.683*** 2.507*** -0.0277** -2.179*** (2.165) (0.114) (0.690) (0.0136) (0.141)

Log Inc Per Capt * -7.246*** 1.032*** -3.646*** -0.00683 1.294*** Zoning Reg (2.456) (0.255) (1.064) (0.0276) (0.453)

N 1,536 384 2,448 288 2,448

Year*High Reg FEs Y Y Y Y Y

Notes: The table reports the coefficients β1 and β2 from regressions of the form:

Δlnyit=αt+αtregit+β1lnyit+ β2lnyit x regit+εit. We use three regulation measures: (1) land use cases per capita (not the rank), scaled from zero to the 90th percentile of positive observations (2) whether land use cases per capita are above or below median, and (3) the rank of cases mentioning the word "zoning". The dependent variables are new housing permits from the Census Bureau, the median log housing price from the Census, population change, the change in log human capital due to migration, and the change in log per-capita income. Standard errors clustered by state. *** p<0.01, ** p<0.05, * p<0.1 APPENDIX TABLE 6 Robustness Tests

Δ Log Income Per Capt-20,t (Annual Rate in %) (1) (2) (3) (4) (5) (6)

Log Inc Per Capt-20 -2.034*** -1.968*** -2.442*** -11.04*** -1.109*** (0.102) (0.107) (0.0876) (3.108) (0.197)

Log Inc Per Capt-20* Regit 1.304*** 0.640** 0.585* 0.516* 0.370** (0.393) (0.312) (0.313) (0.275) (0.140)

Log Inc Per Capt-20*1(Inc >Med)t-20* Regit 2.002** (0.799)

Share BA t-20 -19.48 (21.54)

Log Inc Per Capt-20 *Share BA t-20 2.400 (2.003)

Log Inc Per Capt-20 ^2 0.478*** (0.165)

Regit -3.451** (1.354) Year x Reg Year x Reg Year x Reg Year x Reg Year x Reg Year x Inc Fixed Effect Census Division x Reg R2 0.811 0.817 0.874 0.817 0.851 0.820 N 2,448 2,448 288 2,448 2,448 2,448 Column 1 reports the baseline convergence relationship from Table 2. Column 2 interacts the regulation variable with a dummy for state per capita income greater than the median. This follows our model in assuming that regulations only bind in growing locations. Column 3 includes controls for the percent of the population with a BA and the interaction of this share with initial income. This specification, like Section 5.1, is designed to show the robustness of the regulation result to controls for skill-biased technological change. Column 4 includes a control for initial log income squared, accounting for potential nonlinearity in convergence. Column 5 includes Census division fixed effects interacted with regulations to account for differential regulation growth across regions. Column 6 includes year fixed effects interacted with initial income, which allows for different baseline convergence rates across time. In all of these models, the relationship between tighter regulation and slower convergence remains statistically significant. Standard errors are clustered by state, and the construction of the variables is discussed in the text. *** p<0.01, ** p<0.05, * p<0.1 APPENDIX TABLE 7 Share of Unavailable Land (Aggregated from Saiz 2010) UT 0.698 CO 0.202 FL 0.553 MI 0.200 CA 0.532 MD 0.193 WV 0.523 DE 0.188 LA 0.507 OH 0.180 VT 0.447 AL 0.174 OR 0.427 AR 0.170 NV 0.415 AZ 0.162 WA 0.389 NM 0.156 CT 0.376 MT 0.146 ID 0.354 RI 0.139 NY 0.347 WY 0.137 ME 0.346 KY 0.133 NH 0.339 NC 0.122 MA 0.338 GA 0.113 WI 0.333 IN 0.103 IL 0.326 SD 0.101 VA 0.299 TX 0.101 MS 0.279 MO 0.089 NJ 0.274 IA 0.050 SC 0.250 ND 0.043 TN 0.236 OK 0.043 PA 0.211 KS 0.040 MN 0.209 These data are drawn from Saiz (2010). County level estimates were weighted by population in 1960 to arrive at state-level averages. These data are used in Table 3 in the text. APPENDIX TABLE 8 Inequality Impacts of Convergence and its Demise

Panel A: Inequality Counterfactual without Convergence (1940-1980) Std Dev of Log Hourly Earnings -- Full-time Males Statea Totalb 1940 0.300 0.781 1950 0.227 0.672 1960 0.183 0.580 1970 0.147 0.600 1980 0.106 0.618

Convergence (1940-1980)c 65%

1980 No Convergence Counterfactual: SD [Y + Ystate1940*(1-0.35)]d 0.674

Inequality 1980 Observed - 1940 Observed -0.163 1980 No Convergence Counterfactual - 1940 Observed -0.107 Share of Inequality Accounted for By Convergence 34%

Panel B: Inequality Counterfactual if Convergence Continued (1980-2010) Std Dev of Log Hourly Earnings -- Full-time Males State Total 1980 0.106 0.618 1990 0.125 0.622 2000 0.098 0.643 2010 0.115 0.678

2010 Convergence Counterfactual: SD [Y - Ystate1980*(1-0.35)]e 0.674

Inequality 2010 Observed - 1980 Observed 0.060 2010 Convergence Counterfactual - 1980 Observed 0.056 Share of Inequality Accounted for By End of Convergence 8%

Sample uses hourly earnings for men ages 18-65 with nonallocated positive earnings, who worked at least 40 weeks last year and at least 30 hours per week in the Census. Sample is winsorized at the 1st and 99th percentile in order to limit the influence of outliers. a. Population-weighted standard deviation of mean state-by-year log hourly earnings. b. Standard deviation of log hourly earnings. Conceptually, this measure includes both state-level and residual variation in earnings. c. Convergence = 1-SDState1980/SDState1940. Note that this measure uses hourly earnings, and is different from the measure of Convergence developed in Appendix Table 1, which uses per capita income. d. Rather than using observed state income in 1980, we predict state income using 1940 state income and the observed convergence rate of 65% to calculate Ystate1980hat=0.35*Ystate1940. We characterize the counterfactual distribution of earnings in the absence of state income convergence as Y + Ystate1940 - Ystate1980hat. e. Method follows note (d), except that we calculate the counterfactual with convergence as Y - Ystate1980 + Ystate2010hat.