<<

The Birth and Persistence of Cities: Evidence from the ’s

First Fifty Years of Urban Growthú

John C. Brown and David Cuberes

May 2020

Abstract

This paper examines the influence of first- and second-nature forces on the process of city formation in Oklahoma from 1890 through 1930. The natural experiment oered by the opening up of previously unoccupied land for settlement in in 1889 and the rapid initial settlement reveals that population density was highest near railroads and pre-existing sites of (modest) habitation. Persistence or second-nature forces dominate in explaining population growth over the first decade of settlement. The oil boom of 1900 to 1930 altered the productivity potential of many locations in Oklahoma. Oil shocks increased population in those areas most profoundly aected. We also find evidence that the eect of the shock is larger in areas with higher initial population density and places with a higher initial population.

Keywords: city formation; urban agglomerations; locational fundamentals; natural experiment

JEL Codes: N9, R1

1 Introduction

Urban economists have long been interested in understanding the origin of urban agglomerations. As Fujita et al. (1999) show, in a featureless plane, without some sort of increasing returns, there should be no reason why population would optimally choose to agglomerate in a specific location. On the other hand, there is plenty of evidence that human settlements have long been very much influenced by the natural advantages like proximity to the sea or to a river or moderate temperatures.

The literature has defined first nature forces (also known as locational fundamentals) as attributes specific to a location but not associated with human settlements that make some locations more appealing than others (Davis and Weinstein, 2002). From a dierent perspective, the Heckscher-Ohlin model posits that relative factor endowments (e.g. fertile soil, deposits of minerals, or access to waterways) dictate comparative advantage and hence relative city size. Second nature forces have been defined as the advantages to a location that derive from an agglomeration of population and can include external economies of scale arising from localization or urbanization economies. The distinction between first-nature and second-nature forces is a

úBrown: Department of Economics, Clark University. E-mail: [email protected] Cuberes: Department of Economics, Clark University. E-mail: [email protected];. We thank Jesus Fernandez-Villaverde, Paolo Martinelli and seminar partici- pants at the University of and Oklahoma State University. PRELIMINARY, PLEASE DO NOT CIRCULATE.

1 crucial one for urban economists but, as we argue below, previous attempts to separate them empirically have not been very successful.

There is some limited evidence from the European colonization of America that first-nature forces mat- tered for initial settlements. However, this specific natural experiment is not very clean. The process of settlement extended over several centuries, which makes the task of isolating the importance of first-nature factors vis-`a-vis other confounding factors a daunting one.

This paper examines the importance of first and second nature influences on the development of Okla- homa’s urban system during the first 50 years of white settlement. It exploits two unusual features of that history, including the extremely rapid pace of urban development during the period of initial settlement (1889 to 1907) and the dramatic impact of highly localized positive productivity shocks during the oil boom years (1900 to 1930). The final section will examine a third episode: the devastating impact of negative shocks to first nature fundamentals during the era (roughly 1930 to 1940).

Oklahoma’s history of urban development began with the , when as many as 40,000 Euro-Americans competed for a farmstead in the area known as the ”.” Figure 1 indicates that this virtually uninhabited area (about the size of ) of former that at the time of settlement was surrounded by Indian Territory inhabited by native peoples. During the less than two decades following the first Land Rush the remaining area of Indian Territory (about 23 times the initial Unassigned Lands) had been opened up to Euro-American settlers following the allotment of a small share of the territory to native people. Each opening up of territory prompted a rapid in-flow of Euro-American settlers, compared with the more measured pace that had been observed in neighboring states prior to the “closing” of the in 1890. By the late 1890s, the area of the Territory of Oklahoma included over almost 1,000 places (cities, towns, villages, crossroads), where there had been only a handful of isolated stops and railroad depots prior to 1890. The subsequent two historical periods featured spatially-identifiable localized shocks to productivity.

The three periods of Oklahoma history oer several advantages to identification of the importance of first- versus second-nature influences on urban growth. During the first Land Rush period, there was no role for second nature forces of prior agglomerations since the Unassigned Lands were practically uninhabited up until 1889 and almost entirely settled by mid-1890. This natural experiment allows us to estimate in a very clean way the importance of first nature forces on town formation. Second, the shock was very short-lived since most of the territory was claimed a few days after it was given to settlement.1 The second period (the oil boom) features spatial and temporal variation in positive shocks to local productivity in the context of what was still a very young urban system. The boom vaulted Oklahoma to the top oil-producing U.S. state for a period and resulted in remarkable urban growth in some areas, even as others showed signs of stagnation. We can also investigate the interaction of first and second nature forces by examining whether larger agglomerations benefited more in the longer term from the impact of an oil shock. The final section of the paper will examine the impact of a strong negative shock to first nature—–the Dust Bowl of the 1930s—–on Oklahoma’s towns and cities up through 1940.

The rest of the paper is organized as follows. Section 2 provides some historical context for the land

1For example, the settlement of the Americas by Europeans, started in the early 16th century and continued for several more centuries. During that very long period of time they were likely many time-varying factors that aected the process. Bosker and Buringh (2017) consider early urbanization in Europe but again this was a very drawn-out process.

2 rush and the oil boom. Section 3 reviews the literature on attempts to identify the importance of first and second nature forces on the urbanization process. Section 4 describes the data used in the analysis. Section 5 presents results on the role of first nature influences and population persistence on the density of settlement during the first decade (1890 to 1900). Section 6 presents a simple model of urban city size that allows us to study the impact of productivity-enhancing first-nature shocks on dierent types of agglomerations. In section 7 we show the results of oil boom shocks during the 1900 to 1930 period on on urban growth. Section 8 concludes the paper.

2 Historical context

Our analysis of the influence of first and second nature forces on Oklahoma’s urban development over the first half-century of Euro-American settlement takes account of three distinct periods in the state’s history. The first period (1889 to 1907) includes the almost two decades of transformation that took a territory that was once the home of native peoples from across the continental to a state that was admitted to the Union in 1907. The second period (1900 to 1930) involves an oil boom, which vaulted Oklahoma into the ranks of the most important oil-producing states in the union. The third period (1930 to 1940) was one of economic reversal, as drought and the consequences of destructive agricultural practices resulted in soil erosion and rural depopulation. Between the 1820s and 1889, Oklahoma became home to as many as 40 tribes of native people by a process of successive ”removals” and resettlement as Euro-Americans pursued a relentless drive to settle first the area east of the , and then lands to the north of Oklahoma ( and ). The tribes came from as far east as and and as far west as Oregon and Washington. Most prominent among the tribal groups were the five ”civilized” tribes, who had origins in the southeastern United States and who practiced agriculture with settled communities and full systems of self-governance (see Figure 2).2 Pressure by settlers outside of Oklahoma and railroad interests intent on generating revenue from lines running through the Indian territory led to the first of a sequence of Congressional Acts opening up land to settlement by Euro-Americans. The so-called ”Unassigned Lands” in Figure 1 were at the time of settlement unoccupied. They were settled in a very short period of time during the first Oklahoma ”Land Rush.” After 1889, each subsequent opening up of new land to Euro-American settlers involved allotting a share of the land as tracts to the native people inhabiting the land and then a ”” (as in the first land rush) or a process of allocating land by lottery. As Figure 2 indicates, by 1900, most of the and northern part of former Indian Territory was incorporated into the new Territory of Oklahoma. Lands in the southwest and in the northeast were incorporated into the by successive treaties between 1900 and 1906. The final step was the formal incorporation of territory occupied by the civilized tribes that had farmed their territories for several decades.

For the first land rush, Bunky (1890) reports that more than 40,000 waited on the borders of the “promised land” on Monday, April 22, 1889 for their share of the two million acres (8,000 km2) in the Unassigned Lands.3 As was the practice throughout the west and midwest since the enactment of the Ordinance of 1785, the land in the area opened up for settlement had been surveyed and divided into about 12,000 ”quarter sections” of 160 acres (about 64 hectares) each. Thirty-six complete sections made up a surveyed ”township” of thirty-six square miles designated by the range (east or west) and town (north or south). Figure 3 illustrates the results

2These five tribes were the , , , , and Creek. 3The area included all or part of present-day Canadian, Cleveland, Kingfisher, Logan, Oklahoma, and Payne counties.

3 of this practice for the case of Garfield County. Conducted under the terms of the Homestead Act of 1862, a claim for a maximum of 160 acres could be made by any person aged 21 or older, male or female. To ”prove” the claim and receive title, the settler had to build a habitation and well on the land, plow it, and live there for at least the first three years after filing the claim. The settler also had the option to buy the land after six months of residency for $1.25 per acre.4 By the end of April 22, most of the available quarter-sections had been claimed.

In the wake of the land run, a few cities and town centers sprang up overnight. The future site of the state’s largest city, , consisted of only seven buildings on April 21: the train depot, a section house, a post oce, a government building, the home of the railway agent, a boarding house, and an old stockade used by a stage company for an oce. Main Street of what it is today Oklahoma City was the base of operations and in a very short time the best lots on the street were taken. “Every train brought great numbers of new citizens from all parts of the country and the new city was booming by the second night of its existence... While this adjustment of lots was going on in the southern part of the city, hundreds of citizens were settling in the north part upon the lines of the Seminole survey and those who could not find any place to settle, were talking loud, holding meetings, and denouncing everything and everybody.” (Bunky, 1890). Similar scenes played out in the larger towns of Guthrie, Kingfisher, and Norman.

The initial year or two of settlement involved substantial churning of settlers, since the land run occurred with inadequate time for a homesteader to prepare a field and then plant it. Buck (1907) reports that a drought during the summer of 1889 also meant that for the first year, settlers were dependent upon food shipments brought in by the U.S. Army. However, subsequent years were beneficial for agriculture. Bohanon and Coelho (1998) report that the average value of farmland rose from $5 to $25 per acre between 1890 and 1910. They estimate that entering one of the Oklahoma land contests could allow an unskilled laborer to obtain an asset worth between $276 and $388, or between 74% and 103% of a year’s income. So it is fair to say that, while risky, there was a high payofor those who were able to overcome the risks of homesteading.

The next period of Oklahoma’s history with a profound impact on urban development was the 1900 to 1930 oil boom, during which Oklahoma rose rapidly to prominence as a leading oil-producing state (Boyd, 2002 and Northcutt, 1985). The discovery of oil first took place in 1897 in an area under control of the Osage tribe, although the first productive well was not in place until after 1900. Various parts of the then- Indian territory and a few locations elsewhere experienced discoveries. As Figure 4 illustrates, the opening up of new oil fields got underway in the early 1900s, with a remarkable surge of new discoveries of oil fields during the years around World War I. During the period 1907 to 1923, Oklahoma ranked as the number one oil-producing state. Discoveries tapered oafter the mid-1920s.

The map in Figure 5 shows the distribution of oil discoveries during the four intercensal periods for which population data are available. The area just north of the Indian Territory saw the first major new drilling prior to statehood. In the period from 1907 to 1909, new discoveries were primarily in the area of the former Creek nation. The period 1910 to 1919 saw an expansion to much of the former Creek nation and again north into the former Osage territory. Finally, the last wave of discoveries were clustered further to the east in former Osage territory and reached for the first time into the former Seminole nation (Seminole County). Welsh et. al. (1981) provide detailed documentation of the boom in Seminole County and adjacent areas to the east in Pottawatomie County. Many communities saw their population increase up

4The three years’ required habitation was less than the five years typical for areas settled under the Homestead Act.

4 to seven-fold from 1920 to 1930. Other regional studies document similar tales of villages that experienced an extraordinary boom followed by population decline (or even extinction).5 But the pattern of boom and then bust was not uniform. A key issue for our understanding of this period is whether this short-term boost to local productivity interacted with existing productivity advantages to provide more lasting impacts to the productivity of particular locations.

The final period of Oklahoma’s urbanization history (1930 to 1940) was influenced by the collapse in oil prices during the Great Depression and devastation to agricultural production caused by drought and subsequent soil erosion. Cronin and Beers (1937) document the impact of a series of droughts on the Great Plains states and Lee and Gill (2015) provide a discussion of competing explanations for the severity of erosion from wind found in the areas most dramatically aected by the Dust Bowl. According to reports of the period, severe wind erosion was confined to the far western part of the state and the panhandle. Reports of the period note that soils in the central and eastern part of the state were also subject to ”sheet erosion,” which occurs when heavy rains fall on powdery soils that were already dry from extended drought. Along with the direct and indirect impacts of droughts, the ability of rural areas in central and to support large rural populations suered from the mechanization of cotton growing that substantially reduced the requirements for labor (Holzschuh and Mills, 1939, pages 28-29.) These changes in the technology of production as well as the weather shocks contributed equally to the potential for rural depopulation. 6.

3 Related Literature

Our paper is directly related to several strands of the literature. First, it directly links to the literature that has quantified the importance of first-nature forces in determining location choices. Several authors have identified important first-nature variables for both consumers and firms (Rosenthal and Strange, 2004): good weather (Rappaport, 2007), factor endowments (Ellison and Glaeser, 1999; Kim 1995), ports (Bleakley and Lin, 2002), proximity to the sea (Cuberes et al., 2019), proximity to coal mines (Fernihough and O’Rourke, 2014), and locational fundamentals (Davis and Weinstein, 2002) among others. For example, Ellison and Glaeser (1999) use industry data to show that the percentage of agglomeration that is predicted by the natural advantage proxies is roughly 20%. Henderson et al. (2018) use lights data and find that geographical characteristics explain 47% of worldwide variation and 35% of within-country variation in income around the world. Most of these studies focus on firms’ data whereas we use data on individual’s location choices.7 Moreover, almost no study has exploited a natural experiment to analyze the importance of first-nature forces.

The paper is also related to the literature on endogenous growth arising from agglomeration economies that predicts strong persistence and path-dependence in urban agglomerations (Davis and Weinstein, 2002; Bleakley and Lin, 2012; Liu, 2015; Desmet and Rappaport, 2015; Bazzi et al., 2019; Fl¨uckiger et al, 2019).

Our examination of an exogenous positive shock to the persistence of urban agglomerations links our work to papers that have also explored productivity shocks. Their eorts focus on negative shocks to productivity, including the eects of World War II bombings (Davis and Weinstein, 2002; Brakman et al., 2004) and the eects of conflicts that took place in medieval Europe (Cuberes and Gonz´alez-Val, 2016).

5See the series of studies led by geographers at Oklahoma State University: Carney(1981), Carney(1985) and Carney(1986). 6See the results of the Farm Security Administration discussed in Holzschuh and Mills (1939, Fig. 3 page 20). 7See Rosenthal and Strange (2004) for a review of these papers.

5 Our paper takes advantage of a unique quasi-natural experimental circumstances of Oklahoma’s early urban history. The Land Run of 1889 is the first example, since it circumvents two problems associated with work similar to Bosker and Buringh (2017). First, the issue of pre-existing population is solved by studying the location choices of the first settlers of a vast territory (the Unassigned Lands). Since by historical accident there were practically no settlements, institutions or other second-nature features in this territory at the moment of the Land Run, our estimates capture pure first-nature eects. We are not aware of any other paper that can make this claim. Second, our shock, namely the Land Run of 1889 is very short-lived, in the sense that, in a matter of days or weeks at best, the totality of the Unassigned Lands became occupied. This reduces the scope of confounding variables or shocks that can explain this process. The second example are the oil discoveries of 1900 to 1930, which represent positive highly localized shocks to productivity in a setting that roughly holds constant many other features of the Oklahoma urban system. This example oers a contrast to other studies focused on negative shocks to productivity, which typically involve widespread wartime destruction or dislocations that were likely to have aected many aspects of the urban system aside from the productivity benefits of particular locations.

4 Data Description

The data used for this paper include estimates of population for two geographies for the period 1890 to 1930, geo-referenced information on the first nature forces that existed prior to the first wave of settlement in 1889, and geo-referenced data on all discoveries of oil fields for the period through 1927. Since census data for smaller administrative units of Oklahoma Territory are not available until 1900, this study tapped three sources of information that for the first time are compiled into consistent geographies suitable for analysis. The first source draws on the individual observations of from the extant manuscripts of the 1890 territorial census that was conducted prior to establishment of formal territorial government in the Unassigned Lands. In addition, we used information on homesteads and households from the 1890 Smith’s First Directory of Oklahoma Territory. Finally, we consulted the recently digitized the land patent records of the Bureau of Land Management, which record all homestead grants. For subsequent years (1900, 1907, 1910, 1920, and 1930), data for the population of towns (places) and townships (rural unincorporated areas) are available from the territorial censuses of 1900 and 1907, a special census of the Indian Territory in 1907, and the decennial censuses of 1910 through 1930.

This study uses three ways of organizing the geography of population agglomeration. First, all of the population data have been mapped into a spatially-consistent GIS of about 710 ”administrative townships.” The GIS is based upon the 988 civil townships that existed in 1934 and constructed so that the population data applies to the same geographic region across time. Since the GIS allows a calculation of the area of the administrative townships, the most relevant measure that covers all of the state of Oklahoma is the population density of each township.8 The data on population available from incorporated towns and cities are mapped in a separate ”place” GIS. There were about 515 such places ca. 1930. Finally, a third township-based GIS was developed for the unique circumstances of the 1890 census, for which data could only be geo-referenced according to the town-range surveyed townships required by the Ordinance of 1785. As the map of Garfield County found in Figure 3 illustrates, the surveyed townships of 36 sections were

8We note also that for a place to become incorporated and thus included in the census involved a political process and was often a response to population growth. Many small town centers would never appear in the census.

6 often reconfigured into larger or smaller civil townships for reasons of geography or convenience. The map also illustrates one shortcoming of the ”place” data: neither territorial nor decennial censuses provide data on the area of places.

The manuscript 1890 Oklahoma census has information from a comprehensive index of homestead files and was taken in June of 1890. The data in the manuscripts are arranged by the county and then the enumerators notation of the town and range or the place of residence of each enumerated individual. As Womack(1986) documents, this census was not without its shortcomings. Multiple towns and ranges are given on several of the 1200-plus pages of the census, and it appears that manuscripts for two or three survey townships have gone missing. To fill in these gaps, we used two directories published by James W. Smith in 1890 that covered places (the ”urban directory”) and all of the homesteads recorded as of August, 1890. In addition, the homestead patents available from the Bureau of Land Management allowed identification of the town and range of households.9.

Our specification of first-nature forces includes three variables descriptive of the geography and topog- raphy of Oklahoma that may have influenced patterns of settlement and agglomeration. The first is the rivers, which served as both transportation routes (to the extent they were navigable) as well as points of trans-shipment. The second is the soil suitability index for maize developed by Galor and Ozak¨ (2016). The index provides a measure of the potential crop yields for the most important crop planted by settlers in Oklahoma, since it was used for both food and as a feed grain. The third variable is the terrain ruggedness index developed at one kilometer of resolution and published by Amatulli, et. al. (2018).10 The map of the Unassigned Lands found in Figure 6 illustrates the east-west course of the main rivers and as well as the moderate east-west gradient of soil suitability.

First nature for the Unassigned Lands also includes the transportation routes and nodal points that existed prior to April, 1889. The trails and roads in Figure 6 include a major north-south stage coach route along the western edge of the unassigned lands and three well-known cattle trails for driving cattle from pastureland in the northern part of the United States down to Texas: the Abilene trail, , trail).11 In the eastern part of the Unassigned Lands, the Sante Fe railroad had built a rail line connecting Kansas with Texas that was completed in 1887. We georeferenced the roads, trails, and railroads using several contemporary maps based upon the surveys of United States Geological Survey.12 Finally, we include initial minor agglomerations. These were a few railroad depots, stage coach stations, and (which was the site of an Indian Agency and lay just outside of the Unassigned Lands.) These places had at most a few hundred people. Table 1, column (1) includes the means and standard deviations for these variables on the basis of the survey townships. One-half of the Unassigned Land townships had a (shallow) river flowing them. The ruggedness of terrain varied across the area. The variation in maize productivity

9In the small number of cases where the original 104 survey townships could not be directly mapped into the 71 administrative townships for the Unassigned Lands, we prorated the population by area 10These variables were projected onto the maps of the survey and administrative townships. The median value from the projection was assigned to the township. 11Cattle drive trails were routes that were used for moving cattle from pastureland in northern areas of the United States down to Texas; they had been present in Oklahoma well before 1890. We have identified three main cattle trails that cross the state from North to South: the first one (the one to the left in Figure 3) is the Abilene Cattle trail. This trail run from Kansas to Texas and was founded in 1867 by Joseph McCoy. The second one (in the middle), is the 1,200-mile historic Chisholm Trail and is known as the world’s greatest cattle trail. The Chisholm Trail, founded by Jesse Chisholm, a Scottish-Cherokee cattleman, operated between 1867 and 1885 and it ran from Abilene, Kansas to South Valley, Texas. The third trail (the one on the right) is Shawnee Cattle Trail, which was in operation between 1840 and 1872. 12As asserted in Glaeser and Kolhase (2004), “Because roads and rail were rare and costly, every large city in 1900 was located on a waterway.”

7 was relatively limited. Only about ten had access to a railroad depot, although the network of roads and trails apparently served the area well.

5 First- and Second Nature during the First Land Rush Period: 1890-1900

We are interested in exploring two main questions: first, what was the initial pattern of settlement and early town formation for the first wave of settlers in 1889? Did they settle in areas with favorable first nature characteristics (rivers, railroads, cattle trails, etc.), or did they mostly chose their location without taking these factors much into account? Second, how did the urbanization process evolve in the following decade? Did new migrants and towns locate near where the first settlers resided?

Our empirical strategy to answer these questions consists of two stages. First we use cross-section data on the first two available census of population (1890 and 1900) in Oklahoma after the land rush of 1889 to explore two spatial issues: the dispersion of population in 1890 and to what extent it can be explained by specific geographic features. Then we explore the persistence of this dispersion between 1890 and 1900 for the 71 administrative townships of the Unassigned Lands. Our exploration of the role of the positive productivity shock of the oil boom allows us to pursue the question of persistence with the full panel of townships from 1907 onwards.

5.1 The importance of first-nature forces or geography

We begin by documenting the dispersion in population in the Unassigned Lands as of 1890, a few months after the first Oklahoma Land Rush. We explore the importance of being close to a river, a road or an existing railway to determine the location of this sudden wave of immigration. Based upon the GIS, we created three dummy variables for whether a survey or administrative township included a river, a railroad, a railroad depot, or a place. As noted above, we projected the soil suitability index for maize and the terrain ruggedness index onto the survey and administrative townships. We study the eect of first-nature forces by estimating the following regression in 1890:

ln(Densityi,1890)=– + “ÕXi + Ái (1)

We estimate this equation for the survey townships and the ”panel-consistent” administrative townships. Table 1 shows the results of estimating this equation using OLS for the 104 survey townships and and Table 2 shows the results using the 71 administrative townships. As a quick review of the results for both geographies suggests, the physical geography played only a minor role the measured density of population. Access to transportation (a railroad ran through the township) played an important role, as did the presence of a ”place.” Since most places were also railroad depots, the results are consistent with Anas et al (1998), who show that major cities, such as Oklahoma City, , Omaha, and , grew up around rail nodes and developed compact CBDs centered on rail terminals. Since the railroad depots were built in the Unassigned Lands prior to settlement, they can can be considered first-nature and reinforce the notion of causality leading from transportation access to population growth.This result is also consistent with Bairoch (1988), who claims that transportation advantages were by far the most important first nature force to

8 explain the spatial pattern of urbanization in Europe, rather than defensive advantages, climate, or resource abundance. Interestingly, being crossed by a river matters to some extent, but it’s not a robust eect. The associated coecient loses significance when we add the railroad dummy or the maize productivity index. Initial town sites also attracted population. Overall, the first nature variables account for up to 23 percent of the variation in population (density).

5.2 First-nature vs. second-nature influences in 1900

By 1898, the area of the Unassigned Lands included approximately 92 towns, villages, or cities. In this section we study the eect the extent to which first nature forces rather than centers of population existing in 1890 reinforced the tendency for further urban development. To examine this question, we estimate the following regression between 1900 and 1890:

ln(Densityi,1900)=– + —ln(Densityi,1890)+“ÕXi + Ái (2)

We estimated this regression for the all of the 71 administrative townships that had existed since 1890. Those results are in Table 3. We also constrained the sample to exclude the townships that included the five largest towns and cities in 1890 to explore influences on population density outside of the main population centers in the territory. The results of this regression are displayed in Table 4. The results in Table three suggest that none of the first-nature forces found in Table 2 exerted any influence on subsequent growth. Instead, there is evidence of very strong persistence: a 10 percent higher population density in 1890 is asso- ciated with eight percent higher density in 1900. One may argue that these results are driven by the impact of the larger towns from the original wave of settlement in 1889-90, including Oklahoma City, Stillwater, and Guthrie. Table 4 shows that this is not the case: the results are qualitatively similar to those found in Table 3, even when the six administrative townships with a population at or above 1,500 are excluded. Dropping the upper tail of the distribution results in a halving of the R2 and a diminished quantitative impact of the lagged log of population, but the coecient still remains significant at all reasonable levels of Type I error. Our initial examination of the influence of first nature forces on early urban development in the Okla- homa Territory suggests that the main influences were the creation of transport nodes and the provision of transportation services, rather than natural features. The substantial persistence exhibited suggests that the initial decisions on those areas with urban development in 1890 continued to exert a strong influence on subsequent town formation. The focus on the Unassigned Lands is not without drawbacks. The degrees of freedom are limited, and it is possible that the geography displays inadequate variation to reveal impacts on urban development. Fortunately, Oklahoma’s early settlement history was characterized by an identifiable and dramatic series of productivity shocks after 1900. The discovery and exploitation of oil fields in locations across large parts of the south and eastern part of the state created a series of locally-centered productivity shocks that are ably documented in the compendium of with the discovery and then exploitation of oil in much of the south and eastern part of the state after 1900.

9 6 Benchmark model: Localized Productivity Shocks and Urban Growth

The analysis of the oil shock phase of urban development in Oklahoma benefits from Bullard (1928), which provides a compendium of the over three hundred oil fields discovered through 1927. To make the best use of these data on a productivity shock and explore the implications for first- versus second-nature urban growth, we develop a simple theoretical framework for our empirical analysis that extends Glaeser et al.’s (1995) approach. We consider an economy with I>1 cities populated by a large number of atomistic individuals who live an infinite number of periods and who can relocate from one city to another at no cost. Cities are treated as separate economies that share a common pool of mobile labor, so that dierences in city growth are not due to exogenous labor endowments. The assumption of mobile labor means that for cities to dier, there must exist dierences in their levels of technological productivity and quantity of land. Total output in city j at time t is given by the production function:

— “ Yit = Ait (Nit,Nit 1,‰it) NitLit ≠

where Y is output, N is population, and L is land. The function Ait captures the potential for both first- and second-nature forces to influence the city’s productivity, since it depends upon both current population, lagged population, as well as a potential productivity shock, ‰it, in a given location and time. The assumption that the productivity of city i is a function of its previous population draw’s on Allen and Donaldson (2019).

In particular, we assume the following functional form for the productivity term Ait:

¯ –1 –2 Ait = ›iAitNit Nit 1 ≠ where › A¯ ‰ . We further assume that – ,– ,—,“ (0, 1), – + —<1, – + – + — + “>1 and i it © it 1 2 œ 1 1 2 –1 + —<1 so that each city’s production function exhibits diminishing returns to each input but increasing returns to scale. The parameter ›i > 1 measures time-invariant first-nature physical attributes of city i that makes its location more productive (soil quality, proximity to a water body, proximity to a railroad, etc.).

Total utility of potential residents of city i is assumed to be simply a function of their labor income in that city at time t. Assuming that wages are competitively determined, we have:

¯ –1+— 1 –2 “ uit =(–1 + —)›iAitNit ≠ Nit 1Lit (3) ≠

In a spatial equilibrium, if individuals can freely move between cities, it must be the case that each individual’s utility level in any city equals her reservation utility, which may change over time. We denote this utility levelu ˆt.Sowehave

¯ –1+— 1 –2 “ uˆt =(–1 + —)›iAitNit ≠ Nit 1Lit (4) ≠

Taking logs and rearranging:

10 1 –2 “ lnNit = ◊ ÷i ”t + lnA¯it + lnNit 1 + lnLit (5) ≠ ≠ 1 – — 1 – — ≠ 1 – — ≠ 1 ≠ ≠ 1 ≠ ≠ 1 ≠

ln(–1+—) 1 where ◊ = 1 – — , ÷i = ln›i and ”t = 1 – — lnuˆt. ≠ 1≠ ≠ 1≠ We can now model technological shocks more explicitly. Assume that any location can experience a technological shock at any period. In our context we will consider the possibility that oil is found at or near a given city. As explained above, the productivity shock is measured by ‰it and we use the indicator zit that takes a value of one if city i is aected by it at t and zero otherwise. The eect on city i of a shock in city j depends negatively on the distance between the two locations d where we assume d > 0, i = j and ij ij ’ ” dii = 0:

I „zijtdij A¯it = zijte≠ (6) j Ÿ=1

where > 1 and „>0. Under this formulation, if city i itself experiences an oil shock at period t ¯ and no other city experiences an oil shock, city iÕs productivity is Ait = e . Also note that we make the assumption that shocks enter in a multiplicative way. The rationale for this is that the eect of an oil shock on a city’s productivity is larger if it is near other locations that also experienced an oil shock. The standard varieties of localization economies would motivate this benefit of proximity, including input-output linkages and advantages accruing to localized labor markets.

Inserting (6) into (5) we have

I I –2 “ lnNit = ◊ˆ ÷i ”t lnzijt „ zijtdij + lnNit 1 + lnLit (7) ≠ ≠ ≠ ≠ 1 – — ≠ 1 – — j j 1 1 ÿ=1 ÿ=1 ≠ ≠ ≠ ≠

ˆ ln(–1+—)+ln 1 where ◊ = 1 – — and ”t = 1 – — lnuˆt. ≠ 1≠ ≠ 1≠ The formulation in (7) suggests an initial specification for a regression that can be estimated with OLS:

I I

lnNit = Â0 + ÷i + ”t + Â1 zjt + zjtdijt + Â2lnNit 1 + Â3lnLit + Áit (8) ≠ j j ÿ=1 ÿ=1

where Á is a standard error term. A city’s population depends upon first-nature conditions specific to the locality (÷i), the economy-wide utility, the impact of localized shocks to productivity, and the persistence eect of lagged population.

11 7 Productivity Shocks, First-, and Second-Nature During the Oil Boom: 1900 to 1930

In this section we use the structure of the theoretical model of Section 6 to study the eect of a localized technology shock. To do so we exploit cross-sectional and time variation in the discovery of oil in Oklahoma.

There is a wide range of theoretical models showing how transport costs and agglomerations of population interact. The models do not always have the same prediction. In theory, lower transport costs can contribute to further agglomeration and hence divergence in populations in the urban system, or they can work for convergence.13

Yet in the presence of increasing returns to population, cities hit by a large productivity-induced popula- tion shock can reallocate to a dierent equilibrium. Additionally, the role of increasing returns can interact with changing fundamentals. Advantages of a location can lock in cities at locations with long-defunct ad- vantages as Bleakley and Lin (2012) found in their study of portages and urban development in the USA. Alternatively, a shock to first nature fundamentals could facilitate urban populations to spatially reallocate as they adapt to new configurations of locational advantages. One such example is that cities after the collapse of the Western Roman Empire in Britain reconfigured in coastal areas (Michaels and Rauch, 2017).

The historical literature on Oklahoma suggests that such a reallocation may have taken place as localities developed industries and skills complementary to the initial shock of oil discovery, but this hypothesis has not been subject to rigorous testing.14 As the map in Figure 5 reveals, the productivity as well as spatial distribution of these fields varied widely over time and space. This section exploits this spatial and temporal variation to examine the impacts of shocks to productivity of localities and explore the hypothesis that second nature forces (population) may have interacted with shocks to magnify their impact on subsequent growth. The empirical analysis focuses on the period 1910 to 1930 and looks at the experience of the 700+ administrative townships and over 300 places. We estimate regressions of this type:

ln(Di,t)=– + —1ln(Di,t 1)+—2Oili,t 1 + —3Oili,t 2 ln(Di,t 2)+“ÕXi + Á (9) ≠ ≠ ≠ ú ≠

where lnDt is the (log of) the current population density in census year t, lnDt k is the population density ≠ or population of the year t-k (k=1,2), Oili,t k (k=1,2) is the (log of) the spatially averaged productivity ≠ of nearby oil fields, and Xi is the vector of first nature influences defined in section 5. We interpret the discovery of oil as a shock to the first nature productivity potential of a township or place. We measure the presence and intensity of the oil productivity shock for the period prior to the census year (t 1upto ≠ but not including t). We lag the oil shock-population interaction by one period (to t 2) in the interaction ≠ term, ln(Dt 2) Oil, in order to overcome the high correlation of the impact of the shock in t 1with ≠ ú ≠ an interaction term including the same shock and the population density. The lagged value also allows us to test for whether the longer-term impacts of the productivity shock are augmented by the presence of an agglomeration.

13See See Fujita et. al. (1999), Baldwin and Martin (2004), and Desmet and Rossi-Hansberg (2014). 14See most notably Stacey(1956).

12 This specification also allows for a short-run “level” eect of a shock on township population as oil drilling and exploration attracted a population of additional workers. In addition, we test for a separate longer-run productivity eect as the township economy responds to opportunities to create supporting activities. For the first set of regressions, the full specification thus becomes

ln(Di,1910)=– + —1ln(Di,1907)+

—2Oili,1907 1909+ (10) ≠ —3Oili,1900 1906 ln(Di,1900)+“ÕXi + Á ≠ ú

The data available to us allowed us to employ a richer specification of the spatial/temporal productivity shock A¯it. In contrast to the model and regression specification expressed in (8), the specification of the productivity shock used in the estimation also incorporates the size of the shock as well as its geographic location. Bullard(1928) provides the data on the maximum and minimum from the well. As a first approx- imation, we used the average of the two. The quantity is expressed as the log of barrels per day. Mapping the quantities as well as the location in a GIS allowed us to use spatial interpolation techniques to capture the spatial decay of the discovery’s impact.15 Given that a standard survey township is almost 10 kilometers square, it is possible that dierent parts of an administrative township could be aected dierentially by the productivity shock. As a first approximation, we used the value of the shock that aected the largest share of the township as our measure of the productivity shock. For the parallel analysis of place data, we assigned the shock experienced by the survey township within which the place was located. We also note that the specification used departs from (8) in the treatment of land area. We standardized each administrative township by dividing the population reported in the census by the area so that the dependent variable is the log of the population density. For places, we lack information on the area of each place so that the control for area implied by the model is not included. Essentially we are assuming that this control is not strongly correlated with other right hand side variables.16

The period of our analysis is confined to the years 1910 to 1930 for the full model, which incorporates the two-period lags in population(density) and the productivity shock. The first year for which population data are available for all of the 700 administrative townships is 1910; the sample for 1900 includes only 312. Thus there is sample attrition for the full model and sample sizes will vary depending upon the year of analysis. We also chose to estimate (9) using both cross-sectional regressions for 1910 through 1930 as well as panel analysis.17 We also estimated (9) using data on places.

Tables 5 and 6 present the results of the cross-sectional regressions for administrative townships and places respectively. Given the importance of persistence in accounting for population in both geographies, the R2s rise substantially when the specification moves from the first column for each year, with its focus on first-nature influences (Xi), to the second column, when the specification includes the log of population or population density. It is noteworthy that when all 705 administrative townships are analyzed, first-

15We experimented with both kriging and inverse distance weighting to convert the spatial data into a raster surface with varying intensity reflecting the wide variation in oil well yields evident in Figure 5. To ensure that shocks of the same size are treated identically over the period 1900 to 1930, we constrained the radius of impact to 16 kilometers and assumed a decay parameter of 2. Alternative specifications yielded qualitatively similar results. 16Our main concern would be if the presence of a river or terrain ruggedness strongly constrained the land areas of places in Oklahoma. We suspect that is not the case, but this issue needs to be explored further. 17Specification tests suggested fixed eects estimation for places and random eects estimation for administrative townships.

13 nature influences can account for almost one-quarter of the variation in population density. The pattern of coecients on Ruggedness and Maize vary in an expected fashion. The years 1900 to 1919 reflected optimal production and demand conditions, which enhanced the importance of a township’s agricultural productivity (the coecient on Maize) for predicting population. The negative coecient on ruggedness for 1910 is consistent with this interpretation. By 1930, falling agricultural prices and less favorable weather conditions had apparently cancelled out the productivity advantage evident before 1920. At the same time, access to rail (from the configuration in 1887) was much more important.

The persistence eect is evident for all three years, although it is diminished by the regressions for 1930 population density. The oil shock in all three years (cols. (3), (7), and (11)) was strongly positive. Recalling that the oil shock is measured in logs, the point estimates suggest an elasticity of 0.02 to 0.03. The coecients on the interaction of the lagged shock and population density (Oilt 2xLPopDent 2) is suggestive ≠ ≠ that townships with a larger initial population would be more likely to continue to receive benefits from the productivity shock. The only year for which the interaction term is not significant is 1910, which could reflect the fact that the early discoveries for the most part took place in townships for which population data are not available in 1900.18 The results on population for the place dataset presented in Table 6 are in general in line with those presented in Table 5. The variables expressing first nature forces generally perform poorly, although the impact of emerging rural agricultural distress is evident in the results for 1930 with the negative coecient on Maize and positive coecient on Ruggedness. Places benefiting from the ideal demand and production conditions in the ’teens had a reversal of fortune during the 1920s. The coecients on the oil shock terms are very close to those estimated for administrative townships. The interaction term is only significant for population in 1920 and is of a smaller magnitude. This could be reflective of the sample selection issues inherent in using population data for places, many of which would not have been large enough to achieve incorporation prior to a local oil boom.19

The results of panel analysis found in Tables 7 and 8 largely reinforce the conclusions from the cross- sectional regressions. By definition, the four first nature forces employed in the analysis are time invariant. Since the cross-sectional regressions suggested there may be some scope for time-varying influences of these four variables, the panel analysis allowed for some variation in the coecients on these variables over the three years of the panel. Since the composition of the two panels changes with the inclusion of the interaction term (Oilt 2xLPopDent 2), the final three columns are estimated to ensure comparability across the three ≠ ≠ 20 specifications that include the oil productivity shock term Oilt 1. The results in both tables exhibit ≠ the pattern of varying fortunes of the agriculture economy evident in the cross-sectional results. Townships and places with the greatest productivity potential (flat land with a good potential for raising maize) saw a positive impact on population in the 1900s and ’teens with a reversal of fortune in the 1920s. The lagged influence of population and population density was significant in all cases.

In both sets of regressions, the impact of the oil shock is positive and significant. For both townships and places, the magnitude is similar to that reported in Tables 5 and 6 from the cross-sectional regressions. The time-varying coecient on the shock is suggestive. The impact was largest in the first decade of significant oil exploration. Only a few townships (in another part of the oil field) had experienced an oil boom prior

18The only exception appears to be the discoveries in Tulsa. 19The table in Welsh, et. al. (1981, page 12) illustrates several cases. 20Compare column (3) with column (6) in each table. The sample in column (6) excludes those townships in the former Indian Territory and in areas occupied by the Osage tribe, since population data are not available for those areas in 1900.

14 to 1907, so that the impact was quite substantial. By the next period (the shock of 1910-1919), new areas were opening up to drilling in some proximity to areas that had already seen exploration. The impact was much more widespread, but also a bit less likely to aect an area that had yet to experience an impact. The place data show a similar pattern of a fall-oin impact, but with a lag of one period. The coecient on the interaction term (Oilt 2xLPopDent 2) is significant for both townships and places. The size of the coecient ≠ ≠ is roughly consistent with the averaged experience of the cross-sectional regressions. The pattern of time variation is also consistent with the cross-sectional results. The impact was positive but not significant in 1910 (from the oil shock of 1900-1907), and larger and significant in 1920 and 1930.

8 Conclusions

This paper uses two natural experiments that aected the urbanization of Oklahoma to explore the impor- tance of first- and second-nature forces in influencing the pattern of urban settlement and growth. The first was the extraordinary and sudden pattern of settlement that resulted from the 1889 Oklahoma Land Run. It allowed us to examine how the process of town and city formation works in a previously unpopulated area. This natural experiment is an ideal environment to study city formation since, as of 1889, most of Oklahoma’s Unassigned Lands were unoccupied and because the migration to this territory occurred in a very short period of time. We found that second nature forces, or the power of persistence, played a key role in accounting for population growth during the first decade of settlement.

The second natural experiment was the rapid emergence of a highly productive oil industry in the state during the first three decades of the 20th century. Our analysis of highly localized shocks to productivity revealed strong impacts from the oil shocks, even as the evidence also suggested an important, if secondary, role for the first nature forces of Oklahoma’s topography. The analysis also uncovered a statewide influence of persistence. The population of a township or place in one census year was highly predictive of its population in the next year. Finally, and perhaps most intriguing, it appears that second nature forces (a location’s population) enhanced the impact of positive productivity shocks that altered the locational distribution of first nature advantages. We intend to explore this implication of our findings with a closer examination of the impact of the strong negative shock to productivity that resulted from the drought of the early 1930s and the subsequent spread of agricultural distress to the southern and eastern parts of the state.

15 References

[1] Allen, T. and D. Donaldson (2019), “The Geography of Path Dependence,” manuscript.

[2] Anas, A., R. Arnott, and K. Small (1998), “Urban spatial structure.” Journal of Economic Literature 36: 1426–1464.

[3] Allen, T., and D. Donaldson (2018), ”The Geography of Path Dependence,” working paper.

[4] Amatulli, G., Domisch, S., Tuanmu, M.-N., Parmentier, B., Ranipeta, A., Malczyk, J., Jetz, W. (2018). “A Suite of Global, Cross-Scale Topographic Variables for Environmental and Biodiversity Modeling.” Scientific data, 5, 180040-180040.

[5] Bairoch (1988), Cities and Economic Development: from the Dawn of History to the Present.

[6] Bazzi, S., M. Fiszbein, and M. Gebresilasse (2019), “Frontier Culture: The Roots and Persistence of ”Rugged Individualism” in the United States.” NBER Working Paper 23997.

[7] Baldwin, R., and Martin, P. (2004), “Agglomeration and Regional Growth.” In V. Henderson, J. Thisse (eds) Handbook of Regional and Urban Economics.

[8] Bleakley, H., and J. Ferrie (2016), “Shocking Behavior: Random Wealth in Antebellum Georgia and Human Capital Across Generations,” Quarterly Journal of Economics, March.

[9] Bleakley, H., and J. Ferrie (2014), “Land Openings on the Georgia Frontier and the Coase Theorem in the Short and Long Run,” manuscript.

[10] Bleakley, H., and J. Lin (2012), “Portage and Path Dependence.” Quarterly Journal of Economics, 127(2): 587–644.

[11] Bohanon, C. E., and P. R.P. Coelho (1998), “The Costs of Free Land: the Oklahoma Land Rushes,” Journal of Real Estate Finance and Economics, 16(2): 205-221.

[12] Bosker, M., E. Buringh (2017), “City seeds: Geography and the origins of the European City System,” Journal of Urban Economics, Volume 98, Pages 139-157.

[13] Boyd, D. T. (2002), “Oklahoma Oil: Past, Present and Future. Oklahoma Geology Notes,” 62(3), 96-106.

[14] Brakman, S., H. Garretsen, and M. Schramm (2004), “The Strategic Bombing of German Cities During World War II and its Impact on City Growth,” Journal of Economic Geography 4 (2): 201-218.

[15] Buck, S. J. (1907). The Settlement of Oklahoma. (Masters). , Madison, WI.

[16] Bullard, B. M. (1928). “Digest of Oklahoma oil and gas fields.” Oklahoma Geological Survey, Bulletin, 40, 101-276.

[17] Bunky (1890), “The First Eight Months of Oklahoma City,” The McMaster Printing Company.

[18] Carney, G. O. (1981). Cushing Oil Field Historic Preservation Survey. Stillwater, OK, Oklahoma State University.

[19] Carney, G. O. (1985). Energy South Central Oklahaoma 1900-1930. Stillwater, OK, Oklahoma State University.

16 [20] Carney, G. O. (1986). Energy Northwest Oklahoma. Stillwater, OK, Oklahoma State University.

[21] Cronin, F. D. and H. W. Beers (1937). Areas of intense drought distress, 1930-1936, Works Progress Administration.

[22] Cuberes, D, and R. Gonz´alez-Val (2017), “History and Urban Primacy: The Eect of the Spanish Reconquest on Iberian Cities.” Annals of Regional Science, May, vol. 58, Issue 3, pp. 375-416.

[23] Cuberes, D., K. Desmet, and J. Rappaport (2019), “Urban Growth Shadows.” Federal Reserve Bank of Kansas City, Research Working Paper no. 19-08, November.

[24] Davis D. R., and D. E. Weinstein (2002), “Bones, Bombs, and Break Points: The Geography of Economic Activity.” American Economic Review, 92(5): 1269–1289.

[25] Desmet, K., and J. Rappaport (2017), “The Settlement of the United States, 1800-2000: The Long Transition to Gibrat’s Law,” Journal of Urban Economics, 98, 50-68.

[26] Desmet, K., and E. Rossi-Hansberg (2014), “Spatial Development,” American Economic Review, 104, 1211-43.

[27] Ellison, G. and E. L. Glaeser (1999), “The Geographic Concentration of Industry: Does Natural Ad- vantage Explain Agglomeration?,” American Economic Review, 89 (2), 311–316.

[28] Ertan, A., Fiszbein, M., and Putterman, L. (2016), “Who was Colonized and When? A Cross-Country Analysis of Determinants.” European Economic Review, 83, 165-184.

[29] Fernihough, A., and K. H. O’Rourke (2014), “Coal and the European industrial revolution.” NBER Working Paper No. 19802.

[30] Fl¨uckiger, M., E. Hornung, M. Larch, M. Ludwig, A., and A. Mees (2019), “Roman Transport Network Connectivity and Economic Integration,” CESifo Working Paper Series 7740, CESifo Group Munich.

[31] Fujita, M. Krugman, P. and Venables, T. (1999), The Spatial Economy. MIT Press.

[32] Galor, O., Ozak,¨ O.¨ (2016). “The Agricultural Origins of Time Preference.” American Economic Review, 106(10), 3064-3103.

[33] Glaeser E. L., and Kolhase, J. E. (2004), “Cities, Regions and the Decline of Transport Costs.” Papers in Regional Science 83, 197–228.

[34] Glaeser, E. L., J. A. Scheinkman, and A. Shleifer (1995), “Economic Growth in a Cross-Section of Cities.” Journal of monetary economics 36(1): 117-43.

[35] Henderson, J.V. T. Squires, A. Storeygard, and D. Weil (2018), “The Global Distribution of Economic Activity: Nature, History, and the Role of Trade,” Quarterly Journal of Economics, Volume 133, Issue 1, February 2018, Pages 357–406.

[36] Hoig, S. (1984), The Oklahoma Land Rush of 1889. Hardcover. June.

[37] Holzschuh, A. and O. Mills (1939). A study of 6655 migrant households receiving emergency grants, Farm Security Administration California, 1938. .

17 [38] Kim, S. (1999), “Regions, Resources, and Economic Geography: Sources of US Regional Comparative Advantage, 1880–1987,” Regional Science and Urban Economics, 29 (1), 1–32.

[39] Lee, J. A. and T. E. Gill (2015). ”Multiple causes of wind erosion in the Dust Bowl.” Aeolian Research 19: 15-36.

[40] Lin, J. (2015), “The Puzzling Persistence of a Place,” Federal Reserve Bank of Philadelphia Business Review, Q2, 1-8.

[41] Liu, S. (2015), “Spillovers from Universities: Evidence from the Land-Grant Program.” Journal of Urban Economics, Volume 87, 25-41.

[42] Michaels, G. and F. Rauch (2017), “Resetting the urban network: 117–2012,” The Economic Journal, 128 (608), 378–412.

[43] Northcutt, R. A. (1985). ”Oil and Gas Development in Oklahoma, 1891-1984.” Shale Shaker 35(6): 313-322.

[44] Oklahoma Geological Survey. (2019). Field Discovery Wells. Retrieved from: http://www.ou.edu/ogs/data/oil-gas

[45] Rappaport, J. (2007), “Moving to Nice Weather,” Regional Science and Urban Economics, 37(3): 375- 398

[46] Raz, I. T., and R. Mattheis (2019), “There’s No Such Thing As Free Land: The Homestead Act and Economic Development.” Manuscript.

[47] Rosenthal, S. , and W. Strange (2004), “Evidence on the Nature and Sources of Agglomeration Economies,“ Handbook of Regional and Urban Economics, Volume 4, Edited by J.V. Henderson and J.F. Thisse.

[48] Skelton, M. H. B., and Skelton, A. G. (1942), “A Bibliography of Oklahoma Oil and Gas Pools.“ Oklahoma Geological Survey Bulletin, 63.

[49] Steckel, R. H. (1983), “The Economic Foundations of East-West Migration During the 19th Century.” Explorations in Economic History, 20(1), 14-36.

[50] Steckel, R. H. (1989), “Household Migration and Rural Settlement in the United States, 1850–1860.” Explorations in Economic History, 26(2), 190-218.

[51] Stacey, K. (1956), “Petroleum and Gas in the .” PhD dissertation. Clark, Worces- ter, MA.

[52] Welsh, L., Townes, W. M., Morris, J. W. (1981), “A History of the Greater Seminole Oil Field. Oklahoma City, OK: Western Heritage Books.

[53] Womack, J. (1986), “The Maladministration of the 1890 Oklahoma Territorial Census.” The Chronicles of Oklahoma, 63(4), 412-417.

18 Figure 1: The Unassigned Lands

19 Figure 2: Oklahoma in 1900

Notes: The area in white and light red constituted the Oklahoma Territory. Areas in green were tribal lands incorporated into Oklahoma Territory from 1902 to 1907. The areas in blue were the territories of the five civilized tribes, which were incorporated into the State of Oklahoma in 1907 at statehood.

20 Figure 3: Survey and Administrative Townships in Garfield County, Oklahoma

Notes: The map shows the division of the county into survey townships of 36 sections of 640 acres each. A homesteader could lay claim to one-quarter of such a section, or 160 acres or about 64 hectares.

21 Figure 4: The Timing of the Oil Boom in Oklahoma, 1900-1927

Source: Bullard, B. M. (1928) and Oklahoma Geological Survey (2019)

22 Figure 5: The Geography and Timing of Oil Field Discoveries: 1900-1927 23

Source: Bullard, B. M. (1928) and Oklahoma Geological Survey (2019) Figure 6: First nature in the Unassigned Lands

Source: The map shows the railroads, roads or trails, railroad depots, stage stops, and forts in the Unassigned Lands prior to settlement by Euro-Americans in 1889. The soil suitability indices for Maize are from Galor and Ozak¨ (2016).

24 Table 1: First Nature and the Population Density of Surveyed Townships

VARIABLES (1) (2) (3) (4) (5) (6) (7) (8)

River in township 0.46 0.282** 0.127 0.257* 0.206 0.238* 0.167 (0.50) (0.137) (0.127) (0.138) (0.131) (0.128) (0.129) Terrain Ruggedness 2.83 0.096 0.119* 0.084 0.088 0.115* (1.13) (0.061) (0.063) (0.058) (0.057) (0.059) Maize Productivity 19.2 0.055 (1.75) (0.036) Road/trail in township 0.46 0.180 0.078 0.127 (0.50) (0.144) (0.130) (0.133) Railroad in township 0.15 0.675*** 0.455** 0.380* (0.36) (0.187) (0.217) (0.215) RR depot in township 0.10 0.891*** (0.30) (0.217) Place in township 0.11 0.572** 0.593** (0.31) (0.240) (0.237) Constant 1.227*** 0.549 1.091*** 1.202*** 1.187*** 1.462*** 1.044*** (0.194) (0.699) (0.222) (0.183) (0.180) (0.092) (0.205)

N10487104104104104104 R2 0.0638 0.0363 0.0783 0.172 0.199 0.187 0.232 Adj.R2 0.0453 0.0134 0.0506 0.147 0.175 0.162 0.193 F-test 3.443 1.584 2.831 6.913 8.284 7.642 5.930 Standard errors in parentheses

Note: *** p<0.01, ** p<0.05, * p<0.1. The standard deviation is in parentheses in column (1). The dependent variable is the 2 Log(Population Density)1890.Themeanpopulationdensitywas6.9personsperkm.Aplacewasdefinedasastagecoach stop, fort, or railroad depot. For other variable definitions, please see the text.

25 Table 2: First Nature and the Population Density of Civil Townships

VARIABLES (1) (2) (3) (4) (5) (6) (7)

River in township 0.120 0.142 0.106 0.059 0.084 0.049 (0.134) (0.136) (0.135) (0.129) (0.125) (0.125) Terrain Ruggedness 0.001 0.001 0.000 0.001 0.001 (0.001) (0.001) (0.001) (0.001) (0.001) Maize Productivity 0.012 (0.024) Road/trail in township 0.142 -0.040 0.028 (0.145) (0.129) (0.136) Railroad in township 0.494*** 0.268 0.149 (0.170) (0.201) (0.212) RR depot in township 0.689*** (0.197) Place in township 0.483** 0.563** (0.221) (0.225) Constant 1.365*** 1.348*** 1.246*** 1.391*** 1.309*** 1.549*** 1.222*** (0.191) (0.468) (0.227) (0.182) (0.178) (0.086) (0.214)

N71717171717171 R2 0.0397 0.0183 0.0532 0.148 0.188 0.193 0.226 Adj. R2 0.0114 -0.0106 0.0108 0.109 0.152 0.157 0.167 F-test 1.405 0.633 1.255 3.867 5.184 5.339 3.799 Standard errors in parentheses Note: *** p<0.01, ** p<0.05, * p<0.1. Dependent variable: Log(Population Density)1890.Forothervariabledefinitions,please see the text.

26 Table 3: First Nature and Persistence: The Population Density of Administrative Townships in 1900

VARIABLES (1) (2) (3) (4) (5) (6) (7)

Log Pop density 1890 0.796*** 0.814*** 0.792*** 0.764*** 0.768*** 0.751*** 0.736*** (0.087) (0.088) (0.087) (0.093) (0.099) (0.097) (0.098) River 0.018 0.017 0.008 0.006 0.016 0.001 (0.091) (0.092) (0.091) (0.092) (0.091) (0.092) Ruggedness 0.000 0.001 0.000 0.000 0.001 (0.000) (0.000) (0.000) (0.000) (0.000) Maize -0.010 (0.016) Road 0.099 0.028 0.071 (0.097) (0.094) (0.100) Railroad 0.125 0.084 0.021 (0.128) (0.148) (0.156) RR depot 0.100 (0.162) Place 0.114 0.171 (0.167) (0.173) Constant -0.230 0.042 -0.305 -0.160 -0.181 -0.050 -0.207 (0.217) (0.343) (0.229) (0.228) (0.232) (0.215) (0.246)

N71717171717171 R2 0.573 0.569 0.580 0.579 0.576 0.579 0.590 Adj. R2 0.554 0.550 0.554 0.554 0.550 0.553 0.552 F29.9929.5022.7622.7122.3822.6815.37 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

Note: *** p<0.01, ** p<0.05, * p<0.1. The dependent variable is the Log(Population Density)1900.Forothervariable definitions, please see the text.

Table 4: First Nature and Persistence: The Population Density of Administrative Townships in 1900 (Townships<1,500 in population in 1890)

VARIABLES (1) (2) (3) (4) (5) (6) (7)

Log Pop density 1890 0.551*** 0.588*** 0.556*** 0.530*** 0.540*** 0.559*** 0.539*** (0.131) (0.135) (0.131) (0.134) (0.136) (0.134) (0.135) River 0.018 0.021 0.009 0.011 0.020 0.004 (0.091) (0.093) (0.091) (0.092) (0.092) (0.093) Ruggedness 0.000 0.001 0.000 0.001 0.001 (0.000) (0.000) (0.000) (0.000) (0.000) Maize -0.005 (0.016) Road 0.115 0.057 0.101 (0.098) (0.096) (0.101) Railroad 0.113 0.129 0.064 (0.131) (0.155) (0.161) RR depot 0.063 (0.178) Place -0.004 0.059 (0.181) (0.186) Constant 0.247 0.419 0.146 0.290 0.261 0.344 0.180 (0.288) (0.378) (0.300) (0.293) (0.293) (0.290) (0.312)

N66666666666666 R2 0.259 0.241 0.275 0.268 0.260 0.256 0.282 Adj. R2 0.223 0.204 0.228 0.220 0.212 0.207 0.209 27 F7.2196.5695.7875.5775.3695.2533.865 Standard errors in parentheses Note: *** p<0.01, ** p<0.05, * p<0.1. The dependent variable is the Log(Population Density)1900.Forothervariable definitions, please see the text. Table 5: The Impact of Oil Shocks on Township Population Density: 1910-1930

VARIABLES LogDensity1910 LogDensity1920 LogDensity1930 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

LPopDent 1 0.822*** 0.826*** 0.912*** 0.986*** 0.982*** 0.970*** 0.801*** 0.792*** 0.771*** ≠ (50.04) (50.79) (48.05) (32.43) (32.44) (30.80) (32.02) (31.65) (30.40) Oilt 1 0.0297*** 0.0746*** 0.0212*** 0.0143* 0.0232*** 0.00539 ≠ (4.162) (6.103) (2.842) (1.848) (3.289) (0.642) Oilt 2LPopDent 2 0.000334 0.0208*** 0.0152*** ≠ ≠ (0.0499) (3.380) (3.848) Rail 0.180*** 0.0456 0.0613** 0.0894*** 0.123 -0.0543 -0.0367 -0.0320 0.184** 0.0853* 0.113** 0.119** (2.953) (1.629) (2.196) (3.281) (1.586) (-1.104) (-0.744) (-0.641) (2.288) (1.665) (2.189) (2.329) River 0.0698 0.0442** 0.0387* 0.0303 0.174*** 0.105*** 0.106*** 0.110*** 0.200*** 0.0608 0.0667* 0.0629* (1.555) (2.141) (1.895) (1.400) (3.043) (2.901) (2.937) (3.028) (3.378) (1.598) (1.764) (1.678) Maize 0.0363*** 0.00655*** 0.00476*** 0.00529** 0.0532*** 0.0174*** 0.0148*** 0.0152*** 0.0435*** 0.000902 -0.00132 -0.00271 (10.35) (3.764) (2.689) (2.096) (11.93) (5.758) (4.681) (4.699) (9.400) (0.279) (-0.401) (-0.827) Ruggedness -0.00115*** -0.000197*** -0.000172*** -0.000173* -0.000965*** 0.000168* 0.000213** 0.000206** -0.000915*** -0.000142 -7.59e-05 -6.96e-05 (-10.95) (-3.826) (-3.351) (-1.958) (-7.233) (1.836) (2.313) (2.212) (-6.607) (-1.551) (-0.816) (-0.756) Constant 1.752*** 0.394*** 0.398*** 0.162*** 1.360*** -0.367*** -0.363*** -0.357*** 1.534*** 0.445*** 0.429*** 0.479*** (25.61) (8.974) (9.171) (3.372) (15.63) (-4.795) (-4.756) (-4.315) (16.99) (6.653) (6.437) (7.127)

N705690690310705705705690706705705705 R2 0.222 0.830 0.834 0.926 0.211 0.685 0.689 0.678 0.164 0.661 0.666 0.673 adj. R2 0.218 0.829 0.833 0.925 0.207 0.683 0.686 0.674 0.160 0.659 0.663 0.670 F50.00667.5572.4542.746.93304.2257.5204.834.46272.8232.3205.2 t-statistics in parentheses 28

Note: *** p<0.01, ** p<0.05, * p<0.1. The dependent variable is the log(population density of the civil township in the respective year. For other variable definitions, please see the text. Table 6: The Impact of Oil Shocks on the Population of Places: 1910-1930

VARIABLES Log Population 1910 Log Population 1920 Log Population 1930 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

LPopt 1 0.995*** 0.993*** 1.019*** 1.030*** 1.028*** 1.029*** 1.051*** 1.045*** 1.051*** ≠ (56.38) (56.39) (44.87) (52.42) (53.99) (52.40) (56.93) (56.74) (59.97) Oilt 1 0.0226* 0.0575** 0.0386*** 0.0304*** 0.0224*** 0.0297*** ≠ (1.955) (2.501) (5.077) (3.894) (2.945) (3.491) Oilt 2LPopt 2 0.00114 0.00665*** 0.00120 ≠ ≠ (0.427) (3.203) (0.886) Rail -0.00102 0.000547 0.000481 -0.000868 0.000466 0.00154* 0.00118 0.00223*** 0.000540 0.00136 0.00118 0.00106 (-0.435) (0.732) (0.647) (-0.526) (0.206) (1.717) (1.359) (2.597) (0.233) (1.515) (1.323) (1.231) River 0.00126 0.000707 0.00115 -0.000493 0.00121 -0.000430 -0.000151 -0.000401 -0.00586 -0.00804*** -0.00819*** -0.00761*** (0.212) (0.375) (0.606) (-0.175) (0.208) (-0.188) (-0.0685) (-0.183) (-0.928) (-3.442) (-3.533) (-3.471) Maize 0.00656 -0.00845*** -0.00946*** -0.00701 0.0191** 0.0104*** 0.00565* 0.00832** -0.000731 -0.0264*** -0.0282*** -0.0311*** (0.758) (-2.938) (-3.251) (-1.286) (2.325) (3.110) (1.683) (2.427) (-0.0838) (-8.086) (-8.565) (-9.363) Ruggedness 0.000217 -2.73e-05 -9.79e-06 0.000129 0.000408 0.000152 0.000216 6.88e-05 0.000416 0.000282** 0.000327** 0.000401*** (0.595) (-0.234) (-0.0841) (0.851) (1.152) (1.082) (1.587) (0.512) (1.092) (2.008) (2.333) (2.956) Constant 6.318*** 0.376*** 0.390*** 0.102 6.052*** -0.313** -0.292** -0.338** 6.429*** 0.161 0.185 0.176 (33.24) (2.933) (3.054) (0.522) (33.05) (-2.173) (-2.093) (-2.259) (32.38) (1.214) (1.401) (1.364)

N389336336124468385385333507463463382 2

29 R 0.004 0.907 0.908 0.951 0.022 0.881 0.889 0.900 0.005 0.878 0.880 0.913 adj. R2 -0.00607 0.905 0.906 0.948 0.0131 0.879 0.887 0.898 -0.00337 0.876 0.878 0.911 F0.415640.2538.7320.22.553560.7502.1416.40.575655.9557.2559.2 t-statistics in parentheses *** p<0.01, ** p<0.05, * p<0.1

Note: *** p<0.01, ** p<0.05, * p<0.1. The dependent variable is the population of a town or city in the respective year. For other variable definitions, please see the text. Table 7: The Impact of Oil Shocks on Township Population Density: 1910-1930

VARIABLES (1) (2) (3) (4) (5) (6)

LPopDent 1 0.857*** 0.853*** 0.855*** 0.855*** 0.870*** ≠ (60.00) (60.02) (50.25) (50.32) (51.77) Oilt 1 0.0230*** 0.0119** 0.0231*** ≠ (5.470) (2.290) (4.920) 1910xOilt 1 0.0737** ≠ (2.470) 1920xOilt 1 0.0145** ≠ (1.980) 1930xOilt 1 0.00606 ≠ (0.792) Oilt 2xLPopt 2 0.0140*** ≠ ≠ (4.815) 1910xOilt 2xLPopt 2 0.00391 ≠ ≠ (0.243) 1920xOilt 2xLPopt 2 0.0240*** ≠ ≠ (4.172) 1930xOilt 2xLPopt 2 0.0124*** ≠ ≠ (3.478) 1910xRail 0.180** 0.0398 0.0532 0.0999 0.103 0.100 (2.450) (0.891) (1.196) (1.522) (1.574) (1.516) 1920xRail 0.123* -0.0312 -0.0121 -0.0188 -0.00778 -0.0252 (1.674) (-0.704) (-0.274) (-0.402) (-0.166) (-0.536) 1930xRail 0.184** 0.0785* 0.105** 0.114** 0.105** 0.103** (2.506) (1.771) (2.376) (2.463) (2.258) (2.211) 1910xRiver 0.0699 0.0437 0.0396 0.0409 0.0325 0.0412 (1.292) (1.324) (1.207) (0.776) (0.615) (0.777) 1920xRiver 0.174*** 0.114*** 0.115*** 0.118*** 0.115*** 0.122*** (3.213) (3.483) (3.533) (3.447) (3.363) (3.532) 1930xRiver 0.200*** 0.0511 0.0560* 0.0498 0.0491 0.0530 (3.699) (1.557) (1.720) (1.460) (1.442) (1.545) 1910xMaize 0.0363*** 0.00547** 0.00431 0.00983* 0.00848 0.00928 (8.583) (2.025) (1.604) (1.747) (1.491) (1.640) 1920xMaize 0.0532*** 0.0221*** 0.0192*** 0.0200*** 0.0186*** 0.0195*** (12.58) (8.483) (7.280) (6.950) (6.230) (6.755) 1930xMaize 0.0435*** -0.00208 -0.00454* -0.00756*** -0.00641** -0.00548* (10.29) (-0.779) (-1.693) (-2.638) (-2.209) (-1.920) 1910xRuggedness -0.00115*** -0.000157** -0.000146* -0.000176 -0.000160 -0.000171 (-9.087) (-2.000) (-1.871) (-0.817) (-0.744) (-0.790) 1920xRuggedness -0.000965*** 1.97e-05 6.91e-05 5.73e-05 7.80e-05 8.13e-05 (-7.631) (0.253) (0.885) (0.693) (0.934) (0.979) 1930xRuggedness -0.000915*** -8.79e-05 -1.77e-05 2.57e-05 -1.81e-06 -5.83e-07 (-7.235) (-1.133) (-0.227) (0.312) (-0.0218) (-0.00706) Year=1920 -0.392*** -0.474*** -0.484*** -0.349*** -0.360*** -0.366*** (-6.974) (-6.551) (-6.742) (-2.884) (-2.951) (-3.000) Year=1930 -0.218*** 0.0370 -0.00216 0.153 0.144 0.141 (-3.873) (0.510) (-0.0298) (1.285) (1.199) (1.177) Constant 1.752*** 0.332*** 0.348*** 0.204* 0.220** 0.181* (21.25) (5.699) (6.010) (1.879) (2.014) (1.658)

N2,1162,1002,1001,7051,7051,705 N Townships 706 705 705 705 705 705 R2 within 0.0615 0.00196 0.00229 0.0792 0.0910 0.0950 R2 between 0.218 0.925 0.929 0.928 0.928 0.929 R2 overall 0.197 0.704 0.708 0.700 0.702 0.696 z-statistics in parentheses

Note: *** p<0.01, ** p<0.05, * p<0.1. The dependent variable is the population density of the civil townshipit.Forother variable definitions, please see the text. 30 Table 8: The Impact of Oil Shocks on Place Population: 1910-1930

VARIABLES (1) (2) (3) (4) (5) (6)

LPopt 1 0.363*** 0.357*** 0.241*** 0.209*** 0.297*** ≠ (9.357) (9.343) (4.951) (4.096) (5.954) Oilt 1 0.0292*** 0.0387*** 0.0273*** ≠ (4.953) (5.254) (3.678) 1910xOilt 1 0.0308 ≠ (1.010) 1920xOilt 1 0.0564*** ≠ (4.317) 1930xOilt 1 0.0317*** ≠ (3.598) Oilt 2xLPopt 2 0.00730*** ≠ ≠ (6.162) 1910xOilt 2xLPopt 2 0.00178 ≠ ≠ (0.520) 1920xOilt 2xLPopt 2 0.00611*** ≠ ≠ (2.663) 1930xOilt 2xLPopt 2 0.0106*** ≠ ≠ (5.158) 1910xRail -0.00307*** -0.00340*** -0.00306*** -0.000902 -0.000886 -0.00283 (-2.891) (-3.369) (-3.082) (-0.431) (-0.417) (-1.314) 1910xRail -0.00144 -0.00103 -0.000912 -0.000730 -0.000853 -0.00100 (-1.484) (-1.068) (-0.958) (-0.771) (-0.904) (-1.020) 1910xRiver 0.00808*** 0.00859*** 0.00915*** 0.0134*** 0.0132*** 0.0127*** (2.975) (3.365) (3.637) (3.666) (3.638) (3.342) 1920xRiver 0.00788*** 0.00739*** 0.00766*** 0.00768*** 0.00746*** 0.00719*** (3.119) (3.002) (3.162) (3.214) (3.130) (2.893) 1910xMaize 0.0154*** 0.0138*** 0.0153*** 0.0201*** 0.0222*** 0.0208*** (3.924) (3.593) (4.040) (2.942) (3.222) (2.923) 1920xMaize 0.0255*** 0.0300*** 0.0294*** 0.0322*** 0.0318*** 0.0297*** (7.270) (8.337) (8.291) (8.704) (8.462) (7.769) 1910xRuggedness -0.000456*** -0.000307* -0.000353** -0.000600*** -0.000642*** -0.000477** (-2.747) (-1.957) (-2.290) (-2.981) (-3.178) (-2.289) 1920xRuggedness -0.000285* -0.000293* -0.000310** -0.000390*** -0.000382*** -0.000330** (-1.877) (-1.937) (-2.087) (-2.659) (-2.607) (-2.166) Year=1920 -0.108 -0.268*** -0.266*** -0.180 -0.190 -0.112 (-1.251) (-3.146) (-3.173) (-1.149) (-1.215) (-0.693) Year=1930 0.363*** 0.274*** 0.260*** 0.340** 0.328** 0.425*** (4.210) (3.233) (3.112) (2.173) (2.098) (2.621) Constant 6.219*** 4.009*** 4.008*** 4.705*** 4.910*** 4.351*** (73.41) (15.19) (15.44) (13.16) (13.27) (11.84)

N1,3641,1841,184839839839 NPlaces 515 469 469 385 385 385 R2 within 0.152 0.274 0.299 0.326 0.341 0.268 R2 between 0.00350 0.816 0.790 0.608 0.497 0.808 R2 overall 0.00282 0.774 0.756 0.595 0.498 0.757 F15.0524.1824.9516.4113.3113.48 t-statistics in parentheses

Note: *** p<0.01, ** p<0.05, * p<0.1. The dependent variable is the population of a town or city in the respective year.For other variable definitions, please see the text.

31