Q-exponential City Size Hierarchies, 430 BCE to 1950, Nomadism, and Population Fluctuations1

Douglas R. White,2 Céline Rozenblat,3 and(?) Nataša Keyžar4

Our principal interests here are first to determine city size distributions for the period 430 BCE to 1950 and whether these distributions are variable or constant (scale-free), and, second, to trace the fluctuations of population sizes. Secondarily, are there biases evident in pre-1800 city population survey data? Would these make significant alterations in findings relevant to our principal interest?

One of the crucial historiographic problems is whether city size hierarchies, as evidenced by superlinear size distributions, are found in world civilizations prior to the industrial revolution. The question is especially difficult given the rarity of census data prior to 1800. We are fortunate to have a database of city sizes with which to provide initial tests of hypotheses. Although city sizes over 200K show some possible evidence of a break in these data for world cities after 1800, which display the well-known by superlinear distributions in seven contiguous periods, and those before, which might differ in having linear distributions that characterize constant average group rates at all levels, i.e., these may lack superlinear size distributions. We show, however, that distributions with different distributions at some historical breakpoint (e.g., pre- and post-1800) entail a discontinuity in the total world population before and after the break, which is not supported by the available evidence. Thus, urban size distributions must have distributions that match the historical continuities in world population.

When we take all the available data for to city sizes over 50K (or even down in some periods where data are available down to 35K or lower), the problem is resolved: Comparisons among these distributions show relative uniformity, but these broader-range curves fall somewhere between superlinear and linear/exponential curves in their functional form. (These similarities also argue for the absence of significant biases for the pre-1800 period.)

The problems we must address, then, are these: What is the nature of the city size hierarchies and why would we find distributions that are neither superlinear nor linear/exponential, but something in between? What kind of distributions these might be? We argue from first principles that there are q- exponentials that occur in network growth processes, and we fit each of our 29 datasets for successive historial periods to the q-exponential parameters, back to 430BCE.

Methods

Céline Rosenblat provided the data and will explain how it was acquired. Fig. 1 shows the outcome of the coding procedure using the processes described here. Each of the 29 periods 430 BCE to 1950 was described in a text file with columns for city names, empires, city population, latitude and longitude. Cities were sorted by size and the sorted stacks of city sizes compiled in a single file. Because sizes from 1 This paper is the outcome of a collaboration under the auspices of EU project ISCOM, Information Society as a Complex System. 2 Santa Fe Institute and Institute for Mathematical Behavioral Sciences, UC Irvine. [email protected]. 3 Université de Montpelier, Maison de la Géographie. [email protected] http://mgm.fr. 4 Santa Fe Institute and University of Ljubljana, Department of Mathematics. 200K upwards were coded in all the files from 230BCE, sizes above 199K were log binned for cumulative populations from sizes of 9 million down to 200K in multiples of 1/√2. This resulted in number of people in each of 12 city size bins that were equal interval on a log scale. The X axis in in Fig. 1 represents the log scale for the bins. The top line represents the population distribution across the 12 bins shown by the points along the line. The red line represents the upper bins which, because they are partially empty, are always of lower slope than the full bins.

For measures of oscillation, we use 1 Average city population (for our purposes, >200 because this is lowest city size coded for all of our 29 historical periods. 2 Turnover of cities at the average (up/down, including out-of-rank) 3 Rank-size size migration rates by bins.

(Céline: How the ‘empire’ coded, and why do some of the files cease coding empires below a certain size? Where “China” is given, for example, with 5-8 cities, is there only one empire involved? Might the pre-1800 distributions be biased by the way that the pre-1800 city size data were estimated? If so, China has census data prior to that period, and in the next stage of this research we might test the superlinear distribution hypothesis for various historical periods.)

The q-exponential. Nonextensive or q-entropy Sq (Soares, Tsallis, Mariz and da Silva 2004) is a q generalization of Boltzmann-Gibbs entropy SBG = - ∫ dk P(k) ln (k), where Sq = (1- ∫ dk P(k) )/(q-1). Sq is associated with a plethora of scale-free phenomena for hierarchical and (multi)fractal structures (see extensive references in [15-18] in Soares et al. 2004). Nothing has as yet been discovered about Sq that is inconsistent with the entropy for thermostatistical systems, and Sq = SBG in the case where q=1 for which pairwise interactions are independent. Values of q ≈ 1.33 ± 0.15, for example (which might be 5/4 or 4/3 ratios with geographic or network biases or higher with other nonindependence biases), are consistent with the nonindependent and scale-free processes of preferential attachment models of social networks (Soares et al. 2004; White, Keyžar, Tsallis, Farmer and White 2005). Also of interest are the particular parameters of Sq that govern the q-exponential, which is typically a distribution somewhere between the superlinear (power-law) and the linear/exponential, but which also accounts for the inflections of log-log distributions both at the lower and upper ends of the distribution. One of the parameters of the q- exponential eq (besides q) is kappa (κ), which is an expression of the largest coherent unit of the distribution (or for a network, a characteristic number of links. If k is the variable measured, and P(k) is -k/ κ x the cumulative probability of k or higher values, then P(k) = P(0) eq , where, for x=k, eq ≡ [1 + (1 – q) 1/(1-q) x x x] , and in the case where q = 1, eq = e , for the natural log base e.

If we assume that cities are nodes in networks in which there is preferential attachment towards rural- urban migration, then the q-exponential is consistent with the scale-free characteristics of urban hierarchies.In fitting the q-exponential we solve for q (the exponent of scale-free nonindependence), for α (the slope of the q-exp/log graph), and for κ (the base unit size), where α will be a function of κ and q and possibly other parameters. If there is a preferential attachment for rural-to-urban migration, this will be reflected in the α parameter and thus in q.

Why would we expect, from interaction theory, a preferential attraction to cities proportional (or q- proportional, and thus α-proportional) to city size? The answer is simple: larger cities typically have more functions and thus more attractions. Cities are limited, however, by the productive and trading technologies which set as characteristic scale κ (the base unit size), which will also affect α, the scale- free exponent. We can expect that κ (the base unit size), as a limiting function, will tend to rise with the evolution of urban functional complexity, but there will be limits at which it will stabilize or fall.

Cities grow, however, within the context of the full population, including the noncity population that produces foodstuffs and raw materials. If we find full populations N that are relatively closed, such as China or England during certain historical periods, for example, we should find some interaction between the full population size and the carrying capacity Nc of the system of production and trade (distribution, organized transports for consumption). As the ratio of N/Nc rises  1, structural demographic theory predicts, first, “persistent price inflation, falling real wages, rural misery, urban migration, and increased frequency of food riots and wage protests. Second, rapid expansion of population results in an increased number of aspirants for elite positions. Increased intraelite competition leads to the formation of rival patronage networks vying for state rewards. As a result, elites become riven by increasing rivalry and factionalism. Third, population growth leads to expansion of the army and the bureaucracy and rising real costs. States have no choice but to seek to expand taxation, despite resistance from the elites and the general populace. Yet, attempts to increase revenues cannot offset the spiraling state expenses. Thus, even if the state succeeds in raising taxes, it is still headed for fiscal crisis. As all these trends intensify, the end result is state bankruptcy and consequent loss of the military control; elite movements of regional and national rebellion; and a combination of elite-mobilized and popular uprisings that manifest the breakdown of central authority (Goldstone 1991).” (Turchin 2005:1) Network theory predicts that cities that are multiconnected will have lesser degrees of monopolistic or oligarcic suppliers than those cities or rural areas that are connected in the trading system by a single link (or fewer links). Thus, as N/Nc rises  1, inflation will be higher in the disadvantaged (rural) areas and urban migration will be higher towards more multiconnected cities, which are typically in the larger trading zones. Cities, then, are not only attractors, they are also part of a system in which there are oscillations that displaces people from the land due to pressures of rural inflation. The q- and α- values of a q-exponential urban hierarchy, then, ought to increase with monetization and trade, and fall when monetization and trade decline, and, more specifically, vary with N/Nc ratios in closed areas.

In different historical periods, in consequence, as N/Nc ratios rise, there will be higher urban attraction when the trading systems are viable, but if trading systems crash for whatever reason (which can include the violence resulting from high N/Nc ratios), there will be urban-rural flight for economic reasons balanced by urban attraction in spite of poverty if there are strong protective polities. Thus, we would expect considerable urban hierarchy instability.

Cities as attractors, however, to the extent to which the form multiconnected trade networks, in developing urban hierarchies of specialized functions and skills. In this context they become disproportionate centers for invention and the diffusion of innovation, for education and literacy, of governance, of elite concentration and elite consumption, medical improvements and (up to a limit) increased longevity. There are, then, a considerable number of dimensions on which city hierarchies have fractal scaling patterns, ones which when summed in the aggregate, such as the role of cities in infusing the world population with growth, create not just oscillations but long-term (and presumably reversable) trends. Besides a growing city (and world) population, another seems to be the (increase) in α values (varying inversely to q), in which the city hierarchy becomes ever steeper in slope. The tendencies for long-term growth in q-entropic curves such as urban hierarchies do not seem to have a critical value of α, although in scale-free attachment networks (Barabási 2000), an attachment steepness of α > 3 is the threshold at which disease transmission can become uncontrollable if the item diffusing surpasses an initial frequency threshold. This seems to have happened in human history in examples such as the Black Plague of the Early Renaissance. Another such threshold appears to be the fertility transition. When fertility decline (which is limited by zero) more than offsets longevity increase (which is limited to 90 years or so), then (controlling for migration) urban population will decline. And as rural-urban migration has proportional limits, world population may decline as well. There is, in short, no guarantee of increasing or stabilizing world population, and no secure guarantee of a rising standard of living given rural/urban dynamics, although this is an areas of great contention. Empirical Results for City Sizes 1000000

1950

1925

100000 1900

1875

1850

1825 1800

10000 1750 1700 1650 1600 1550 1500 1450 1400 1350 1300 1250 1200 1150 1100 1000 900 1000 800 622 361 100 -200 -0.0035x -430 y = 15813e R2 = 0.9969

y = 1E+06x-0.9109 y = 10051e-0.0058x R2 = 0.9869 R2 = 0.9621 100 100 1000 10000 100000 totPowerLaw.xls Figure 1: Binned Population Sizes, from 1950 back to 430 BC. Prior to 1800, the distributions are fitted by an exponent (linear growth R2 ≈ .97), although this may be a function of the estimation methods and missing data. From 1800 forward the distributions follow superlinear distributions with coefficients in the range of -.91 to -.63 tending to fall with time (R2 ≈ .98). The population bins of 200 282 400 566 800 1131 1600 2263 3200 4526 6400 9052 are for number of population supported at or above that level.

When the regression lines are projected back to an axis of N=1 the intercept, as shown in Figure 1a (axes in thousands), gives a good estimation of the global population, e.g., circa 1 billion in 1950. In this figure, some of the upper bins with incomplete data are deleted. Now, however, we see a major problem of discontinuity – if there were a major break between the pre- and post- 1800 curves, there would be a huge discontinuity between the estimate world population figures for the pre-1800 period as well. We can conclude that both sets of data are governed by the same curve.

1 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0

1 0 0 0 0 0 0

1 0 0 0 0 0

y = 5 E + 0 6 x - 0 . 6 3 0 2 1 0 0 0 0 R 2 = 0 . 9 6 8 4

y = 6 E + 0 6 x - 0 . 7 6 8 7 R 2 = 0 . 9 7 6 2 y = 3 E + 0 6 x - 0 . 8 1 2 7 R 2 = 0 . 9 7 5 7 1 0 0 0 y = 2 E + 0 6 x - 0 . 8 7 0 1 R 2 = 0 . 9 8 3 7 y = 8 1 9 2 9 1 x - 0 . 7 9 6 6 R 2 = 0 . 9 7 4 7 y = 1 E + 0 6 x - 0 . 9 1 0 9 1 0 0 R 2 = 0 . 9 8 6 9 0 . 0 0 1 0 . 0 1 0 . 1 1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0

Figure 1a.

When more of the lower-sized city data are added, as in Figure 1b, the two sets of curves resemble one another. Clearly, these are a family of curves that are not power-law or superlinear. Nor, however, are they linear/exponential. 1000000

100000

10000

1000

100 0.001 0.01 0.1 1 10 100 1000 10000 Figure 1b. City size data down to sizes of 50K where available and excluding bins with incomplete data (tot.xls)

Fig. 2 (not yet constructed) shows the previous data (Fig. 1b) fitted with q-exponentials that are extrapolated back to the Y axis at .001 (estimated cumulative population including units down to 1). 1000000

100000

10000

1000

100 0.001 0.01 0.1 1 10 100 1000 10000 Figure 2. (tot.xls)

(One question at the end of q-exp curve fitting is how well the extrapolated total populations for each time period match the actual (independently estimated) total world population)

Macro-Theory and Indices of Oscillation

Estimating total population from fitted city size distributions makes the unwarranted assumption that human population is governed by urban dynamics. To look at urban and population dynamics from the perspective of other processes, we look first at indices of population oscillations, both urban and nonurban.

Fig. 3 compares the time series for total size of urban population in over 200K and World Population (top and bottom graphs, with graphs for two oscillatory variables in the middle: Average World City Size over 200K (upper middle), and Changes in World Population (lower middle). The upper and the lower graphs show what appear to be similar superlinear growth trajectories, but with a flattening after 1950. The middle two graphs show strong oscillations, as expected for urbanism. These two graphs – one for city and one for total population oscillations – are very different. The downward arrow at the year 630CE for average world city size shows a depression that is not reflected in total population oscillations. An second such arrow at 1200CE shows a second depression of this type that runs directly opposite to oscillations in total world population. What is occurring to make these graphs so different?

Similarly, the upward arrows at the years 800 (also 1000 although not an independent peak) and 1350CE for average world city size show a high points that are not reflected in total population oscillations. Historically, these two downward deviations are the result of invasions by nomads of urban Eurasian heartlands, one after the fall of classical empires (e.g., Rome, China?), and the other with the Mongol conquests of the MidEast and China, which aimed at an Empire (nearly the largest in history) in which cities were crushed if they resisted submission. The upward arrows show periods of urban recuperation following nomad conquest, when trade also flourished.

In the lower middle graph, the two downward and one upward arrows between 1400 and 1700 represent world population oscillations that are not reflected in urban hierarchy shifts – a period (after 1350) when further nomad conquests (such as the Qing conquest of China) are symbiotic with urbanism rather than destructive of urban resistance. > 2 0 0

3 5 0 0 0 0

> 2 0 0 3 0 0 0 0 0

2 5 0 0 0 0

2 0 0 0 0 0

1 5 0 0 0 0

1 0 0 0 0 0

5 0 0 0 0

0 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900

8 0 0

7 0 0

6 0 0

5 0 0

4 0 0

3150 0 0 35002 0 0 1100 0 0

30000 - 5 0 0 - 4 0 0 - 3 0 0 - 2 0 0 - 1 0 0 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 1 0 0 0 1 1 0 0 1 2 0 0 1 3 0 0 1 4 0 0 1 5 0 0 1 6 0 0 1 7 0 0 1 8 0 0 1 9 0 0 50 2500

2000 0 - 2 0 0 - 1 0 0 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 1 0 0 0 1 1 0 0 1 2 0 0 1 3 0 0 1 4 0 0 1 5 0 0 1 6 0 0 1 7 0 0 1 8 0 0 1 9 0 0 2 0 0 0 1500 -50 1000 -100 500

-1500 -200 0 200 400 600 800 1000 1200 1400 1600 1800 2000 -200 Fig. 3: Total and Average World City Size over 200K (top two graphs), Changes in World Population (lower middle) and Total World Population (bottom) Check that the Mongols lopped off several large cities before 1150 -250(e.g., Baghdad)

In sum, then, nomadic-300 invasions and empires need to be considered as part of urban dynamics. What of the more detailed changes of relative numbers of cities at different levels in the urban hierarchy? Are these affected by the seven events marked by arrows in fig. 3? Fig. 4-6 provide different views of these oscillations. Oscillations are evident in fig. 4 but cannot be clearly distinguished. The same is true for fig. 5, which shows the same data as percentages. One might argue that the oscillations are a function of noise from having binned our city sizes to narrowly by multiples of √2. We will see that this is not the case. 25000 >200 >282

20000 >400 >566 >800 15000 >1131 >1600 10000 >2263 >3200

5000 >4526 >6400

0 -500 0 500 1000 1500 2000

Figure 4: Historical Changes for the Binned Data. Percents in Logged Population Bins for 200K, 282K, 400K, 565K, and 800K are the relevant categories up to 1800. 1.000 >200 >282 0.900 >400 >566 0.800 >800

0.700 >1131 >1600 0.600 >2263 >3200 0.500 >4526 >6400 0.400 >9052

0.300

0.200

0.100

0.000 -500 -300 -100 100 300 500 700 900 1100 1300 1500 1700 1900 Figure 5: Historical Changes for the Binned Data. Percents in Logged Population Bins for 200K, 282K, 400K, 565K, and 800K are the relevant categories up to 1800. Fig. 6 enhances fig. 4 by raising each of the log-binned population to reflect numbers of the population supported at or above that level, as in fig. 1. All the arrows in fig. 3 are now reflected in urban dynamics, and the data are not binned too narrowly: The red arrows in fig. 6 correspond to the two major Eurasian nomad conquest events identified in fig. 3 (upper middle graph). The green arrows show the subsequent urban renaissance periods (upper middle graph, fig.3 ). The blue arrows correlate with the world population shifts in the lower middle graph of fig. 3. All of these effect are in the predicted direction.

100 >200

>282

90 >400

>566 80 >800

>1131 70 >1600

>2263 60 >3200

>4526 50 >6400

>9052 40

30

20

10

0 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 Figure 6: Historical Changes for the Binned Data. Percents in Logged Population Bins for 200K, 282K, 400K, 565K, and 800K are the relevant categories up to 1800. Fluctations reflect peaks in the period of Early Empires (230BCE-300CE), Urban renaissance (1000CE-1150CE), Early and Late Renaissance (1300-1550) and Industralization (post 1800); low points follow the fall of the Roman and Chinese empires (600CE) and the Mongol Conquest 1200CE). Growth periods are long and have unusual wave patterns; declines are more precipitous. Key to colored arrows: red = two major Eurasian nomad conquest events identified in fig. 3, upper middle graph; green = subsequent urban renaissance periods (upper middle graph, fig. 3); blue = world population shifts in the lower middle graph of fig. 3. Still need to do 2 Turnover of cities at the average (up/down, including out-of-rank) 3 Rank-size size migration rates by bins. Run pairs of these variables by one another. Do Phase Diagrams.

Conclusion

Phenomenologically, all the variables we have used in this study are world aggregates, and hence macro variables for urban hierarchies and population sizes, plus several external shocks to the urban system (such as nomad invasions) and presumed responses to those shocks (such as urban recovery), as well as observed oscillations that have been studied in terms of dynamical interactions with population variables. The latter dynamics have typically been studied in smaller and, by design, relatively closed regions (Turchin 2005). In our period of study, there are also network interactions within two regions (Old and New World) that are interactionally separable up to the 16th century. What we have shown is that the macro variables we have studied, including the world population size, should not be considered at any time as a closed system that follows its own developmental laws, such as dN/dt ≈ K (relatively constant growth rate, growing exponentially, which does not fit the observed data for world population in this period) or dN/dt ≈ KN (relative growth increase proportional to existing population, i.e., superlinear or power-law growth). The second of these closed-system models is descriptively accurate (it does fit the observed data for world population in this period, up to 1963, fairly well, as might be evident from fig. 3), but is not a valid model of the interactions that affect our macro-variables. Phenomenologically, our attempt to model urban dynamics at the macro level is not a closed-system or endogenous model, as with many of the power-law or Zipfian regularites proposed for urban hierarchies, but one that is oriented toward indentifying interaction dynamics.

Our study of urban dynamics and city size hierarchies in relation to world population changes across 29 historical periods is indicative of two contrasting rural populations that are relevant to urban dynamics: peasants and farmers who provide the raw materials for city growth; and nomads or others outside urban political orbits who can mobilize the resources for invasion of urban regions and establish empires. The first of these peri-urban populations, along with urban wage-workers and proletariats, are involved in the oscillatory dynamics of population and sociopolitical violence in agrarian polities (Turchin 2005). The second is involved in longer-term oscillatory dynamics of warfare, conquest and the administration of empires (Kradin 200?).

Of these three population subsystems, each has a sociopolitical organization that can be characterized as fractal. The q-exponentials that we have fitted to urban hierarchies reflect principles that are common to hierarchical and fractal organization. Fractal properties are evident in the food riots, wage protests, rebellions, and revolutions of peasantries in periods of population pressure (high ratios of population to carrying capacity). Fractal properties are found in two forms among nomads. One is the Middle Eastern social organization characterized by White and Houseman (2002) and White and Johansen (2005) associated with the segmentary systems of endoconical clan organization. The other is the fluidity of leadership in Eurasian horse nomadism that is associated with exogamous patriclans and interclan alliances.

The larger part of the world-system where the tripartite dynamics of these population subsystems is evident is found in Eurasia. It is perhaps not an exceptionalism of Europe that allowed it to become a core region of this system in the lattermost centuries, overtaking China and India, but the exceptionalism of China in having twice been invaded by horse nomads who were successful in establishing empires that in many respects crushed China’s earlier innovations and potentials for commercial hegemony (which was present in the Early Renaissance, for example).

Our conclusion is that it is not urban characteristics per se such as technological innovation that extends carrying capacity for urban populations, or modern medicine, that drives the expansion of world population up to a point where slowdown is necessary through transitions in demographic behavior (Kremer 199?) or education. What we have shown here is that urban populations are embedded in at least a tripartite dynamics among the different types of populations that share our common habitat.

Nonetheless, the city-size hierarchy does have scale-free properties that reflect these three, and perhaps other, nonindependent components of interacting populations. The scale-free model that best fits the urban size data over all historical periods since 430BCE, however, is not that of the a power-law or Zipfian hierarchy. Rather, it is q-exponential, with several parameters that change over time. One is the manageable scale of the system of interaction (κ). Another is the attachment inequality (α), the other is the deviation from BG equipartition of energy principles, q itself, which is a function of α and κ. We have yet to learn how these parameters themselves might influence urban, population, and world-system dynamics and co-evolve with other dynamic variables.

References

Barabási 2000

Goldstone 1991.

Kradin, N. 200?

Kremer 199

Soares, D. J. B., C. Tsallis, A. M. Mariz and L. R. da Silva. 2004. Preferential attachment growth model and nonextensive statistical mechanics.

Turchin 2005.

White, D. R., and M. Houseman. 2002.

White, D. R., and U. Johansen. 2005

White, D. R., N. Keyžar, C. Tsallis, D. Farmer and S. D. White. 2005.