<<

The Pennsylvania State University

The Graduate School

Department of Economics

ESSAYS ON RUSSIAN ECONOMIC GEOGRAPHY: MEASURING SPATIAL INEFFICIENCY

AThesisin

Economics

by

Tatiana N. Mikhailova

c 2004 Tatiana N. Mikhailova

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

August 2004 The thesis of Tatiana Mikhailova has been reviewed and approved* by the following:

Barry W. Ickes Professor of Economics Thesis Advisor Chair of Committee

N. Edward Coulson Professor of Economics

Eric Bond Professor of Economics

Regina Smyth Assistant Professor of Political Science

Robert Marshall Professor of Economics Head of the Department of Economics

*Signatures are on file in the Graduate School. Abstract

Compared with other transition countries faces the burden of extreme cold. This is not, however, strictly a function of geography. Soviet location policy directly affected the average (population weighted) temperature of the Russian economy. So- viet policy moved industry and population from the western part of the country to the east, effectively making Russia even colder than it was in the pre-Soviet era. Movements to the east have the significant impact on aggregate temperature because the isotherms on the Eurasian continent resemble lines of longitude, not latitude. Thus, the Russian economy entering transition faces not only the usual burden of an inhospitable Russian climate, but also suffers from the extra disadvantage due to the legacy of Soviet location policy. In this thesis I estimate the cost to Russian economy of the inefficient spatial al- location of its productive resources. Spatial inefficiency can result not only in added production and distribution costs — when regional comparative advantages are not exploited and unnecessary transportation and communication expenditures are in- curred — but also the wrong allocation of labor brings inefficiency in consumption: there are extra costs associated with people living in unsuitable places. In the Russian context, “unsuitable” usually means “too far” and “too cold.” I focus on the “cost of cold,” precisely, the cost of production being wrongly located in places with a climate too cold. The first essay is a counterfactual exercise. To obtain a benchmark of spatial efficiency, I construct an allocation of industry and population that would result in Russia in the absence of Soviet location policy. To design such an allocation, I impose Canadian behavior on Russian initial conditions. I estimate a spatial dynamic model on Canadian regional panel data in a multinomial logit framework. I then project the estimated relationship onto Russia. The result is a hypothetical allocation of population and industry, specific to Russia’s endowment and initial conditions, but free of any disadvantages stemming from Russian historical circumstances. This procedure, however, ignores the effect of WWII — a major exogenous shock to Russian economy with no precedent in Canada. We should expect that war would have an impact on industry allocation irrespective of economic or political system. To account for the possible effects of WWII I conduct a separate simulation exercise, taking into account the fact that the war was fought primarily in the west. The results of this exercise show that the eastern part of Russia is still significantly overdeveloped — in other words, WWII explains only a small part of the misallocation.

iii I construct an index of Temperature Per Capita (TPC) to capture the effect of lo- cation on aggregate temperature. Unlike other temperature indicators widely used in empirical growth literature, TPC is obtained by aggregating the temperature readings not over the territory, but over the population distribution. Thus, it provides more informative measure of temperature-related comparative advantages or disadvantages of the economy, especially in short-run. I use the estimated allocation of population to construct the counterfactual TPC. The comparison with the actual TPC reveals that due to Soviet location policy Russia has become about 1.5◦C “colder.” In the second essay I estimate the cost of cold directly. The most profound con- sequences of cold are extra energy use, added construction costs, health effects and productivity loss. Using Russian regional data on energy use, health and production I estimate the elasticity of each of these factors with respect to temperature. I then use these elasticities together with the TPC indices of the actual and the projected allocation to estimate the extra burden of cold that resulted from the Soviet location policy. The results show strong and significant effect of cold on all the factors exam- ined. An increase in TPC of 1.5◦C would have improved aggregate health indicators: average infant mortality rate would have been 1.5% lower, country-wide aggregate mortality rate — at least 0.8% lower. The estimations for construction industry re- veal a 3.5% productivity loss in the actual allocation compared to the counterfactual. The most significant impact of cold is on energy consumption. Cross-sectional analysis reveals that consumption of various kinds of energy by manufacturing pro- ducers increases 2.5 to 4% when January temperature drops 1◦C. Thus, 1.5◦CTPC difference between the actual and counterfactual allocations translates into 3.5-6% industrial energy consumption increase country-wide. The similar results were - tained for the residential energy consumption: various estimates point on 6 to 9% total energy loss due to the misallocation of population. The cost of extra energy consumption and the loss of construction productivity amount to 1.2% to 2.1% of Russian GDP yearly.

iv Contents

List of Tables vii

List of Figures viii

Acknowledgments ix

1 Introduction 1

2 Where Russians Should Live 4 2.1Stylizedfacts...... 4 2.2Theidea...... 8 2.3Thetheoreticalframework...... 11 2.4Datadescription...... 14 2.5Theprocedureandtheestimationresults...... 14 2.5.1 EstimatingtheCanadianpaneldynamicmodel...... 15 2.5.2 Estimationissues...... 16 2.5.3 ProjectingCanadianbehaviorontotheRussiandata..... 21 2.5.4 AccountingforWWII...... 24 2.5.5 Correction for the exogenous cross-regional fertility differences 27 2.5.6 Temperaturepercapitadynamics...... 30 2.5.7 Alternative criteria for model selection and robustness checks . 33 2.6Conclusions...... 35

3 The Cost of the Cold 37 3.1TheRoleofClimate...... 37 3.2Energy...... 39 3.2.1 Energyuseinproducingsectors...... 40 3.2.2 Residentialenergyconsumption...... 42 3.2.3 Cost...... 46 3.3Productivity...... 49 3.3.1 AggregateProduction...... 50 3.3.2 Construction...... 53 3.4Health...... 57 3.4.1 Mortalityandmorbidity...... 57

v 3.4.2 Whatisthecostoftheexcessmortality?...... 61 3.5Conclusions...... 63

Appendices

A Details of dataset construction 65 A.1 Dependent variables: population and manufacturing employment . . . 65 A.2Regionalcharacteristics...... 67

B Algorithm for choosing the optimal model 70

CTables 72

Bibliography 86

vi List of Tables

2.1 The results of the Monte-Carlo simulations for the projected Siberian population:selectdistributionquantiles...... 25 2.2 Excess population in and Far East, according to alternative forecastmodels...... 36

3.1Heatingdegree-daysforselectcitiesandRussiaasawhole...... 45 3.2 Savings of energy in counterfactual relative to actual allocation. . . . 46 3.3Counterfactual“savings”ofenergy...... 48 3.4Regionalproductionfunctionestimates(secondstage)...... 51 3.5 Regional production function estimates, GRP deflated by subsistence minimumindex(secondstage)...... 53 3.6Relativepricesofconstructionoutput...... 54 3.7 Construction industry production function IV estimation. First stage. 55 3.8 Construction industry production function IV estimation. Second stage. 56 3.9 Production function estimates. Test of the robustness to the weighting. 56 3.10 Aggregate morbidity rate as a function of temperature. Robustness to specification...... 59 3.11 Aggregate mortality rate as a function of temperature. Robustness to specification...... 59 3.12Standardizedmortalityrateasafunctionoftemperature...... 60 3.13Infantmortalityrateasafunctionoftemperature...... 62

A.1Regionalcharacteristics...... 68

C.1 Results of the restricted system estimation. Equations for population. 72 C.2 Results of restricted system estimations. Equations for industry. . . . 73 C.3Projectedvsactualpopulation.Canada...... 74 C.4Projectedvsactualpopulation.Russia...... 76 C.5 The industrial structure control series...... 77 C.6 Electricity in 1991...... 78 C.7 Electricity in 1992...... 79 C.8 Thermal energy in 1991...... 80 C.9 Thermal energy in 1992...... 81 C.10 Fuels in 1991...... 82 C.11 Fuels in 1992...... 83

vii List of Figures

2.1Isotherms:averageJanuaryairtemperature...... 5 2.2ChangeinTPCindexinRussia,CanadaandUSA...... 7 2.3 Administrative divisions in the and the of the SovietUnionandtheRussianFederation...... 9 2.4MajormineralresourcesinRussiaandCanada...... 10 2.5Projectedvs.actualpopulation.Canada...... 21 2.6Projectedvs.actualpopulation.Russia...... 23 2.7 Projected vs. actual population. Russia. Accounted for the impact of WWII...... 28 2.8 Projected vs. actual population. Russia. Corrected for fertility differ- ences...... 31 2.9 Projected vs. actual population. Russia. Accounted for the impact of WWII and corrected for fertility differences...... 32 2.10ProjectedandactualTPCdynamicsinCanada...... 33 2.11ProjectedandactualTPCdynamicsinRussia...... 34 2.12 Projected and actual TPC dynamics in Canada. Alternative models compared...... 35

3.1 Energy sectors: consumption structure and counterfactual savings. . . 47 3.2Grossregionalproduct,pricesandtemperature...... 52

viii Acknowledgments

I would like to thank my advisor Barry W. Ickes for his guidance, invaluable advice and encouragement. I am indebted to Herman Bierens, Eric Bond, Edward Coulson, Alexei Deviatov, Clifford Gaddy, Susumu Imai, Vijay Krishna, Joris Pinkse, Regina Smyth and many workshop participants at Cornell University, SMYE ’02, Midwest Economics Meeting, CEFIR, and Penn State for many helpful comments and discus- sions. I am grateful to Marjory Winn for her help with climatic data, and to Yuri Andrienko, Yevgeniya Bessonova, and CEFIR for help and advice on Russian regional data. All errors are mine.

ix Chapter 1

Introduction

More than a decade has passed since the collapse of the communist regimes in the countries of Eastern and USSR. The subsequent years of transition saw ex- amples of both success and difficulty in overcoming the legacy of economic distortions inherited from the planned system. To what extent is the variation in economic per- formance among the transition countries determined by the initial conditions? One of the stylized facts of transition is that the countries where the legacy of the Soviet period was the deepest – the former USSR – experienced the largest drops in GDP and longer recessions. A part of the difference in performance is without a doubt at- tributable to the differences in policies. However, the role of initial conditions should not be discounted.1 Among all transition economies Russian economy is among the most (if not the most) affected by the legacy of the Soviet regime. The focus of most researchers, however, has been mainly on structural and institutional distortions in Russia: al- location of production among sectors and industries and the absence of institutions conductive to the market environment. This work focuses on another feature of the initial conditions that sets Russia apart – geography. Russian geography is unique in two aspects: physical characteristics (size, climate, location) on one hand, and the extent of Soviet distortions in the spatial dimension on the other. Not only does Russia have an unfavorable geographical endowment, but it also uses this endowment badly. Russia’s position on the globe can hardly be characterized as favorable. In em- pirical cross-country studies of growth, findings generally reveal the positive role of such characteristics of the geographical location as proximity to other markets, ac- cess to seashore, land quality and mild climate.2 Russian climatic conditions are

1Ultimately, policy design is endogenous to the initial conditions, because the very role of any policy in transition is to bridge the gap between the Soviet inheritance and free market, subject to the (social, economic) constraints set by the same initial conditions. 2Gallup, Sachs & Mellinger (1999) conclude that both hot climate and location away from seashore hinder economic performance. Bloom & Sachs (1998) point out the hot climate of Central Africa, responsible for disease transmission, as one of the factors hindering economic development. Rappaport & Sachs (2001) find a present-day positive productivity effect for the population growth

1 harsh, resources and population are dispersed over the vast territory, the few natu- ral transportation routes (rivers, seas) are located unfavorably for both internal and international trade, that to the prevalence of costly land transport. Natural resources, though abundant, are located primarily far from population centers in the regions with most severe climate and least developed infrastructure. Russia’s size itself is a source of higher transaction costs: transportation and communications has to reach over larger distances. All these factors drive production costs up, leaving Russia in absolute disadvantage. The unfortunate geography of Russia and its impact on the economic performance have been noted recently by Lynch (2002) and Parshev (1999). Russia’s problems, however, do not end with the poor overall geographical endowment. The spatial distribution of economic activity inside present-day Russia carries the legacy of Soviet investment decisions – the geographical endowment is not being used efficiently. In the field of economic geography, two fundamental factors are named as the driving force behind the process of spatial allocation of economic activity – increas- ing returns to scale and location fundamentals.3 How important is each of these factors in reality is still unclear (Davis & Weinstein (2001)). It is normally presumed, however, that whenever these two forces are allowed to work freely, in the long run the comparative advantages of the favorable locations are indeed being exploited.4 It is quite obvious that Russian economy presents a case when neither of the two forces were actually at work. In fact, in no other country in the world is the spatial allocation of industry as close to being exogenous as in Russia. The industrial structure of the was produced by central planning, with investment decisions made in the absence of market prices. Taking into account that the economic efficiency objectives (if any) of the Soviet planning system were distorted by ideology, geopolitical concerns, military doctrines, historical circumstances, and other issues specific to the Soviet period, it is safe to expect that the existing spatial allocation of the productive resources in Russia is far from being optimal. More importantly, with Russia’s large size comes a wider margin of error: if economic activity is misallocated, it is more likely to be significantly misallocated, and, given Russia’s unfavorable geography, such misallocations can be costly. The present-day Russian economy bears not only the usual burden of the un- fortunate geographical location and climate, but as a direct result of Soviet location policy, also suffers additional losses due to the inherited spatial inefficiency. But while Russian geography cannot be changed, the spatial allocation of industry and people inside the country can be. Thus, there is room for policy. in US counties located near seashores and navigable waterways. 3See Fujita, Krugman & Venables (2000) (FKV) for collection and comprehensive analysis of the models of spatial economy to date. 4According to theoretical models in spatial economies (see FKV, for example) increasing returns may result in multiplicity of spatial equilibria, and therefore even a market economy may be locked in a sub-optimal allocation. As far as reality is concerned, to the best of my knowledge the phenomenon of spatial suboptimality has not been observed and studied by researchers, save in the case of the former Soviet block countries – in this instance being obviously due to policy intervention.

2 The challenge for policy analysis is associated with the following problem. While we can almost certainly infer that Soviet system deviated from the optimal path of spatial development, the extent of the distortion is not known until a counterfactual path – a spatial development pattern that market forces would have produced – has been derived.5 And if this counterfactual path of spatial development is indeed significantly different from the Soviet history, then the next logical question is: “What is the economic cost of the Soviet distortions?” This thesis is organized as follows. In chapter 2, I conduct a counterfactual exer- cise. I obtain the hypothetical counterfactual allocation of industry and population that would result in Russia under the market conditions and in the absence of Soviet location policy. Chapter 3 answers the second question: using the results in chapter 2, I estimate the specific costs associated with Soviet spatial misallocations.

5The most prominent example of counterfactual analysis in economics is the classic work by Fogel (1964) on the importance of railroads for the American economy.

3 Chapter 2

Where Russians Should Live: A Counterfactual Alternative to Soviet Location Policy.

The history of Soviet location decisions can be characterized by a simple motto: “Go East.” During the Soviet period, Siberia and Far East developed at a faster pace than the rest of the country, gaining in both share of population and share of industrial production. The massive movement of people to these regions, the places with most hostile climates and away from existing population centers, has no precedent either in earlier , or in any other country. As the direct result of the Soviet policy, a large share of present day Russian economy operates in unfavorable conditions: too far from the markets, too cold. Was the aggressive development of Siberia and the Far East inefficient from a market economy point of view? The analysis presented in this chapter aims to answer this question by building a market- based counterfactual pattern of spatial development in Russia. This chapter is organized as follows. Section 2.1 discusses historical background: the features of Soviet location policy and the facts from transition period. Section 2.2 describes the general idea and methodology. Section 2.3 gives the setup for the em- pirical part. Section 2.4 provides data description. Section 2.5 outlines the estimation procedure and the results. Section 2.6 concludes.

2.1 Stylized facts

Throughout the course of the 20th century both population and industry in the USSR were moved en masse from the west to the east, from European part of the country to the regions east of Urals. The share of the total population living in Siberia and Far East increased from 5.5% in Russian Empire in 1910 to about 13% in the USSR in 1989. Besides being remote and less developed, these regions are famous for extremely cold climates. The peculiar fact of the climatic geography of the Eurasian continent

4 Figure 2.1: Isotherms: average January air temperature. is that isotherm lines resemble lines of longitude rather than latitude. Thus, in the process of populating Siberia, and even the Urals, people were moving across isotherm lines: from warmer to colder places (see Figure 2.1). After the breakup of the Soviet Union the new Russian state has received all of the regions with the most extreme climate under its jurisdiction, together with all the problems associated. As a result, the economy inherited by Russia is probably the most distorted among the Newly Independent States, not only structurally but also spatially.1 At the time of the collapse of the Soviet Union in 1991 the population of Siberia and Far East reached 25% of Russian Federation’s total. No other country in the world has such a high share of population living in climates so cold. For example, the population of Canadian Yukon and North-West territories, comparable in climate to Siberia, is only 0.3% of the total (1991).2

Temperature and economic growth Does climate matter for economic performance? A common finding in the empirical growth literature is that hot climates are bad for the economic growth, although the causes of this empirical regularity are still being debated. For obvious reasons, one would expect the cold climates to have a similar effect. Extreme cold is dangerous

1The majority of the defense industry was located in Russia. citeGaddy gives Russia’s share of the Soviet Union defense complex employment in 1985 as 71.2%, share of population as 51.8%. 2Canada is similar to Russia not just in climate, but also in the share of its territory that lays north of the Circle.

5 for the survival of the human species, thus it certainly has to be bad for the eco- nomic activity.3 Although it is possible to adapt to cold, this adaptation is costly: more energy is required for heating; productivity of both labor and machinery is de- creased; construction costs are multiplied due to extra material requirements, lower productivity, and quicker wear; the effect on people’s health is enormous. So far, the adverse effect of cold on economic performance (either on level or growth) has not been studied very closely to date. There might exist several reasons why. First, data are limited: most of the growth studies either exclude Russia (or the Soviet Union) from the dataset, or use the hardly reliable Soviet time data only. However, missing or mismeasuring this observation can prove crucial. Russia is not just a cold country, but rather it is the cold country. By territorial temperature aggregations Russia stands out as very cold even among other cold countries. A map of the isotherms in Europe shows that has colder winters then even Norwegian ports well above the (see Figure 2.1). Second, all of the empirical studies of temperature and growth use territorial aggregations of climate variables (temperature, or number of frost days, as Masters & McMillan (2000)). But what is important for the studies of economic activity is where the people actually live. According to the territorial temperature aggregations the countries of : Sweden, , appear to be cold. In fact, in these countries the population is concentrated along the seashores and in the south, where temperatures are not significantly different from the rest of Europe. The same is true for Canada, where people mainly live along the southern . As an alternative to the territorial temperature aggregations, Gaddy & Ickes (2001) propose the Temperature Per Capita (TPC) index.4 We can define TPC of country k as:  TPCk = ηjτj, (2.1) j where ηj is the share of a country’s total population that resides in region j,andtj is the average mean temperature in region j. TPC is typically measured for a given month – in most cases, January, the coldest month of the year. Regions are usually basic administrative units: provinces, , or states. In essence, TPC is a coun- trywide average temperature aggregated over the spatial distribution of population. Clearly, TPC for a country is not constant over time. If people migrate from colder to warmer places, TPC would rise. This way, a change in TPC serves as an index

3The evolutionary theory seems to be in accord with this: “Climate plays an important part in determining the average number of a species, and periodical seasons of extreme cold or , I believe to be the most effective of all checks ... In going northward, or in ascending a , we far oftener meet with stunted forms, due to the directly [emphasis in original] injurious action of climate, than we do in proceeding southwards or in descending a mountain.” – Charles Darwin, 1859, Origin of the Species, Oxford World’s Classics ed. pp. 57-58. 4In a way, territorial temperature aggregations are measures of country’s climatic endowment. TPC describes how this endowment is used. Given the great degree of inertia in spatial popula- tion distribution, the latter is a more useful measure of climate-related comparative advantages or disadvantages, especially in short-run.

6 Figure 2.2: Change in TPC index in Russia, Canada and USA. measure of the climate-related effect of the spatial economic policy. This measure is especially useful in the case of Russia, as with Russian geographical endowment (alleged) spatial inefficiency is synonymous with cold. In terms of TPC, especially its dynamics, Russia stands out even more. Not only was Russia colder than other countries at the beginning of the century, but also TPC in Russia fell even further through the Soviet years. While market economies were gradually warming up, with capital and labor resources allowed to freely migrate to the more favorable locations, Russia got even colder (see Figure 2.2).

Soviet spatial policy: inefficient? Of course, the fact that Eastern parts of Russia had been so aggressively developed during Soviet times does not by itself prove that this was not economically efficient. One should not expect the regional distribution of industry to stay the same over time. Remote regions should develop if and when technology makes it cost-effective. Migration trends during the transition period exhibit the pattern opposite to that of the Soviet times. This is but another evidence that the Eastern regions were “overdeveloped” during the Soviet times, and they host too many people and too much production from the market economy point of view. But even though the negative net migration picture can pinpoint the most evident regional problems, it cannot show the degree of the misallocations. Positive net migration flow alone does not prove that a region is economically viable in long-run, but only shows that in short-run there exists a region that performs even worse.5 Direct comparisons with other countries – Canada, Northern Europe – are also at best illustrative. Though the direction and scale of population movements in the

5Moreover, in short-run a poorly performing region may be locked in a poverty trap: people are credit constrained and poor, and therefore, unable to move out.

7 Soviet Union have no precedent anywhere in the world, there are some important issues to take into account. First, the eastern regions of Russia are rich in natural resources. It could be ef- ficient to establish production near the primary inputs. Second, even at the times of Russian Empire, Siberia was already populated, with several major cities estab- lished. Even if the location fundamentals were not favorable, the increasing returns argument could validate the further development of Russian East. Third, it may be strategically wise to populate the regions bordering historically unfriendly and . Fourth, WWII was responsible for the destruction of infrastructure in the western parts of the Soviet Union and for the shift of major defense industries to Urals. Partly, this shift was due to political decisions of Soviet authorities, but even without the political pressure we could expect to see similar effect in any kind of economy. These considerations imply that the endowments and the unique historical circum- stances specific to Russia must be factored in to make an interesting counterfactual. To do so we must therefore simulate how Russia might have developed if market forces had operated, and incorporate the special unique considerations mentioned above.

2.2 The idea

The idea for the simulation exercise is: use Canadian behavior as a benchmark of the spatial dynamics in market economy, but apply it to Russian initial conditions and endowment. Using Canadian data, estimate a model that characterizes the dynamic links between, on one hand, spatial structure of the economy and, on the other hand, initial conditions and regional characteristics. Then this model can be applied to Russia to produce the counterfactual “market” allocation.6 Why Canada? Applying the spatial dynamic relationships from one country onto another is most justified when the countries are similar in their endowments and stages of development. Thus, the choice of Canada as a benchmark is relatively obvious: there is no other country in the world more close to Russia in climate and size. Both economies possess and export abundant natural resources. (Figure 2.4 shows the geography of major mining operations in Canada and Russia.) Less obvious, but also important is the fact that both Russia and Canada at the beginning of the century had (and still have) the vast undeveloped amounts of land. Russia was still

6My concern is with the impact of Soviet location policy on the economic geography of Russian Federation, primarily on Siberia and Far East. However, for the most of the century Russia had been a part of the common market of the Soviet Union. In order to correctly account for the possibility of the interregional migration, the projections must be applied to the whole territory of the Soviet Union. Thus, the dataset covers not only the present day Russian Federation but also some of the territory that belongs now to other Newly Independent States. All regions that were part of both the Soviet Union and Russian Empire are included. I apologize for using the term “Russia” to refer to this artificial territorial entity throughout. I do this purely for the purpose of simplicity. Figure 2.3 shows the administrative borders on the territory of the former Russian Empire and the former USSR.

8 Figure 2.3: Administrative divisions in the Russian Empire and the borders of the Soviet Union and the Russian Federation.

effectively expanding east, and Canada colonizing its west. Neither country seemed to be in long-run spatial equilibrium, but they were “moving in similar directions.” At the same time, Canada is diverse enough to generate needed variance in the data. For example, in Russia the coldest regions are also the most remote ones. In Canada, this is not true to the same degree. The city of Vancouver has the warmest climate but located rather far from the most populous Toronto. Hopefully, this will allow us to effectively separate the two factors: cold and distance. On the other hand, Canada has the benefit of much better access to oceans, and better natural water transportation network then Russia. Hence, due to the lower costs of transportation, Canada is in a better position for both internal and inter- national trade. Canada also enjoys sharing the (only) international border with its major trading partner, USA – a friendly neighbor and a large market. Russia, in con- trast, shares borders with an extremely diverse set of countries, and the relationships with the neighbors were uneasy at times. Among major trading partners of Russia at the beginning of the century as well as now are Germany and United Kingdom, and shockingly none of the bordering states. While partly this situation might have developed endogenously, either due to comparative advantage patterns, or (later) due to the particular Soviet political choices, still any model estimated on Canadian data is likely to somewhat overstate the importance of geographical proximity for trade if

9 Figure 2.4: Major mineral resources in Russia and Canada.

10 applied to Russia. Another challenge in applying the Canadian model onto the Russian data lays with the fact that Canadian regions are much more homogeneous then Russian ones with respect to ethnic composition, human capital, and culture.7 I develop an empirical model of spatial population dynamics that is estimated on Canadian regional data. The equations in general functional form link regional population growth to past population, industry, and various location characteristics. Then, the equations bearing the fitted values of the coefficients are applied to the Russian regional data for the initial (beginning of the century, before the ) population and industry and the same set of regional characteristics. The result of the projections is the counterfactual allocation – specific to Russian starting point and geographical characteristics, but obtained using the dynamic relationship fitted on market economy. The procedure is described in greater detail in Section 2.5. Thus, the assumption behind the procedure is not that spatial structures of dif- ferent market economies should be similar, but rather that the dynamic forces that impact on location should be similar. In other words, we do not just compare the existing spatial allocations in Russia and Canada, but instead we look at the changes in structure over time: initial conditions matter.

2.3 The theoretical framework

Consider a discrete-time setting with an infinite horizon where each period infinitely many small agents, people, choose a location (i.e., a place to live) that maximizes their utility. Simultaneously, firms choose locations that maximize expected profit.8 The people who have chosen a particular location compose the population in this location. The “amount” of industry in a location is characterized by total number of employed in manufacturing. I model the spatial distribution of population and industry using the framework of the multinomial discrete choice (logit) model.9 Both indirect utility and profit functions are defined over the finite set of alternatives – locations. The location choices are obviously interdependent: people choose where to live taking into account the industry locations and other people’s decisions, and firms choose where to locate taking into account locations of both population and other firms. I use a reduced-form model. Utility in a location and potential profit depend on the past values of population and the past values of industry employment as well as on other exogenous characteristics of this location. Among these characteristics

7For example, we might expect based on cultural reasons alone a higher than average population growth in Central Asian parts of the USSR. On average, family size has been traditionally larger in those regions. 8Assume firms are economically small so that their decisions on location do not affect regional profitability. 9The application of multinomial logit model to the problems of location choice is a fairly standard approach, pioneered by Carlton (1983).

11 are: temperature as a climate proxy, agricultural potential, natural resources, sea and rivers as natural transportation routes; infrastructure (man-built transportation network), international trade routes as a proxy for the size of potential markets, agricultural potential, etc. Assuming utility and profit functions are linear, write n pop ind k k u u uijt = βt ln POPj,t−1 + βt ln INDj,t−1 + βt xj,t−1 + δjt + ijt, (2.2) k=1

n pop ind k k π π πijt = αt ln POPj,t−1 + αt ln INDj,t−1 + αt xj,t−1 + δjt + ijt, (2.3) k=1 where uijt is the utility of agent i in a location j in period t; POPj,t−1 and INDj,t−1 are, correspondingly, population and industrial employment in location j in period 1 n t − 1; xj,t−1...xj,t−1 are various characteristics of location j, possibly time-dependent; u π δjt and δj t – normal-distributed shocks to utility or profit specific to the location j u π and time t; ijt and ijt – Weibull-distributed agent-location-time specific shocks to utility or profit; α’s and β’s – parameters. The parameters of the utility and profit functions are time-dependent, so that the role of the different factors may change over time. Denote n pop ind k k ujt = βt ln POPj,t−1 + βt ln INDj,t−1 + βt xj,t−1, (2.4) k=1 ujt is a deterministic component of the utility function in a location j at time t.Itis common to all agents. Also, n pop ind k k πjt = αt ln POPj,t−1 + αt ln INDj,t−1 + αt xj,t−1, (2.5) k=1

πjt is a deterministic and common to all agents component of the profit function. The probability that person i chooses location j at time t can be expressed as   pop u u u u Pijt = P ujt + δjt + ijt >ult + δlt + ilt|∀l    u u u u = P ujt + δjt + ijt >ult + δlt + ilt , (2.6) l=j and the probability that firm i chooses location j at time t is   ind π π π π Pijt = P πjt + δjt + ijt >πlt + δlt + ilt|∀l    π π π π = P πjt + δjt + ijt >πlt + δlt + ilt , (2.7) l=j

12 Under the assumption that the ’s are independent Weibull-distributed random variables, these probabilities take the following form:10

u ujt+δ pop e jt Pjt = L u , (2.8) ult+δ l=1 e lt and π πjt+δjt ind e Pjt = L π . (2.9) πlt+δ l=1 e lt One can show that the observed shares of population and employment converge to the above values (equations (2.8) and (2.9)) as number of agents in the economy pop ind increases. Let Sjt be the observed share of population, Sjt – be the observed share pop pop ind ind of employment, then setting Sjt = Pjt , Sjt = Pjt and taking natural logarithms of both sides, get   L pop u u ult+δlt ln Sjt = ujt + δjt − ln e , (2.10) l=1 and   L π ind π πlt+δlt ln Sjt = πjt + δjt − ln e . (2.11) l=1 Now consider the difference between log-shares in any two locations j,l:

pop pop u ln Sjt − ln Slt = ujt − ult + ξjt, (2.12) and ind ind π ln Sjt − ln Slt = πjt − πlt + ξjt, (2.13) u u u π π π where ξjt = δjt − δlt and ξjt = δjt − δlt- zero-mean normal variables. L pop  Taking into account a loss of one degree of freedom due to the fact that l=1 Slt = L ind l=1 Slt = 1, assume that SL is fixed and exclude location L from the sample. Ig- noring the common denominator, and substituting the expressions (2.4) and (2.5) for ujt and πjt get the equations to be estimated: n 0 pop ind k k u ln POPjt = βt + βt ln POPj,t−1 + βt ln INDj,t−1 + βt xj,t−1 + ξjt, (2.14) k=1 n 0 pop ind k k π ln INDjt = αt + αt ln POPj,t−1 + αt ln INDj,t−1 + αt xj,t−1 + ξjt. (2.15) k=1 The system (2.14) and (2.15) is estimated on the panel dataset.

10 The assumption that ijt are independent is necessary for probabilities to have simple functional form. In reality, we could expect that the individual unobservable shocks are correlated in spatial dimension: if a person prefers a particular location, possibly locations near it are also more attractive to him on average.

13 2.4 Data description

I assembled a panel dataset to use in the estimations for Canada, and for the pro- jections onto Russia, from the various Canadian and Russian population census pub- lications, statistical yearbooks, and various maps.11 The panel dataset for Canada contains data for population for 9 time points, one in every decade, starting with year 1911 and up to year 1991. Data for industrial employment are available for year 1911 and for years 1941-1991, for 1921 and 1931 data are missing. For 1921-1931 cen- sus publications give industry data only for cities of population greater than 15,000, but not for census districts. These years were dropped from the sample for industry equation, or replaced by 1911 data in in population equations for 1931 and 1941. Data for the year 1981 are not included into census publications, but available from Statistics Canada.12 Unfortunately, due to confidentiality issues, several small monoindustrial census districts are not listed. Thus, the quality of data available for 1981 is substantially lower. For each year the sample consists of 37 or 38 (upon inclusion of Newfoundland into Canada) observations. The dependent variables are: population (POPj,t) and man- ufacturing industry employment (INDj,t) in a region. The set of independent vari- k ables {xjt} includes the following regional characteristics (time and location subscripts are omitted): area (AREA), mean January temperature (TEMP), distance to the largest city (DISTCAP), number of railroads (RR), natural resources dummy vari- ables (OIL, , METALS, TIMBER), and geographical characteristics dum- mies such as presence of trade route (R ABROAD), access to waterways (PORT), quality of land (FARMING), and urbanization rate at the beginning of the century (URBAN). Appendix A has detailed information on the data sources. For Russia, the dependent variables (population, industry employment) are col- lected for only two years: the starting year, 1910, and the final year, 1989. The 1910 data will be used for the projections, as a starting point, and the 1989 data will be used for the actual vs. counterfactual comparison. The set of Russian regional char- acteristics is the same as for Canada, and data are collected for all the years present in Canadian dataset. The sample size (number of regions included) for each year for Russia is 79. Appendix A gives further details on data.

2.5 The procedure and the estimation results

This section describes the empirical procedure. The course of action can be outlined in four steps.

11For more information on data sources and the details of the dataset construction, see Appendix A. 12“Manufacturing Industries of Canada: Geographical Distribution,” Regional and Small Business Statistics Section, Manufacturing and Primary Industries Division, Statistics Canada, 1982.

14 Step 1 Estimate the dynamic panel model for Canada to obtain the fitted values of the model parameters.

Step 2 Project the estimated relationship onto Russian data. The result is the coun- terfactual spatial population distribution (not accounting for WWII).

Step 3 Incorporate the WWII effect into the projections.

Step 4 Correct the projections for the exogenous inter-regional fertility differences.

The counterfactual spatial population distributions obtained in steps 2, 3 and 4 can be used to construct counterfactual TPC indices. The following subsections describe the above steps in greater detail and report the results.

2.5.1 Estimating the Canadian panel dynamic model

The system of equations (2.14) and (2.15) is estimated on the Canadian panel data. The subscript (t−1) indicates value taken in the preceding time period in the dataset (10-year lagged).13 All parameters (α’s and β’s) in the equations (2.14) and (2.15) have a time sub- script t attached, indicating time-dependence. In general, the parameters can change over time, reflecting possible changes in technology, world prices, tastes, or other factors that can impact on location choice. Since the coefficients are allowed to freely vary with time, the system should be estimated time period-by-time period rather than together as a panel.14 However, the equations (2.14) and (2.15) represent, in essence, seemingly unrelated regressions. The error terms for industry and population in the same region and the same time are obviously correlated: a positive shock to region’s population is likely to coincide with similar shock in manufacturing employment. Thus, the equations (2.14) and (2.15) have to be estimated together with the use of Generalized Least Squares method.15

13Missing industry series for 1921 and 1931 were substituted by 1911 series. 14A serious deficit of the degrees of freedom might arise when estimating 15 parameters on a sample of 36 or 37 observations for each year. The number of degrees of freedom in each individual regression is 22 at best. Thus, the standard error estimates might not be reliable. It is desirable to impose restrictions on the model: drop some variables from the regression to reduce the number of parameters and release several degrees of freedom. 15As long as the set of explanatory variables is the same for industry and population regressions, the GLS estimation is equivalent to the simple OLS estimation of separate year-by-year regressions. The equivalence of GLS and OLS does not hold anymore, if (non-matching) restrictions are imposed on the parameters in these equations.

15 2.5.2 Estimation issues The unobservable location-specific shocks

What if the set of explanatory variables ({POPt−1,INDt−1,xt−1})is not exhaustive? What if there exist other (unobservable) factors that impact on location? Formally, if true relationship between population at time t and explanatory vari- u ables includes an unobservable location-specific shock ηl,t, the equation (2.14) becomes n 0 pop ind k k u u ln POPj,t = βt + βt ln POPj,t−1 + βt ln INDj,t−1 + βt xj,t−1 + ηl,t + ξjt, (2.16) k=1 u u where ηjt may be correlated with other explanatory variables. In particular, if ηjt terms are persistent across time dimension, then they are inevitably correlated with past population levels ln POPj,t−1. If a correctly specified relationship is given by the equation 2.16, but instead the equation (2.14) is estimated by OLS, the estimated coefficients would carry the omitted variable bias:

E[βˆt]=βt + γt, (2.17) where βt is a vector of the coefficients. γt is a vector of the coefficients in a (cross- sectional) regression of ηt on the explanatory variables: n 0 pop ind k k ηt = γt + γt POPt−1 + γt INDt−1 + γt xj,t−1 + et. (2.18) k=1 The presence of location-specific unobservables is common for cross-country or regional panel models.16 Normally, when the goal is to estimate the structural pa- rameters in a panel data setting the unobservables are integrated into location-specific (fixed or random) effects. In our case, however, the goal is different. We need to find a forecast model that would be relevant for counterfactual Russia. Obviously, the location-specific effects estimated on Canadian data cannot be transferred onto Rus- sia. Moreover, if a forecast is to be unbiased, the equation (2.14) estimated by OLS is the one to be used. Formally, the conditional expectation of population tomorrow as a function of explanatory variables is

0 pop ind E[POPt|POPt−1,INDt−1,xt−1]=βt + βt POPt−1 + βt INDt−1 n k k u + βt xj,t−1 + E[ηt |POPt−1,INDt−1,xt−1], (2.19) k=1 where n u 0 pop ind k k E[ηt |POPt−1,INDt−1,xt−1]=γt + γt POPt−1 + γt INDt−1 + γt xj,t−1. (2.20) k=1

16See Islam (1995) for the discussion of the common estimation issues in panel models of growth.

16 Substituting (2.20) into (2.19), get

0 0 pop pop E[POPt|POPt−1,INDt−1,xt−1]=(βt + γt )+(βt + γt )POPt−1 n ind ind k k k +(β + γt )tINDt−1 + (βt + γt )xj,t−1 (2.21) k=1 Under the assumption that the equation (2.18) correctly specifies the relationship between unobservable location characteristics and observable explanatory variables, the parameters of a conditional expectation function given by the equation (2.21) are exactly what the OLS estimation of (2.14) gives. Under the key premise that (pre- communist and later counterfactual) Russia and Canada share all features of spatial dynamics – i.e. regional characteristics, observable or unobservable, have the same impact on location choices – this relationship gives unbiased Russian forecast.

Choosing the best model Different restrictions imposed on the model coefficients in equations (2.14) and (2.15) can potentially to different results when projecting the Russian counterfactual. In essence, we are faced with a problem of choosing the best forecast model. There exist two objectives in a model choice problem. First, an essential feature of a good model is the absence of systematic spatial bias. If one group of regions is systematically over- or underestimated in the Canadian model, the projected Russian allocation will most likely be biased as well. Second, we would like to minimize the unsystematic error – produce the highest quality forecast. A model with too few explanatory variables would not fit the Canadian dynamics closely enough, and, consequently, give poor forecasts. A model with too many variables would give a better fit for every time period. But it may not give a better fit for the 1911-1991 period as a whole. A good model should explain the long-term and general trends well and ignore the transitory and localized shocks. With too many variables included there is a danger of overfitting the individual time periods. Overfitting not only makes estimators inefficient, it also records all the local and transitory variations as if they were permanent and part of the long-term trend. Intuition suggests that the best forecast will result neither from the most parsimonious nor from most exhaustive model, but from an “intermediate” case: some variables are dropped from regressions. A careful choice of variables for the earlier years is more crucial, as the projection error incorporated early in the process might grow in magnitude with each step (time period) as forecast equations are applied recursively. Since it is unclear if the errors from different time periods would accumulate or cancel each other out, we cannot sim- ply apply one of the widely used forecast model selection criteria (R-square adjusted, Akaike or Schwartz) to each time period individually - this would not guarantee the best forecast for the final year. Instead, the fit of any model may be evaluated in following way. First, estimate the model. Second, apply the same procedure as planned for Russian projections,

17 only to Canada itself, working from 1911 data on. Then, compare the projected values with real Canadian data for 1991 using some criterion. This exercise would show how well the particular model fits the Canadian spatial dynamics overall.17 First and foremost, the Canadian projection errors should be examined (subjec- tively, at least) for any visible spatial bias. If the differences between projected and actual population values do not appear spatially random, the model is likely biased and should not be used for the Russian projections. Second, the in-sample (Canadian) forecast error must be evaluated numerically. Several criteria can be proposed to compare different models. I use the sum of rela- tive (weighted by the actual population) squared differences between the actual and projected populations in 1991:    Actual Proj 2 POPl,1991 − POPl,1991 Actual . (2.22) l POPl,1991

Of all possible models (i.e. subsets of explanatory variables) I choose the one that min- imizes the equation (2.22). My motivation was to choose the criterion that is focused on the terminal period distribution and does not overlook the “small” low-populated regions – the Canadian North, as this work is focused mostly on the implications for its Russian counterpart – Siberia and the Far East. Thus, weighting the errors by region’s population is necessary to guarantee equal treatment of all regions. Only final year errors are taken into account.18 Intuitively, the criterion (2.22) looks for the model with negative intertemporal correlation in error terms. If the errors for the same region in different time periods are negatively correlated, then in the projection process, when equations are applied sequentially, errors in different time periods tend to cancel each other out rather than add up over time. Hence, the final year error is smaller. In our case, the errors for the same region and different time periods turn out to be positively correlated (at the level of 0.5 to 0.6 for different time period pairs in unrestricted model).19 Thus, the procedure is in fact looking for the model with the lowest positive intertemporal correlation in the residuals. Ideally, the search for the best model would involve estimating the projected Cana- dian population for all possible models, and then selecting the model that minimizes (2.22). However, with 14 explanatory variables for each period and for each depen- dent variable, the number of possible combinations is astronomical – 214∗(9+7) =2224

17Essentially, I am choosing the model that gives the best in-sample forecast. It might not be true in general that best in-sample (Canadian) model would also give best out-of-sample (Russian) projections. But this the nature of a counterfactual exercise. 18Alternatively, instead of equation (2.22) other criteria could have been used. For example, other weighting schemes (or none) could have been used, or forecast error could be evaluated along the whole time path, not only in final period. Section 2.5.7 further discusses other criteria for model selection, compares performance of different models, and checks the robustness of the main results. 19Due to, of course, the presence of the persistent unobservable location characteristics as discussed above.

18 possible models. Estimating such a number of panel models is impossible – there is not enough time. To find the best model, instead of a direct search I use an algo- rithm that searches through models eliminating variables one by one. The algorithm is described in Appendix B.

Results Tables C.1 and C.2 (Appendix C) present the estimation results for the best (by weighted SSE criterion (2.22)) model. In the population equation, the lagged population values are significantly positive and stable over time; coefficients have a value slightly less than one, except for two periods: 1921 and 1991.20 The AREA coefficient is mostly positive as well. This suggests that a process of spatial population distribution is diffusing ceteris paribus: people tend to spread over the territory over time rather than concentrate in large agglomeration points. This pattern is quite probable for the countries going through the territorial expansion, or as in cases of Russia and Canada, settling the sparsely populated territories. Lagged industrial employment is also positive and significant for the middle part of the century. The coefficient in 1921 is, in contrast, negative, while the coefficients in 1931, 1941, and 1991 are close to zero. The zero coefficient in 1991 may be explained by poor 1981 employment data quality. 1931 and 1941 equations use 1911 data for past industry instead of missing 1921 and 1931 – so the variation in coefficients is expected. As an alternative explanation, the 1911 to 1941 time period covers also two major historical events: the massive migration to the Western provinces and the Great at the beginning of the century and The Great Depression of the 1930s. Thus, the near zero or negative coefficients for lagged industrial employment and temperature might reflect population movement away from industrial centers and to the regions with colder climates.21 The natural resources variables might have either positive or negative influence on population growth. The presence of natural resources may draw people and industry into the region, offering cheaper production inputs. Or alternatively, the monoindus- trial resource-oriented regions may grow slower ceteris paribus. The initial population boom after the discovery of the resource might follow with a period of relatively slow population growth, giving the negative sign to the lagged resource variable. Evi- dently from the results, the presence of natural resources seems to be of relatively

20The near-unity coefficient of past population is expected. In the presence of the unobserved time-persistent regional idiosyncrasies evident in my case, the coefficient of past population tends to be biased upward from a true structural value and (in case of diffusing process) towards unity. The coefficient of past population reflects not only dependence on past population per se, but also on all the unobserved factors that determined population in the past. The unobservables are “built into” the past population. When interpreting the coefficients, one should bear in mind that the estimated coefficients are not true structural parameter values, thus do not reflect true causality. 21In addition to these considerations, it is not always possible to correctly compare the coefficient values between time periods simply due to the fact that for different time periods different sets of variables are be included into the equations.

19 little influence for the population distribution. Timber in 1921 and Metals in 1941 have positive coefficients - these are the only exceptions. The variables that proxy for trade possibilities, communications and the size of the reachable market (number of railroads, route abroad and ports) – presumably advantageous characteristics – do not appear to have a significant impact on the population distribution dynamics either. Most likely, the positive influence of these factors is already built into the past population levels, and does not warrant higher than average ceteris paribus population growth during XXth century. Urbanization rate in 1911 has a consistently positive coefficient, suggesting that areas that were settled and urbanized prior to the beginning of the century continued to attract people at a higher than average rate. Interestingly enough, in many time periods temperature itself is not a significant explanatory variable – the movement of population towards warmer areas is explained well enough by other factors. The results for the industry regression follow roughly the same pattern, except for a few key distinctions. First, the role of lagged population and industry variables are reversed: lagged industry is now the more important factor. Second, infrastructure and/or access to markets seems to be slightly more important for industry, as the number of railroads has positive and significant coefficient for two time periods. Third, area does not always have strongly positive coefficient, and lagged industry coefficients for different years are both under and over 1. Thus, it is not clear whether industrial employment does follow the same diffusion-type spatial dynamics as population. The natural resources variables have predominantly negative or near-zero coeffi- cients, (coefficients for metal mining operations are significantly negative) suggesting that primary industries in a region are not a magnet for other industrial production, but may even crowd it out. In general, the results suggest that overwhelmingly the most important factor that determines the spatial distribution of population (or industry) today is the pre- vious population (or industry) distribution. The spatial patterns of both population and industry look very stable.22 Among less important factors, infrastructure, com- munication, transportation routes, access to markets tend to either promote faster population and industry growth or have insignificant dynamic impact. Natural re- sources do not make a region significantly more attractive for either population or manufacturing industries. Projecting the estimated model onto Canadian data from 1911 on allows us to produce Figure 2.5 (and Table C.3). As the map shows, the positive and negative errors are distributed fairly evenly across the territory. There is no immediately visible bias against or for any particular part of the country. The model does worse predicting the population in large metropolitan areas. Sev- eral regions with major cities (Vancouver, Toronto, Montreal, Edmonton) have been growing fast during the century and have strongly positive unexplained error – ac- tual population is seriously higher than projected. The other large cities (Winnipeg,

22Most likely, because attractive features of locations manifested themselves through history: best locations are the ones most densely settled.

20 Numbers show the absolute difference between projected and actual population, in thou- sands.

Figure 2.5: Projected vs. actual population. Canada.

Halifax, Ottawa, etc) have either negative or near zero error. On average, however, the population of the biggest metropolitan areas is underestimated.

2.5.3 Projecting Canadian behavior onto the Russian data I now use the estimated model to produce the counterfactual. Using the data on the initial conditions and the exogenous characteristics for Russian regions, we apply equations (2.14) and (2.15) to Russian population (POP) and industry (IND)datain 1910, and all exogenous information to get the projected regional values of population and industry employment in the next year in the panel. Formally,  RusP roj Rus pop ind k k ln POP1921 = β1921 + βˆ1921 ln POP1911 + βˆ1921 ln IND1911 + βˆ1921x1911, k (2.23) and  RusP roj Rus pop ind k k ln IND1941 = α1941 + α1941 ln POP1931 + α1941 ln IND1911 + αˆ1921x1911. k (2.24)

21 Note that the constant terms βRus and αRus are not equal to the intercepts in the equations (2.14) and (2.15), respectively, estimated on the Canadian data. Since the multinomial logit model describes the relationship between relative shares of different regions in a total population of a country, and the shares have to sum up to 1, one degree of freedom is lost to that additional condition. In the same way, when projecting the model onto Russia, to properly calculate the relative shares of all the provinces we need to account for the loss of one degree of freedom. Therefore, the Rus Rus 79 RusP roj values of βt and αt can be obtained from the condition: l=1 POPl,t = RusActual RusActual POPt ,wherePOPt is actual total population of the Soviet Union in year t.23 In the next step I use the projected 1921 values of population in Russian regions and 1910 industry data as an input for the 1931 population equation. The results are the population estimates for 1931. Then, population in 1941 is projected in similar way. The projections for industry start with year 1941 as a function of 1931 projected population and 1910 actual industry data. Then, the forecast equation for the year 1951 takes both industry and population 1941 projected values as arguments. The procedure is repeated until the year 1991. The 1991 results present an alternative spatial allocation for Russia that would have occurred if its development followed the Canadian path. This is the counterfac- tual allocation sought after: it is free of all the shocks and disadvantages specific to Russian history. The counterfactual regional population levels are reported in the Table C.4, Ap- pendix C. The difference between actual and counterfactual population levels is plottedonthemaponFigure2.6. The east-west divide is evident on the map. As the rule, the projected values of population in the western provinces are higher than the actual, and in the eastern part – lower than the actual. The degree of spatial autocorrelation is astonishing: all of the underdeveloped regions are located in the western part of the country. In the European part of the Former Soviet Union, the only four observations with a predicted population distinctly lower than actual are Moscow, Estlyandskaya province (now independent state of ), Ekaterinoslavskaya province (home of Donbass coal mining region) and Tavricheskaya (Crimean peninsula). In the European part of the country, generally, the provinces around larger cities (St. Petersburg, and also the capitals of the Union : Kiev, Minsk) experi- ence less of a population deficit in per capita terms. The population of Moscow is underpredicted. The fast growth of Moscow in XXth century owes to its status as the capital of the Soviet Union. At the turn of the century Moscow was only the second largest city in the Russian Empire. The fact that Central Asian regions show a lot of excess population can be ex-

23It is not necessary to obtain the value of the constant term for each year. Since the relationship between past and present population (or industry) is linear in logs, the constant added to all the observations does not change the relative “weight” of different regions. Only the terminal-year constant term is of interest.

22 plained by cultural reasons: the fertility rates in Central and parts of are historically higher than in the rest of the country. I attempt to correct for these differences later. The eastern regions: Siberia, and especially the Far East are noticeably overpop- ulated. Even though the predicted number for the Siberian population is very high – about 19 million people (compare with 80,000 in Canadian Yukon and North-West Territories), it still leaves an astonishing 14.6 million of excess population east of Urals. Moreover, the situation in neighboring regions of Urals is no better: they are overpopulated as well.

Numbers show the absolute difference between projected and actual population, in thou- sands.

Figure 2.6: Projected vs. actual population. Russia.

Of course, the mere fact that predicted population for individual regions appear to be over- or under- the actual level does not necessarily imply the general inefficiency of the allocation. Forecast model is not 100% precise. Values for the most of the regional characteristics (temperature, distance, railroads, etc) used in the estimations and projections are taken at the point of the regional (provincial) population center, but population is spread over the territory. Population levels for the individual regions are very sensitive to the location of the borders. If a large city is located near a

23 regional border, a small change in administrative division might lead to big changes in population numbers. Thus even a spatial allocation reasonably close to efficient might have individual regions seemingly over- or underpopulated. We should expect, however, that the differences between actual and predicted values for the neighbor regions approximately cancel each other out, and that the errors appear would be spatially random rather than systematic. Neither of these is true for Russia.

Evaluating the forecast error Is the excess population of 14.6 million in Siberia and the Far East statistically differ- ent from zero? It is interesting to determine what is the probability that the existing population of 34 million in the eastern part of the country could indeed be generated by the (estimated) market-based model. Evaluating the forecast error in this case is not a trivial task. It is theoretically possible (although extremely impractical) to derive the statistical distribution of the projected population in each of the 79 Russian provinces. However, to determine if the excess population of Siberia as a whole is above the boundary of statistical significance we need to consider the joint distribution of the predicted population in all nine of the Eastern provinces. The values of the predicted population for the different provinces are not independent, and the joint distribution function is prohibitively complex. Instead, I conduct a Monte-Carlo experiment. Using a Normal random number generator, I draw 1000 sets of random model coefficients (α’s and β’s), according to their estimated means and variance-covariance matrix. For each set of coefficients, I conduct the projection exercise on Russian data and record the total projected popu- lation of the nine Siberian and Far Eastern regions. I then examine the sample of 1000 projections and record the distribution quantiles. The procedure is asymptotically equivalent to evaluating the forecast error analytically. In 950 cases (or 95%), the projected population of Siberia and Far East is below 24,460,000 in contrast to the existing population of more than 34 million people. That makes the 95%-lower-bound on excess population about 9.8 million. It is absolutely implausible that the estimated 14 million of excess population in my counterfactual is a random error. The maximum projected Siberian population that occurred in the sample of 1000 random models is about 30 million, vs. 34 million actual. Thus, the probability that actual spatial population dynamics in Russia might have been indeed following the Canadian model is for certain less than 0.001, and is likely literally microscopic. Statistically, the difference between predicted and actual population values is not just significant, it is significant at a very high level of confidence (see Table 2.1).

2.5.4 Accounting for WWII It is important to examine to what extent the Soviet-produced inefficiency is due to policy decisions and to what extent it can be explained by exogenous factors:

24 % level Projected Siberian Difference between population, ’000s actual and projected 50% 19,672 14,576 95% 24,465 9,783 97.5% 25,451 8,797

Table 2.1: The results of the Monte-Carlo simulations for the projected Siberian population: select distribution quantiles. the circumstances beyond control of Soviet authorities. The single most important historical event that had impact on the spatial pattern of Russia’s economy is WWII. WWII disproportionally affected the western regions of the country. The regions of the European part of the USSR suffered the destruction of infrastructure and the loss of many lives. In addition, during the war, a substantial number of strategically important enterprises, together with essential personnel were evacuated to safer places – mostly to the Urals, Siberia and .24 If detailed information on the population losses, infrastructure destruction and evacuation efforts were available, it would be possible to account for the consequences of the war directly. Unfortunately, lack of relevant data is a major obstacle. De- tailed information on the evacuation efforts by the Soviet Government has not been published openly. The industry employment data at a low level of geographical ag- gregation were not published even for peaceful times. The first post-war census of population in the Soviet Union took place in 1959. There is no way to obtain - level data either on population loss, or on loss of industrial capacity due to the war. Thus, instead of tracking the actual impact of the war, I try to estimate an upper bound on its long-run consequences. According to the scattered evidence from various publications, as a whole lost about 20% of its population, and Belorussia lost about 25%.25 The percentage loss of productive capabilities during WWII was not publicized in the Soviet Union, but it could be inferred from the publications of the gross industrial production relative to 1913 that actual production fell about 75% in the worst cases.26 The Center and

24“From July to November 1941, the equipment and machinery for more than 1,500 industrial enterprises (including 1,360 defense enterprises) were shipped eastward in 1.5 million train-car loads. To build and then stuff the Soviet defense plants, 10 million people – plant workers and their families – were relocated to the East.” – Gaddy (1996), p. 133. 25The following data were reported in statistical publications during Soviet period. total population in 1939 was 8.9 million, losses of Belarus population during the war were more than 2.2 million people. Source: “Belorusskaya SSR za 20 let (1944-1963)” (“Belorussia during 20 years” – a statistical publication), Central Statistical Unit with the Government of Belorussian USR, “Belarus,” Minsk 1964. Loss of only civilian population in Ukraine is reported as 16% of total. Source: “Ukraina za 50 rokiv” (“Ukraine during 50 years”), Central Statistical Unit with the Government of Ukrainian USR, Kiev 1967. 26In Leningrad region, the reported production levels for 1940 and 1945, correspondingly, were 8.9 and 2.3 times higher than in 1913. The loss of production capabilities due to war, therefore, is about 74%. Similar figures are reported for Smolensk region. For and Estonia the losses are

25 South of the European part of the Russian Federation were occupied for a shorter period of time, and people (as well as enterprises) had more time to evacuate. As the result, on average the loss of lives and productive capabilities on average was not as massive as in the westernmost regions of the Soviet Union. To account for the war, I take the projected population and industry values for year 1941 and instead of directly using them in the 1951 projections, I alter the values for the regions that were occupied during the war in the following way. I reduce the population levels of all affected regions by 25% and industry employment levels by 75%. This way, I am assuming that 75% of productive infrastructure was destroyed, and 25% of population was lost due to the war.27 To be on the safe side, these figures are deliberately taken to be higher than actual losses. Then, I use the altered 1941 values for the 1951 projections. The equations (2.23) and (2.24) for the year 1951 become:

RusP roj pop RusP roj ln POP = βRus + βˆ ln(0.75WARPOP ) 1951 1951 1951 1941  ind WAR RusP roj k k + βˆ1951 ln(0.25 IND1941 )+ βˆ1951x1941, (2.25) k and

RusP roj pop RusP roj ln IND = αRus +ˆα ln(0.75WARPOP ) 1951 1951 1951 1941  ind WAR RusP roj k k +ˆα1951 ln(0.25 IND1941 )+ αˆ1951x1941, (2.26) k where WAR is a dummy indicator if the region was occupied during WWII. The rest of the process is unchanged. The assumption implicitly embedded in this procedure is that any shock due to the war has to be permanent. Since the coefficients of the dynamic relationship for the years since 1941 were not changed, I am imposing the equilibrium path onto the Russian economy that has just been shaken by a major shock. The results of this procedure overdramatize the situation, and overestimate the effect of war on a counterfactual market economy. Although the shock of war was substantial, it was not entirely a permanent shock.28 To the contrary, according to the work of Davis & Weinstein (2001), such drastic shocks to the population distribution are likely to be completely transitory: people tend to rebuild destroyed cities and population levels tend to rebound. Even in the absence of pure economic incentives, people tend to return home, even if home was near 50%. Unfortunately, these data are not given for all regions. Source: “Atlas SSSR,” Glavnoe upravlenie geodezii i kartografii, Moscow, 1962. 27It has to be noted that due to the relative nature of the multinomial logit model reducing the (population or industry) shares of one region automatically raises shares of the regions unaffected. Thus, the composite effect of this artificial shock is even larger than nominal 25% and 75%. 28But the assumed permanence of the artificial war shock is quite consistent with the actual Soviet policy: most of the evacuated defense enterprises were never moved back.

26 destroyed. This effect is completely ignoredinmyWWIIcounterfactual.So,the hypothetical spatial allocation resulting from this procedure is likely skewed toward the eastern part of the country.29 The projected 1991 allocation is the counterfactual that would result in Russia under Canadian-style development, given the initial conditions, endowments, and accounting for WWII. Figure 2.7 presents the results of the projections.30 With the artificial shock included, the difference between actual and projected population in Siberia and Far East does decrease substantially: down to 9.6 million. This way, the long-term effect of the war generously allows for about 5 million more people in Siberia.31 However, the difference between actual and projected population still remains statistically significant (at about 99.5% level, according to Monte-Carlo simulations)! Even with the (probably grossly exaggerated) war effect built-in, the estimated excess of the population in Siberia and Far East is too huge to be generated only by random error, but has to be the result of the deliberate policy by the Soviet authorities.

2.5.5 Correction for the exogenous cross-regional fertility dif- ferences The difference in the fertility rates in the USSR varied from 1.93 children per woman average in Estonia to 5.03 in Turkmenistan. In case of perfect labor mobility the birth rate differences would not affect spatial distribution of population in long-run: people would instantly re-allocate (migrate) according to the economic incentives only. This is no longer true if mobility is imperfect. For example, if people are more likely to migrate to the parts of the country with similar culture and/or ethnic composition, the differences in fertility rates can have a long-term effect on spatial population structure.

29As another way to describe this issue, recall the discussion (in Section 2.5.1) on the unobservable factors impacting on location choices. The estimated model provides an unbiased forecast of the future population (industry) levels if current population levels – forecast input – are endogenous to the unobservables. In this case, we are trying to predict a result of an exogenous shock. In other words, we are treating the regions that suffered from war and lost population as if they were inherently undesirable (and therefore had less population). In this case, the forecast from 1941 on is biased toward regions that were spared in war, i.e. it exaggerates the war impact. 30The results are not particularly sensitive to the magnitude of the shock. I used the range of percentage decrease values, from 20% to 50% for population, and from 60% to 100% for industry. Only at the level of 50% loss of population and 100% loss of industry did I get projected population for Eastern regions close to the actual level. 31On the other hand, according to findings in Davis & Weinstein (2001), there might be no long- term effect of war whatsoever. However, given Russia’s larger size, and possibly higher transportation costs, it is possible that in Russia the WWII shock might have some long-term consequences. In Japan, people moved out of cities running from war, in Russia, people moved across the country. It would be more costly for Russians to move back to the previous spatial allocation once the war was over than for Japanese. Probably, neither the projection with zero-shock, nor with fully permanent shock produces the perfect counterfactual allocation. The truth is somewhere in between.

27 Numbers show the absolute difference between projected and actual population, in thou- sands.

Figure 2.7: Projected vs. actual population. Russia. Accounted for the impact of WWII.

Obviously, ignoring the differences would skew the population projection results away from Central Asian republics and toward the regions with lower natural pop- ulation growth. Historically, Russia had one of the lowest birth rates in the former USSR. The Siberian regions had roughly the same (age-standardized) low birth rates as the European part of Russian Federation. Thus, correcting for the fertility dif- ferences would on average decrease the estimated population deficit in the European part of the country and increase the estimated surplus of population in Siberia and Far East. The mechanism I propose to account for these differences is as follows: before plugging the past population levels in the projection equations, multiply them by the population growth coefficients specific to a geographical location and time period. Essentially, this way the dynamic relationship would be applied not onto the static (initial) spatial population distribution, but onto the “natural” population growth path. The same critique as with the war exercise applies here: I am introducing exoge-

28 nous changes into the equilibrium dynamics. The growth of regions where birth rates are higher is going to be over-predicted. The regions with higher population levels due to high birth rates are going to be treated as if they are inherently attractive. Therefore, one should think of the results of this exercise as an upper bound on what could be the result of fertility (and mortality) differentials. The procedure requires data on natural population growth rates by regions (provin- ces). The data on birth, death and net population growth in the USSR is generally available by oblast. However, the differences in age composition of the region’s popu- lation in the USSR are so profound, that their effect on the differences in birth rates absolutely cannot be ignored. The age composition depends on birth rates: there are more young people in the regions with traditionally higher fertility levels. The birth rate, in turn, also depends upon the age composition: there are more children in the regions where there are more young people. Finally, labor migration and, critical for Russia, location policy affects the age composition of the regions’ population directly. Itisnotexactlyclearwhattypeofpopulationgrowthdataisbettertousein this exercise: standardized or raw (non-standardized). Raw data on births carry the effect of location policy: the migrant population of the Siberian and Far Eastern regions is younger on average, which leads to higher per capita birth rates. The age- standardized births data do not account for the fact that regions with traditionally higher birth rates tend to be “younger,” and for this reason alone have higher per capita birth rates. Using raw data we would over-predict the growth of regions that accepted migrants during Soviet period. Using age-standardized data we would under- predict the growth of regions with high birth rates. Either type of data has its shortcomings. To the best my knowledge, the standardized birth rates in the USSR were pub- lished only by Union Republics, starting in 1959.32 Thus, only the grand average was published for the whole Russian Federation while the differences in birth rates among parts of Russia are sometimes profound. Unfortunately, there is no way to explore this issue given the data limitations. For this section, I used annual non-standardized per capita natural population growth rates (births minus deaths) data by republics, taken to the power 10 (for 10- year periods). The population levels of all the provinces in the same Union were adjusted using the same coefficient. This way, I am accounting for the birth rates difference between Union Republics, but ignoring the variance inside each Republic. I am filtering out the effect of Soviet policy manifested in intra-republican migration, but not the possible effect of Soviet policy on the inter-Republic migration and its consequences for the inter-Republic age composition differences. The result likely underestimates population in the parts of Russian Federation with the birth rates traditionally higher than Russian average: , and possibly other autonomous republics. The projection equations (2.23) and (2.24) now change

32The Goskomstat of Russian Federation started publishing the standardized birth rates by oblast only in 1993.

29 to

RusP roj Rus pop 10 ln POP = β + βˆ ln((1 + g) POP1911) 1921 1921 1921  ind k k + βˆ1921 ln IND1911 + βˆ1921x1911, (2.27) k and

RusP roj Rus pop 10 ln IND = α +ˆα ln((1 + g) POP1911) 1921 1921 1921  ind k k +ˆα1921 ln IND1911 + αˆ1921x1911, (2.28) k where g is a yearly natural population growth rate (in a given Union Republic: number of births minus number of deaths divided by the population) during the corresponding time period.33 The projection results accounting for the birth rate differences are shown in Table C.4, and on Figures 2.8 (without WWII simulations), and 2.9 (with WWII simula- tions). The actual population of Central Asian republics in 1991 was about 38 million in total. With fertility correction, instead of population surplus the Central Asian republics now have population deficit of about 8 million without the war, and sub- stantial deficit (23 million) with the war. Central Asian regions were actually growing in population faster than the rest of the Soviet Union all through 20th century, so predicting even higher growth for them in a counterfactual world seems rather im- plausible. As expected, the procedure most likely does overestimate growth in the high-fertility regions. The estimated surplus of population in Siberia and Far East is now 17.6 million not accounting for WWII, about 14 million with the war adjustment, both highly significant statistically. However, because procedure is strongly biased toward high- fertility regions these estimates should rather be treated as upper bounds. The true fertility-adjusted surplus of people in Russian East is likely to be somewhat lower, somewhere in between adjusted and unadjusted estimates.

2.5.6 Temperature per capita dynamics Several observations could be made about the dynamics of TPC in both Russia and Canada. The trajectories of the actual and projected TPC in Canada and in Russia are shown on the Figures 2.10 and 2.11. The plot of the Canadian TPC trajectory (Figure 2.10) shows that forecast model is somewhat “colder” than reality. Compared with actual TPC dynamics, the pro- jected TPC is lagging below actual by 0.4◦C. Thus, the model probably over-predicts

33Soviet data on natural population increase by Republics are available only for 1940 and from 1960 on. Thus, I had to take 1940 growth rates for the 1921 to 1951 projections.

30 Numbers show the absolute difference between projected and actual population, in thou- sands.

Figure 2.8: Projected vs. actual population. Russia. Corrected for fertility differ- ences.

population in colder regions.34 If this bias were corrected, this would shift the forecast for Russia towards even “warmer” allocation. Generally, the trajectory of counterfactual Russian TPC mirrors the Canadian dynamics. In Canada, TPC dips around years 1920-1940, then rises steadily until 1990. At the beginning of the century, after the construction of the Trans-Canadian Railroad, settlers rushed to Alberta and Saskatchewan drawn by abundance of fertile land. Aggregate temperature dropped slightly as colder areas have been populated. Later, as agricultural technologies have become less labor intensive, and the share of in Canadian GDP fell, population shifted towards manufacturing centers, away from agricultural areas, and TPC went up. In counterfactual Russia, the model predicts the same phenomenon with respect to Western and Southern Siberia after the Trans-Siberian Railroad was completed.

34This is probably an artifact of the large negative error for several big cities. Population of Vancouver is severely underestimated, the Vancouver error alone is enough to be responsible for the 0.15 degree actual-projected TPC difference.

31 Numbers show the absolute difference between projected and actual population, in thou- sands.

Figure 2.9: Projected vs. actual population. Russia. Accounted for the impact of WWII and corrected for fertility differences.

Climatically, South- is roughly equivalent to Canadian Plains, and at the beginning of XXth century also was a major agricultural producer (most famously, wheat). Similarly to Canadian Plains, these regions experienced an influx of migrants from the European part of Russia where land was traditionally in deficit. Migration to Siberia from 1897 to 1926 was quite significant, according to census data.35 Dur- ing 1910-1940 actual and counterfactual trajectories for Russia are practically the same. The divergence starts at the time when a counterfactual model predicts a TPC “reversal” alaCanada. Actual Russia never started to “warm up.” If in Canada (and in counterfactual Russian model) after 1930s productive resources concentrated in regions already established as manufacturing and service centers, Soviet Russia

35This time segment overlaps the period of Bolshevik rule that started after the October revolution of 1917. It is difficult to say with little data what part of migration happened before 1917, and what after. Probably, most of migration was voluntary and for economic reasons and would have occurred similarly in a counterfactual world. However it is impossible to say for sure given the atmosphere of political unrest and three wars during that period.

32 Figure 2.10: Projected and actual TPC dynamics in Canada. continued to develop its frontier. Figure 2.11 is another illustration of how significant the difference between actual and counterfactual allocations is. The shaded area represents 95% confidence interval around the predicted TPC trajectory (without either fertility or war corrections). The permanence of the artificial WWII shock is also evident from the graph: the TPC trajectories “with” and “without” war do not converge, but rather the gap widens even more with time. But, however grossly exaggerated, the effect of WWII still does not explain the entire actual-predicted TPC gap. Actual TPC is still well below all possible counterfactual estimates.

2.5.7 Alternative criteria for model selection and robustness checks As discussed in Section 2.5.2, the explanatory variables for the forecast model were selected to minimize population-weighted sum of squared errors (equation (2.22)) for in-sample (Canadian) forecast. Are results robust to the model specification? Alternatively to (2.22) other criteria could have been used. For example, minimum of absolute SSE in 1991:   2 Actual Proj POPl,1991 − POPl,1991 ; (2.29) l or, minimum of absolute value of TPC difference: |TPCActual − TPCProj|; (2.30) or, a combination of the above could be used.36

36Another option is to measure SSE, absolute or relative, not only in 1991, but along the whole   2 Actual Proj time path, for example: l,t wt POPl,t − POPl,t ,wherewt is a weight of year t. However,

33 Grey area shows 95% confidence interval around basic forecast trajectory.

Figure 2.11: Projected and actual TPC dynamics in Russia.

Incidentally, both criteria (2.29) and (2.30) choose the same set of explanatory variables, and this set is notably different from the one picked by (2.22). Thus, the choice of a criterion does prove crucial. The results for counterfactual Russia generally differ (quantitatively) depending upon the model chosen. However, overpopulation of Siberia does appear as the robust result irrespective of what specific model is used for estimations and projections. I repeated the procedure of Sections 2.5.3-2.5.6 for two alternative models: unrestricted model (all variables included) and for the model that minimizes equations (2.29) and (2.30). Figure 2.12 plots the time trajectories of temperature per capita for all models. An unrestricted model diverges from the true trajectory the most. A model picked on the basis of any of the criteria mentioned performs better.37 The criteria (2.29) and (2.30), naturally, produce the TPC trajectory closest to the actual one. Model chosen by criterion (2.22) performs somewhat worse in terms of TPC fit. Table 2.2 compares the results for predicted Siberian population according to the different models. The only instance where counterfactual population of Russian East is not sta- tistically different from actual is if the forecast is done with the unrestricted model and WWII correction is done. This is not entirely unexpected, however. First, as Canadian projections suggest, unrestricted model is (most strongly of all models) bi- ased toward colder places (Figure 2.12). Second, unrestricted model contains many insignificant variables – obviously the forecast error is larger. Third, WWII correction in our particular case, the same models would have been selected. 37Models chosen in “traditional” way, by eliminating insignificant variables using Akaike or Shwartz criterion, produce a trajectory very close to the unrestricted model.

34 Figure 2.12: Projected and actual TPC dynamics in Canada. Alternative models compared.

probably overestimates real WWII impact, as discussed earlier.38 All of the factors that tend to bias the P-value up are present at the same time. In all other cases, overwhelmingly, Siberia and Far East appear to be overpop- ulated significantly no matter what model is used for forecast. The most obvious alternative - a model that minimizes absolute SSE (equation (2.29)) predicts even fewer people in eastern Russia – 17 million less than now. Compared to a whole spec- trum of possible model specifications, the one I chose to use gives rather conservative predictions.

2.6 Conclusions

We show that the present allocation of population and industry in Russia inherited from the Soviet system is far different from that which would occur in the absence of Soviet location policy. It is colder and further to the east. Namely, the Eastern part of the country is noticeably overpopulated compared to the counterfactual market allocation, while the Western part experiences a relative population deficit. The excess population in Siberian and Far Eastern regions ranges from 9.6 to 17.6 million people according to various estimates. The impact of WWII, however drastic in the case of Russia, explains the east- west misbalance only partly. Even according to the most liberal estimates, the excess population of Siberia and Far East remains at the level above 9.6 million after the war adjustment, and is statistically significant.

38In addition, unrestricted model may also suffer from degrees of freedom deficit - 14 variables in a regression with 37 data points are too many. Therefore, the coefficient standard errors I used in Monte-Carlo simulations, and hence the simulations themselves, may be unreliable.

35 Excess population in Siberia and Far East, ’000 No correction Correction Correction Correction Model for WWII for fertility for WWII and fertility Main result Min. weighted 14,577 9,563 17,583 14,012 SSE (<0.001) (0.005) (<0.001) (<0.001) Alternative models Min. absolute 17,291 12,471 19,677 16,172 SSE (<0.001) (<0.001) (<0.001) (<0.001) Unrestricted 13,030 6,260 15,914 10,882 (0.008) (0.131) (0.001) (0.028) P-values in parentheses

Table 2.2: Excess population in Siberia and Far East, according to alternative forecast models.

With the transition to market economy, as agents have the ability to freely respond to market stimuli, there might have been hope that the spatial inefficiency will correct itself eventually: people will migrate to more favorable places. Indeed, starting from the early 1990s in post-communist Russia internal migration data show a pattern opposite to that of the Soviet times. People are leaving the North, Far East and Eastern Siberia. The European part of Russia, the Urals, as well as some industrial centers in Western Siberia (e.g. Omsk, ) see the migration inflow.39 The rate of this migration, however, is slow.40 Heleniak’s (1999) estimate of the absolute decrease in population of Far North and Equivalent Territories (these correspond roughly to Northern, Western and Eastern Siberia, and Far East in “usual” regional terminology, but do not include Novosibirsk, Omsk, Kemerovo, and Kurgan oblasts) during the period from 1989 to 1998 is around 726,000 people41 (or about 3.3% of population of these regions). While this out-migration flow may seem significant in absolute magnitude, it is nowhere near the rate that it would take to bring the population distribution closer to what appear optimal in the foreseeable future.42 On average, Eastern regions are overpopulated by a factor of 1.5. At the current rate, return to the optimal distribution would take about 180 years. Thus, the costs of spatial inefficiency are likely to be an extra burden for the new Russian state for years to come.

39Goskomstat, 1995. “Migratsia naselenia v Rossii.” (”Population Migration in Russia”) 40Some territories are exceptions. For example, Magadan oblast and Chukotka republic lost 30- 40% of population during the transition years. It should be noted that these are the most remote territories with unbearable climate. Still, even at this rate of out-migration it would take 50-60 years minimum to revert to the counterfactual population levels in these territories. 41Data from Heleniak (1999), Table 4. 42Heleniak also points out that some of the Western Siberian oblasts are popular designations for the migrants from Siberian North and Far East. So, the total migration outflow from Siberia as a whole is actually lower than the numbers given.

36 Chapter 3

The Cost of the Cold

According to the results of the analysis presented in chapter 2, the spatial allocation of people and industry in Russia is highly distorted. If not for many decades of communism, the spatial allocation of economic activity in Russia would have been drastically different. At present the Siberian and Far Eastern regions are severely overpopulated. The simulated counterfactual market allocation is warmer: the actual population-weighted aggregate January temperature index (TPC) is 1.5◦Clowerthan the counterfactual one. The next question is evident: since the allocation is found to be inefficient, what are the consequences? Are the extra 1.5◦ of cold indeed costly for the economy; and if yes, then what is the cost? The chapter is organized as follows. Section 3.1 surveys the existing work on climate-related costs of economic activity in relation to Russia’s circumstances. The following sections are devoted to estimating the costs of excess cold in Russia di- rectly. To calculate the cold-related cost of spatial inefficiency, I first investigate the relationship between temperature and various regional characteristics. I estimate the temperature elasticities of energy consumption, health indicators, and productivity. Then, I use the estimated elasticities together with the measure of extra cold resulted from Soviet investment decisions — 1.5◦C TPC difference — to calculate the cost in terms of present-day Russian GDP. Section 3.2 deals with additional energy con- sumption, section 3.3 evaluates the loss of productivity, section 3.4 explores health issues with respect to climate, section 3.5 concludes.

3.1 The Role of Climate

The role of climate in economic performance is a widely debated question, mainly due to the extensive literature on growth in tropical vs temperate countries. The well- known empirical regularity that tropical countries grow slower attracted conflicting explanations. North (1990), Acemoglu, Johnson & Robinson (2002) point onto the role of institutions as the main factor in growth during the period after the industrial revolution. Others (see for example Bloom & Sachs (1998)) offer an explanation that factors directly connected to country’s climate and geographical endowment, such as

37 disease transmission possibilities, being landlocked, land quality, etc, are responsible for the poor economic performance.1 The lack of consensus stems from the obvious puzzle in growth literature: even though the adverse effect of higher temperatures (or other factors highly correlated with it) is easily observed, it is not immediately clear what exact mechanisms are in play here. Generally, warmer climates provided more comfortable environment for human civilization in its early periods - in fact tropical areas used to be the only places where the survival of human species was possible in prehistoric time. Intuitively, warmer climates should reduce cost of production and cost of living, and therefore be conductive to economic growth ceteris paribus, not the opposite. Other factors that directly influence productivity must be at play, and it is not yet clear if they are really climate-related. In contrast, the role of extremely cold climates is much less controversial. Beyond any doubt, cold climates increase production costs and the cost of living. Both residential housing and production facilities require extra energy for heating. People require extra clothing. Constructing buildings and roads requires more materials. Agriculture in colder areas of the globe is limited or simply impossible. Transportation is more difficult. Cold is detrimental to people’s health. With modern technology, it is possible to partially mitigate the adverse effects of cold, i.e. adapt to it. Yet adaptation itself is costly. Several studies have addressed the issue of cold-related economic costs before, and the adverse effects of cold on productivity and quality of life are well documented. Hill & Gaddy (2003), chapter 3 discusses the work done to date on Canada and USA.2 The general conclusion is unambiguous: cold is costly. Abele (1986) finds that between 1 and 2 % of labor productivity is lost when temperature drops by one ◦C in the range between 0◦ and -20◦, more in lower temperatures. Herbert & Burton (1994) finds that Canada spends about 1.6% of GDP annually to combat the effects of cold weather. Even in the relatively warm climate of USA, the 1◦C drop in temperature would have resulted in extra costs amounted to 1.04 to 1.46% of GDP.3 Finally, Rosenthal, Gruenspecht & Moran (1995) found that in USA the extra costs of cooling brought by Global Warming of 1◦C would be more than offset by lower costs of heating. All the studies suggest that for Russia the economic burden of cold could be just as, if not more severe. While there is overwhelming evidence that operating in cold climates does indeed carry a significant economic cost, the second motivating question remains: what is the magnitude of costs in Russian case? The studies for US provide an informative

1The case of Russian regions and variation in their economic performance is particularly inter- esting in light of this debate. Russia represents a rare natural experiment where population was distributed (by the Soviet authorities, and in a manner largely exogenous to true economic incen- tives) across vast geographical space with diverse climatic conditions yet under the same set of institutions. 2Including Abele (1986), Herbert & Burton (1994) and series of studies by the U.S. Department of Transportation. For summary, see Hill & Gaddy (2003). 3U.S. D.O.T. study, cited in Hill & Gaddy (2003).

38 benchmark, but cannot be directly applied to Russia. The differences in technology between Russia and US are profound - it is conceivably possible that adaptation to cold weather is significantly better in one of these economies: Russia (due to the massive Soviet investments) or, more likely, US (due to market economy incentives). Furthermore, what if the burden of cold is not simply proportional to temperature differential, what if the cost of cold is non-linear? US January TPC is +4◦Cvs Russian -12.9◦C. It is important to estimate the costs of cold specifically for Russian conditions. Canadian studies of adaptation costs (Herbert & Burton (1994)) are more rele- vant for Russian case, as climatic conditions in Canada are closer to Russian ones. However, it answers a totally different question: it estimates the total costs of adap- tation, i.e. battling the cold weather. First, any adaptation to cold is partial: some of the adverse effects of cold cannot possibly be mitigated. Thus, Herbert & Burton (1994) provide only a lower-bound on total costs of cold. Second, my work concerns not the total costs of cold per se, but rather the costs of extra cold in Russia, i.e. the climate-related costs of spatial misallocations by the Soviet system. Russia has always borne and will be bearing the costs of unfavorable location. Climate cannot be changed; costs of poor climate in general cannot be prevented. Spatial misalloca- tions, however, can be corrected. How costly are these misallocations is the question directly relevant for policy design.

3.2 Energy

This section studies energy consumption in Russia as a function of temperature. In section 3.2.1 I look at the consumption of energy in producing sectors and calculate the temperature elasticity of energy/labor ratio. In section 3.2.2 I estimate how different would residential and commercial energy consumption for heating be in the counterfactual allocation. Finally, section 3.2.3 combines the estimates for all sectors and calculates the total extra energy costs. For producing sectors, looking only at the energy/labor ratio does not reveal all of the extra energy costs due to cold. Cold may be detrimental to the productivity of all the production factors - not only energy, but labor and capital as well. Therefore, it is possible (and likely) that to produce the same amount of output in a colder climate one needs to have not only more energy, but more labor and more capital as well. The estimates based on labor reflect the cost of cold only partially, and should be treated as a lower bound on the true cost. Section 3.3, dedicated to productivity, addresses these questions further. For residential and commercial sectors, due to data limitations I can only calculate change in energy consumed for heating purposes. Most probably, extra heating costs are the most profound consequence of cold, but it is definitely not the only one. It is possible that non-heating use if energy (lighting, cooking, etc) also increases with cold. As with the case of producing sectors, calculated energy cost in residential and commercial sectors is just a lower bound on the true cost.

39 3.2.1 Energy use in producing sectors The amount of energy used in production per unit of labor depends, in general, on production technology and, possibly, on the environment factors - climate. The goal of this exercise is to test if low temperatures indeed lead to higher industrial energy consumption in Russia, and to quantify the effect, if any. I test the hypothesis on Russian regional data on aggregate consumption in producing sectors: agriculture and industry.4 Let the energy to labor requirement ratio in region i be dependent on temperature in the following way:

Ei ◦ ln = α0 + αtti + i (3.1) Li where Ei is the amount of energy (electric, thermal or fossil fuel) used for production ◦ in region i, Li - total industrial and agricultural employment in region i, ti -region’s i average January temperature, i - error term. The error term i reflects all regional idiosyncrasies pertaining to energy consump- tion. If the error term is orthogonal to temperature (random across geographical dimension), the equation (3.1) can be estimated by OLS. One of the factors that can affect a region’s energy consumption is industry struc- ture. A region specializing in energy-intensive industries is bound to use more energy in any climate. If the regional industrial structure in Russia follows a particular geo- graphic pattern, the i are not independent of temperature, and we risk getting biased estimates for the temperature elasticity. To account for interregional specialization, I include industry structure controls into the equation (3.1):

m Ei Lij ◦ ln = α0 + αj + αtti + i (3.2) Li j=1 Li  where Lij - total employment by industry j in the region i ( j Lij + Agriculturei = Li). This way, the aggregate energy-to-labor ratio depends on industrial structure of regional production and, additively, on temperature. The equation (3.2) is estimated by OLS. The data on electricity, thermal energy and fossil fuels used for production comes from the Goskomstat publications.5 Data on industrial and agricultural employment come from the Industrial Census of Soviet Union, 1989 and from Goskomstat.6 To create industry structure controls (Lij), I grouped employment data by region (oblast) and by SIC code.

4In 1991 and 1992 the Russian State Committee for Statistics (Goskomstat) published regional data on energy “used for production of goods”. That includes the sectors producing material goods - industry and agriculture, and excludes commercial (services and government) and residential sectors. 5“Materialno-Technicheskoe Obespechenie v Rossiiskoi Federatsii”, 1992 and 1993. 6“Regiony Rossii” (“Regions of Russia”), 1997.

40 The industry structure controls consist of employment shares in 2-digit SIC in- dustries and in some 4-digit SIC industries. To choose them, I first looked at data on the energy-to-labor ratio for 2-digit and 4-digit SIC in order to pinpoint what 4-digit industries are most likely to consume significantly more or significantly less energy that the 2-digit average. As a guide, I took the data for industry in US.7 If a 4-digit industry had the energy-labor requirement ratio significantly higher or lower that 2- digit average, it was made into a separate series. For example, Primary Aluminum (SIC 3334), Pulp and Paper Mills (SIC 2611 and 2621) are among the “suspected” energy-intensive outliers and were separated from SIC33 and SIC26 groups. This way, I constructed 33 series of shares of 2-digit and 4-digit industries in total employment. (See Table C.5 for the list.) The total of 33 industry series cannot be used at the same time in the regression with a sample size of 79 (regions). To ease the degrees of freedom deficit, The SIC groups that have roughly the same energy-labor ratios were combined. The groups that have energy-labor ratios close to the economy-wide average were dropped. In final regressions 10 to 12 separate control series were used.8 The Soviet Industrial Census gives data on the civilian enterprises only. A large portion of economy (so called “VPK” - a Russian acronym for “Military-Industrial Complex”) involved at least partially in the production of military goods is not in- cluded into the database. To make up for the omitted VPK employment, I use the estimates given by Horrigan (1992). The estimated VPK employment was added to the regional civilian employment total (so that Li = Agricuturei + VPKi + j Lij), and VPK series was included into the set of industry controls. Equation (3.2) models the dependence of temperature in additive way, thus essen- tially assumes that temperature affects all industries uniformly. Per equation (3.2) the extra energy required to operate in an environment one degree colder does not depend on industrial structure. In reality, the “sensitivity” of the energy/labor re- quirement ratio to temperature may be different for different industries. However, to test for this in our environment, one we would need to include a set of interaction ◦ ∗ Lij terms ti Li , which seems impossible with so many industry controls and only 79 regional observations in the data sample - degrees of freedom would be effectively exhausted. The results are reported with and without the VPK-correction. Tables C.6-C.7 shows the results for electricity consumption, Table C.8-C.9 - for thermal energy, Table C.10-C.11 - for fuels (see appendix C). The estimated elasticities to temperature are highly significant and negative in both years and for all types energy. The absolute values of the elasticity range from 2.5% to 3.9% per degree Celsius. Including industry structure controls has little effect on either electricity or thermal energy elasticity estimates. For fuels, the temperature

7Per Energy Information Administration (EIA), in “Manufacturing Consumption of Energy 1991”, on-line at . 8As the regression results show, neither the coefficient of interest - the elasticity to temperature, nor its standard error changes significantly even when too many controls are added.

41 coefficient drops from 4.4% in 1991 (4.9% in 1992) to 2.5%(1991) and 3.4%(1992). The elasticity for electricity consumption is virtually the same (3.7-3.8% per ◦C) for 1991 and 1992. For thermal energy and for fossil fuels the estimates differ sig- nificantly for 1991 and 1992. It is possible that the change in coefficients accurately reflects real structural changes in energy consumption patterns. In the end of 1991 - beginning of 1992 Russia went through price liberalization reform: the state control of consumer prices was abandoned. A period of GDP fall and high inflation followed. Different temperature elasticities in 1991 and 1992 could reflect change in producers’ behavior during these years - before and after reform. Of course alternatively, the discrepancy in coefficients might be due to poor data quality. Including VPK into the industrial totals and into controls does not seem to affect the estimates. Marginally better R2 results with the VPK included in the model with industrial controls, but not in the model without controls. The estimated coefficients of VPK variable were not statistically different from that of other metal and machinery industries (SIC 32-39), what prompted me to combine the series. The elasticity estimates from 2.5% to almost 4% per degree translate into serious energy consumption differences in the Russian climate range. For example, most of the Western Siberian cities (Omsk, Tomsk, Novosibirsk to name a few), have January temperatures 8 to 9 degrees C below that of Moscow. In the case of electricity, for example, this difference translates into 27-30% higher per unit of labor requirement for similar production processes. In order to finance this difference, about 10% of all output (revenue) in the city of Omsk has to be spent!9 Of course, production in Eastern Siberia requires even more of the extra energy. The extreme example is a city of , where production requires two times more electric energy per unit of labor than in Moscow.

3.2.2 Residential energy consumption The most significant consequence of cold are, probably, extra heating costs. For the residential and commercial structures the relationship between amount of energy required for heating and outside temperature may be adequately approximated by a linear function. The amount of energy required to heat a building with particular structural characteristics over a period of time when the outside temperature varies is approximately proportional to the number of degree-days below the building’s natural balance point.10 A balance point is a minimum outside temperature such that no

9Assuming the “extra” (compared to Moscow) energy consumption is all , sold for the average export price. If converted into oil at the export price, the “extra” energy would cost up to 18% of output. 10From Rosenthal et al. (1995): “...In the case of direct combustion of fossil fuels the propor- tionality assumption is quite accurate... For electric heat pumps... the coefficient of performance is a (positive) function of the outside temperature...” Thus, energy requirement function of temper- ature is rather nonlinear with decreasing slope. Therefore, if we use linear approximation, we are rather understating the amount of energy used for heating in coldest Russian regions, and we are understating possible savings of this energy.

42 additional heating is required - the secondary inside sources of heat (people’s body heat, lighting fixtures or other appliances) and the building’s insulation are enough to keep indoors warm.11 Given a balance point of b degrees Celsius, the yearly number of degree-days in location i is calculated in the following way: 365 ◦ HDDi = max(b − tik, 0) (3.3) k=1 To calculate the number of energy required for residential heating in a city (or region), I assume that total energy used is proportional to the population of that city (region). In other words, I am assuming that there are no economies or diseconomies of scale in residential heating. In fact, economies of scale in heating are very probable. The well-known effect of “city warmth” is an excellent example. A large city generates a warm “pocket” - the average temperature in the center of the city is higher than in surrounding areas. This effect is easy to alleviate though, by using the actual temperature readings in the city in question. The other effect is due to higher residential population density - per capita heating costs are inversely related to the number of floors in the building and directly to the amount of living space per capita. Thus more densely populated areas tend to use less energy per capita. On the other hand, the facts about Russian existing technology suggest some dis- economies of scale may be present. In most Russian urban areas buildings are heated through a centralized system: the central boiler produces hot water and/or steam that is distributed through the pipes to the residential and commercial buildings in the district. The losses of thermal energy in transmission must be directly related to the length of the communications, possibly making areas with more population less en- ergy efficient. It is difficult to say whether positive or negative effect of agglomeration on energy consumption actually prevails in Russian case. With no economies/diseconomies of scale the total amount of energy for residential heating in region i is: R Ei = C ∗ HDDi ∗ Populi (3.4) where Populi is population of region i, C - proportionality coefficient. Then, if we as- sume that the structural characteristics of residential buildings in different regions are homogeneous with respect to balance point, then the country-wide average residential heating consumption per capita can be calculated as: ER  = C ∗ HDDPC (3.5) i Populi where HDDPC is a countrywide average number of heating degree-days per capita, calculated in the following way:  HDDi ∗ Populi HDDPC = i  (3.6) i Populi 11A base point for residential housing used by US Energy Information Administration and by Rosenthal et al. (1995) is 65 degrees Fahrenheit (18.3 degrees Celsius).

43 Simply put, under our assumptions the heating costs are proportional to the number of heating degree-days per capita. The number of HDD’s per capita is a population- weighted average of degree-days in all the regions (equation (3.6)).12 To compare the costs of residential heating in actual and counterfactual allocation one has to simply compare the actual and counterfactual per capita HDD’s. To construct an HDD per capita measure I use temperature data from National Climatic Data Center (CLIMVIS database).13 The CLIMVIS database provides daily observations of various climate indicators for about 8000 weather stations worldwide. I collected the mean temperature data for each day during 1994 to 2002 time period for 43 Russian stations. The stations chosen are located in or near former province capitals of Russian Empire that now lay within the borders of Russian Federation. Most of the cities are now oblast centers. I then calculated the average annual number of heating degree-days according to equation (3.3).14 I used three base temperatures: 10◦C, 5◦C and 18.3◦C.15 It is a practice in US Energy Information Administration to calculate heating and cooling degree-days from the base of 65F (18.3◦C). EIA divides all US territory into the climate zones according to the number of 65F-base heating and cooling degree- days. In USA the places with more than 7000F heating degree-days (of 65F base), classified as “cold” climate zone, host 8.7% of total population. In Russia, 87% of population lives in “cold” climate zone, and the remaining 13% - in a second coldest zone, according to EIA classification. The EIA standard of 18.3◦C heating base temperature was also used in the study by Rosenthal et al. (1995). However in my opinion, lower base temperature has to be adopted for the analysis of Russia. Accepting the 65F heating base for Russia would imply that Moscow, for example, requires heating for 11 month a year, which does not happen in reality. The normal heating period in Moscow is from 15th of September to 15th of May, and the target inside temperature must be at or above 18 degrees Centigrade.16 For the analysis I am using the base temperature of 10◦C, which seems to be the most realistic in Russian case. Table 3.1 presents the number of heating degree-days in selected Russian cities. First column shows the HDD for 10◦C base. For comparison purposes, numbers for the different base temperatures are also given. The last column shows heating degree- days calculated in EIA standard (HDD for 65F/18.3◦C base), Fahrenheit equivalent is shown in parentheses. The southern city of Makhachkala, the capital of has the lowest num- ber of heating degree-days. By far coldest city is Yakutsk with 7549 HDD average.

12The way the HDD per capita measure is calculated is perfectly analogous to TPC - temperature per capita index discussed in Mikhailova (2003). 13The database is on-line at . 14All calculations are done in degrees Celsius. 15Correspond to 50F, 40F and 65F. 16The fact that Russia constructed housing stock with (on average, but not universally) a better insulation than in the US suggests that some degree of adaptation to cold happened even under the Soviet system.

44 City HDD 10◦C HDD 5◦C HDD 18.3◦C(inF) Makhachkala 1076 339 2783 (5009) Moscow 2505 1417 4726 (8506) St. Petersburg 2510 1378 4856 (8740) Omsk 3829 2671 6078 (10942) Irkutsk 4295 3031 6718 (12095) Yakutsk 7549 6167 9998 (17996) Russian actual per capita average 2985 1883 5189 Counterfactual per capita average 2817 1727 5010 difference, % 5.7 8.3 3.5

Table 3.1: Heating degree-days for select cities and Russia as a whole.

Moscow and St. Petersburg have both near 2500 HDD annually. The number of HDD in Siberian cities ranges from about 3908 in Krasnoyarsk to about 4755 in Chita. The average Siberian city has 60% more HDD, hence requires 60% more energy per resi- dent to provide heating, than Moscow! To get the actual and counterfactual HDD’s per capita I plugged the actual and counterfactual populations values into the equation (3.6). The actual number of (10◦C) HDD per capita is about 2985. The counterfactual HDD per capita is 2817 - slightly more than Moscow average, and 5.7% less than actual. Thus, if market forces allocated the population in Russia, the country as a whole would require at least 5.7% less energy for residential heating.17 Furthermore, similar implications can be derived for commercial energy use. Com- mercial building stock hosts various non-manufacturing firms, mostly in the service sector, and administrative functions. There is little regional variation in the share of people employed in administrative and service sectors. Therefore, it is natural to ex- pect that in a counterfactual allocation the commercial sector, similarly to residential, would use about 5.7% less energy.

17An interesting topic for future work is the hypothesis that predominant structural characteristics of Russian housing stock would have been different in a market economy environment. For such a cold climate, insulation in average Russian housing units is way below adequate. (“...as much as 800 kilowatt-hours per square meter per year are expended to heat multistory apartment blocks - roughly 10 times what is spent in developed countries.”- a quote from “Foreign Insulation Gets a Warm Welcome”, The St. Petersburg Times, January 10, 2003, By Robin Munro) The aggressive subsidization of energy in the Soviet Union discouraged the use of more energy-efficient but more expensive materials and heating technologies. It is natural to expect that situation in a free market environment would be drastically different. Although the goal of this paper is to isolate the effect of spatial misallocations alone, one needs to bear in mind that the inherited Soviet distortions of other nature - structural, technological - also result in added burden to Russian economy. More research is needed on how to mitigate the adverse effects of cold. Is it cheaper to insulate buildings, or to move people away from cold places (or both)?

45 Energy savings Production sector energy Residential and commercial Electric Thermal Fossil fuels sector, energy for heating “optimistic” version 5.55% 3% 3.75% 5.7% (1991 estimates) “pessimistic” version 5.7% 6% 5.25% 5.7% (1992 estimates)

Table 3.2: Savings of energy in counterfactual relative to actual allocation.

3.2.3 Cost The next step is to calculate the cost of all the extra energy Russian economy uses because of its inefficient spatial allocation. First, the total amounts of “excess” energy used in production can be calculated using the estimated temperature elasticities together with the difference between actual and counterfactual TPC’s:

∆E = αt ∗ ∆TPC (3.7)

The αt coefficients were estimated from a cross-sectional sample as described in section 3.2.1. I calculate the cost for two sets of estimated elasticities: “optimistic” 1991 values - 2% and 2.5% for thermal energy and fossil fuels, correspondingly, in 1991; and “pessimistic” 1992 values - 4% and 3.5%. Chapter 2 establishes the ∆TPC to be 1.5◦C. Plugging this value together with the elasticities into the equation (3.7), get the total percentage savings of energy in the counterfactual allocation (see Table 3.2). The residential and commercial energy consumption would have been 5.7% lower in the counterfactual allocation, as established by the degree-days analysis, section 3.2.2. The next step is to combine the estimated percentage savings with the exact amounts of each kind of energy consumed in either sector. To the best of my knowl- edge, only two sources exist for energy balances by sector: Goskomstat (partial data) and Asia Pacific Energy Council (APEC) energy balance reports (drawn on data sup- plied directly to APEC by Russian Fuel and Energy Ministry). 18,19 The data from these two sources sometimes disagree drastically. I used data on electricity consump- tion from Goskomstat, data on thermal energy industrial-residential breakdown from Olkhovsky (2001), and data on fossil fuels sectoral breakdown from APEC Energy Balance Tables. The diagram on Figure 3.1 summarizes the information on energy sectors, what parts of total energy consumption did or did not go into the cost esti- mates.20 For production sector, I am using two values for each elasticity parameter - low (estimated from 1991 data) and high (from 1992).

18Goskomstat source: “Statistical Yearbook of Russia”, 2000. 19APEC source: Asia Pacific Energy Research Centre, “APEC Energy Supply and Demand Out- look, 2002”, Energy Balance Tables. 20I calculate the cost given the limited data on energy consumption I could collect from available sources. If and when more and better regional data becomes available, it would allow to fill in the missing estimates, conduct more thorough analysis, and to resolve the discrepancies between 1991

46 Figure 3.1: Energy sectors: consumption structure and counterfactual savings.

The shares of productive (industrial plus agricultural), residential, and commercial sectors in electricity consumption were calculated from Goskomstat data on electric energy balance for years 1995-2001.21 The share of industrial sector in total energy consumption has been steadily declining through these years, in my calculations I used a simple average of yearly shares. The relative shares of other sectors remained relatively constant. Residential and commercial share is somewhat more complicated to infer, since the data on consumption in these sectors are pooled together with transmission losses. To subtract the losses, I calculated the share of electricity losses in total consumption in 2000 from Table 14.23 in “Statistical Yearbook of Russia”. According to Table 14.23, transmission losses in electricity sector (35.0 million t.c.e) account for about 12% of total electricity consumption (297.5 million t.c.e). So, I subtracted 12% of total from the average “other sectors...” consumption percentage. The remainder is the 21% average share of residential and commercial sectors. I assume that residential consumption of electricity (mostly for lighting and ap- pliances) does not change with temperature, thus the temperature elasticity is zero, which is possibly unreasonable. A small part of housing stock is heated with electric- and 1992. 21“Statistical Yearbook of Russia”, Table 14.25 on page 358 gives data on total consumption together with consumption for industrial, agricultural, transportation sectors and “other sectors and losses”. (The “other sectors and losses” category is a sum of consumption in residential and commercial sectors plus total transmission losses in all sectors combined.)

47 Sector Consumption, Counterfactual Counterfactual mln t.c.e savings, % savings, mln t.c.e Fossil fuels, residential 85.5 5.7 4.9 Fossil fuels, production 149.6 3.8-5.3 5.7-7.9 Fossil fuels, energy generation 408.6 3.8-5.3 15.5-21.7 Electricity, residential 62.5 0 0 Electricity, production 178.5 5.6-5.7 10.0-10.2 Thermal energy, residential 92.3 5.7 5.3 Thermal energy, production 113.7 3-6 3.4-6.8 Total for the sectors above 1090.7 4.1-5.2 44.8-56.8

Table 3.3: Counterfactual “savings” of energy. ity exclusively, and many households use electrical space heaters as a backup. Thus, the electricity savings are somewhat underestimated. According to my estimates, fossil fuels used in the production process have a temperature elasticity of 2.5-3.5% per ◦C. I do not have regional data broken down by the type of use: how much fossil fuel was used as an input for (thermal and electric) energy generation, and how much was used in final consumption for other purposed. Because of these data limitations, I have to assume that the consumption of fossil fuels for all purposes has the same temperature elasticity. In production sector, many firms use plant-size thermal stations that work on fuel, the generated energy is used both for heating and in technological processes, if required. Most of the fuel delivered to final consumers (households and firms) is used roughly in the same way as in separate energy-generating facilities. Only a small part is used as a raw material for chemical and allied industries (see Goskomstat energy balance, Table 14.25 in “Statistical Yearbook of Russia”). Therefore, I do not believe the assumption would distort the results significantly. To approximate the residential share of fossil fuels consumption, I took APEC data on residential and commercial consumption of three major components: natural gas, oil and coal. Then, I calculated the share of it to the total consumption of gas, oil and coal. Production share is the remainder minus losses (from “Yearbook of Russia”, Table 14.25) and minus the share of transportation sector. For thermal energy, I first subtracted the share of losses (Goskomstat), then di- vided the remainder between residential/commercial and production sectors in pro- portion given by Olkhovsky (2001). Table 3.3 combines the total energy consumption data (“Yearbook of Russia”, Table 14.25) and the inferred shares of each sector to give the absolute levels (in tons of coal equivalent or t.c.e) of energy consumption in all sectors and calculate the corresponding absolute savings. The next step is to determine the cost of the 44.8 to 56.8 million t.c.e. of lost energy. The price of one t.c.e. differs depending on the type of energy. I calculate the cost in terms of natural gas, one of the cheapest of most common energy types. An average representative heat content of 1000 cubic meters of natural gas is 36

48 GJ or 1.23 t.c.e. Thus, the 44.8-56.8 million t.c.e. of extra energy are equivalent to 36.4-46.2 billion cubic meters of natural gas. The average export price of natural gas22 for year 2000 was 87.5 $US per 1000 cubic meters. The would-be savings of energy in 2000 translate into 3.1 to 4.1 billion dollars, or about 0.91 to 1.15% of 2000 Russian nominal GDP23 of 349.4 billion $US.24 We must bear in mind that the energy costs are incurred on yearly basis. The cost of lost investment opportunities accumulates over time. If the Russian economy could have grown 1 percentage point faster during the period of last 30 years (this is what would happen if there were no substantial increase in Siberian in-migration in the 1970s, hence no sharp decrease in TPC), Russian GDP in 2000 would have been about 35% higher!25 I must add that many components of the total extra energy costs are still un- accounted for. In reality, electricity use by residential customers must have positive temperature elasticity. Energy consumption by the transportation sector, for the pur- posesofthisresearchisassumedtobeindependentonclimate.Inrealityitmostlyis increasing with cold. More importantly, in Russia burning various kinds of fuel is the most common way of generating energy. Fuel has to be transported from the place of production to consumers. If less people live in severe climates, not only less fuel would have to be spent, but also less fuel would have to be transported over vast Russian space. For many remote Russian regions transportation costs are comparable with the cost of fuel itself.26 In counterfactual allocation, all those transportation costs would not be incurred. The cost of extra energy consumed is only a lower bound on the true excess energy-related costs.

3.3 Productivity

In this section I investigate how total factor productivity interacts with cold. Does TFP significantly decrease with extremely low temperatures? First, in subsection 3.3.1 I examine how total gross regional product changes with temperature. Then, in section 3.3.2 I look at the construction industry, which is probably one of the sectors that exhibit the most interregional heterogeneity along the temperature dimension.

22Source: Goskomstat, “Statistical Yearbook of Russia”, 2000 23Source: Energy Information Administration, . 24Same calculations for more expensive crude oil (1.46 t.c.e. per ton, 175$US export price in 2000) give counterfactual energy savings of 1.54 to 1.94% GDP. 25Calculated under assumption of zero growth. Because of ruble devaluation during transition, nominal GDP per capita growth (in currency exchange terms) in USSR/Russia from 1970 to 2000 is near zero. If instead we assume average annual growth of 2.5% (typical for pre-transition decade), the result does not change significantly - lost GDP amounts to about 34%. 26The example of Kamchatka region is discussed in Hill & Gaddy (2003): for the region’s electric utility plants, the price of fuel transportation is about twice the price of fuel itself.

49 3.3.1 Aggregate Production Suppose, the regional production function is of a Cobb-Douglas form and production requires two inputs: capital and labor:

α β Yi = Fi(Ki,Li)=AiKi Li (3.8)

Where Ki is a capital stock in region i, Li is region i’s labor supply. Total Factor Productivity Ai varies across regions i and is a loglinear function of temperature (TEMPi), regional industrial structure, resource endowment and other controls (Xij), and the unobserved regional component i:  ln Ai = Ao + A1TEMPi + BjXij + i (3.9) j Plugging (3.9) into (3.8) and taking logs, we obtain:  ln Yi = A0 + α ln Ki + β ln Li + A1TEMPi + BjXij + i (3.10) j

Equation (3.10) cannot be estimated directly by OLS because Li and Ki are not exogenously given, but are the variables of choice. When the production function as given by (3.10) is estimated with cross-sectional data, the factor with higher short-run mobility - labor - is usually instrumented for. Contrary to common intuition, in Russia labor is less, not more mobile across regions than capital. Apparently, investment/disinvestments decisions produce more variation in regional capital stock across time than interregional migration does for the regional labor force. Regional labor supply is extremely stable, shockingly more so than regional capital.27 A cross-sectional series of regional capital stock in 1994 and in 2000 are 92.6% correlated,28 while the series of regional labor force are correlated at the 99.8% level. Thus, I am instrumenting capital with the regional population pre-transition (1990). As industry structure controls, I am using shares of the select industries in total employment. As appears, the most effective controls are the share of oil and gas extracting industry (SIC13) and the share of agricultural employment. Specializa- tion in oil and gas make regions appear more productive, dominance of agriculture negatively affects the estimated TFP.

27This is not to say that the labor mobility is non-existent in Russia. On the level of individual firms labor has been getting increasingly more mobile during the transition period. However, most of labor movement is intersectoral but not interregional. The level of labor migration across regions is still low. 28Note that capital stock is measured as the base (purchase) value of capital adjusted for depreci- ation at the official rate, and does not truly represent capital stock used for production. If a region is economically successful and investment level is high, all new capital correctly shows up in data. If a region is in recession, capital is underutilized, and the measured capital stock is actually higher than the utilized capital stock. Probably, capital is even more mobile across regions than it appears to be.

50 Independent Dependent variable is the Log of GRP variables in 2000 Capital 0.95* 0.38 0.86* 0.34 (0.28) (0.26) (0.31) (0.26) Labor 0.23 0.70* 0.29 0.74* (0.30) (0.26) (0.32) (0.28) Temperature -0.023* -0.013* -0.013 -0.011 (0.006) (0.006) (0.007) (0.006) Resource 0.25 0.09 endowment index (0.13) (0.08) Industry structure no yes* no yes* controls R2 0.85 0.92 0.86 0.95 Heterosckedasticity-consistent standard errors are in parentheses. Number of observations is 72. * denotes significance at 95% level.

Table 3.4: Regional production function estimates (second stage).

As an alternative (or additional) control variable I also used the regional resource endowment index compiled by World Bank. The index value varies from zero for Moscow city to 2.2 for oil-rich Tuymen’. The results are shown in Table 3.4.29 Evidently, the temperature coefficient is not of the expected sign, and statistically significant. Low temperatures seem not to hurt production, but help it! The coefficient loses some of the significance if industry structure controls are included, but is far from being positive, as our hypothesis would predict. The reason for the negative sign of temperature may be interregional income and price level differences. Cold regions tend to have higher nominal incomes and higher cost of living. Higher local nominal incomes would imply higher prices for non-traded goods in colder regions. Even in the case of traded goods, the substantial transportation costs (in a large country like Russia regions are dispersed spatially) would insure that some degree of interregional price differential would exist. Moreover, Gluschenko (2002) shows that the law of one price does not hold for Russian regions even when transportation costs are accounted for. Thus, price-income patterns in Russia exhibit a Balassa-Samuelson-type effect: richer regions have higher prices, even for traded goods. Higher wages in resource-extracting industries lead to higher prices for non-traded goods and services and, as a consequence pump up the prices for traded goods through service component in final price. Concurrently, lower temperatures are associated with higher prices. Thus, even if a firm in a cold region is producing the same amount of real output as a comparable firm in a warm region, chances are

29The estimated shares of labor and capital are very sensitive to specification. Including industry controls changes the results drastically. This is yet one more indication that production functions are highly heterogeneous across industries. Additional work is needed to estimate the production function, perhaps using data disaggregated on industry level.

51 Figure 3.2: Gross regional product, prices and temperature.

that the “cold” firm would record higher output in monetary terms. Since labor is measured in real terms - number of people employed - the estimates of TFP in high-price regions are inflated.30 To overcome this problem would require data on real regional output, i.e. we need to deflate the nominal output by a regional producer price index. The only cross- sectional price indices available from Goskomstat are the cost of 19- and 25-good consumer baskets31 and the local subsistence minimum. The results with the dependent variable - gross regional product - deflated by the subsistence minimum are in Table 3.5. With deflated GRP data, the temperature elasticity estimates have increased. However, it is yet unclear whether the result is robust to the choice of deflator. Using the price of either consumer basket as a deflator yields virtually the same results, since they are highly correlated with subsistence minimum. Both indices are based on most necessary consumer items and are poor proxy for producer price index. It is possible that the producer prices vary less (or more) across regions than subsistence minimum and consumer prices, thus the temperature elasticity in Table 3.5 is overestimated (or underestimated). Further research of the relationship between productivity and climate is needed. One of the possible directions is to estimate the production function at the level of

30It is not clear if price differentials among regions affect capital valuation in Russia. Theoretically, if value of capital is dependent on local price level, the coefficients α and β would also be biased. However, I know not of any evidence that would link capital valuation to consumer price index, and cannot conclusively argue one way or the other. 31The consumer baskets represents most necessary products, mostly food.

52 Independent Dependent variable is the Log of variables GRP/Subsistance in 2000 Capital 0.22 -0.04 0.14* -0.08 (1.33) (0.20) (0.25) (0.21) Labor 1.03* 1.25* 1.08* 1.29* (0.23) (0.20) (0.25) (0.22) Temperature -0.008 -0.004 -0.0005 -0.0005 (0.004) (0.006) (0.005) (0.004) Resource 0.20* 0.09* endowment index (0.07) (0.06) Industry structure no yes* no yes* controls R2 0.91 0.94 0.92 0.95 Heterosckedasticity-consistent standard errors are in parentheses. Number of observations is 72. * denotes significance at 95% level.

Table 3.5: Regional production function estimates, GRP deflated by subsistence min- imum index (second stage). industry. The production process is more homogeneous inside a particular industry than on the level of aggregate production, thus we should be able to estimate the coefficients of labor and capital (β and α), and the TFP better. Moreover, the pro- ducer prices inside the same industry are more comparable across regions, especially for traded goods. Therefore, even without using the regional price level deflator, we can analyze the data for output in an industry that produces traded goods yet still estimate the temperature elasticity with little or no bias. I would expect the temperature elasticity of TFP to be positive and highest in the industries where production process is not climate controlled, such as construction, forestry and logging. In the industries where the production process is conducted indoors, I would expect little or no effect of climate on the productivity itself, but the higher costs of extra energy consumption.

3.3.2 Construction This subsection analyzes the effect of climate on the productivity of the construction industry. There are three reasons for construction costs to be higher in cold cli- mates. First, the material requirements must be higher in cold climates: comparable structures, for example residential housing, must provide higher degree of insulation. Second, productivity of machinery and labor must be negatively affected by cold.32 Finally, equipment, buildings and roads deteriorate faster in cold climates and require more maintenance and have to be replaced more often.

32See for example, the table of the cost thresholds for standard equipment in Siberia, by Mote, cited in Hill & Gaddy (2003) and in Gaddy & Ickes (2001).

53 Dependent ROADS BUILDINGS Number R2 variable of obs. Construction 2.19 1.29 74 0.77 output, mln rub (1.48) (0.08) Heterosckedasticity-consistent standard errors are in parentheses.

Table 3.6: Relative prices of construction output.

The data available to date on construction in Russian regions allow to measure only one of the costs: the loss of productivity. Goskomstat33 publishes both industry- specific data on labor, capital and data on production in real-terms. It tells exactly how many cubic meters of buildings were constructed, how many kilometers of roads and communications were built. The availability of real-terms data is welcome because of the difficulty with calculating the interregional price level differences in Russia as discussed in section 3.3.1. The temperature elasticity estimates based upon the nominal output data are bound to be biased downward. Using data from Goskomstat34. I estimate the construction industry production function and test whether temperature has a significant effect on total factor pro- ductivity. As in section 3.3.1, consider a Cobb-Douglas production function given by (3.8) with Total Factor Productivity being a function of temperature. The estimated equation is (3.10), where Yi is the real output in construction industry in region i, Ki and Li are capital and labor employed in construction. Yi is supposed to represent the total amount of real output in construction. How- ever, construction output is not fully homogeneous. Goskomstat gives separate data on length of roads and volume of buildings constructed.35 Different varieties of con- struction product have to be aggregated into a composite measure. How should we aggregate output from different kinds of construction projects? I sum up the data on buildings and on roads weighted by their relative prices. Prices are estimated out of the equation (3.11). I run a regression of total construction output (monetary value) on the physical volumes of buildings and roads construction. The ratio of the estimated coefficients represent the relative price ratio sought after.

PiYi = ProadsROADSi + PbuildBUILDINGSi + δi (3.11)

Where PiYi - monetary value of construction output in region i. The most likely outliers - Moscow, Moscow region, St. Petersburg, ’ region were dropped fromthesample.TheresultsarepresentedinTable3.6.

33“Stroitel’stvo v Rossii”(“Construction in Russia”), 2000. 34“Construction in Russia”, 2000. 35Data on length of water, sewer and other types of communications constructed are also given, but their average monetary share in output is relatively small and the amounts are highly correlated with volume of buildings construction. Because of its high correlation with buildings construction volume, omitting this category of output would not change the results. I did not use these data.

54 Dependent Const Log of Number R2 variable Labor 1990 of obs. Log of Construction -4.957 1.242 72 0.91 labor, 1999 (0.308) (0.045) Heteroskedasticity-consistent standard errors are in parentheses.

Table 3.7: Construction industry production function IV estimation. First stage.

Under the assumption that relative prices reflect relative costs of different con- struction products, the aggregate index of real output Yi (to the multiplicative con- stant) is calculated as a weighted sum of real outputs in two sectors:

CYˆ i = PˆroadsROADSi + PˆbuildBUILDINGSi (3.12)

Now using the constructed weighted index of real construction output CYˆ i we can estimate the equation (3.10). As with the case of total production, the equation (3.10) cannot be estimated directly by OLS because Li and Ki are endogenous. In the case of construction, however, it is labor that has to be instrumented for. Even though interregional labor migration in Russia is almost nonexistent, inside the region labor is sufficiently mobile between sectors and industries. Labor employed in a particular industry is capable of short-run adjustment and surely endogenous, especially in construction industry where people are often hired for specific projects, on on-demand basis. As an instrument for construction labor I take the regional data on total labor force in 1990. While the total number of employed is highly correlated with employed in construction (even though the series are separated by 9 years), it is likely to be orthogonal to the present-day region-specific productivity shocks, because spatial allocation of labor force in 1990 is a legacy of Soviet planning, not a response to market conditions. The results of the first stage IV regression are in Table 3.7. Table 3.8 presents the results of the second-stage IV estimates, along with an OLS regression for comparison. Magadan and Yakutia regions, both bitterly cold (but not the only bitterly cold regions) proved to be outliers. When these two regions are dropped from the sample, the estimated temperature elasticity drops from -1.6% per ◦C to -2.3% per ◦C. In both cases, the temperature coefficient is strictly significant statistically. To check if the estimated temperature coefficient is robust to the weighting used to construct the real output index, I also estimate IV regressions with the dependent variable being only roads constructed and only buildings constructed. The results are in Table 3.9. As the results show, the temperature coefficient remains virtually unchanged if we choose to measure construction output taking into account only buildings, or only roads. Thus, the dependence on temperature is not induced by the particular weighting method I used.

55 Dependent variable is Log of real construction output weighted index Independent variables OLS IV OLS IV Constant 4.765 5.285 4.579 5.172 (0.345) (0.291) (0.384) (0.319) Log of Capital -0.055 0.166 -0.053 0.213 (0.076) (0.063) (0.078) (0.074) Log of Labor 0.953 0.770 0.981 0.767 (0.097) (0.072) (0.109) (0.086) Temperature 0.022 0.023 0.014 0.016 (0.010) (0.009) (0.011) (0.011) Magadan and Yakutia included no no yes yes Number of observations 76 70 78 72 R2 0.88 0.82 0.84 0.77 Heteroskedasticity-consistent standard errors are in parentheses.

Table 3.8: Construction industry production function IV estimation. Second stage.

Dependent variable Log of building Log of real output Log of roads Independent variables space , m3 weighted index length, km Constant 4.859 5.285 2.349 (0.313) (0.291) (0.513) Log of Capital 0.154 0.166 0.179 (0.070) (0.063) (0.172) Log of Labor 0.795 0.770 0.563 (0.078) (0.072) (0.176) Temperature 0.022 0.023 0.024 (0.010) (0.009) (0.020) Number of obs. 70 70 66 R2 0.81 0.82 0.31 Heteroskedasticity-consistent standard errors are in parentheses.

Table 3.9: Production function estimates. Test of the robustness to the weighting.

56 To summarize, the findings do support the hypothesis that cold harms construc- tion productivity. 36 The estimated productivity loss is about 2.3% per degree Centi- grade. An average Siberian city requires 23-25% more labor and capital resources then Moscow to complete a comparable task. The inherited spatial inefficiency, the extra -1.5 ◦C of TPC according to chapter 2, costs the Russian construction indus- try 3.4% of total output country-wide in productivity loss alone. Value added in the construction industry is responsible for about 6.4% of Russian GDP, hence lost productivity in the construction industry translates into about 0.22% GDP loss, due simply to lower efficiency of work. Recall that productivity loss is only a part of total costs - as the costs of extra materials and costs of premature depreciation of buildings and communications are not accounted for.

3.4 Health

I now turn to the input of temperature on health. I begin with analyzing the effect of temperature on people’s health. First, I estimate the elasticity of health indica- tors to temperature (I look at regional mortality and morbidity). Then, I use the counterfactual-actual TPC difference of 1.5◦C to predict the aggregate health indica- tors in the counterfactual allocation.

3.4.1 Mortality and morbidity The nature of mortality or morbidity data is probabilistic: aggregate per capita measures are confined between zero and one. A non-linear model of probability has to be employed. I use a logit setup. Assume the probability of an event (either illness or death) for a person i in region j takes the form:

Pij = Prob{xjβ + δj + i > 0} (3.13)

36Rosen (1979), chapter 6 provides evidence on the effect Winter House Building Initiative in Canada - a program subsidizing construction of housing during December-March period. A moderate subsidy provided by WHBI was enough to significantly shift the seasonal pattern of construction toward winter months, which seem to suggest that added costs of building in winter are relatively small, contrary to my own findings. However, three factors must be considered comparing Canadian WHBI experience with Russian evidence. First, as Rosen points out, the main effect of a subsidy was to effectively decrease the downpayment requirement for winter-built housing, hence significantly increase demand. The increased demand might have brought an increase in producer surplus above the nominal value of the subsidy. Second, winter weather in the most densely populated areas in Canada - southern Ontario and Quebec - is way milder than in Russian Siberia and Far East. Perhaps, the loss of productivity due to cold is non-linear, accelerating as temperatures drop. Thus, my estimated elasticity is relevant in Russian temperature range, but not in Canadian. Finally, even if the effect of temperature itself on productivity were indeed small, my results would imply that some other factors correlated with temperature affect construction productivity. However, this does not change the main conclusion of the paper: Soviet spatial misallocations result in productivity loss, be it due to cold or some unobserved regional heterogeneity.

57 where xj - observable characteristics of region j, δj - unobserved region-specific shock, i - person-specific shock, β - parameter vector. Assume there are infinite number of people in each region and that i are Weibull-distributed. Integrating over individuals get the expression for aggregate event probability in region j:

exj β+δj Pj = (3.14) 1+exj β+δj

Substituting the observed mortality or morbidity rates Sj instead of theoretical prob- ability Pj and applying the logit transformation, get:

Sj ln = xjβ + δj (3.15) 1 − Sj Assuming the region-specific shocks are independent and Normal-distributed, the equation (3.15) can be estimated by OLS. The set of control variables xj includes temperature and various social and eco- nomic characteristics, such as average income, various proxies for the level of devel- opment of regional infrastructure and health services. I use regional income levels, number of doctors per capita and (for adult mortality) amount of alcohol products sold per capita. Some, if not all, of these variables may be endogenous to health indicators. For example, number of doctors in the region may have direct positive effect on health, or/and at the same time reflect the reverse causality: a region with poor health indicators may choose to attract more doctors (with or without federal help). High levels of alcohol consumption may be associated with (or even caused by) low overall quality of life, health notwithstanding. However, since my interest is with the temperature coefficient, I will not address the issues of possible endogeneity in other control variables. My purpose in using these controls is only to make sure that the temperature coefficient is robust to specification. The results for aggregate (not age-standardized) morbidity and mortality are given in Tables 3.10 and 3.11. Note that mortality seems to be increasing with temperature, contrary to common sense. The temperature elasticity of morbidity has the expected negative sign. To explain the results, note that these mortality and morbidity data are not age-standardized. The real reason why temperature may seem either harmful or irrelevant lies in Russian migration patterns. Higher wages in Siberia and the Far East traditionally have been attracting young healthy people. Young migrants “bring good health” into Siberia. In contrast, retirees have always been trying to relocate back to the European part of country, especially to the warm southern regions of North Caucasus. Poor health of long-term Siberian residents was dissipated over the whole country. Analogous to “Florida effect”, such self-selection of people over climate zones biases the temperature coefficient up. Young people are less likely to get sick than elderly, and surely they are much less likely to die. Because of the life-cycle migration patterns the elasticities for both aggregate mortality and morbidity are likely to be biased

58 Dependent variable (S) is average morbidity rate 1992-2000 S Model Logistic (ln( 1−S )) Linear (ln(S)) Temperature (◦C) -0.025 -0.017 -0.017 -0.009 -0.006 -0.006 (0.01) (0.08) (0.08) (0.02) (0.03) (0.12) Doctors per 0.000 0.001 capita, average (0.92) (0.77) Alcohol sales 0.093 0.093 0.040 0.040 per capita, ’91 (l) (0.04) (0.04) (0.02) (0.02) Income, ’90 (rub) 0.002 0.003 0.001 0.001 (0.01) (0.02) (0.01) (0.03) ◦ % Elasticity to t ◦C -0.8 -0.6 -0.6 -0.9 -0.6 -0.6 Number of obs 79 73 73 79 73 73 R2 0.11 0.26 0.27 0.11 0.28 0.28 Heteroskedasticity-consistent P-values are in parentheses.

Table 3.10: Aggregate morbidity rate as a function of temperature. Robustness to specification.

Dependent variable (S) is average mortality rate 1992-2000 S Model Logistic (ln( 1−S )) Linear (ln(S)) Temperature (◦C) 0.009 0.016 0.016 0.010 0.004 0.004 (0.00) (0.01) (0.00) (0.00) (0.00) (0.19) Doctors per -0.001 -0.001 capita, average (0.80) (0.81) Alcohol sales 0.070 0.070 0.069 0.069 per capita, ’91 (l) (0.00) (0.00) (0.00) (0.00) Income, ’90 (rub) -0.002 -0.002 -0.002 -0.002 (0.00) (0.00) (0.00) (0.00) ◦ % Elasticity to t ( ◦C ) +0.9 +0.4 +0.4 +1.0 +0.4 +0.4 Number of obs 78 72 72 78 72 72 R2 0.11 0.13 0.23 0.10 0.13 0.45 Heteroskedasticity-consistent P-values are in parentheses.

Table 3.11: Aggregate mortality rate as a function of temperature. Robustness to specification.

59 Dep. var. is Independent variables standardized Temp. Doctors Alcohol sales Income Number % 2 mortality rate ( ◦C ) per capita p/capita, ’91 ’95 (th rub) of obs R ’97-’98 -0.57 0.000 0.027 0.002 77 0.40 average total (0.00) (0.89) (0.00) (0.97) ’97-’98 -0.54 -0.001 0.038 -0.072 77 0.39 average male (0.00) (0.73) (0.00) (0.24) ’97-’98 -0.63 0.000 0.019 0.044 77 0.41 average female (0.00) (0.83) (0.03) (0.53) Dependent variable is logit-transformed. Heteroskedasticity-consistent P-values are in parentheses.

Table 3.12: Standardized mortality rate as a function of temperature. upward. Looking at the age-standardized levels of mortality and morbidity instead of the simple aggregate levels may cure the problem partially. Data on standardized mortality based on mortality and age data collected by Goskomstat, are published in bimonthly journal of Russian Health Ministry. Un- fortunately, to the best of my knowledge, there are no standardized morbidity rates by region published in Goskomstat or other official sources. Therefore, in further analysis I concentrate on mortality only. Indeed, if instead of looking at the simple aggregate mortality we look at age- standardized values, the temperature coefficient changes to negative and statistically significant. Standardization by age removes the inter-generational selection bias, but the bias due to self-selection inside the same age group is still present. Of the young people, the healthier ones are more likely to migrate to colder regions. The less healthy people of any age are also more likely to leave the colder regions for places more comfortable. Note that if selective migration is more pronounced among men, women would have higher estimated elasticity to temperature, which is not inconsistent with the findings.37 The temperature elasticity for female mortality is higher consistently over the years. This pattern is highly suggestive of the hypothesis that at least in Russia mostly men choose the place to work, and women follow. Since self-selection bias is present in the estimates, it is fair to treat the estimated temperature elasticity as a upper-bound (lower-bound in absolute value) on true elasticity value. However, given low overall levels of labor migration in Russia the selection for most of regions is likely small, possibly negligible (Tuymen’ might be the only exception).38

37The elasticity estimates are not statistically different for men and women, each having a standard error of 0.17 % per ◦C. 38One might argue that the elevated levels of mortality in colder regions are not the consequence of cold per se, but instead owe to other factors, such as remoteness, luck of infrastructure, poorer qualifications of medical personnel and overall lower standard of living, since all these factors are highly correlated with (or sometimes endogenous to) temperature in Russian case. The argument

60 Looking at infant mortality can shed some light on the size of the remaining selection bias. Infants are the only age group for which climate-selection does not yet come to play (parents can decide to move to more comfortable environment, or decide against moving to cold place if their child’s poor health can be affected by climate, but moving takes time).39 The estimated elasticity of infant mortality to temperature is indeed slightly higher in absolute value than the elasticity of (even standardized) mortality rate of adults, and this is true consistently over time.40 On one hand, a higher elasticity might be caused by physiological differences: maybe cold climates are more harmful for infants than for adults.41 On the other hand, if cold is equally bad for children and adults, then the elasticity of mortality for infants is representative of all age groups and clean of self-selection bias. Yet in both cases, to the hypothesis that “men choose, women follow” we may now add “and children pay the price”. The selection bias in adult data cannot be either proved or fully corrected for using only aggregate regional information. More research is needed on the influence of climate on health, if and when a significantly large and regionally representative sample of individual-level data is available. It is also important to conduct separate estimations for rural and urban infant mortality, using the data available for recent years. The purpose is to investigate if temperature proves to be a more significant factor for either urban or rural infant mortality. The results are rather uninformative. Splitting the data turns both coeffi- cients insignificant. For rural data, income suddenly appears to be main explanatory factor, and the urban regression is generally poor. Vassin & Costello (1997) found that the Russian mortality patterns, just as the overall state of the economy, society and health, differ much more between urban and rural areas than between regions. There exist unobserved heterogeneity between urban and rural Russia. Perhaps, bet- ter control variables are needed: temperature, health infrastructure data aggregated separately over urban and rural areas might be more informative. Even geographic centers of urban and rural population of the same region might be different, and hence have different average temperature (TPC). More careful treatment of urban-rural dif- ferences is needed in order to satisfactory investigate the role of temperature.

3.4.2 What is the cost of the excess mortality? The next logical question is: what are the economic consequences of the excess mortal- ity brought about by the Soviet misallocations? We can consider two estimates. One is the estimate of the direct cost of mortality: the reduction in economic growth, loss has a lot of merit, it does not however change the main conclusion: whether because of cold or because of distance or other factors, overpopulation of the East is costly to Russian economy. 39Infants’ health could be subject to some degree of selection bias: if parents are healthier on average, so are the children; but hopefully this effect is relatively negligible. 40Though not statistically different in any given year. 41As an obvious example, respiratory diseases - dangerous for infants and elderly - are spreading faster in cold climates, simply because people spend more time together indoors.

61 Dep. variable Independent variables is infant Temperature Doctors Income Number % 2 mortality rate ( ◦C ) per capita (th rub) of obs R 1985 total -0.9 0.001 -0.52 72 0.34 (0.01) (0.16) (0.22) 1990 total -0.8 0.000 -0.005 72 0.44 (0.00) (0.82) (0.97) 1993 total -0.8 -0.001 -0.66 72 0.30 (0.00) (0.53) (0.11) 1995 total -0.8 -0.005 0.037 79 0.20 (0.02) (0.06) (0.73) 1997 total -0.8 -0.001 -0.098 79 0.23 (0.01) (0.60) (0.34) 1998 total -0.9 -0.003 -0.115 79 0.15 (0.03) (0.42) (0.40) 1995 urban -0.7 -0.007 0.143 79 0.14 (0.10) (0.11) (0.64) 1997 urban -0.4 -0.002 -0.011 79 0.11 (0.28) (0.48) (0.93) 1998 urban -0.7 -0.003 -0.178 79 0.11 (0.20) (0.57) (0.24) 1995 rural -0.3 -0.004 0.341 77 0.14 (0.48) (0.19) (0.10) 1997 rural -0.3 0.002 0.398 77 0.24 (0.57) (0.47) (0.02) 1998 rural -0.7 0.001 0.228 77 0.14 (0.13) (0.82) (0.26) ’95-’98 -0.4 0.001 0.347 77 0.22 rural average (0.34) (0.73) (0.02) ’95-’98 -0.6 -0.005 -0.023 79 0.15 urban average (0.11) (0.24) (0.81) ’85-’98 -0.7 -0.000 -0.33 72 0.43 total average (0.00) (0.86) (0.17) Dependent variable is logit-transformed. Heteroskedasticity-consistent P-values are in parentheses.

Table 3.13: Infant mortality rate as a function of temperature.

62 of output and utility because of shorter life expectancy. If we obtain an estimate for the economic value of life in Russia, we can calculate the aggregate economic impact of decreased life expectancy based on actual-counterfactual mortality differential. The other estimate is the amount of money it would take to mitigate the cost of extra cold: invest in and develop medical infrastructure enough to bring mortality to the counterfactual levels. In a competitive decentralized economy (or with benevo- lent social planner), we can assume that if a reduction in mortality were economically efficient, it would have happened. It is not necessarily so in the Soviet system. There- fore, the opportunity cost of excess mortality in Russia is really the lower of the two estimates mentioned above. The studies of growth in a cross-country setting so far show that higher mortality is weakly associated with slower growth. Barro & Sala-i-Martin (1995) present sup- porting evidence but only for the countries with per capita GDP of somewhat below $1000. For richer countries (Russia included) no relationship between mortality and growth was observed. However it is still unclear where real causality lies. Is it lower mortality that causes higher growth? Or does economic success lead to better health situation? The issue is far from being conclusively resolved, and any extrapolations of the international results to the Russian case should be taken with the grain of salt.42 The consequences of higher mortality for Russia’s economic performance are an important direction for future research. But even in most modest scenario, if pure economic costs of excess mortality in Russia are close to zero, the social cost of decreased life expectancy is hardly a factor to ignore.

3.5 Conclusions

The purpose of this paper was to determine how costly is the cold for Russia. My answer is: very. Lower temperatures lead to higher energy consumption and higher mortality. More research is needed to investigate the effect on productivity of various sectors of the economy. If the construction industry is at all representative, it is reasonable to expect negative effect on productivity in many if not all sectors. The legacy of the Soviet system leaves Russia 1.5◦C colder in per capita terms. In the most modest of estimates, this difference in TPC costs not less than 1% of GDP in extra energy costs, and 0.2% of GDP in lost productivity in construction sector alone annually. If all manufacturing industries had the same temperature elasticity of TFP as construction, loss of another 1.3% of GDP yearly could be attributed to cold. Additional 0.85% of aggregate mortality are also a direct consequence of Soviet spatial policy. These are annual costs, but compounded over last 30 years of the Soviet era - a time when the spatial evolution of Russian economy took a sharpest detour from the optimal trajectory - lead to a GDP loss in excess of 35% (or 97% in

42Intuition suggests, however, that higher mortality must indeed have adverse effect on GDP ceteris paribus. Higher mortality must be associated with higher morbidity. Higher morbidity rate necessarily leads to lower productivity of labor and reduces the labor force.

63 worst case scenario). Every person in Russia gave up at least one fourth (or maybe a half) of his income for Siberian development! Two directions for future research are evident. First, the analysis could be further improved and expanded, especially with respect to productivity. The elasticity of TFP to temperature in different sectors or industries could be further investigated with the appropriate data on regional output, capital and labor disaggregated by industry. The analysis of energy consumption could benefit from better and more complete data. In particular, due to data constraints residential electricity consumption was assumed to have zero temperature elasticity. In reality, in Russia (where air conditioning is much less common than electrical space heating) it is probably negative. If we had the proper data, it is likely that the estimates would add to the total assessment of the costs of cold. A second direction for future research deals with designing the appropriate policy. The cold is costly, and the extra cold due to the Soviet misallocations is an extra unnecessary burden, but what could be done about it? Given the spatial structure of Russian economy formed by the Soviet system, more research is needed on the optimal policies that would help to restore or improve spatial efficiency. Is it better to adapt to cold or to move people to warmer places?

64 Appendix A

Details of dataset construction

A.1 Dependent variables: population and manu- facturing employment

Russia

The population and manufacturing employment data for Russia (POPULt and INDt) for the pre-revolutionary period were taken from the “Yearbook of Russia, 1910.” The data on population by administrative units are quite accurate and extensive. For manufacturing employment data I used two alternative sources. First, I use the reports of the 1908 census of industry that give the number of people employed in manufacturing enterprises (5 or more employees). Secondly, I the report on peoples’ occupations by province.1 The number of people reported to be in manufacturing occupations (group VII occupations: manufacturing, mining and crafts) is normally higher than number of employed counted by census of industry, because small and family establishments (less than 5 people) were not covered by industrial census. The choice of data series did not prove important, as the final results turned out to differ only by fractions of 1%. There were 98 administrative units – “gubernias” and “oblasts” (provinces) – in the Russian Empire in 1910. Only the territories that later belonged to the Soviet Union are included in the sample. Eighteen provinces were located on the present- day territories of and Finland. Data for Kamchatka oblast are included into Primorskaya province, and data for Zakatalskii – into Tiflisskaya province. Karsskaya province now belongs to Turkey. The resulting sample size is 79. Figure 2.3 shows a map of the administrative divisions in the Russian Empire together with the borders of the USSR and the present day borders of the Russian Federation. The data for the pre-transition Soviet Union were taken from the results of the population census of 1989, and the census of industry, 1989. Both employment and population data are available at the low level of geographical aggregation. The data

1Also published in “The Yearbook of Russia, 1910.”

65 were reclassified according to the boundaries of the pre-revolutionary administrative units.

Canada

The dependent variables series POPULt and INDt (population and manufacturing employment) for Canada were obtained from the Census of Canada publications for the corresponding years. The census publications provide data at the low level of geographical aggregation – by census districts. For the estimations, the census dis- tricts have to be combined into bigger geographical units. The purpose is dual: both to maintain a panel dataset, and to make sure the Canadian geographical units are roughly equivalent to Russian provinces. Historically, as Canadian territory was gradually developed and populated, the number of the census districts was increasing over time. Large and sparsely popu- lated at the beginning of the century single-district territories later experienced the increase in population, and were divided into several districts for the subsequent cen- suses. The year 1911 remains largely the limiting case, as the districts in subsequent years are generally more compact spatially. Wherever the bigger 1911 districts were divided up later on, I had to combine districts for later years in order to maintain the same geographical breakdown for all time periods in the sample. Most of the sparsely populated territories – northern parts of Prairie Provinces, Yukon, North- West Territories, of British Columbia – were single-district territories in 1911. The administrative units in the densely populated South were also occasionally revised. On several occasions, new district boundaries overlapped significantly with the old boundaries.2 In these cases, I had to merge the overlapping districts. The exception to this rule is Northern Quebec, where the districts were revised several times throughout the century. The boundary changes were significant there, but the overlaps covered the territory with extremely low population density. In these cases, I counted a new district into a geographical unit where the largest city of the district was previously located. Since the ambiguity usually existed only with respect to a small fraction of the district population – distant villages – I do not believe that significant error was introduced in the process. Next, I further merged the geographical units in order to construct a set of re- gions that would closely resemble Russian administrative units (provinces) in size and spatial pattern. Provinces in Russia mostly are formed around an urban center, and sometimes, especially in Eastern Siberia and the Far East, have borders that follow the topography of the terrain. Provinces in the European part of Russian Empire were smaller in area and had a higher population density. In Central Asia, Siberia, and the Far East, provinces are large in area and sparsely populated. Where possible,

2For example, Vancouver Island was a single census district in 1961, but in 1911 the northern part of the island was included in the same district with all of British Columbia’s Pacific coast. I had to combine these districts into a single observation.

66 I tried to apply similar principles to Canadian districts. Generally, small districts (counties) in southern Ontario, southern Quebec, Nova Scotia and New Brunswick are combined such that, where possible, each division covers a large or mid-size city and counties around it. Districts of the Prairie Provinces are often rectangular in shape, and can be divided naturally onto southern, middle and northern parts, with a large or mid-size city in each of them.3 As the result, 38 regions were constructed from 227 census districts in 1911 Canada.

A.2 Regional characteristics

The regional characteristics variables that were used for the estimations (for Canadian regions) and in projections (for Russian regions) are summarized in Table A.1.4 The variables such as area, temperature, agricultural land quality (FARMING), access to the waterways (PORT), and distance to the largest city (DISTCAP)rep- resent inherent characteristics of a region: size, climate, quality of soils, accessibility, and remoteness. (If the region is not landlocked, the largest city is usually a port, so the PORT variable can be treated as inherent to geographical location.) Variables such as number of railroads (RR), trade route (R ABROAD) character- ize not only the location (accessibility) of a region, but also level of the infrastructure development. Since the structure of railroads network is highly endogenous to the population distribution, it is imperative to use lagged number of railroads (RRt−1)in the regressions. Presence of a trade route (R ABROAD) variable is treated as exoge- nous, since all the routes connecting Canada and USA are the traditional transport routes, and were formed prior to the beginning of the century. Natural resources endowments are characterized by four variables: presence of coal mining operations (COAL), presence of oil extracting operations (OIL), and presence of any metal mining operations (METALS), presence of timber resources (TIMBER). All these variables are dummy variables indicating only the presence of active resource-extracting operations in a region. To track changes in mining operations and to avoid possible endogeneity the dummy variables vary with time, the lagged values are used. The choice of dummies is a compromise. The factors truly relevant for the popu- lation or industry growth in a region are the potential economic profit that could be obtained either from natural resources and/or land, or the value of positive externali- ties provided by the existing infrastructure or favorable location. Characterization of these factors by the dummy variables is an extreme simplification of reality. Unfor- tunately, it is a necessary choice. I had to choose regional characteristics bearing in

3Of course, there exist many possible sets of spatial divisions that can be constructed on these principles: rural districts can be attached to either of the nearby cities. However, the estimation results did not appear to be sensitive to the particular choices. 4Distance to largest city (DISTCAP) variable, number of railroads (RR) variable, and all dummy variables were constructed using various maps dated from 1921 to present.

67 Variable Description AREA Area of the division, sq. km. For divisions containing more than one cen- sus district, the areas of these districts were added up. Source: Census of Canada, 1991. TEMP Average January temperature, ◦C. For large regions where average tem- perature varies across the region, the temperature in a largest city was taken. DISTCAP Direct (straight line) distance from a largest city in a region to Toronto or Moscow. RRt Number of railroad branches leading from the largest city in a year t. For any given region this characteristic can increase over time as new railroads are built. COALt Dummy=1 if significant amount of coal was mined in a region in a year t, and a region is a net exporter of coal. Mining sufficient for the local needs only is ignored. This characteristic can change over time, as new coalmines are explored or old mines are closed. METALSt Dummy=1 if significant amount of any metal ores were mined in a region in a year t. The same definition of “significant operation” as with coal is applied. Can vary with time. OILt Dummy=1 if significant amount of oil was extracted in a region in a year t. The same definition of “significant operation” as with coal is applied. Canvarywithtime. TIMBERt Dummy=1 if at least 1/3 of the territory is covered with forrest and the significant amount of logging is taking place. The same definition of “significant operation” as with coal is applied. Can vary with time. PORT Dummy=1, if a largest city in a region is a port. Included are Canadian ports on the Atlantic and Pacific oceans, on Great lakes and on St. Laurence river, and Russian ports on all seas and Caspian lake. R ABROAD Dummy=1 if there is a direct (not through some other region) trans- portation route abroad from the largest city in a region. Transportation routes are railroads, conventional roads, or waterways. FARMING Dummy=1 if at least 1/3 of the land is classified as “having no major obstacles for agriculture.” For Canada, corresponds to land type A and B in agricultural lands classification.

Table A.1: Regional characteristics

68 mind that data for both Canada and Russia had to be collected. Using more informa- tive measures for region’s natural resource potential (amount of extractable resources, for example), land value, infrastructure was possible for Canada. Unfortunately, the comparable information on Russian regions may not exist in public domain, often due to security issues,5 or may not exist at all. Only simple dummy variables can be constructed using open sources – maps, statistical publications.

5For example, the estimated amount of extractable natural resources in a region was a USSR state secret.

69 Appendix B

Algorithm for choosing the optimal model

The algorithm for choosing the best model works as follows:

Step 1 Choose the “core” set of variables – i.e. the ones that definitely are going to be included into the model. In our case, the past values of population and industry, area and the constant term have to be included. This narrows down the number of variables “in question” from 224 to 168. Step 2 Start with the model with all variables included. Drop one of the 168 “question- able” variables from the regressions and estimate the restricted model. Check, if the overall fit of the dynamic forecast (according to a chosen criterion among (1)-(4), equations (2.22) - (2.30)) improves. If the fit is better without the variable, drop it. If the fit is worse, keep it. Repeat the procedure for all 168 variables consecutively. Step 3 Take the model that resulted from step 2. Now try to add explanatory variables and check if inclusion of any of them improves the overall fit. If it does, put a variable back into regressions. Step 4 Repeat steps 2 and 3 until no more inclusions or exclusions can be made that would improve the fit of the dynamic forecast. A local minimum for the chosen criterion is found.

Of course, there is no guarantee that the procedure finds the global minimum of SSE. If there exist several local minima that correspond to the different non-nested models, it is possible that algorithm finds one of those models, not necessarily the best one. The procedure is sensitive to the starting point (model) and to the order in which variable are examined. To insure that the model chosen is indeed the best local minimum at least among those easy to find, the next step can be conducted.

Step 5 Repeat the steps 2 to 4, examining the variables “in question” in different order. If the procedure finds a different local maximum (a different set of variables), compare it with the one found previously and pick the better one. Repeat several times.

70 Repeat steps 2 to 4, but with different starting point. For example, start with the model that includes “core” variables only. As there are no more variables to drop, go directly to step 3, then do as algorithm requires. If a different model results, compare it with the one found previously and pick the best. Repeat with various starting points.1

1In my experience, Step 5 did not uncover a better alternative, i.e. in this case the minimum found in Step 4 is likely global.

71 Appendix C

Dependent POP POP POP POP POP POP POP POP Variable (ln) 1921 1931 1941 1951 1961 1971 1981 1991 Lagged population 1.097 0.997 0.982 0.811 0.861 0.861 0.957 1.076 ln POPt−1 (0.045) (0.035) (0.035) (0.041) (0.044) (0.052) (0.058) (0.045) Lagged industry -0.143 -0.007 0.001 0.117 0.093 0.079 0.010 -0.041 ln INDt−1 (0.033) (0.030) (0.022) (0.036) (0.034) (0.034) (0.043) (0.031) Area 0.048 0.055 0.031 0.045 0.048 0.066 0.043 0.011 AREA (0.016) (0.015) (0.017) (0.016) (0.012) (0.015) (0.019) (0.013) Temperature 0.007 0.009 0.001 0.008 0.012 0.000 TEMP (0.007) (0.004) (0.004) (0.004) (0.004) (0.004) Distance to Toronto 0.000 -0.066 0.008 -0.011 DISTCAP (0.000) (0.026) (0.025) (0.023) Lagged railroads 0.030 -0.010 -0.005 -0.010 -0.014 RRt−1 (0.017) (0.013) (0.012) (0.011) (0.013) Lagged coal mining 0.043 0.002 0.029 COALt−1 (0.046) (0.043) (0.043) Lagged metal mining -0.021 0.005 0.113 -0.020 -0.046 METALSt−1 (0.055) (0.054) (0.055) (0.047) (0.049) Lagged oil extraction -0.042 -0.060 0.092 -0.036 0.087 OILt−1 (0.076) (0.068) (0.056) (0.054) (0.044) Timber cutting 0.150 0.030 -0.005 TIMBER (0.049) (0.048) (0.038) Access to waterways -0.043 -0.050 -0.040 -0.035 -0.069 -0.054 -0.016 PORT (0.041) (0.041) (0.038) (0.039) (0.041) (0.040) (0.039) Trade route -0.016 0.010 -0.014 -0.065 -0.072 -0.029 ROUT E ABROAD (0.045) (0.049) (0.041) (0.042) (0.043) (0.043) Agricultural land -0.035 0.060 quality, FARMING (0.051) (0.044) Urbanization rate 0.212 0.343 0.172 0.263 0.276 0.251 0.228 (1911), URBAN (0.142) (0.172) (0.121) (0.096) (0.129) (0.136) (0.139) Number of observations=279 R2=0.99

Standard errors are in parentheses. * indicates significance at 90% level.

Table C.1: Results of the restricted system estimation. Equations for population.

72 Dependent IND IND IND IND IND IND Variable (ln) 1941 1951 1961 1971 1981 1991 Lagged population 0.735* 0.070 -0.077 0.139 -0.089 0.552* ln POPt−1 (0.077) (0.080) (0.096) (0.114) (0.124) (0.095) Lagged industry 0.248* 0.938* 1.095* 0.875* 1.173* 0.509* ln INDt−1 (0.073) (0.068) (0.073) (0.075) (0.085) (0.059) Area 0.167* 0.041 0.095* 0.041 0.105* -0.062* AREA (0.040) (0.036) (0.037) (0.034) (0.037) (0.024) Temperature 0.030* 0.033* 0.019* TEMP (0.011) (0.009) (0.010) Distance to Toronto -0.326* -0.006 0.231* -0.206* DISTCAP (0.067) (0.061) (0.051) (0.047) Lagged railroads 0.075* 0.020 0.065* -0.013 RRt−1 (0.030) (0.026) (0.024) (0.026) Lagged coal mining 0.045 0.040 0.106 -0.067 COALt−1 (0.101) (0.091) (0.105) (0.092) Lagged metals mining -0.315* -0.211* -0.402* METALSt−1 (0.128) (0.107) (0.101) Lagged oil extracting -0.004 0.010 -0.207 0.014 OILt−1 (0.158) (0.131) (0.133) (0.111) Timber cutting -0.022 0.200* -0.183* -0.059 TIMBER (0.109) (0.085) (0.097) (0.098) Access to waterways -0.093 -0.128 -0.038* PORT (0.094) (0.083) (0.089) Trade route -0.061 -0.131 -0.076 -0.200* ROUT E ABROAD (0.103) (0.073) (0.101) (0.090) Agricultural land quality -0.113 -0.199 0.051 -0.275* FARMING (0.099) (0.111) (0.100) (0.091) Urbanization rate, 1911 1.435* 0.091 0.561* URBAN (0.353) (0.316) (0.269) Number of observations=180 R2=0.98

Standard errors are in parentheses. * indicates significance at 90% level.

Table C.2: Results of restricted system estimations. Equations for industry.

73 Region Projected Actual Difference, Actual to population, population, ’000s projected ratio ’000s ’000s ratio Alberta north 1115.9 1268.6 152.7 1.14 Alberta central 1021.2 1063.5 42.3 1.04 Alberta south 234.0 213.4 -20.6 0.91 BC coast 965.8 679.6 -286.2 0.70 Kootenay 314.2 176.3 -137.9 0.56 Vancouver area 1174.7 1833.0 658.3 1.56 Yale and Cariboo 455.0 593.2 138.2 1.30 Manitoba center 126.1 80.5 -45.6 0.64 Manitoba sw 109.2 47.9 -61.2 0.44 Manitoba south-center 253.7 152.0 -101.7 0.60 Manitoba south-east 786.9 747.3 -39.5 0.95 Manitoba north 61.1 64.2 3.1 1.05 New Brunswick north west 179.8 165.7 -14.1 0.92 New Brunswick south 411.3 346.7 -64.6 0.84 New Brunswick east coast 186.3 211.5 25.2 1.14 Nova Scotia west 901.5 657.7 -243.9 0.73 Nova Scotia east 594.2 242.3 -351.9 0.41 Toronto area 4564.4 5897.1 1332.7 1.29 Ontario south 1594.8 1235.9 -358.9 0.77 Ottawa area 1493.0 1219.0 -274.0 0.82 Ontario center 1177.8 910.5 -267.3 0.77 Ontario north 925.2 581.9 -343.3 0.63 Ontario north-west 601.3 240.6 -360.7 0.40 PEI 120.7 129.8 9.1 1.08 Montreal and around 2245.6 3442.3 1196.7 1.53 Quebec south 752.5 861.5 108.9 1.14 Quebec city and around 596.0 842.8 246.8 1.41 Quebec center 415.4 377.8 -37.5 0.91 Quebec east 430.8 331.0 -99.8 0.77 Quebec north east 616.7 417.0 -199.7 0.68 Quebec west 141.5 203.4 61.9 1.44 Quebec south-west 211.0 332.3 121.3 1.57 Saskatchewan south 786.4 426.4 -360.1 0.54 Saskatchewan center 366.8 337.8 -29.0 0.92 Saskatchewan north 352.6 224.8 -127.9 0.64 Yukon 26.9 27.8 0.9 1.03 N-W Territ 37.3 57.6 20.4 1.55 Newfoundland 861.5 568.5 -293.0 0.66

Table C.3: Projected vs actual population. Canada.

74 Province Actual Projected Projected Projected Projected population population population, population, population, WWII fertility WWII and adjusted adjusted fertility adjusted Arkhangelskaya 2699.1 1088.1 1365.4 921.8 1119.3 Astrakhanskaya 1426.5 2383.9 2991.5 2019.6 2452.4 Bessarabskaya 4345.8 3682.0 2866.9 3615.5 2724.1 Vilenskaya 1977.2 2868.3 2233.3 2363.7 1780.9 Vitebskaya 2868.4 3628.2 2825.0 3443.3 2594.4 Vladimirskaya 1089.7 3350.7 4204.6 2838.5 3446.8 Vologodskaya 1970.2 5544.7 6957.9 4697.3 5703.9 Volynskaya 4037.9 5362.0 4175.1 4490.3 3383.3 Voronezhskaya 2734.1 3989.2 3106.1 3379.4 2546.3 Vyatskaya 3431.5 5469.3 6863.3 4633.4 5626.4 Grodnenskaya 1394.9 2991.8 2329.5 2839.3 2139.3 Oblast’ Voiska Donskogo 5424.8 5918.4 4608.3 5013.9 3777.8 Ekaterinoslavskaya 11286.0 4856.2 3781.2 4066.7 3064.1 Kazanskaya 3806.9 3860.2 4844.0 3270.2 3971.0 Kaluzhskaya 1066.8 2248.7 1750.9 1905.0 1435.4 Kievskaya 6170.8 7808.5 6080.0 6539.1 4927.0 Kovenskaya 2314.9 3145.7 2449.4 2592.3 1953.2 Kostromskaya 1425.2 2901.5 3641.0 2458.0 2984.8 Kurlyandskaya 917.0 1078.9 840.1 613.1 462.0 Kurskaya 2564.5 3452.7 2688.4 2925.0 2203.9 Liftlyandskaya 2006.3 2406.3 1873.7 1367.5 1030.4 Minskaya 4601.6 5890.1 4586.2 5589.9 4211.8 Mogilevskaya 2145.7 3409.6 2654.8 3235.8 2438.1 Moskovskaya 15682.4 12044.3 9378.1 10203.4 7687.9 Nizhegorodskaya 3336.9 2863.6 3593.5 2426.0 2945.8 Novgorodskaya 1454.6 3230.7 2515.5 2736.9 2062.2 Olonetskaya 892.4 907.2 706.4 768.6 579.1 Orenburgckaya 6421.7 3683.1 4621.8 3120.2 3788.9 Orlovskaya 2084.7 3534.4 2752.0 2994.2 2256.0 Penzenskaya 1773.2 1983.9 2489.5 1680.7 2040.8 Permskaya 7755.0 7153.5 8976.7 6060.1 7358.8 Podol’skaya 2660.9 5155.3 4014.1 4317.2 3252.9 Poltavskaya 2487.5 4866.4 3789.1 4075.3 3070.6 Pskovskaya 771.7 2377.6 1851.3 2014.2 1517.6 Ryazanskaya 1432.0 3592.5 4508.2 3043.4 3695.7 Samarskaya 4267.3 5170.0 6487.6 4379.8 5318.4 St. Peterburgskaya 6446.6 7855.9 6116.9 6655.2 5014.5 Saratovskaya 2117.7 4562.3 5725.1 3865.0 4693.3 Simbirskaya 2441.0 2191.4 2749.9 1856.4 2254.3 Smolenskaya 1211.2 3668.7 2856.6 3108.0 2341.7

75 Province Actual Projected Projected Projected Projected population population population, population, population, WWII fertility WWII and adjusted adjusted fertility adjusted Tavricheskaya 4916.7 3263.3 2540.9 2732.8 2059.1 Tambovskaya 2502.7 4294.4 5389.0 3638.1 4417.7 Tverskaya 1581.4 4129.1 3215.1 3498.0 2635.6 Tul’skaya 1891.4 2825.5 2200.0 2393.6 1803.5 Ufimskaya 4533.7 3990.1 5007.1 3380.3 4104.7 Khar’kovskaya 4830.8 5352.8 4167.8 4482.6 3377.4 Khersonskaya 4524.1 4397.6 3424.1 3682.7 2774.8 Chernigovskaya 2317.2 4715.1 3671.3 3948.6 2975.1 Estlyandskaya 1037.9 487.7 379.7 264.2 199.1 Yaroslavskaya 1384.1 2207.7 2770.4 1870.3 2271.1 Bakinskaya 4158.6 1674.7 2101.5 2761.1 3352.8 Batumskaya 392.7 133.3 167.2 156.6 190.2 Dagestanskaya 1514.1 864.3 1084.5 1424.9 1730.3 Elisavetpol’skaya 2536.2 2301.0 2887.4 3793.7 4606.7 Kubanskaya 4929.3 6013.9 4682.7 5094.8 3838.7 Kutaisskaya 1441.9 1536.8 1928.5 1806.1 2193.1 Stavropol’skaya 1685.1 1986.0 1546.3 1682.4 1267.6 Terskaya 4209.6 3367.4 2622.0 5551.9 4183.2 Tiflisskaya 2898.9 3456.8 4337.8 4062.4 4933.0 Chernomorskaya 585.7 157.3 197.4 133.3 161.9 Erivanskaya 3582.8 1715.6 2152.9 3498.5 4248.3 Amurskaya 1333.4 498.0 625.0 421.9 512.3 Eniseiskaya 4244.4 3234.7 4059.2 2740.3 3327.6 Zabaikal’skaya 2496.1 1180.7 1481.7 1000.3 1214.6 Irkutskaya 2966.9 1675.9 2103.0 1419.7 1724.0 Primorskaya 5009.5 731.9 918.4 620.0 752.9 Sakhalinskaya 709.6 22.5 28.2 19.0 23.1 Tobol’skaya 6245.8 5688.5 7138.4 4819.1 5851.8 Tomskaya 10160.8 6112.0 7669.8 5177.8 6287.5 Yakutskaya 1081.4 527.3 661.6 446.7 542.4 Akmolinskaya 5449.1 2272.0 2851.0 3611.3 4385.3 Zakaspiiskaya 2424.1 1765.1 2214.9 3810.5 4627.0 Samarkandskaya 2290.9 2181.8 2737.9 5060.4 6144.8 Semipalatinskaya 2576.7 1866.8 2342.6 2967.4 3603.3 Semirechenskaya 6489.2 2211.0 2774.5 3920.6 4760.8 Syr-Dar’inskaya 7971.0 6433.8 8073.5 14922.1 18120.0 Turgaiskaya 1629.8 1522.6 1910.7 2420.2 2938.9 ’skaya 1102.0 2125.1 2666.7 3377.8 4101.7 Ferganskaya 10466.4 5948.8 7465.0 12310.0 14948.0

Table C.4: Projected vs actual population. Russia.

76 SIC code Description 10 Metal mining 12 Coal mining 13 Oil and gas extraction 14 Nonmetallic minerals, except fuels 20 Food and kindred products 2063 Beet Sugar 21 Tobacco manufactures 22 Textile mill products 23 Apparel and other textile products 24 Lumber and wood products 25 Furniture and fixtures 26 Paper and allied products 2611 Pulp Mills 2621 Paper Mills 2631 Paperboard Mills 27 Printing and publishing 28 Chemicals and allied products 2812 Alkalies and Chlorine 2822 Synthetic Rubber 2869 Industrial Organic Chemicals, NEC 29 and coal products 30 Rubber and miscellaneous plastics products 31 Leather and leather products 32 Stone, clay, glass, and concrete products 3241 Cement, Hydraulic 3274 Lime 33 Primary metal industries 3334 Primary Production of Aluminum 34 Fabricated metal products 35 Industrial machinery and equipment 36 Electrical and electronic equipment 37 Transportation equipment 38 Instruments and related products 39 Miscellaneous manufacturing industries

Table C.5: The industrial structure control series.

77 Dependent variable is Log of industrial electicity Independent consumption per unit of labor in 1991 Variables VPK no VPK VPK no VPK VPK no VPK Temperature -0.039* -0.041* -0.037* -0.40* -0.037* -0.040* (0.006) (0.006) (0.007) (0.006) (0.006) (0.006) SIC10 -0.27 -2.59 (2.07) (2.05) SIC12,14 -2.39* -5.20* -2.28* -1.72 (1.01) (1.93) (0.82) (1.08) SIC13 5.30* 2.08 5.41* 4.71* (1.23) (1.84) (0.67) (0.65) SIC20 -0.69 -4.06* (1.04) (2.03) SIC2063 -1.04 -3.26 -1.84 -1.25 (4.39) (3.17) (3.84) (3.12) SIC22-25 -2.76* -4.63* -2.49* -1.69* (1.23) (1.79) (0.72) (0.84) SIC26,2631 -10.28* -10.28* -12.28* -10.56* (5.02) (4.36) (4.37) (4.42) SIC2611 9.59 2.34 9.21 6.91 (6.00) (5.69) (4.83) (4.11) SIC2621 -0.91 -5.77 (1.59) (3.12) SIC27 1.97 1.39 (8.62) (7.15) SIC28 2.26 -0.84 3.09* 2.11 (1.67) (1.92) (1.42) (1.19) SIC2822 20.35* 12.11* 18.48* 12.10* (5.45) (3.10) (4.85) (2.34) SIC2869 2.60 -1.27 (3.63) (3.08) SIC29,30 -3.96* -6.36* -2.85* -2.49* (1.30) (2.11) (0.97) (1.09) SIC31 0.52 -2.53 (5.23) (4.38) SIC33 0.09 -2.65 (1.23) (1.86) SIC3334 16.65* 10.25* 16.04* 13.55* (4.91) (4.02) (4.08) (3.71) SIC3241 14.23 9.48 11.90 8.42 (9.36) (8.17) (8.54) (7.37) SIC32,34-38 - -4.42* -1.24 (1.83) (0.75) SIC32,34-38, -2.60* -2.41* VPK (1.16) (0.61) SIC39 3.63 1.11 (4.06) (3.45) R2 0.22 0.26 0.79 0.74 0.77 0.66 No. of obs 75 75 75 75 75 75

Heteroskedasticity-consistent standard errors are in parentheses, * denotes significance on the 95% level.

Table C.6: Electricity in 1991.

78 Dependent variable is Log of industrial electicity Independent consumption per unit of labor in 1992 Variables VPK no VPK VPK no VPK VPK no VPK Temperature -0.039* -0.040* -0.038* -0.040* -0.038* -0.041* (0.007) (0.007) (0.007) (0.006) (0.007) (0.006) SIC10 0.17 -2.30 (1.51) (1.71) SIC12,14 -0.93 -4.11* -1.28 -1.23 (1.04) (1.81) (0.86) (1.04) SIC13 6.65* 2.91 6.04* 4.95* (1.48) (1.84) (0.72) (0.69) SIC20 -0.34 -3.75 (1.24) (2.16) SIC2063 8.87* 3.61 6.45 4.01 (4.26) (3.55) (4.06) (3.70) SIC22-25 -1.63 -3.98* -1.96* -1.58* (1.32) (1.79) (0.55) (0.72) SIC26,2631 -7.07 -8.86* -8.66 -8.54* (6.14) (4.16) (5.09) (3.46) SIC2611 6.10 -0.49 6.10 3.78 (6.40) (5.37) (5.41) (4.19) SIC2621 0.25 -4.83 (2.05) (2.66) SIC27 7.79 5.46 (9.94) (7.37) SIC28 2.61 -0.64 3.09* 2.10 (1.96) (1.95) (1.50) (1.09) SIC2822 16.40* 9.54* 13.19* 8.53* (6.28) (3.60) (5.59) (2.73) SIC2869 1.05 -2.41 (4.72) (3.10) SIC29,30 -1.21 -4.19* -1.05 -1.06 (1.74) (2.06) (1.23) (1.10) SIC31 -3.57 -6.05 (4.24) (4.07) SIC33 1.39 -1.77 (1.43) (1.86) SIC3334 20.00* 12.04* 18.05* 14.59* (4.88) (3.70) (4.60) (3.84) SIC3241 0.089 -4.61 -4.87 -6.12 (6.17) (6.24) (5.29) (6.32) SIC32,34-38 -3.80* -1.17 (1.80) (0.68) SIC32,34-38, -1.51 -1.87 VPK (1.26) (0.48) SIC39 4.96 1.76 (4.62) (3.68) R2 0.21 0.24 0.74 0.72 0.71 0.64 No. of obs 79 79 79 79 79 79

Heteroskedasticity-consistent standard errors are in parentheses, * denotes significance on the 95% level.

Table C.7: Electricity in 1992.

79 Dependent variable is Log of industrial thermal energy Independent consumption per unit of labor in 1991 Variables VPK no VPK VPK no VPK VPK no VPK Temperature -0.025* -0.027* -0.019* -0.23* -0.026* -0.024* (0.007) (0.007) (0.009) (0.008) (0.007) (0.007) SIC10 2.13 -0.16 (2.24) (2.71) SIC12,14 1.04 -0.69 (1.48) (2.46) SIC13 3.73* 2.10 3.72* 2.44* (1.85) (2.58) (0.82) (0.63) SIC20 -1.07 -2.23* -1.13* -1.49 (0.93) (2.57) (0.40) (0.55) SIC21 45.36* 16.79 (22.78) (19.74) SIC22-24 0.15 -0.96 (1.53) (2.55) SIC25,2063 1.51 0.42 2.81 1.99 (2.61) (3.06) (1.79) (0.84) SIC26 -45.65* -41.48* -41.44* -39.08* (17.39) (18.24) (13.53) (15.99) SIC2611 12.22 5.92 8.32 7.20 (6.36) (7.18) (5.03) (3.97) SIC2621 3.33 1.70 3.55 2.02 (3.41) (4.00) (2.43) (2.02) SIC2631 1.09 -0.94 (8.63) 6.42 SIC27 1.62 3.78 6.00 8.30 (7.07) (5.95) (5.04) (4.79) SIC28,29 5.09* 2.85 5.33* 3.55* (2.23) (2.59) (1.45) (1.19) SIC2822,2869 17.68* 12.00* 15.36* 13.66* (3.60) (2.92) (3.21) (2.59) SIC30-32 0.10 1.10 (2.27) (3.07) SIC33 1.25 0.19 1.63 0.90 (1.65) (2.58) (0.59) (0.53) SIC3334 4.89 2.55 7.69 3.53 (4.17) (4.30) (4.41) (3.61) SIC3241 -0.01 3.30 (11.34) (9.56) SIC34-35,37-38 - -0.95* - (2.46) SIC34-35,37-38, 0.08 - - VPK (1.51) SIC36 1.09 -1.87 -1.35 -1.02 (1.62) (2.44) (0.66) (0.64) SIC39 3.11 1.32 (3.86) (3.54) R2 0.11 0.11 0.69 0.64 0.65 0.61 No. of obs 75 75 75 75 75 75

Heteroskedasticity-consistent standard errors are in parentheses, * denotes significance on the 95% level.

Table C.8: Thermal energy in 1991.

80 Dependent variable is Log of industrial thermal energy Independent consumption per unit of labor in 1992 Variables VPK no VPK VPK no VPK VPK no VPK Temperature -0.035* -0.036* -0.039* -0.041* -0.044* -0.042* (0.007) (0.007) (0.007) (0.007) (0.006) (0.006) SIC10 0.59 -1.29 (1.20) (1.74) SIC12,14 -1.05 -2.94 (0.87) (1.84) SIC13 2.43 0.60 3.26* 2.26* (1.35) (1.89) (0.67) (0.58) SIC20 -0.32 -2.37 0.55 0.10 (0.66) (1.74) (0.37) (0.44) SIC21 5.44 -3.17 (24.96) (18.42) SIC22-24 -1.15 -2.64 (1.05) (1.78) SIC25,2063 -1.73 -3.98* -3.44* -2.42 (2.17) (1.79) (1.55) (1.71) SIC26 9.72 -1.87 -1.39 -1.28 (11.99) (11.89) (8.05) (9.67) SIC2611 7.59 -1.73 7.02 6.14 (4.60) (5.18) (4.62) (3.87) SIC2621 3.49 0.72 4.51* 3.00 (2.08) (2.65) (1.18) (1.12) SIC2631 -2.48 -1.93 (7.38) (4.65) SIC27 -13.45* -6.84 -7.34 -2.59 (5.90) (4.81) (5.60) (4.32) SIC28,29 2.86 0.96 4.12* 2.86* (1.78) (1.84) (1.42) (1.11) SIC2822,2869 12.39* 7.31* 12.95* 11.18* (3.58) (2.80) (3.13) (2.39) SIC30-32 -0.19 -0.17 (2.02) (1.98) SIC33 0.26 -1.13 1.34* 0.83 (1.22) (1.77) (0.54) (0.49) SIC3334 2.26 -0.86 5.22 2.07 (3.54) (3.28) (4.11) (3.55) SIC3241 6.13 1.18 (5.80) (4.67) SIC34-35,37-38 - -2.61 - (1.74) SIC34-35,37-38, -1.13 - - VPK (1.00) SIC36 -2.35 -3.47* -1.04 -0.90 (1.13) (1.74) (0.72) (0.58) SIC39 0.05 -1.42 (3.59) (3.23) R2 0.27 0.26 0.73 0.66 0.71 0.58 No. of obs 79 79 79 79 79 79

Heteroskedasticity-consistent standard errors are in parentheses, * denotes significance on the 95% level.

Table C.9: Thermal energy in 1992.

81 Dependent variable is Log of industrial consumption Independent of fuel per unit of labor in 1991 Variables VPK no VPK VPK no VPK VPK no VPK Temperature -0.042* -0.044* -0.025* -0.026* -0.029* -0.026* (0.008) (0.008) (0.008) (0.008) (0.007) (0.007) SIC10 0.23 -1.00 (1.86) (1.92) SIC12,14 -0.50 -1.83 (0.73) (1.23) SIC13 7.07* 4.74* 7.17* 5.94* (1.10) (1.17) (0.63) (0.49) SIC20,21,24 0.51 -0.96 0.67 0.32 (0.62) (1.19) (0.52) (0.49) SIC2063 -12.66* -11.44* -9.55* -8.51* (5.26) (3.58) (2.77) (1.84) SIC22-23 -1.35* -2.40 -0.99* -1.12* (0.69) (1.14) (0.46) (0.47) SIC25 -7.00* -6.43* -5.30* -4.95* (2.58) (2.69) (1.69) (2.09) SIC26,2631 5.41 0.68 (4.61) (3.47) SIC2611,2621 -1.55 -2.49 (1.94) (2.20) SIC27 10.62 7.30 9.65 8.55 (8.19) (5.98) (5.54) (4.75) SIC28-29 6.47* 3.61* 6.52* 4.58* (2.09) (1.63) (1.88) (1.23) SIC2822 17.62* 10.18* 19.68* 13.11* (7.06) (4.74) (7.39) (4.42) SIC2869 -4.48 -2.91 (9.16) (5.24) SIC30-31 1.11 0.05 (2.53) (1.80) SIC32 1.37 1.28 1.34 2.39 (2.32) (1.46) (1.97) (1.48) SIC3241 21.61* 13.89 22.93* 15.74* (10.64) (8.38) (10.20) (7.61) SIC3274 -72.61 -39.68 (61.83) (48.43) SIC33 2.79* 0.99 2.91* 2.06* (0.84) (1.08) (0.63) (0.49) SIC3334 -6.82* -7.88* -5.90* -5.79* (3.45) (3.02) (2.89) (2.71) SIC34,36,38 -3.25* -3.62* -2.69* -2.14* (0.95) (1.20) (0.69) (0.53) SIC35,37,VPK -1.04 - -0.69* - (0.68) (0.35) SIC35,37 - -1.95 - -0.35 (1.16) (0.39) SIC39 -4.46 -3.57 (6.09) (3.84) R2 0.20 0.23 0.77 0.76 0.76 0.74 No. of obs 75 75 75 75 75 75

Heteroskedasticity-consistent standard errors are in parentheses, * denotes significance on the 95% level.

Table C.10: Fuels in 1991.

82 Dependent variable is Log of industrial consumption Independent of fuel per unit of labor in 1992 Variables VPK no VPK VPK no VPK VPK no VPK Temperature -0.049* -0.049* -0.034* -0.034* -0.037* -0.036* (0.009) (0.008) (0.006) (0.008) (0.006) (0.006) SIC10 0.75 -0.10 (0.64) (0.78) SIC12,14 -0.67 -0.61 (0.62) (0.78) SIC13 6.95* 4.99* 6.67* 5.63* (0.97) (0.78) (0.57) (0.41) SIC20,21,24 0.58 -0.68 0.07 -0.39 (0.36) (0.82) (0.45) (0.39) SIC2063 -9.56* -8.57* -8.83* -6.56* (4.11) (3.50) (3.21) (2.89) SIC22-23 -1.11* -1.94* -1.47* -1.39* (0.43) (0.77) (0.34) (0.36) SIC25 -3.83 -3.98* -4.34* -3.42* (2.10) (1.79) (1.73) (1.71) SIC26,2631 -0.83 -3.48 (3.69) (2.80) SIC2611,2621 -0.82 -1.88 (1.17) (1.73) SIC27 9.96 6.97 10.65 7.74 (7.61) (5.50) (5.50) (4.14) SIC28-29 6.94* 4.26* 6.66* 4.61* (1.89) (1.41) (1.54) (1.13) SIC2822 20.53* 10.56 21.26* 11.41* (6.80) (4.38) (6.50) (3.66) SIC2869 1.08 1.04 (8.06) (4.18) SIC30-31 0.90* -0.05 (1.94) (1.49) SIC32 2.38 1.37 2.26 1.91 (1.79) (1.24) (1.61) (1.01) SIC3241 3.95 0.71 3.94 1.52 (5.21) (4.88) (4.44) (4.31) SIC3274 -39.71 -14.71 (58.64) (45.26) SIC33 3.46* 1.67* 3.03* 2.07 (0.66) (0.80) (0.55) (0.44) SIC3334 -8.54* -8.94* -9.49* -8.31* (3.07) (2.53) (2.53) (2.28) SIC34,36,38 -2.80* -3.13* -3.23* -2.70* (0.72) (0.77) (0.58) (0.40) SIC35,37,VPK -1.00* - -1.44* - (0.48) (0.27) SIC35,37 - -1.74* - -1.20* (0.83) (0.37) SIC39 -0.40 -1.27 (5.59) (3.56) R2 0.26 0.28 0.81 0.79 0.81 0.78 No. of obs 79 79 79 79 79 79

Heteroskedasticity-consistent standard errors are in parentheses, * denotes significance on the 95% level.

Table C.11: Fuels in 1992.

83 Bibliography

Abele, G. (1986), Effect of Cold Weather on Productivity, in ‘Technology Trans- fer Opportunities for the Construction Engineering Community. Proceedings of Construction Seminar.’, US Army Cold Regions Research and Engeneering Lab- oratory.

Acemoglu, D., Johnson, S. & Robinson, J. (2002), ‘Reversal of Fortune: Geogra- phy and Institutions in the making of the Modern World’, Quarterly Journal of Economics 117.

Barro, R. J. & Sala-i-Martin, X. (1995), Economic Growth, McGraw-Hill, New York, NY.

Bloom, D. & Sachs, J. D. (1998), ‘Geography, Demography and Growth in Africa’, Brookings Papers on Economic Activity 2.

Carlton, D. W. (1983), ‘The Location and Employment Choices of New Firms: An Econometric Model with Discrete and Continuous Endogenous Variables’, The Review of Economics and Statistics 65, 440–449.

Davis, D. R. & Weinstein, D. E. (2001), ‘Bones, Bombs, and Break Points: The Geography of Economic Activity’, NBER working paper 8517.

Fogel, R. W. (1964), Railroads and American Economic Growth: Essays in Econo- metric History, Johns Hopkins University Press, Baltimore, Maryland.

Fujita, M., Krugman, P. & Venables, A. J. (2000), The Spatial Economy: Cities, Regions, and International Trade, The MIT Press, Cambridge, Massachusetts.

Gaddy, C. G. (1996), The Price of the Past: Russia’s Struggle with the Legacy of a Militarized Economy, Brookings Institution Press, Washington, D.C.

Gaddy, C. G. & Ickes, B. W. (2001), ‘The Cost of the Cold’, The Pennsylvania State University working paper, unpublished.

Gallup, J. L., Sachs, J. D. & Mellinger, A. (1999), ‘Geography and Economic Devel- opment’, International Regional Science Review 22(2), 179–232.

84 Gluschenko, K. (2002), ‘Common Russian Market: Myth rather than Reality’, Eco- nomic Education and Research Consortium Working Paper Series .

Heleniak, T. (1999), ‘Out-Migration and Depopulation of the Russian North during the 1990s’, Eurasian Geography and Economics 40, 155–205.

Herbert, D. & Burton, I. (1994), Estimated Costs of Adaptation to Canada’s Current Climate and Trends Under Climate Change. Atmospheric Environment Service, unpublished.

Hill, F. & Gaddy, C. (2003), The Siberian Curse. How Communist Planners Left Russia Out in the Cold., Brookings Institution Press, Washington, DC. forth- coming.

Horrigan, B. (1992), ‘How Many People Worked in Soviet Defense Industry?’, RFE/RL Report 1(33).

Islam, N. (1995), ‘Growth Empirics: A Panel Data Approach’, Quarterly Journal of Economics 110.

Lynch, A. C. (2002), ‘Roots of Russia’s Economic Dilemmas: Liberal Economics and Illiberal Geography’, Europe-Asia Studies 154, 31–49.

Masters, W. & McMillan, M. (2000), ‘Climate and Scale in Economic Growth’, CID working paper No. 48.

Mikhailova, T. (2003), Where Russians Should Live: a Counterfactual Alternative to Soviet Location Policy. The Pennsylvania State University, unpublished.

North, D. C. (1990), Institutions, Institutional Change, and Economic Performance, Cambridge University Press, New York, NY.

Olkhovsky, G. (2001), ‘Combined Electricity and Heat Generation in Russia: Efficient Way of Greening Fossil Fuels’, Towards Local Energy Systems: Revitalizing Dis- trict Heating and Co-Generation in Central and . World Energy Council conference report. .

Parshev, A. P. (1999), Why Russia is not America, Forum, Moscow.

Rappaport, J. & Sachs, J. D. (2001), ‘US as a Coastal Nation’, Research Division, Federal Researve Bank of Kansas City.

Rosen, K. T. (1979), Seasonal Cycles in the Housing Market, The MIT Press, Cam- bridge, MA.

Rosenthal, D. H., Gruenspecht, H. & Moran, E. A. (1995), ‘Effect of Global Warming on Energy Use for Space Heating and Cooling in the United States’, The Energy Journal 16(2).

85 Vassin, S. A. & Costello, C. A. (1997), Spatial, Age, and Cause-of-Death Patterns of Mortality in Russia, 1988-1989, in ‘Premature Death in the New Independent States’, The National Academy of Science.

86 Vita

Date & Place of Birth: 22 Nov 1973; Moscow, Russia

Citizenship: Russian Federation

Education: 2004, Ph.D. in Economics, Pennsylvania State University 1997, M.A. in Economics (cum laude), New Economic School, Moscow, Russia 1996, Diplom in Electronic Engineering, Moscow State Institute of Electronic Engineering, Russia

Fields: Primary: Economics of Transition Secondary: Industrial Organization, International Trade

Presentations: “Regional Migration in Russia,” presented at GET Conference, New Economic School, Moscow, Russia, 1997 “Where Russians Should Live: A Counterfactual Alternative to Soviet Location Policy,” presented at Cornell-PSU Macro Workshop, 2001; the VIIth SMYE Conference, 2002; Midwest International Economics Meeting, 2002; Compara- tive Economics Seminar at the Davis Center for Russian and Eastern European Studies, Harvard University, 2003; CEFIR open seminar, Moscow, 2003

Experience: Instructor (visiting), Intermediate Macroeconomics, Fall(I) 2003, New Economic School, Moscow Instructor, Intermediate Microeconomics, Summer 2000 Research Assistant for Professor Kala Krishna, Summer 1998 Teaching Assistant for Microeconomic Theory (graduate), Mathematics for Economists (graduate), International Finance, Labor Economics, Monetary Theory and Policy, Intermediate Microeconomics, Introductory Microeconomics, Fall 1997– Spring 2001 Member of the research group, Government and Economy in Transition (GET) project, New Economic School, Moscow, Russia, 1996-1997