The long-run effects of R&D place-based policies: Evidence from Russian Science Cities*

Helena Schweiger,† Alexander Stepanov‡ and Paolo Zacchia§

February 2021

Abstract

We study the long-run effects of historical place-based R&D policies: the creation of Science Cities in Soviet . We compare current demographic and economic characteristics of Science Cities with those of localities that were similar to them at the time of their establishment. We find that in present-day Russia, Science Cities host more highly skilled workers, are more innovative and productive. We interpret these findings as the result of the interaction between persistence and agglomeration forces; we rule out explanations related to the differential use of public resources.

JEL Classification Codes: N94, O38, P25, R58

Keywords: Science Cities, Place-based policies, Agglomeration, Russia

*We would like to thank Charles Becker, Ekaterina Borisova, Vincenzo Bove, Julian Cooper, Ralph de Haas, Sergei Guriev, Maria Gorban, Mark Harrison, Denis Ivanov, Sergei Izmalkov, Patrick Kline, Olga Kuzmina, Andrei Markevich, Enrico Moretti, Gérard Roland, Mauro Sylos Labini and Natalya Volchkova for helpful discussions, as well as the Coeditor, Lucas W. Davis, and two anonymous referees, the participants at the SITE Academic Conference: 25 years of transi- tion, 8th MEIDE conference, 2015 PacDev conference, 17th IEA World Congress, 16th Uddevalla Symposium, 2nd World Congress of Comparative Economics, XVIII HSE April International Aca- demic Conference, 2017 Barcelona Workshop on Regional and Urban Economics, 2018 Meeting of the American Economic Association, 13th Meeting of the Urban Economics Association and seminars at the EBRD, Higher School of Economics, New Economic School, U.C. Berkeley, U. of Genoa, U. of Cergy-Pontoise, École Polytechnique, U. of Nottingham, U. of Warwick and CERGE- EI for their comments and suggestions. Irina Capita, Jan Lukšiˇcand Maria Vasilenko provided excellent research assistance. The views expressed in this paper are our own and do not neces- sarily represent those of the institutions of affiliation. †European Bank for Reconstruction and Development (EBRD). E-mail: [email protected]. ‡European Bank for Reconstruction and Development (EBRD). E-mail: [email protected]. §CERGE-EI and IMT School of Advanced Studies. E-mail: [email protected]. 1 Introduction

The effectiveness of public support for science and research and development (R&D) is a long-standing issue in the economics of innovation. Both direct sub- sidies and indirect incentives for research and science typically rest on positive externalities (or other types of market failures) which, in the absence of public intervention, cause under-investment in R&D. Some specific innovation poli- cies, such as the top-down creation of local R&D clusters, are characterized by a geographical dimension. To evaluate the overall impact of such interventions, it is necessary to assess the spatial extent of knowledge spillovers – one of the three forces of spatial agglomeration first identified by Marshall (1890). The de- bate about localized innovation policies mixes with the one about broader place- based policies: a key question is whether place-based policies can generate self- reinforcing economic effects that persist after their termination, possibly be- cause of agglomeration forces at work. Absent any long-run effects, the net wel- fare effect of place-based policies is as likely to be negative as it is to be positive (Glaeser and Gottlieb, 2008; Kline and Moretti, 2014b). Localized innovation policies are a popular policy intervention, but evidence about their effectiveness is scarce even in the short-run, let alone the long-run. This paper is one of the few analyzing such policies and their long-run impact. We do so by examining the legacy of Russian Science Cities: 95 middle-sized urban centers that were created or developed by the Soviet government in the territory of modern Russia with the purpose of concentrating highly specialized, strategic R&D facilities. These cities were typically shaped around a specific tech- nological purpose; the Soviet government relocated scientists, researchers and other high-skilled workers from elsewhere in the to work in the newly created establishments. The creation of Science Cities was prompted by the technological and military competition between geopolitical blocks during the ; unsuprisingly, most of them were specialized in military-related fields, such as nuclear physics, ballistics, aerospace and chemistry. Russia main- tains a comparative technological advantage in these sectors to this day.

1 While one may question whether the institutional context of Russian Science Cities is comparable to that of other industrialized countries, this historical expe- rience stands out in two respects. First, it greatly diminishes selection bias con- cerns due to unobserved determinants of future development, which typically affect studies about innovative clusters in other countries. The allocation of re- sources in the Soviet command economy was managed according to suboptimal, often erratic rules of thumb – especially so for highly secretive projects, which were managed by a handful of bureaucrats lacking the advice of experts (Gre- gory and Harrison, 2005; Harrison, 2018). Soviet science was, in fact, a largely secretive business managed by the internal police and the army (Siddiqi, 2008, 2015). A rule of thumb appears most evident in the choice of Science Cities’ lo- cations (Agirrechu, 2009), which was based on a secrecy-usability trade-off: the Soviet leaders prioritized places that offered better secrecy and safety from for- eign interference (in the form of R&D espionage), or that were otherwise easy to control by governmental agencies, by virtue of geographical proximity. The potential for economic development and local human capital accumulation was arguably not, at the margin, a determinant of location choice. Second, the transition to a market economy that followed the dissolution of the USSR resulted in a large negative shock to Russian R&D: direct governmental R&D expenditure as a percentage of GDP fell by about 63 per cent, causing half of the scientists and researchers of post-1991 Russia to lose their jobs. Conse- quently, state support for Science Cities was abruptly suspended for obviously exogenous reasons; only in the 2000s was it partially resumed for 14 of the for- mer towns, which today bear the official name of Naukogrady (Russian for “Sci- ence Cities”). These historical developments provide us with a unique opportu- nity to study the long-run consequences of an exogenous spatial reallocation of highly-skilled workers, decades after the termination of the program that orig- inally motivated such reallocation. In addition, by analyzing historical Science Cities separately from modern Naukogrady, we are able to evaluate to what ex- tent the modern characteristics of the former depend on the long-run effects due to the Soviet-era policy, rather than on current government support.

2 These distinctive institutional features motivated us to build a unique, rich dataset covering geographical, historical and present characteristics of Russian municipalities to answer more general questions about the effects of innovation- centered place-based policies. With it, we estimate the effect of the past estab- lishment of a Science City on present-day municipal-level human capital (mea- sured as the share of the population with either graduate or postgraduate qual- ifications), innovation (evaluated in terms of patent applications) and various proxies of economic development. To give our estimates a causal interpretation, we match Science Cities to other localities that, at the time of selection, were similar to them in terms of characteristics that could affect both their probabil- ity of being chosen and their future outcomes, and were on a similar popula- tion growth trend. Our main identifying assumption is that, conditional on our matching variables, the choice of a locality was determined at the margin by fac- tors that would be independent from future, post-transition outcomes. Our findings can be summarized as follows. At present, Soviet-era Science Cities still host a more educated population, are more economically developed, employ a larger number of workers in R&D and ICT-related jobs, and apply for more patents than the localities comparable to Science Cities at the time of the program’s inception. In addition, workers in former Science Cities receive sub- stantially higher salaries, especially in high-skilled occupations. The estimated treatment effect is typically lower than the raw sample difference for all outcome variables except those related to patent applications, for which no ex-ante bias can be attested from our estimates. The results remain largely unchanged when modern Naukogrady are excluded from the analysis, but the point estimates rela- tive to patent filings and the share of workers employed in the R&D or ICT sectors decrease substantially. A more in-depth analysis of demographic outcomes and economic development (proxied by night lights) reveals little to no evidence of mean reversion. We interpret the findings in light of a spatial equilibrium model, inspired by Glaeser and Gottlieb (2009), Moretti (2011), and Allen and Donaldson (2020), that incorporates both path-dependence and agglomeration forces. We estimate the

3 equilibrium predictions of our model on the matched sample; this reveals large estimates of the parameters that embody path-dependence as well as robust ag- glomeration elasticities – in a neighbourhood of 0.3 – stemming from a higher concentration of high-skilled workers. This finding contrasts with the study by von Ehrlich and Seidel (2018, henceforth ‘vES’) of the formerly subsidized West German municipalities which used to border the Iron Curtain. While their em- pirical analysis rules out agglomeration effects, they propose persistence in pub- lic goods investment as the explanation of their measured long-run effects. Our paper provides one of the first assessments of the vES hypothesis through the analysis of municipal budget data. We observe no evidence of differences in the overall expenditure or patterns in the use of resources between Science Cities and their matched counterparts. Our paper adds to the growing body of studies about the evaluation of place- based policies.1 It is most directly related, conceptually and methodologically, to the studies by Kline and Moretti (2014a) about the Tennessee Valley Author- ity, Fan and Zou (2021) on China’s “Third Front” state-driven industrialization of inner China, and Heblich et al. (2020) about the “Million Roubles Plants” built in China with Soviet support during the early Cold War. While these contribu- tions also exploit unique historical circumstances of political, geographical and military kind to uncover long-run effects from historical place-based policies, the Science Cities program stands out as it specifically concerned investments in knowledge, rather than in physical capital. We also contribute to the more general search of agglomeration effects – and in particular, localized knowledge spillovers – in urban and regional economics. This has long been a traditional field of investigation for economic geographers, with a particular interest in innovation clusters. Following the seminal contri- butions by Jaffe (1989), Glaeser et al. (1992), Audretsch and Feldman (1996) and others, a large literature has developed. The issue has caught the attention of economists working in other fields, with several papers focusing on local pro-

1For a recent survey of the empirical research see Neumark and Simpson (2015).

4 ductivity spillovers; for example: Moretti (2004), Ellison et al. (2010), Greenstone et al. (2010), and Bloom et al. (2013). This paper’s institutional setting relates it to existing research on the con- sequences of historically massive forms of government intervention on long- run economic and technological development, in Russia or elsewhere. Chere- mukhin et al. (2017) argue that the “Big Push” industrialisation policy enacted in the USSR under Stalin did not succeed in shifting Russia onto a faster path of economic development. Mikhailova (2012a,b) evaluates negative welfare effects due to the regional demographic policies enacted in the Soviet Union; she also finds that locations hosting camps grew significantly faster than similar places without camps.2 In the more specific case of R&D policies, Ivanov (2016) finds that Russian regions with more R&D personnel before the onset of tran- sition do better today at expanding employment in high-tech sectors. Outside Russia, Moretti et al. (2020) show that increases in government-funded R&D for military purposes have positive net effects on the TFP of OECD countries, despite crowding out private expenditures in R&D. This paper is organized as follows. Section 2 summarizes the history and ge- ographical patterns of Science Cities. Section 3 outlines the empirical method- ology. Section 4 describes our data. Section 5 discusses the empirical results. Section 6 outlines our spatial equilibrium model and its empirical implications. Lastly, section 7 concludes.

2 Historical and institutional background

The former Soviet Union was among the pioneers of public investment in sci- ence and in place-based policies focusing on R&D. In the context of the Cold War competition, the Soviet leadership prioritized the allocation of the best resources – including human – to sectors considered vital to the country’s national secu- rity. Around two-thirds of all Soviet R&D spending was set for military purposes,

2Toews and Vézina’s (2020) study about the legacy of gulag camps that hosted highly-skilled political prisoners echoes this result and corroborates many of our findings.

5 and almost all of the country’s high-technology industry was in sectors directly or indirectly related to defense (Cooper, 2012). Moreover, science was mostly the army’s responsibility (Siddiqi, 2015). Science Cities emerged in this environ- ment. We identify 95 middle-sized urban centers which the Soviet government endowed with a high concentration of research and development facilities, each devoted to a particular scientific and technical specialization. Science Cities be- gan to develop around strategically important (military) research centers from the mid-1930s;3 however, the majority of them were established after the World War II, especially in the 1950s. As they specialized in industries with high technological intensity, Science Cities needed access to suitable equipment, machinery, intermediate inputs and qualified personnel. With the objective of co-locating scientific research centers, training institutes and manufacturing facilities, the Soviet government estab- lished about two-thirds of Science Cities by “repurposing” existing settlements, while the rest were built from scratch in sparsely populated areas. Researchers, scientists and supporting personnel were relocated to Science Cities in order to contribute to the R&D projects. To incentivize them, the Soviet authorities strove to provide better than standard living conditions in these localities, by making available a wider choice of retail goods, more comfortable apartments as well as more abundant cultural opportunities than elsewhere in the country. Typi- cally, the urban characteristics of Science Cities were better than those of other contemporary settlements, as the former were developed according to the best urban planning criteria at the time (Agirrechu, 2009). Starting in the 1940s, with the need to protect the secrecy of nuclear weapons in the Cold War environment (Rowland, 1996), many Soviet municipalities of mil- itary importance were “closed” to external access to maintain security and pri- vacy. Non-residents needed explicit permission to travel to closed cities and were subject to document checks and security checkpoints; relocating to a

3The model of innovation followed by the Soviet authorities since the early 1930s was the creation of “special-regime enclaves intended to promote innovation” (Cooper, 2012). These en- claves first appeared as secret research and development laboratories (so-called Experimental Design Bureaus or sharashki) in the Soviet Gulag labor camp system (Siddiqi, 2008).

6 required security clearance by the KGB; foreigners were prohibited from entering them at all; and inhabitants had to keep their place of residence secret. Science Cities whose main objective was to develop nuclear weapons, missile technology, aircraft and electronics were closed as well; some of them were located in remote areas situated deep in the Urals and – out of reach of enemy bombers – and were represented only on classified maps. Note that the sets of “Science Cities” and “closed cities” overlap only partially; we take this into account in our empirical analysis. Following the dissolution of the USSR, Russia underwent a difficult transfor- mation from a planned to a market economy. The withdrawal of the state from many sectors of the economy dramatically affected R&D as well. In Russia, gross R&D expenditures as a fraction of GDP fell from the 1990 level of about 2 per cent to a mere 0.74 per cent in 1992, a fact made even more dramatic as Russian GDP shrank by about 50 per cent in the early years of the transition. Wages plum- meted, and consequently, total employment in R&D fell by about 50 per cent.4 Over the transition, there was little to no recovery from these initial shocks (see Online Appendix A for more details). This has inevitably affected Science Cities; while detailed information about their government funding in the 1990s is not available, anecdotal evidence speaks of an effective discontinuation of the mili- tary research programs that Science Cities were responsible for, at least until the government re-established direct support for the 14 modern Naukogrady in the early 2000s (though this time without explicit military focus). Our analysis of re- cent municipal budgets in section 5 confirms that Science Cities today receive, if anything, lower governmental transfers than comparable towns.

The choice of locations that hosted Science Cities. Since Science Cities were created in secret and in a staggered fashion, typically to respond to perceived

4In the USSR, the wages of scientists were about 10-20 per cent above average. They dropped to about 65 per cent of the average wage in 1992, following the state’s withdrawal from the R&D sector (Saltykov, 1997). Even worse, during the 1990s many scientists did not even receive their salary, or received only a fraction of it (sometimes in kind) over extended periods (Ganguli, 2015). Low remuneration was not the only reason for researchers to leave the R&D sector: with the removal of previous restrictions to individual mobility, scientists were allowed to migrate abroad.

7 military-technological threats coming from the west, there exists no detailed, comprehensive account of how their locations were chosen.5 To inform our em- pirical analysis, we draw on the historical meta-analysis by Gregory and Harri- son (2005) which examines the archives produced by the Soviet state between its inception and 1960. The authors note that in the USSR, the allocation of re- sources was based on very imperfect and informal “rules of thumb” rather than efficiency and optimality criteria.6 This was ultimately caused by a combina- tion of Hayekian information problems, and a failure of the Soviet political econ- omy model – at all levels of its bureaucratic apparatus – to credibly commit to a set of contingent, efficient rules for the management of a planned economy. In the case of secretive decisions, these problems were exacerbated by a security- usability trade-off (Harrison, 2018). In short, the Soviet leaders commonly faced a dilemma: while sharing secret information and choices with competent agents might jeopardize the secrecy, not doing so would entail the opposite risk of tak- ing ineffective if not harmful decisions. The geographical pattern of Science Cities’ locations is described in detail by Agirrechu (2009) and depicted in a chloropleth map of modern Russian regions, distinguished by their population density (Figure 1). Following Agirrechu (2009, p. 21), Science Cities can be split in two groups of approximately equal size by type of location. The first group is composed of localities situated in urbanized areas (e.g. in the Moscow region) – they “hosted mainly organizations focusing

5The Russian Ministry of Defense started working on a series of publications documenting the activities of its Soviet predecessor until 1960. However, those available so far only cover the period up to 1941. In the future, these might allow historians and economists to better recon- struct the process. 6This is represented by the concept of planning by feel by Gregory and Harrison (2005, p. 751). In their writing: “[p]lanners were supposed to distribute materials according to engineering norms, but the first allocations took place before norms were compiled (Gregory and Markevich, 2002; Gregory, 2004). Supply agencies used intuition, trial and error, and historical experience. According to one supply official: ‘We give 100 units to one branch administration, 90 to another. In the next quarter we’ll do the reverse and see what happens. You see, we do this on the basis of feel; there is no explanation’ (Gregory and Markevich, 2002, pp. 805–06). According to another: ‘Our problem is that we can’t really check orders and are not able to check them. . . We operate partially on the basis of historical material we are supposed to give so and so much in this quarter, and at the same time you are supposed to give us this much.’ (cited by Gregory, 2004, p. 172).” Citations and quotes are by Gregory and Harrison, emphasis is ours.

8 Figure 1: Location of Science Cities and regional population density

Sources: Table B.1 and ROSSTAT. on theoretical research.”7 By contrast, “[c]ities of the second group were located in the most remote areas of the country (although in densely populated regions), far away from large urban centers, highways, industrial facilities, and production fields. The majority of them were surrounded by forests which served as a natural protection from espionage. In these Science Cities, the core enterprises were military-related R&D institutes, design bureaus, pilot plants, and test sites.” This pattern can be interpreted as the consequence of a rule of thumb de- termined by the security-usability trade-off. The specific R&D to be conducted in a perspective Science City would determine the dominant side of the coin; accordingly, Soviet leaders would either choose some remote, highly secretive (but perhaps not too usable) location,8 or an easy to control (but less secluded)

7This group included so-called academic towns – semi-isolated neighborhoods of a larger city, endowed with R&D facilities, housing for R&D staff and their families, as well as basic local infrastructure. Because they were part of a larger city, we exclude them from our analysis (see also section 4). 8This bears similarities to China’s “Third Front” industrialization policy that is examined by Fan and Zou (2021). Military considerations also determined the locations of the “Million Rou- bles Plants” built in China and studied by Heblich et al. (2020).

9 place.9 Agirrechu (2009) also cites other technological and geographical factors that constrained the choice of location and that, again, depended on the specific R&D specialization: heavy industry and nuclear technology need large amounts of water for their operations, therefore Science Cities focused on those areas were typically built close to rivers or lakes; Science Cities devoted to military ship- building and design had to be located on the coast. Science Cities necessitat- ing timely access to production inputs had to be placed closer to transportation links, such as railroads. The locations chosen to host Science Cities were not random; they typically belonged to the more densely populated and urbanized areas of Russia. The quality of our empirical analysis, however, depends on the extent to which the chosen locations embed unobservable factors that made them more (or less) likely to embark on a path of faster demographic and economic development, relative to other places that were located in the same areas and were otherwise observationally identical to the chosen locations at the time of their selection. It is impossible to provide a definitive response to this question with the currently available archival documentation. Both the historical analysis and the anecdo- tal evidence, however, clearly point to a negative answer. Harrison (2018) cites a series of shortcomings of the decision process taken by Soviet leaders under secrecy: the relevant information was limited to a handful of trusted bureau- crats, usually selected on the basis of their loyalty to the régime, rather than on their competence; expert advice was typically absent; decisions were often taken with information limited by other state secrets. In this context, it appears un- likely that choices taken in a planned economy for military and strategic reasons might have been informed by subtle economic factors.10 The anecdotal evidence speaks of very idiosyncratic criteria that often determined the exact locations of

9The majority of Science Cities in the second group – about one third of the total – are located close to Moscow; according to Agirrechu (2009, p. 21), this is so by virtue of their spatial proximity to “the Academy of Science, the All-Union Academy of Agricultural Sciences, the Academy of Medical Sciences, and some institutes subordinate to ministries.” 10In personal correspondence with us, the Emeritus Professor Mark Harrison concluded that the Soviet leaders would choose a Science City location “[a]s a child with a blindfold would pin the tail on the donkey, I would guess.”

10 certain Science Cities; examples include Sarov and Snezhinsk.11

3 Empirical methodology

We compare the long-run outcomes Yi q of municipalities hosting Science Cities with those of other municipalities which were similar in terms of geographical

and socio-economic characteristics Xik in the years following the World War II, when the majority of Science Cities were established. i 1,...,N indexes mu- = nicipalities; q 1,...,Q our long-run outcomes of interest; and k 1,...,K the = = geographical and historical characteristics we control for. For each long-run out- come, we estimate the Average Treatment Effect on the Treated (ATT), with the treatment being the historical establishment of a Science City in a municipality. To this end, we employ matching techniques. The central identifying assumption is motivated by our earlier historical dis- cussion. We assume that the long-run socio-economic outcomes of both Science Cities and their matched counterparts are orthogonal to the treatment, condi- tional on relevant geographical and historical characteristics and on the paired locations belonging to the same (type of) geographical region. To lend credibil- ity to our approach, we include a number of indicators (detailed below) about the military, scientific and economic importance of each municipality in post- war Russia, in addition to matching on cities of equal size and the geographical constraints described by Agirrechu. We take measures to ensure that cities are matched sufficiently close in space, especially in the densely populated parts of

11These two places provide an indicative example of idiosyncratic factors affecting the loca- tion of Science Cities: sometimes, this was determined by the presence of other Science Cities, or lack thereof. Snezhinsk (Chelyabinsk region) was established as a double of Sarov (Nizhny Novgorod region) with the main purpose of keeping the industry working even if one of the two places were destroyed, but also to create inter-City competition. Since Sarov is located in a rela- tively remote location in the European part of Russia, Snezhinsk had to be placed in a similarly out-of-reach area, but to the east of the Urals. Officials reportedly considered other locations in different regions, but ultimately decided on Snezhinsk because of its proximity to another Sci- ence City, Ozyorsk, which could supply inputs to Snezhinsk. This pattern of interplay between decisions affecting different Science Cities was not unique; for example, the four places special- ized in production of enriched uranium were also located far from each other.

11 Russia such as the Moscow region, but not excessively so, to help ensure that our results are not driven by violations of the Stable Unit Treatment Value Assump- tion (SUTVA).12 In particular, we include in the conditioning covariates both geo- graphical coordinates (latitude and longitude) and the density of historical pop- ulation and factories within a radius of 200km, to control for potentially spa- tially correlated local economic conditions. Moreover, we restrict the minimal distance between matches to 50km. Thanks to all these measures, we obtain a matched sample that conforms to our identifying assumptions and allows us to test for possible violation of the SUTVA, as detailed in section 5 (see Online Ap- pendix E for additional methodological discussion). Another concern with our matching approach is that historical characteris- tics observed at a particular moment in time are not sufficient to account for the dynamic development path of different localities: a Science City might look like a matched control municipality at some point in postwar times even if it was already enjoying faster population and economic growth. If pre-trends are likely to continue, they would threaten our causal evaluation, and would moti- vate approaches in the spirit of the “synthetic control matching” (SCM) method, for example by matching on each municipality’s time series of urban population. Applying SCM to our many outcome variables is unworkable; furthermore, it would not be devoid of shortcomings. As illustrated by Ben-Michael et al. (2019), SCM is biased with short panel dimensions; they propose an “augmented” SCM through a covariates-based bias correction or equivalently, the inclusion of both covariates and pre-treatment outcomes in a “hybrid” multivariate matching pro- cedure. Inspired by the latter option, we implement a Mahalanobis matching algorithm in which the vector of covariates includes the observation of all mu- nicipalities’ urban population at several points in the past. More specifically, a Science City s is matched to the ordinary municipality z

12Violations of the SUTVA might occur in both ways. On the one hand, the economic effect of Science Cities could extend beyond their borders, so matching two municipalities too close to each other would lead to downward biased ATT estimates. On the other hand, Science Cities that are too close to matched control cities could drain resources from the latter, leading to upward biases. If the SUTVA does not hold, these two mechanisms could partly offset one another.

12 with the lowest value of the following extended Mahalanobis Distance msz :

T 1 1 m (x ,x ) (x x ) Σ− 2 WΣ− 2 (x x ), sz i s i z = i s − i z i s − i z

where xic is the vector of the K observable covariates for municipality i of type 1 c s,z; Σ− 2 is the inverted Cholesky decomposition of the empirical variance- ∈ covariance matrix of the covariates, Σ, while W is a matrix of weights obtained via a “genetic” algorithm aimed at optimizing covariate balance (Diamond and Sekhon, 2013). Matching is performed with replacement, so that a control mu- nicipality can be linked to multiple treated cities. We impose exact matching on selected dummy variables;13 importantly, we match Science Cities subject to the “closed city” status described in section 2, to non-Science Cities that experienced similar restrictions (typically, these are places hosting military bases but lacking an R&D content). In order to improve on balance over the entire distributions of covariates, we calculate Mahalanobis distances taking the logs of continuous covariates with asymmetric empirical distributions.14 In our main analysis, we match Science Cities s to control municipalities z one-to-one, conservatively accepting a higher variance for our estimates in ex- change for a lower bias. We derive a unique association of treated-control ob- servations which is based on the original set of 84 Science Cities in our dataset. However, most ATT estimates are performed on a subset of this matched sample, either because for some Science Cities the information about outcomes of inter- est is not publicly available, or because we remove the current Naukogrady from the analysis. For all our outcomes, we estimate the ATT with and without the correction for the multiple covariates bias, perform statistical inference by cal- culating standard errors based on conventional formulae (Abadie and Imbens, 2006, 2011) and adjust p-values using Holm’s method to control the family-wise error rate (FWER). Since our coverage of Russian municipalities equals or ap- proximates the universe, we do not apply sampling weights.

13This means that we pair municipalities with the smallest Mahalanobis distance conditional on them having the same observed values of these dummy variables. 14We calculate logs of (X 1), thus x log(X 1). ick + ick = ick +

13 4 Data

We evaluate the long-run effects of Science Cities at the municipal level by em- ploying a unique dataset, which contains information previously unavailable in electronic format. Specifically, it combines: (i) a Science Cities database and (ii) municipal-level data that aggregate various sources of information on historical and current characteristics of Russian cities. Our unit of observation is a Russian municipality;15 in total, our dataset includes 2,338 such municipalities (the two large cities of Moscow and St. Petersburg are excluded). We used GIS software to merge municipal-level and geographical information from different sources. Be- low, we describe our data and the different sources, introducing the municipal- level variables by type. Additional information, sources and references are pro- vided in Appendices B (for the Science Cities database) and C (for the municipal- level information).

List of Science Cities. The Science Cities database is based on various pub- licly available sources. Since Science Cities were established in secret, an official and definitive list does not exist; the extant lists are not exhaustive, having been put together following the dissolution of the USSR. Most of the 95 middle-sized urban centers on our list appear either in Agirrechu (2009) or in the other sources listed in Appendix B. The database contains information on the location of each Science City, the year the locality was founded, the year in which it became a Science City in the Soviet Union (and the year it became a Naukograd in Rus- sia, where applicable), the type of Science City, whether it was a closed city in the past or is still closed now, and its priority areas of specialization (Table B.1 in the Appendix). We manually assign Science City status – our treatment – to each municipality; in total, the data include 88 municipalities with at least one

15In this paper, we use the English term “municipality” to denote the municipal formations (“municipal’nye obrazovaniya”) of Russia, that is, units at the second administrative level (akin to US counties; also called “rayons”). Most urban localities in Russia (places denominated “gorod”) are assigned to separate municipalities. We use the word “region” to refer instead to federal sub- jects (“oblast’”, “kray” or “respublika”), that is, units at the first administrative level.

14 Science City.16 We exclude four Siberian regional capitals that hosted academic towns from our analysis. While these are “archetypical” Science Cities, the aca- demic towns in question were incorporated in the municipalities of the respec- tive regional capitals, and we are unable to collect statistical information that is suburb-specific. Hence, keeping these municipalities in the treatment or in the control group would contaminate either. Ultimately, we end up with 84 munici- palities hosting at least one Science City.

Socio-economic outcomes. In order to measure differentials in the skill level of local inhabitants, we use data from the 2010 Russian census on the overall mu- nicipal population, the share of the population whose highest attained education are graduate degrees, and the share of the population that completed any form of postgraduate education.17 We proxy innovation by the total count of local in- ventor addresses that appear on patent applications filed to the European Patent Office (EPO) with the priority years falling into periods 1978-1999 and 2000-13. Each address is weighted by the inverse of the number of inventors that appear on the relevant patent application; we call this measure (local) fractional patents. We also collect information from the Russian Statistical Office (ROSSTAT) on total employment and per capita wages in two sectors: R&D, ICT and related “high-skilled” services sector, and the less skill-intensive wholesale and retail sector.18 Lastly, since accurate GDP data at the municipal level are unavailable in Russia, we use several proxies for economic activity: the night lights inten- sity (standardized in z-scores) observed by satellites between 2009 and 2011,19

16Our list contains four Science Cities for which only their Soviet-era nomenclature is pub- licly available: Krasnodar-59, Novosibirsk-49, Omsk-5 and Perm-6. Since their exact location is still unclear, we exclude them from the analysis as they cannot be matched to any municipality. In addition, three pairs of Science Cities are located within the same municipalities. Hence, 91 Science Cities are mapped to 88 municipalities with at least one Science City. 17Graduate education in Russia refers to achieving a bachelor’s or master’s degree or their Russian equivalent “specialist,” while postgraduate education refers to academic or professional degrees, academic or professional certificates, academic or professional diplomas, or other qual- ifications for which a graduate education is generally required. 18Note that ROSSTAT data sources of any kind are typically never available for closed cities, due to national security considerations. 19Night lights can be used as a proxy for economic activity under the assumption that lighting

15 as well as the density and labor productivity of local SMEs across all sectors of the economy from the 2010 SME census by ROSSTAT.

Municipal budgets and local public goods. We obtain data on the budgets of Russian municipalities for 2006-16 through ROSSTAT. Once again, the infor- mation is missing for all closed cities in the sample. On the revenue side, we are able to differentiate between direct revenues (for example, from local taxes) and transfers to the municipalities from both the federal and regional govern- ments.20 In addition, we are able to distinguish local expenditures by category, such as education, health care, local infrastracture, and similar. All measures are converted to 2010 prices using ROSSTAT’s official CPI indices and averaged over 2006-16. ROSSTAT also allows us to access data about certain public goods avail- able in Russian municipalities in 2010. Specifically, we calculate the length of lo- cal roads that is lit during the night, as well as the number of museums, theaters and libraries in a municipality: services most amenable to the better educated.

Geographical characteristics. We collect or calculate information about sev- eral geographical characteristics of Russian municipalities: their area, average altitude, as well as average temperatures in January and July. Since locating close to large amounts of water was necessary for Science Cities of certain specializa- tions, we collect data on each municipality’s direct access to the coast or fresh water (major river or lake).

Historical characteristics. We collect information about historical social and economic characteristics that could affect both Science City status and current outcomes. To account for historical differences in city size and population den- sity within a controlled radius, we use population data from the first post-World War II census held in the Soviet Union, conducted in January 1959, which pro- vides figures for all urban and large rural localities of that time.21 To control is a normal good; see Donaldson and Storeygard (2016). 20These transfers do not include direct government grants to R&D facilities and companies lo- cated within a municipality for various purposes, including salaries. Collecting this information across all municipalities in Russia would be close to impossible. 21We match those locations to modern municipalities. Note that the population of small rural

16 for pre-trends in population and economic growth as discussed in section 3, we complement this data with the population of Russian cities in 1897, 1926, and 1939 as reconstructed by Mikhailova (2012a,b) using historical census data from each year. Unfortunately, the 1959 census does not provide a population break- down by educational achievement at the municipal level; to proxy for the pre- existing level of human capital of an urban area we use data on the number of higher education institutions located in a municipality in 1959, as well as the number of local R&D institutes in 1947. We also collect population data from the 1989 census for use in the structural analysis in section 6. To control for the existing level of industrial development in a municipality, we use two pieces of information. The first is the number of the Soviet defense industry plants (factories, research and design establishments) located in each municipality and its surroundings (within 200km) in 1947. The second is the number of local branches of the State Bank of the USSR in 1946, obtained from its archives. This institution was an instrument of the Soviet economic policy; the geographical dispersion of its branches was indicative of an area’s importance for the Soviet development strategies (Bircan and De Haas, 2020). To account for the fact that some Science Cities needed access to good transportation links, while others had to be located in remote areas far from espionage threats, we use GIS data to measure municipalities’ distance from Russian railroads in 194322 and from the post-World War II USSR borders.

Summary statistics. Summary statistics for the outcomes, the geographical and historical characteristics illustrate that Science Cities were typically located in more populous and warmer places, with a higher historical concentration of industrial plants, universities and R&D institutes (see Table D.1 in Online Ap- pendix D). Moreover, mean differences between Science Cities and all other or- settlements is not reported in the 1959 census. We account for this by calculating the residual rural population of each Russian region in 1959, and assigning it to the region’s municipalities proportionally to their area. 22In the Soviet economy, railroads were the workhorse of the transportation network; road transport played only a secondary role (Ambler et al., 1985). Most of the railroads’ construction took place in tsarist Russia.

17 dinary municipalities are positive and statistically significant for most outcomes.

5 Empirical results

Our empirical estimates are obtained from a matched sample consisting of 78 municipalities that include a Science City and 59 matched municipalities which do not host any Science City. In Online Appendix E we provide an additional de- scription of the matched sample, including the analysis of covariate balance and a visual representation on the map of Russia. The matched sample is obtained by imposing exact matching on five dummy variables: closed city status, direct access to the coast, access to rivers or lakes, being a mountainous municipality (with a average altitude higher than 1000km) and has R&D institutes in 1947. Out of 84 Science City municipalities in our dataset, 6 are not matched to any control observations (there are no suitable matches for them under our exact matching constraints) while most control observations are matched to, at most, two Sci- ence Cities (a few more in a couple of cases). In what follows, we discuss our main results as well as a number of addi- tional estimates that explore potential drivers of the main effects. Note that our Average Treatment on the Treated (ATT) effects estimates can be interpreted as causal effects of Science Cities only if the assumed counterfactual scenario is one where the government had not trained extra scientists as part of the Science Cities program, assuming negligible general equilibrium effects. In a counter- factual scenario where the government had trained extra scientists but decided to distribute them uniformly across similar locations, our control municipalities would no longer be an appropriate counterfacutal (we explore alternative coun- terfactual scenarios with the aid of a model in section 6). At the same time, if the USSR had established a much higher number of Science Cities than it actually did, general equilibrium effects would be much harder to neglect.

ATT estimation: All Science Cities. The main estimates of the ATT for our 11 outcomes of interest are reported in Table 1. Science Cities seem to be, on av-

18 erage, more populated than their matched counterparts, by about 26,000-28,000 people. This difference is driven, for the most part, by the more educated seg- ments of the population. The share of inhabitants who attained graduate edu- cation is about 6-7 percentage points higher in Science Cities. Similarly, Science Cities still host more people with postgraduate qualification (by about 0.3 per- centage points). These estimates are substantially smaller than the correspond- ing raw differences in the full sample, but are generally statistically significant at the 1 per cent level (5 per cent for the total population estimates). Among the innovation outcomes, the fractional patent applications estimate is positive and statistically significant (at the 1 per cent level) both before and af- ter 2000. These results indicate that, before 2000, Science Cities have applied to the EPO, on average, for about 3 additional fractional patents; the number rises to around 9 in the period from 2000 onwards.23 ATT estimates are statistically indistinguishable from the raw differences for both patent measures, which is ar- guably because R&D is very spatially concentrated, in Russia and elsewhere. In- deed, ROSSTAT data indicate that high-tech sectors of the economy are more de- veloped in Science Cities, since both measures of employment share and salaries in the combined R&D-ICT sector register differences that are positive and statis- tically significant (at the 1 per cent level). The share of jobs in this sector is 2 percentage points higher in Science Cities, and these jobs pay a 7,500 roubles (in 2010 prices; roughly US$250) higher gross monthly salary. Interestingly, we observe a similar but smaller effect on wholesale-retail sector wages, too. Lastly, we examine our proxies for economic activity. The 2009-11 average standardized night lights indicator registers a statistically significant difference in favor of Science Cities, although it appears much smaller than the raw differ- ence and is only significant at the 5 per cent level in the bias-adjusted case. The difference amounts to about 45-55 per cent of the indicator’s standard deviation (see Table D.1). Raw differences suggest that Science Cities are characterized by a lower SME density (number of SMEs divided by municipal population); the cor-

23We obtain similar results if we use absolute, as opposed to fractional, measures of patent applications, or per capita measures.

19 Table 1: Municipal-level matching, main analysis: all Science Cities

Whole sample Matched sample (1 nearest neighbor)

Outcome Raw difference TC ATT ATT b.a. Γ∗ 35.524*** 27.256** 26.367** Population in 2010 (thousands) 78 59 2.15 (9.934) (10.869) (9.803) 0.113*** 0.067*** 0.059*** Graduate share in 2010 78 59 4.25 (0.009) (0.010) (0.010) 0.003*** 0.003*** 0.003*** Postgraduate share in 2010 78 59 2.80 (0.000) (0.001) (0.001) 2.610*** 3.014*** 2.822*** Frac. patents before 2000 78 59 3.40 (1.002) (0.713) (0.713) 8.197*** 9.260*** 8.608*** Frac. patents since 2000 78 59 3.40 (2.596) (2.160) (2.170) 0.028*** 0.023*** 0.021*** Empl. share in R&D and ICT 67 52 3.50 (0.004) (0.004) (0.004) 9.675*** 7.903*** 7.513*** Avg. salary in R&D and ICT 67 52 3.10 (1.406) (1.349) (1.390) 9.506*** 3.795** 3.014* Avg. salary in wholesale-retail 67 52 1.30 (1.450) (1.407) (1.600) 1.547*** 0.533*** 0.431** Night lights (2009-2011) 78 59 1.70 (0.155) (0.155) (0.151) -3.106*** 0.243 0.757* No. SMEs per 1,000 people 65 50 1.00 (1.082) (1.369) (1.335) 0.813*** 0.492*** 0.439*** SME labor productivity 65 50 2.15 (0.088) (0.104) (0.102)

Notes: ∗, ∗∗ and ∗∗∗ denote significance at the 10, 5, and 1 per cent level, respectively, of statistical tests performed with a FWER adjustment using Holm’s method that groups all the outcomes listed in this table together with the “de- mographic dynamics” outcomes listed in Table 3. Standard errors are reported in parentheses. For raw differences in the whole sample, the standard errors are the denominators of t-statistics of group differences. In the matched sam- ple, T is the number of matched treated observations; C is the number of matched controls; ‘ATT’ and ‘ATT b.a.’ are two estimates of the ATT respectively excluding and including a bias-adjustment term (Abadie and Imbens, 2011). In both cases, standard errors are computed following Abadie and Imbens (2006). Γ∗ is the minimum value of param- eter Γ 1, selected from a grid spaced by intervals of 0.05 length, such that in a sensitivity analysis à la Rosenbaum ≥ (2002) the set of Wilcoxon signed-rank tests associated with Γ∗ do not simultaneously reject the null hypothesis that the outcome variable is not different across the treated and control samples, for tests with α .05 type I error. Empl. = – employment; Frac. – fractional; Avg. – average; No. – number. responding ATT estimates indicate the opposite, but are only significant (at the 10 per cent level) when adjusting for the matching bias. The SME labor produc- tivity ATT estimate is, however, positive and statistically significant at the 1 per cent level. This suggests that the economic effect of Science Cities operates on

20 the intensive (productivity) margin of the local economy. We also perform a sensitivity analysis of our ATT estimates. Following Rosen- baum (2002), we simulate unobserved factors that would affect both the out- comes and the probability that a municipality hosts a Science City. We assess to what extent this would influence our conclusions about the presence of statisti- cally significant differences in Yi q between treated and (matched) control obser- vations, for all outcomes q 1,...,Q. The size of the simulated unobserved factor = is given by parameter Γ 1, which measures the hypothesized odds of receiving ≥ the treatment (Γ 1 is the experimental benchmark). In Table 1 we report, for = each outcome variable, the lower value Γ∗ selected from a grid spaced by inter- vals of 0.05 length that would lead to insignificant Wilcoxon signed-rank tests of 24 the differences between treated and control observations. The values of Γ∗ are quite high (around 3 or more) for the patent applications, the employment share and average salary in R&D and ICT and the graduate and postgraduate shares. They are satisfactorily high (1.7 or higher) for total population, night lights, and SME labor productivity.25 These values are in line with our statistical inference on the estimated ATT parameters, and show that our estimates are robust to pos- sible threats to causal identification. Lastly, the values of Γ∗ are smaller – be- tween 1 and 1.3 – for average salary in wholesale-retail and SME density: the conclusions about these outcomes appear less robust, although this could also be due to a reduced sample size thanks to ROSSTAT’s incomplete coverage.

ATT estimation: Historical Science Cities. Our interpretation of these esti- mates in terms of long-run effects would be threatened if, on average, Science Cities still receive a preferential treatment from the Russian government, in the form of direct or indirect support to local R&D or general purpose expenditure,

24The Wilcoxon signed-rank tests are based on the plain differences between matched pairs for every outcome; we select Γ∗ as the smallest values of Γ in the grid such that any p-value of the test is higher than α .05. Full results of the sensitivity analysis are available on request. = 25Γ 2 indicates a simulated unobserved factor that doubles the probability of receiving treat- = ment relative to that of not receiving it, or vice versa; such a high value of Γ would be realistic only in the presence of very serious threats to our conditional independence assumption. Con- sequently, very high “critical” values of Γ∗ associated with a certain outcome – close to 2 or higher – indicate that the results are likely to be robust to such threats.

21 such as infrastructure. In order to assess to what extent our results depend on current governmental support, we repeat the above analysis, excluding Science Cities with the official status of Naukogrady in today’s Russia. For these 14 Sci- ence Cities, the Russian government has resumed the Soviet-era program in the early 2000s, although with a less military and more civil focus. We call the re- maining Science Cities “historical.”

Table 2: Municipal-level matching, main analysis: “historical” Science Cities

Whole sample Matched sample (1 nearest neighbor)

Outcome Raw difference TC ATT ATT b.a. Γ∗ 38.419*** 33.570*** 33.472*** Population in 2010 (thousands) 64 51 2.10 (10.740) (9.458) (7.696) 0.110*** 0.056*** 0.052*** Graduate share in 2010 64 51 3.45 (0.009) (0.008) (0.009) 0.003*** 0.002*** 0.003*** Postgraduate share in 2010 64 51 2.30 (0.000) (0.000) (0.000) 2.031** 2.369*** 2.229*** Frac. patents before 2000 64 51 2.85 (1.046) (0.638) (0.631) 5.868** 6.664*** 6.243*** Frac. patents since 2000 64 51 2.75 (2.551) (1.874) (1.865) 0.018*** 0.012*** 0.011*** Empl. share in R&D and ICT 53 44 2.45 (0.003) (0.002) (0.002) 8.926*** 7.860*** 7.409*** Avg. salary in R&D and ICT 53 44 2.85 (1.271) (1.144) (1.353) 8.853*** 3.649*** 3.420*** Avg. salary in wholesale-retail 53 44 1.15 (1.646) (1.212) (1.255) 1.313*** 0.494*** 0.428*** Night lights (2009-11) 64 51 1.50 (0.167) (0.140) (0.135) -4.084*** -1.381 -0.366 No. SMEs per 1,000 people (All) 52 43 1.00 (1.117) (1.094) (1.147) 0.751*** 0.453*** 0.388*** SME labor productivity (All) 52 43 2.20 (0.102) (0.083) (0.081)

Notes: See the notes accompanying Table 1. The reported significance levels are based on a FWER adjustment using Holm’s method that groups all the outcomes listed in this table together with the “demographic dynamics” outcomes listed in Table F.2 of the Online Appendix.

Table 2 shows the estimates based on the matched sample restricted to his- torical Science Cities. The estimated ATT is, for most outcomes of interest, very

22 similar to the corresponding estimates in Table 1. Statistical inferences and sen- sitivity analyses à la Rosenbaum generally confirm this assessment. The removal of Naukogrady results in a dramatic change of the estimated effects for patenting activity and employment share in R&D and ICT. The measure relative to pre-2000 patent applications is about 20 per cent smaller than the initial estimates in Ta- ble 1, while the measure relative to later patent applications is about 28 per cent smaller. Employment share in R&D and ICT is about one percentage point lower than the initial estimate - this is substantial, considering that the raw mean em- ployment share in R&D and ICT is 3.6 per cent in Science Cities. Nevertheless, all estimates remain significant at the 1 per cent level and robust, as evidenced by the values of Γ∗ from the sensitivity analysis. While the difference in the drop in patent applications before and after 2000 is not large, it suggests that current government support is an important stim- ulant for patenting activity in the Russian context, where innovation is still pre- dominantly driven by the government sector. This is underlined by the estimate for the employment share in R&D and ICT. In either case, we keep observing a positive differential in favor of historical Science Cities for most demographic and economic outcomes of interest. Such differentials are even more surprising as they are independent of the extent to which the government currently sup- ports local R&D, and thus can only be interpreted as long-run effects. Our initial interpretation of the results is, if anything, reinforced by this restricted analysis.

ATT estimation: Demographic dynamics. It is possible that our results are merely transient. In our conceptual framework (section 6) we postulate the ex- istence of “persistence forces,” independent of other endogenous mechanisms, that induce path-dependence from the Soviet-era allocation of the labor force. In the real world, however, workers are slowly replaced by new generations of younger workers. If the latter are increasingly less anchored to prior locations, spatial equilibrium can over time lead to mean reversion – even in the presence of agglomeration forces, thanks to the action of random shocks. In such a sce- nario, our results could not be interpreted as true long-run effects, but rather as

23 snapshots of a long transition back to a steady state. We investigate the possibility that the advantage of Science Cities wanes over time by exploiting information on the number of residents in each municipal- ity by type of attained education within each cohort of birth. This lets us assess the extent to which our results on urban educational levels are driven mainly by older cohorts. We split the population of each municipality observed in the 2010 census into the “young” and the “old”, using two different threshold years of birth: 1965 for the graduate share and 1955 for the postgraduate share (tak- ing into account the fact that in Russia, postgraduate education is characterized by a longer average duration). At the time of the dissolution of the USSR (1991- 1992), the older individuals in the “young” group who had obtained a univer- sity degree were starting their professional careers and presumably could move more easily. Furthermore, those who were underage at the time of the transition might have pursued less education than their ancestors (mean reversion). Both factors would predict a more equal distribution of young graduates between Sci- ence Cities and their matched counterparts.

Table 3: Municipal-level matching, demographic dynamics: all Science Cities

Whole sample Matched sample (1 nearest neighbor)

Outcome Raw difference TC ATT ATT b.a. Γ∗ 0.122*** 0.079*** 0.071*** Graduate share: b.y. 1965 78 59 4.40 ≤ (0.010) (0.012) (0.012) 0.107*** 0.056*** 0.046*** Graduate share: b.y. 1965 78 59 3.10 > (0.008) (0.010) (0.010) 0.003*** 0.003*** 0.003*** Postgraduate share: b.y. 1955 78 59 2.80 ≤ (0.001) (0.001) (0.001) 0.003*** 0.003*** 0.003*** Postgraduate share: b.y. 1955 78 59 2.25 > (0.000) (0.001) (0.001)

Notes: See the notes accompanying Table 1. The reported significance levels are based on a FWER adjustment using Holm’s method that groups all the outcomes listed in this table together with the main outcomes listed in Table 1. b.y. – birth year.

Using our matched sample, we estimate the Science Cities ATT on the grad- uate population share separately for the “old” and “young” groups. The results

24 in Table 3 show that while the differences are indeed larger for the older group, they are positive and statistically significant for the younger one as well, albeit amounting to about 65 per cent of the former. The estimates are qualitatively similar for the postgraduate share of the population26 and are not sensitive to the choice of the threshold. Thus, this analysis provides no evidence in favor of the mean reversion hypothesis: it appears that the children of Soviet inhabitants of Science Cities pursue educational and locational choices that are largely sim- ilar, albeit not identical, to those of their ancestors. Estimates for the restricted “historical” subsample (see Online Appendix F) support these conclusions.

ATT estimation: Economic dynamics. A logical next step is to assess mean reversion in economic outcomes. If the relative skill levels of Science Cities and comparable municipalities are equalized over time, we would expect economic convergence as well. Our data do not allow us to track the evolution of our prox- ies of economic activity over the post-transition years, with the exception of the night lights measures. Figure 2 depicts averages of the standardized night lights indicator, separately for Science Cities and their matched controls, for every year from 1992 to 2010. While the two groups share similar annual fluctuations, Sci- ence Cities appear to constantly outperform their counterparts, with hardly any catch-up by the control group. To check that this pattern is not due to an extreme path dependence of some random unobserved factors that are not explained by Science City status, we perform formal regression-based tests, allowing for tem- poral persistence in the unobservable factors driving each municipality’s night light measures (see Online Appendix F).

ATT estimation: Municipal budgets and local public goods. We now turn our attention to the analysis of municipal budgets of Science Cities. Its objective is twofold. First, we directly test whether Science Cities, be they historical or cur-

26We observe a secular increase in the attainment of postgraduate education in Russia follow- ing the transition, which is opposite to the general trend observed for tertiary education. Across all municipalities, the unweighted average share of graduates in the old group is about 12.5 per cent, compared with about 11 per cent among the young (24.5 versus 21.5 per cent in Science Cities). Conversely, the postgraduate share is 0.15 per cent in the old group and 0.33 per cent in the young group (0.50 versus 0.63 per cent in Science Cities).

25 Figure 2: Time series of the average standardized night lights indicators, 1992-2010

140 Treated Control 120

100

80 Average std. night lights

1992 1998 2004 2010 Year rent Naukogrady, receive a differential amount of direct governmental transfers or local tax earnings (itself a function of local economic activity). In addition, we see this as an opportunity to test the hypothesis by von Ehrlich and Seidel (2018). They explain their results by the persistence of municipal spending in certain, presumably productivity-enhancing, infrastructure, rather than by agglomera- tion forces. A parallel mechanism could be at work in our setting: for example, since Science Cities used to be inhabited by relatively more university graduates than other similar localities, their population might have kept a stronger prefer- ence for the provision of certain public goods, such as those related to education or even to local physical infrastructure, whose returns are deferred in time. Russian municipalities collect their resources from both local taxes (property taxes, merchant fees, fees for the provision of local services) and from a portion of federal taxes (income tax, business tax and similar) that are paid by local resi- dents. In addition, municipalities receive discretionary transfers from the federal and the regional governments. In our data, we can identify the source of munic- ipal revenues as well as the allocation of expenditures by category (education, health services, local infrastructure and so on) for all Russian municipalities, ex- cept closed cities. Our measures of interest are 2006-16 averages of selected bud-

26 get items for each municipality, divided by the 2010 municipal population. We estimate the Science City ATT for each of these per capita measures, comparing the fiscal and expenditure patterns of Science Cities with those of their matched counterparts.27

Table 4: Municipal-level matching, budgets analysis: all Science Cities

Whole sample Matched sample (1 nearest neighbor)

Outcome Raw difference TC ATT ATT b.a. Γ∗ -5.271*** -0.283 -1.183 Total revenues per capita 64 51 1.00 (1.375) (0.865) (1.077) 3.467*** 1.986*** 1.520** Tax income per capita 64 51 1.95 (0.726) (0.473) (0.527) -8.738*** -2.269*** -2.703** All transfers per capita 64 51 1.00 (0.869) (0.688) (0.850) -5.158*** 0.236 -0.340 Total expenditures per capita 64 51 1.00 (1.357) (0.871) (1.063)

Notes: See the notes accompanying Table 1. The reported significance levels are based on a FWER adjustment using Holm’s method that groups all the outcomes listed in this table together with the detailed budget entry outcomes and those about local public goods that are listed in Table F.1of the Online Appendix.

Table 4 summarizes the key estimates. In raw differences, Science Cities col- lect more taxes per capita than ordinary municipalities; however, they receive disproportionately lower total transfers per capita. As a result, both their total revenues and expenditures per capita are smaller. This is only partly mitigated by the fact that Science Cities obtain higher earnings from local taxes. In the matched sample, the ATT estimates for average revenues and expenditures are equal to zero, those for average tax income are positive and statistically signifi- cant – as one would expect if Science Cities were indeed richer – and those for average transfers are negative and also statistically significant, with and without bias adjustment. The values of Γ∗ are equal to 1 for all outcomes except tax in- come, in which case it is close to 2.

27In this part, we control for the FWER within a family of variables that comprises both munic- ipal budget and local public goods. We keep this family of “potential mechanisms” separate from the main outcome variables in Table 1, because they are meant to address different hypotheses (we could find differences in the potential mechanisms and not in the main outcomes, or vice versa). Inferences are not fundamentally altered if we group all outcomes in one family.

27 Our interpretation of these results is based on our understanding of the insti- tutional context: we argue that political economy mechanisms operate to redis- tribute federal resources in order to achieve approximately similar levels of gov- ernmental expenditures per capita across Russia. Since Science Cities are typ- ically richer and thus obtain higher local taxes, this often results in lower total transfers in their favor. Governmental support for Science Cities may also exist in the form of direct expenditures appearing only in the federal budget: unfor- tunately, such data are not available to us. Yet, if Science Cities were still strate- gically important for the federal government, we would expect – if anything – to observe less of a symmetry between revenues and transfers per capita. In other words, the government may want to complement direct intervention with more indirect subsidies. We do not observe this in the data. To test the vES hypothesis in more detail, we examine whether Science Cities still differ from their matched localities in terms of per capita expenditures on a number of entries of their municipal budget. We find no evidence of differential patterns in any of the budget sub-categories we examine; none of the ATT esti- mates are statistically significant and the corresponding values of Γ∗ are equal to 1. In addition, we test if Science Cities differ from their matches in terms of the existing stock of selected public goods, like libraries, museums, theaters (fa- cilities that are more likely to attract the better educated) or the shares of streets lit during the night. Once again, we find no robust differences. The detailed re- sults can be found in Online Appendix F, which also reports the results from the analysis of municipal budgets restricted to the subsample of “historical” Science Cities – an exercise that leaves our main conclusions unchanged.

Robustness: Effects by Science City categories. Lastly, we explore whether our results are driven by specific features of Science Cities or of the matched con- trols. In particular, we consider the following categorizations. Reminiscent of the security-usability trade-off discussed in section 2, we first inquire whether Science Cities originally serving top secret military purposes – those in avia- tion/rocket or in nuclear science – have embarked on a different pattern of eco-

28 nomic development thanks to, say, more robust past investments. Second, we look at Science Cities that were built from scratch vis-à-vis those created in pre- existing settlements; the pattern of demographic and economic development might have been different in the two groups.28 Third, we examine whether the long-run outcomes of the Science Cities identified by Agirrechu (2009), whose list we consider incomplete and best amended through other sources, are in any way different from those of the remaining cities in our list.29 Fourth, we provide a test about possible violation of the SUTVA by splitting the matched sample ac- cording to the presence of other Science Cities within a radius of 50km from the matched control, or lack thereof.30 Finally, since the creation of Science Cities was staggered in the postwar decades, we analyze whether the timing of their creation has any bearing, using 1950 or 1960 as cutoff dates. To provide an assessment about differences between the cited categories, we perform t-tests of a difference in the treated-control means in the matched sam- ple. Given the small size of some of the subsamples in question, we restrict our attention to key outcomes for which we have complete information for all mu- nicipalities: the post-2000 fractional patent applications, the night lights mea- sure averaged over 2009-11, the 2010 population, and the share of that popula- tion with at least a graduate degree. Table 5 summarizes the results of these t- tests, for each category-outcome combination, showing the t-statistic value and adjusted p-value controlling for the false discovery rate (FDR) for the outcomes in each category, using Hochberg’s method.31

28For example, Science Cities built from scratch may have initially grown faster, but have stag- nated following the transition in case they had been placed in inconvenient locations. Note that Science Cities built from scratch were newly urbanized settlements located in a municipality which included other, possibly sparsely populated, settlements; under our matching approach, these Science Cities are paired to municipalities with similar characteristics and pre-trends. 29The original list by Agirrechu (2009) includes 75 Science Cities out of our total of 95. 30Our matched sample is constructed by imposing a constraint of a 50km minimal distance between treated and control cities. Therefore, under the hypothesis that interference only affects control cities and it does not affect places farther than 50km, SUTVA violation is not a concern for matched pairs where no other Science Cities are too close to the control municipality (39 out of 78 in our matched sample). See Online Appendix F for additional discussion. 31We control for the FDR rather than FWER to make our tests more powerful at detecting true positives.

29 Table 5: Robustness t-tests: differences between types of Science Cities

Fr. patents Night Total Graduate Criterion since 2000 lights population share -0.007 1.861 2.235 1.445 Secrecy vs. usability [0.995] [0.136] [0.125] [0.205] 0.178 -0.560 -0.520 -0.654 Built from scratch vs. others [0.859] [0.808] [0.808] [0.808] -1.267 0.636 1.980 -0.075 Agirrechu’s list vs. others [0.425] [0.706] [0.213] [0.941] 1.261 -0.715 -0.697 -0.044 Other close Science Cities, vs. none [0.651] [0.651] [0.651] [0.965] 1.495 -0.088 0.677 -0.187 Founded before 1950, vs. later [0.568] [0.931] [0.931] [0.931] 2.649 0.173 0.711 0.312 Founded before 1960, vs. later [0.042] [0.864] [0.864] [0.864]

Notes: For every combination given by a Science City category (rows) and an outcome variable (columns) the ta- ble displays the t-statistic of the average treated-control mean differences in the outcome between the two parts of the matched sample, and (in brackets) the corresponding adjusted p-value that controls for the FDR across all four outcomes for each criterion, using Hochberg’s method. Online Appendix G provides a visual representation of the differences being tested. Fr. – fractional.

Only one null hypothesis can be rejected at conventional levels when con- trolling for the FDR: the raw mean difference in the fractional patent applica- tions since 2000 between Science Cities founded before 1960 and those founded in 1960 or later. Nine out of 14 current Naukogrady were founded before 1960, and as we discussed earlier, they have had a higher patenting activity since 2000 thanks to current government support. Overall, we conclude that the specific categories of Science Cities we examined and issues of SUTVA violation are un- likely to drive our main results.

6 Model and mechanisms

To enhance the interpretation of the long-run effect of Science Cities we esti- mate a spatial equilibrium model suited to our particular setting. We adapt a simplified version of the framework by Allen and Donaldson (2020), while intro-

30 ducing workers of two different skill types. We also draw inspiration from Moretti (2011, 2013), Glaeser and Gottlieb (2008, 2009), and prior tradition (Rosen, 1979; Roback, 1982). We first describe the model and its equilibrium properties and then discuss the resulting estimates and their implications. A detailed analysis of the model and its extensions can be found in Online Appendix H.

Model Setup. Consider a set C of Russian cities, of dimension C and indexed as c 1,...,C. These cities are inhabited by two different types of workers: those = of high skill, and those of relatively lower skill. This classification is convention- ally interpreted in terms of differences in the educational attainment. In our con- text, highly skilled workers can be identified as researchers engaged in R&D – a subset of all the university-educated – while low-skilled workers represent all other workers in the remaining sectors. The model allows for both interpreta- tions. We denote the mass of highly skilled workers employed in city c at time t

as Hct , while Lct is the corresponding notation for low-skilled workers. We use lower-case letters, respectively hct and `ct , to denote their logarithms. At time t 0, all cities are part of the Soviet Union which, for idiosyncratic = reasons, allocates the values of H and L to each city. At time t 1, all cities ct ct = are part of modern Russia, a market economy, and workers of both types self-

select into either location. We express the logarithmic indirect utility unic of an individual i of type n h,` living in city c, as a linear function of wages, local ∈ characteristics such as amenities, past and present population of either skill type, and idiosyncratic preferences:

un w n βn` βn` δnh δnh an en , (1) ic = c + 0 c0 + 1 c1 + 0 c0 + 1 c1 + c + ic where w n is the log-wage earned by workers of type n in city c at t 1; an rep- c = c resents factors specific to city c (such as amenities or public goods) that make it a more enjoyable location, and whose specific value might vary by skill group; n and eic denotes the idiosyncratic taste of individual i for city c - a random shock assumed to follow a Type I Extreme Value (Gumbel) distribution with zero loca- tion parameter and scale parameter equal to σn. In this expression, parameters

31 β1 and δ1 represent current population externalities, which could be negative if

“divergence” effects dominate. Instead, β0 and δ0 express the strength of those “persistence” forces that make a location desirable by virtue of its past popula- tion.32 If σh δh 0 and σ` β` 0, the model generates positive labor supply − 1 > − 1 > elasticities at the city pair level. To close the model, we introduce two types of firm: one that employs highly skilled workers, and one that relies on lower skilled workers.33 The log-output y of type n h,` firms in city c follows a Cobb-Douglas technology: nc =

h h h h h yc gc θ hc ϕ `c λhc (1 λ)kc = + + + + − (2) y` g ` θ`h ϕ`` λ` (1 λ)k`, c = c + c + c + c + − c

n n where gc is the city- and type-specific total factor productivity, and kc is the log- capital employed by the firms of type n in city c. The supply of capital is in- finitely elastic and its cost is the same for all firms in the two cities s and z.34 For simplicity, the elasticity of labor is equal to λ (0,1) for both types of firm in ∈ both cities. Firms of type ` do not hire workers of type h, but take hc as given; h ` vice versa, firms of type h take `c as given. The parameters θ and θ repre- sent agglomeration forces, such as knowledge spillovers, that stem from a higher high-skilled population and affect firms employing high- and low-skilled work- ers, respectively. Symmetrically, ϕh and ϕ` represent agglomeration forces due to a higher low-skilled population.35 This specification nests a more restricted

32Typical examples of divergence forces are the “congestion” effects due to traffic or pollution, and higher rents due to a higher demand for housing. Persistence forces can be micro-founded on better housing stocks and/or wider product variety that are due to a higher past population. 33In Moretti (2011), this was largely a simplification meant to abstract from the degree of sub- stitutability between skills. This characteristic of the model can be given a contextual interpreta- tion here: if workers of type h are researchers, type h firms correspond to the R&D sector, while type ` firms represent the rest of the local economy. 34See the discussion in Online Appendix H about the realism of this hypothesis and the impli- cations of its violation. 35While Allen and Donaldson (2020) do not distinguish agglomeration forces due to different skill types, they include agglomeration forces that depend upon past population (although they find these to be very small in their empirical analysis). We omit them as we are unable to identify the implied additional paramaters with the data available to us.

32 one where ϑn θn ϕn for n h,`, in which case, agglomeration effects are a ≡ = − = function of the share of high-skilled workers in city c, as in Moretti (2004).

Spatial Equilibrium. The spatial equilibrium is most conveniently expressed as the log-difference in selected outcomes between any two cities s and z in C . The key equilibrium equations are the following, for n h,`: =

(n n ) ¡pn pn¢ µnh (h h ) µn` (` ` ) (3) s1 − z1 = s − z + s0 − z0 + s0 − z0 n hh n `h n h` n `` ¡ n n¢ ¡ n n¢ θ µ ϕ µ θ µ ϕ µ w w r r + (hs0 hz0) + (`s0 `z0), s − z = s − z + λ − + λ − (4) where pn, and r n are functions of g n and an for n h,`, while the parameters c c c c = nn expressed as µ 0 – where n0 h,` – are functions of all the parameters listed in = (1) and (2), including σh,σ`. We call these parameters “history multipliers:” they express the equilibrium elasticities of present (t 1) population of type n on the = past (t 0) population of type n0. As such, they enclose all the interaction effects = of agglomeration and divergence forces that lead to the spatial dispersion of pop- ulation and economic activity. As detailed in the Online Appendix, to achieve a non-degenerate equilibrium it is necessary that none of these forces is too preva- lent, implying some restrictions on the values of the primitive parameters.

n n Estimation of the model. We augment pc and rc by making them both func- tions of observable city characteristics (like local expenditures in public goods, or amenities) and error terms. Hence, the four equations given by (3) and (4) for n h,` are reshaped as the reduced forms of two pairs of triangular equations = with two endogenous right-hand side variables (the differences in t 1 popula- = tions for both skill types) and two exogenous ones (the differences in t 0). We = estimate the model on our matched sample, so that s denotes a Science City and z its matched counterpart. This aids identification: under our maintained hy-

potheses, within the matched sample the Soviet-era allocation of hc and `c for c s,z is orthogonal to secular determinants of socio-economic outcomes. Note = that the municipal-level population data from 1989 that we collected cannot be

33 distinguished by level of education. To obviate this problem, we adopt two com- plementary approaches. In the first, we define “past” and “present” populations as the individuals from the 2010 census who were born before and since 1965, respectively (Approach 1). In the second, we proxy the two exogenous variables by the fitted values obtained from regressing 1989 log-population on the “old” 2010 log-population of either skill type (Approach 2). The assumption implicit in both approaches is that older individuals are immobile. Lastly, since λ is not identified in our model, we calibrate it as λ 0.66. = The results are displayed in Table 6. The two approaches deliver very sim- ilar estimates. In the baseline estimates in columns 1 and 3, the high-to-high spillover elasticity θh is estimated at around 0.27 and is significant at the one per cent level; the estimates for the other parameters are less robust. The baseline estimates cannot reject the restriction that ϑn θn ϕn for n h,`; conse- ≡ = − = quently, in columns 2 and 4 we also show estimates that incorporate this restric- tion. Under both approaches, the estimates of parameter ϑh – the elasticity of high-skilled wages on the share of high-skilled workers in a city – are around 0.33 and statistically significant. The corresponding estimates ϑ` for the low-skilled are smaller in magnitude and less precisely estimated. The estimates of the his- tory multipliers are significant, with the exception of the high-on-low elasticity µh`. This suggests that while the sizes of both skill groups depend on their past values, a higher high-skilled population attracts more low-skilled workers in the future, but the converse is not true.

Extrapolations. The estimates from the model help to enhance the interpre- tation of our main results from section 5 in two major ways. First, they allow us to evaluate the relative effects of alternative counterfactual allocations of labor from t 0 on two pairs of matched cities s and z. In particular, given an alloca- = tion (h ,h ,` ,` ), one can calculate the following quantity, for n h,`: s0 z0 s0 z0 =

n ¡¡ n ¢ ¡ n ¢¢ W exp w n w n (5) sz = s + s1 − z + z1 which represents the ratio between the total wages of type n paid in city s and

34 Table 6: Estimation of the spatial equilibrium model

Approach 1 Approach 2 Parameter (1) (2) (3) (4) θh (ϑh): high-to-high spillovers 0.269*** 0.335*** 0.249*** 0.323*** (0.061) (0.059) (0.053) (0.063)

ϕh: high-to-low spillovers -0.211* -0.191* (0.087) (0.079)

θ` (ϑ`): high-to-low spillovers 0.155* 0.175* 0.142* 0.165 (0.077) (0.089) (0.070) (0.092)

ϕ`: low-to-low spillovers -0.1364 -0.1237 (0.1043) (0.0961)

µhh: history: high-on-high 0.976*** 0.942*** 1.683*** 1.609*** (0.031) (0.044) (0.029) (0.074)

µh`: history: high-on-low 0.056 0.123 0.031 0.127 (0.051) (0.079) (0.036) (0.094)

µ`h: history: low-on-high 0.110*** 0.145*** 0.108*** 0.181* (0.018) (0.042) (0.020) (0.075)

µ``: history: low-on-low 0.902*** 0.835*** 1.212*** 1.116*** (0.025) (0.068) (0.021) (0.090) Spillover constraints test: p-value 0.378 0.385 Full set of covariates YES YES YES YES Constrained model NO YES NO YES No. of observations 55 55 55 55

Notes: *, ** and *** denote significance at the 10, 5, and 1 per cent level, respectively. The table displays estimates of the model expressed by equations (3) and (4) for n h,`, where g n and an are expressed as linear functions of a set = c c of city characteristics (“covariates”) and an error term as discussed in Online Appendix H, and for both approaches described in the text. Estimation is performed on the matched sample via one-step GMM using an identity weighting matrix. In constrained models, the parameter restrictions ϑn θn ϕn for n h,` are applied; in unconstrained ≡ = − = models, the p-value from a joint test of these restrictions (“spillover constraints test”) is reported. Standard errors are clustered by grouping matched pairs that share the same control city; they are reported in parentheses.

n those paid in city z. To substantiate, we provide values of Wsz for both skill types, calculated using the estimates from column 4 of Table 6, for six alternative allo- cations. The results from this exercise are displayed in Table 7. Column 1 represents a baseline scenario where Science Cities had never been established: H 14 and N 68 (in thousands of inhabitants) for c s,z. Col- c0 = c0 = =

35 Table 7: Wage ratios defined in equation (5) for alternative allocations at t 0 =

(1) (2) (3) (4) (5) (6)

h Wsz 1.000 1.606 1.131 2.223 0.974 1.442 ` Wsz 1.000 1.183 1.098 1.328 1.112 1.827

(Hs0,Ls0) (14,68) (22,68) (18,68) (30,68) (14,76) (22,88) (Hz0,Lz0) (14,68) (14,68) (14,68) (14,68) (14,68) (14,56) h ` Notes: In each column, the table provides the values of Wsz and Wsz that are predicted by the estimates from column 4 of Table 6 given the allocation (Hs0, Hz0,Ls0,Lz0), expressed in thousands of inhabitants, specified at the bottom. See Online Appendix H for the formulae used to produce these values. umn 2 corresponds to a representative allocation of high-skilled workers from other, non-matched locations to Science Cities, with Hsz increasing to 22 relative 36 n to column 1. A way to interpret the predicted values of Wsz is by taking “ratios of ratios:” for example, the total wages paid to high-skilled workers in a typical Sci- ence City (column 2) are about 60 per cent higher relative to the non-intervention scenario (column 1), which is close to the actual difference that we observe in the matched sample. One can perform similar exercises using other figures from Ta- ble 7.37 Note that these calculations cannot reveal the loss that other Russian cities might have incurred relative to the counterfactual. To provide a compre- hensive answer about the overall welfare effects of such a policy it is necessary to develop and estimate a more extensive model, such as Rossi-Hansberg et al. (2020). They study a hypothetical policy intervention in the U.S. that focuses on concentrating cognitive non-routine occupations in cognitive hubs and argue that it would improve welfare by optimally harnessing the externalities. Assess- ing the welfare effects of an actually tried policy that is similar in spirit, such as the Science Cities evaluated in this paper, would be a good direction for future work.

36We choose these numbers because they are roughly consistent with the average values of the 1989 population in Russia (90 in Science Cities and 82 in their matched controls), with a graduate share among old generations of about 0.24 in Science Cities and of 0.16 in their matched controls h ` (as seen in the data), and with the actually observed values of Wsz and Wsz in the 2010 data. 37For example, column 3 predicts the consequences of an intervention of half the size relative to the baseline; column 4, of double size; column 5 describes an alternative intervention of in- creasing the allocation of low-skilled workers, rather than high-skilled ones, to city s; and finally column 6 describes an allocation where the s-z skill type ratios are identical.

36 Second, the parameters of the model can be used to calculate the subsidy tN∗ (expressed in units of log-wage) that the government should provide as an incen- tive to workers of type N H,L for them to move to some city s so as to reproduce = ex-post the population ratio N ∗ N /N , where s and z are two ex ante identical = s z cities. This value is lower than one implied by the city-level labor supply elastic- ity, because the effect of a relatively small incentive is amplified in equilibrium through agglomeration economies. However, the parameter estimates from Ta- ble 6 are not enough to perform this calculation: it is also necessary to know the parameters βn, δn and σn, for t 0,1 and n h,`. In Online Appendix H we dis- t t = = cuss an attempt to use external information – drawn from the literature – to cal- culate the subsidy tH∗ that the Russian government should provide to reproduce the relative allocation typical of Soviet Science Cities: (h h ) log(22/14), in 0s − 0z = today’s market economy. In summary, we calculate that the government should provide a subsidy equal to about 150 per cent of the ex-ante wage; this figure is based on the city-level labor supply elasticities of about 0.26. In future work we plan to obtain elasticity estimates based on cross-municipality migration data; this would allow us to provide more accurate values for the subsidy tN∗ .

7 Conclusion

In this paper we have analyzed the long-run effects of a unique historical place- based policy: the creation of R&D-focused Science Cities in Soviet Russia. Both the initial establishment and the eventual suspension of this program was largely guided by political factors that are arguably exogenous to drivers of current so- cial and economic conditions of Russian cities. We compare Science Cities with other localities that were observationally similar to them at the time of their se- lection, and we compute differences in the current characteristics between the two groups. We find that former Science Cities are bigger today and they host a higher number of well-educated individuals. Moreover, they file a higher num- ber of patent applications at the EPO; their R&D and ICT sectors are more devel-

37 oped, and pay higher salaries. Lastly, Science Cities host more productive small businesses in the services sector. Because our results are largely unchanged after we exclude Science Cities that currently receive support from the Russian government, we conjecture that they are driven by the interaction between persistence (path-dependence) and agglomeration forces. In more mundane terms, highly skilled individuals who have remained in their former cities of residence have contributed to the emer- gence of more productive businesses in the new market economy. This insight is substantiated by the estimates of agglomeration elasticities based on a spatial equilibrium model. We rule out alternative explanations that have to do with dif- ferential governmental transfers and find no support for the hypothesis of rapid mean reversion of socio-economic outcomes. Our contribution extends previous findings about long-run effects of place- based policies to a unique historical program that focused on human capital and R&D. More generally, our results are also informative for science and innovation policy, both in the context of emerging economies such as Russia and in those of traditionally capitalist countries. We hope that these results will be invoked to motivate similar R&D policies but with a civil, instead of military, purpose.

References

Abadie, Alberto and Guido W. Imbens (2006) “Large sample properties of match- ing estimators for average treatment effects,” Econometrica, Vol. 74, No. 1, pp. 235–267.

(2011) “Bias-corrected matching estimators for average treatment ef- fects,” Journal of Business and Economic Statistics, Vol. 29, No. 1, pp. 1–11.

Agirrechu, Aleksandr A. (2009) Russian science cities: History of formation and development (Naukogradi Rossiyi: Istoriya formirovaniya i razvitiya), Moscow: Moscow University Press. Available in Russian only.

Allen, Treb and Dave Donaldson (2020) “Persistence and Path Dependence in the Spatial Economy.” NBER Working Paper no. 28059.

38 Ambler, John, Denis J. B. Shaw, and Leslie Symons eds. (1985) Soviet and East European transport problems, London: Croom Helm.

Audretsch, David B. and Maryann P.Feldman (1996) “R&D spillovers and the ge- ography of innovation and production,” American Economic Review, Vol. 86, No. 3, pp. 630–640.

Ben-Michael, Eli, Avi Feller, and Jesse Rothstein (2019) “The Augmented Syn- thetic Control Method.” mimeo.

Bircan, Ça˘gatay and Ralph De Haas (2020) “The limits of lending? Banks and technology adoption across Russia,” The Review of Financial Studies, Vol. 33, No. 2, pp. 536–609.

Bloom, Nicholas, Mark Schankerman, and John Van Reenen (2013) “Identifying technology spillovers and product market rivalry,” Econometrica, Vol. 81, No. 4, pp. 1347–1393.

Cheremukhin, Anton, Mikhail Golosov, Sergei Guriev, and Aleh Tsyvinski (2017) “The Industrialization and Economic Development of Russia through the Lens of a Neoclassical Growth Model,” Review of Economic Studies, Vol. 84, No. 2, pp. 613–649.

Cooper, Julian M. (2012) “Science-technology policy and innovation in the USSR,” slides, CEELBAS Workshop "Russia’s Skolkovo in Comparative and His- torical Perspective", held on 12 June 2012 at UCL SSEES.

Diamond, Alexis and Jasjeet S. Sekhon (2013) “Genetic Matching for Estimating Causal Effects: A General Multivariate Matching Method for Achieving Balance in Observational Studies,” The Review of Economics and Statistics, Vol. 95, No. 3, pp. 932–945.

Donaldson, Dave and Adam Storeygard (2016) “Thew view from above: Applica- tions of satellite data in economics,” Journal of Economic Perspectives, Vol. 30, No. 4, pp. 171–198. von Ehrlich, Maximilian and Tobias Seidel (2018) “The persistent effects of place- based policy: Evidence from the West-German Zonenrandgebiet,” American Economic Journal: Economic Policy, Vol. 10, No. 4, pp. 344–374.

Ellison, Glenn, Edward L. Glaeser, and William R. Kerr (2010) “What causes in- dustry agglomeration? Evidence from coagglomeration patterns,” American Economic Review, Vol. 100, No. 3, pp. 1195–1213.

39 Fan, Jingting and Ben Zou (2021) “Industrialization from scratch: The “Construc- tion of Third Front” and local economic development in China’s hinterland.” Mimeo.

Ganguli, Ina (2015) “Immigration & ideas: What did Russian scientists ’bring’ to the US?” The Journal of Labor Economics, Vol. 33, No. S1, pp. S257–S288. Part 2, July.

Glaeser, Edward L., Hedi D. Kallal, José A. Scheinkman, and Andrei Schleifer (1992) “Growth in cities,” Journal of Political Economy, Vol. 100, No. 6, pp. 1126–1152.

Glaeser, Edward L. and Joshua D. Gottlieb (2008) “The economics of place- making policies,” Brookings Papers on Economic Activity, pp. 155–239, Spring.

(2009) “The wealth of cities: Agglomeration economies and spatial equi- librium in the United States,” Journal of Economic Literature, Vol. 74, No. 4, pp. 983–1028.

Greenstone, Michael, Richard Hornbeck, and Enrico Moretti (2010) “Identify- ing agglomeration spillovers: Evidence from winners and losers of large plant openings,” Journal of Political Economy, Vol. 118, No. 3, pp. 536–598.

Gregory, Paul R. (2004) The Political Economy of : Evidence from the Soviet Secret Archives: Cambridge University Press.

Gregory, Paul R. and Mark Harrison (2005) “Allocation under Dictatorship: Re- search in Stalin’s Archives,” Journal of Economic Literature, Vol. 43, No. 3, pp. 721–761.

Gregory, Paul R. and Andrei Markevich (2002) “Creating Soviet Industry: The House That Stalin Built,” Slavic Review, Vol. 61, No. 4, pp. 787–814.

Harrison, Mark (2018) “Secrecy and State Capacity: A Look Behind the Iron Cur- tain.” mimeo.

Heblich, Stephan, Marlon Seror, Hao Xu, and Yanos Zylberbeg (2020) “Indus- trial Clusters in the Long Run: Evidence from Million-Rouble Plants in China.” mimeo.

Ivanov, Denis S. (2016) “Human capital and knowledge-intensive industries lo- cation: Evidence from Soviet legacy in Russia,” Journal of Economic History, Vol. 76, No. 3, pp. 736–768.

40 Jaffe, Adam B. (1989) “Real effects of academic research,” American Economic Review, Vol. 79, No. 5, pp. 957–970.

Kline, Patrick M. and Enrico Moretti (2014a) “Local economic development, ag- glomeration economies, and the Big Push: 100 years of evidence from the Ten- nessee Valley Authority,” The Quarterly Journal of Economics, Vol. 121, No. 1, pp. 275–331.

(2014b) “People, places, and public policy: Some simple welfare eco- nomics of local economic development programs,” Annual Review of Eco- nomics, Vol. 6, pp. 629–662.

Marshall, Alfred (1890) Principles of economics, London: Macmillan, 8th edition.

Mikhailova, Tatiana (2012a) “Gulag, WWII and the Long-run Patterns of Soviet City Growth,” in MPRA Paper 41758: University Library of Munich, Germany.

(2012b) “Where Russians should live: A counterfactual alternative to So- viet location policy,” in MPRA Paper 35938: University Library of Munich, Ger- many.

Moretti, Enrico (2004) “Workers’ education, spillovers and productivity: Evi- dence from plant-level production functions,” American Economic Review, Vol. 94, No. 3, pp. 656–690.

(2011) “Local labor markets,” in Handbook of Labor Economics: Elsevier.

(2013) “Real wage inequality,” American Economic Journal: Applied Eco- nomics, Vol. 5, No. 1, pp. 65–103.

Moretti, Enrico, Claudia Steinwender, and John Van Reenen (2020) “The intel- lectual spoils of war? Defense R&D, productivity and international spillovers,” NBER Working Paper 26483, National Bureau of Economic Research.

Neumark, David and Helen Simpson (2015) “Place-based policies,” in Gilles Du- ranton, J. Vernon Henderson, and William Strange eds. Handbook of Regional and Urban Economics, Vol. 5: Elsevier, Chap. 18, pp. 1197–1287.

Roback, Jennifer (1982) “Wages, rents and the quality of life,” Journal of Political Economy, Vol. 90, No. 6, pp. 1257–1278.

Rosen, Sherwin (1979) “Wage-based indexes of urban quality of life,” in P. N. Miezkowski and M. R. Straszheim eds. Current Issues in Urban Economics: Johns Hopkins University Press, Baltimore, MD.

41 Rosenbaum, Paul R. (2002) Design of observational studies: Springer.

Rossi-Hansberg, Esteban, Pierre-Daniel Sarte, and Felipe Schwartzman (2020) “Cognitive Hubs and Spatial Redistribution,” mimeo.

Rowland, Richard H. (1996) “Russia’s Secret Cities,” Post-Soviet Geography and Economics, Vol. 37, No. 7, pp. 426–462.

Saltykov, Boris G. (1997) “The reform of Russian science,” Nature, Vol. 388, No. 6637, pp. 16–18.

Siddiqi, Asif A. (2008) “Soviet Space Power During the Cold War,” in Paul G. Gillespie and Grant T. Weller eds. Harnessing the Heavens: National Defense Through Space: Colorado Springs: United States Air Force Academy, pp. 134– 150.

(2015) “Scientists and Specialists in the Gulag: Life and Death in Stalin’s .,” Kritika: Explorations in Russian and Eurasian History, Vol. 16, No. 3, pp. 557–588.

Toews, Gerhard and Pierre-Louis Vézina (2020) “Enemies of the people,” mimeo.

42

Online Appendix for “The long-run effects of R&D place-based policies: Evidence from Russian Science Cities”

Helena Schweiger,* Alexander Stepanov† and Paolo Zacchia‡

February 2021

Contents

A Russian R&D and scientists over the transition A.1

B Science Cities in the Soviet Union and Russia A.2

C Detailed data description and sources A.9

D Municipal-level summary statistics A.13

E Construction and analysis of the matched sample A.16

F Additional empirical results A.20

G Robustness checks: visual representation A.26

H Detailed analysis of the structural model A.35

*European Bank for Reconstruction and Development (EBRD). E-mail: [email protected]. †European Bank for Reconstruction and Development (EBRD). E-mail: [email protected]. ‡CERGE-EI and IMT School for Advanced Studies. E-mail: [email protected]. A Russian R&D and scientists over the transition

Figures A.1 and A.2 illustrate the magnitude of the shock that the dissolution of the USSR has represented for Russian science, and the later developments. Our data sources are: Gokhberg (1997), the Russian Statistical Yearbooks for various years, and the OECD Main Science and Technology Indicators (MSTI) database.

Figure A.1: Gov. spending in R&D as a share of GDP,Russia vs. OECD, 1989-2010

Figure A.2: Share of scientists in the labor force, Russia vs. OECD, 1989-2010

A.1 B Science Cities in the Soviet Union and Russia

Table B.1: Science Cities

Naukograd Closed cityd Priority specialization areasa f e e a

a No. Location Oblast Founded Year Soviet status Year Russian status Type Past Now Military base Air rocket and space research Electronics and radio engineering Automation, IT and instrumentation Chemistry, chemical physics and new materials Nuclear complex Energetics Biology, biotechnology and agricultural sciences A.2 1 Biysk Altai Krai 1718 1957 2005 1 No No No No No No Yes No No Yes 2 Mirny Arkhangelsk 1957 1966 2 Yes Yes No Yes No No No No No No 3 Severodvinsk Arkhangelsk 1936 1939 2 Yes No Yes No No Yes No No No No 4 Znamensk Astrakhan 1948 1962 2 Yes Yes Yes Yes No No No No No No 5 Miass Chelyabinsk 1773 1955 1 No No No Yes No Yes No No No No 6 Ozyorsk Chelyabinsk 1945 1945 2 Yes Yes No No No No No Yes No No 7 Snezhinsk Chelyabinsk 1957 1957 2 Yes Yes No No No No No Yes No No 8 Tryokhgorny Chelyabinsk 1952 1952 2 Yes Yes No No No Yes No Yes No No 9 Ust-Katav Chelyabinsk 1758 1942 1 No No No Yes No Yes No No No No 10 Kaspiyskc Dagestan 1932 1936 2 No No No No No Yes No No No No 11 Irkutsk 1949 1988 5 No No No Yes Yes Yes Yes Yes Yes Yes 12 Angarskc Irkutsk 1948 1957 2 No No No No No No Yes Yes No No

a Based on Agirrechu’s list, unless specified otherwise. Type: 1 - Science Cities, “scientific core” established in existing cities, which often had a particular historical significance; 2 - Science Cities that received city status simultaneously (or a few years later) with the creation of a scientific or scientific-industrial complex at a practically new location (often in the “open field”); 3 - Science Cities that have arisen in existing settlements and received city status after obtaining scientific functions; 4 - Science Cities that do not have city status; 5 - academic town. b Based on NAS (2002). c Based on Lappo and Polyan (2008). d Russian Wikipedia article on closed cities (ZATO), 28 September 2016. e Wikipedia articles for each city, 28 September 2016. f Russian Wikipedia article on Science Cities, 28 September 2016. Table B.1 – continued from previous page Naukograd Closed cityd Priority specialization areasa f e e a

a No. Location Oblast Founded Year Soviet status Year Russian status Type Past Now Military base Air rocket and space research Electronics and radio engineering Automation, IT and instrumentation Chemistry, chemical physics and new materials Nuclear complex Energetics Biology, biotechnology and agricultural sciences

13 Obninsk Kaluga 1946 1946 2000 2 No No No No No Yes No Yes Yes No 14 Sosenskyc Kaluga 1952 1973 3 No No No No No Yes No No No No 15 Komsomolsk-on-Amur Khabarovsk 1932 1934 1 No No No Yes No Yes No No No No 16 Krasnodar-59b Krasnodar No No No No No No No No No A.3 17 Akademgorodok Krasnoyarsk 1944 1965 5 No No No Yes Yes Yes Yes Yes Yes Yes 18 Zelenogorsk Krasnoyarsk 1956 1956 2 Yes Yes No No No No Yes Yes No No 19 Zheleznogorsk Krasnoyarsk 1950 1954 2 Yes Yes No Yes No No No Yes No No 20 Kurchatovc Kursk 1968 1976 2 No No No No No No No Yes Yes No 21 Gatchina Leningrad 1928 1956 1 No No No No No Yes No Yes No No 22 Primorsk Leningrad 1268 1948 1 No No No Yes No No No No No No 23 Sosnovy Bor Leningrad 1958 1962 3 Yes Yes No No Yes No No Yes Yes No 24 Zelenograd Moscow City 1958 1958 2 No No No No Yes Yes No No No No 25 Avtopoligon 1964 1964 4 No No No No No Yes No No No No 26 Balashikha Moscow Oblast 1830 1942 1 No No No Yes No Yes No No No No 27 Beloozersky Moscow Oblast 1961 1961 4 No No No Yes No No No No No No

a Based on Agirrechu’s list, unless specified otherwise. Type: 1 - Science Cities, “scientific core” established in existing cities, which often had a particular historical significance; 2 - Science Cities that received city status simultaneously (or a few years later) with the creation of a scientific or scientific-industrial complex at a practically new location (often in the “open field”); 3 - Science Cities that have arisen in existing settlements and received city status after obtaining scientific functions; 4 - Science Cities that do not have city status; 5 - academic town. b Based on NAS (2002). c Based on Lappo and Polyan (2008). d Russian Wikipedia article on closed cities (ZATO), 28 September 2016. e Wikipedia articles for each city, 28 September 2016. f Russian Wikipedia article on Science Cities, 28 September 2016. Table B.1 – continued from previous page Naukograd Closed cityd Priority specialization areasa f e e a

a No. Location Oblast Founded Year Soviet status Year Russian status Type Past Now Military base Air rocket and space research Electronics and radio engineering Automation, IT and instrumentation Chemistry, chemical physics and new materials Nuclear complex Energetics Biology, biotechnology and agricultural sciences

28 Moscow Oblast 1710 1956 2008 3 No No No No No Yes Yes No No No 29 Dolgoprudny Moscow Oblast 1931 1951 2 No No No No No Yes Yes No No No 30 Moscow Oblast 1956 1956 2001 2 No No No Yes No Yes No Yes No No 31 Dzerzhinsky Moscow Oblast 1938 1956 3 No No No Yes No Yes Yes No No No A.4 32 Moscow Oblast 1584 1953 2003 3 No No No Yes Yes Yes No No No No 33 Istra Moscow Oblast 1589 1946 1 No No No Yes No Yes No No Yes No 34 Khimki Moscow Oblast 1850 1950 1 No No No Yes No Yes No No No No 35 Klimovsk Moscow Oblast 1882 1940 1 No No No No No Yes No No No No 36 Korolyov Moscow Oblast 1938 1946 2001 1 No No No Yes No Yes Yes No No No 37 Krasnoarmeysk Moscow Oblast 1928 1934 1 No No No No No Yes Yes No No No 38 Krasnogorskc Moscow Oblast 1932 1942 3 No No No No Yes No No No No No 39 Krasnoznamensk Moscow Oblast 1950 1950 2 Yes Yes No Yes No No No No No No 40 Lukhovitsy Moscow Oblast 1594 1957? 1 No No No Yes No No No No No No 41 Lytkarino Moscow Oblast 1939 1957 1 No No No Yes No Yes No No No No 42 Lyubertsic Moscow Oblast 1623 1948 1 No No No No No Yes No No No No

a Based on Agirrechu’s list, unless specified otherwise. Type: 1 - Science Cities, “scientific core” established in existing cities, which often had a particular historical significance; 2 - Science Cities that received city status simultaneously (or a few years later) with the creation of a scientific or scientific-industrial complex at a practically new location (often in the “open field”); 3 - Science Cities that have arisen in existing settlements and received city status after obtaining scientific functions; 4 - Science Cities that do not have city status; 5 - academic town. b Based on NAS (2002). c Based on Lappo and Polyan (2008). d Russian Wikipedia article on closed cities (ZATO), 28 September 2016. e Wikipedia articles for each city, 28 September 2016. f Russian Wikipedia article on Science Cities, 28 September 2016. Table B.1 – continued from previous page Naukograd Closed cityd Priority specialization areasa f e e a

a No. Location Oblast Founded Year Soviet status Year Russian status Type Past Now Military base Air rocket and space research Electronics and radio engineering Automation, IT and instrumentation Chemistry, chemical physics and new materials Nuclear complex Energetics Biology, biotechnology and agricultural sciences

43 Mendeleyevo Moscow Oblast 1957 1965 4 No No No No No Yes No No No No 44 Mytishchic Moscow Oblast 1460 1935 1 No No No No No Yes No No No No 45 Obolensk Moscow Oblast 1975 1975 4 Yes No No No No No No No No Yes 46 Orevo Moscow Oblast 1954 1954 4 No No No Yes No No No No No No A.5 47 Peresvet Moscow Oblast 1948 1948 2 No No No Yes No No No No No No 48 Protvino Moscow Oblast 1960 1960 2008 3 Yes No No No No Yes No Yes No No 49 Moscow Oblast 1956 1966 2005 3 No No No No No No No No No Yes 50 Remmash Moscow Oblast 1957 1957 4 No No No Yes No No No No No No 51 Moscow Oblast 1492-1495 1940 2003 1 No No No Yes No Yes No No No No 52 Tomilino Moscow Oblast 1894 1961 4 No No No Yes No Yes No No No No 53 Troitsk Moscow Oblast 1617 1977 2007 3 No No No No No Yes No Yes No No 54 Yubileyny Moscow Oblast 1939 1950 3 Yes No No Yes No No No No No No 55 Zheleznodorozhny Moscow Oblast 1861 1952 3 No No No No No Yes No No No No 56 Zhukovsky Moscow Oblast 1933 1947 2007 2 No No No Yes No No No No No No 57 Zvyozdny gorodok Moscow Oblast 1960 1960 4 Yes Yes No Yes No No No No No No

a Based on Agirrechu’s list, unless specified otherwise. Type: 1 - Science Cities, “scientific core” established in existing cities, which often had a particular historical significance; 2 - Science Cities that received city status simultaneously (or a few years later) with the creation of a scientific or scientific-industrial complex at a practically new location (often in the “open field”); 3 - Science Cities that have arisen in existing settlements and received city status after obtaining scientific functions; 4 - Science Cities that do not have city status; 5 - academic town. b Based on NAS (2002). c Based on Lappo and Polyan (2008). d Russian Wikipedia article on closed cities (ZATO), 28 September 2016. e Wikipedia articles for each city, 28 September 2016. f Russian Wikipedia article on Science Cities, 28 September 2016. Table B.1 – continued from previous page Naukograd Closed cityd Priority specialization areasa f e e a

a No. Location Oblast Founded Year Soviet status Year Russian status Type Past Now Military base Air rocket and space research Electronics and radio engineering Automation, IT and instrumentation Chemistry, chemical physics and new materials Nuclear complex Energetics Biology, biotechnology and agricultural sciences

58 Apatity (Akademgorodok) Murmansk 1926 1954 2 No No No Yes Yes Yes Yes Yes Yes Yes 59 Polyarnye Zoric Murmansk 1968 1973 2 No No No No No No No Yes Yes No 60 Balakhna (Pravdinsk) Nizhny Novgorod 1932 1941 1 No No No No Yes No No No No No 61 Dzerzhinsk Nizhny Novgorod 1606 1930 2 No No No No No No Yes No No No A.6 62 Sarov Nizhny Novgorod 1310 1947 1 Yes Yes No No No No No Yes No No 63 Akademgorodok Novosibirsk 1957 1957 5 No No No Yes Yes Yes Yes Yes Yes Yes 64 Koltsovo Novosibirsk 1979 1979 2003 4 Yes No No No No No No No No Yes 65 Krasnoobsk Novosibirsk 1970 1978 4 No No No No No No No No No Yes 66 Novosibirsk-49b Novosibirsk No No No No No No No No No 67 Omsk-5b Omsk No No 68 Zarechny Penza 1954 1958 2 Yes Yes No No No No No Yes Yes No 69 Perm-6b Perm No No 70 Bolshoy Kamenc Primorsk Krai 1947 1954 2 Yes Yes No No No Yes No No No No 71 Volgodonskc Rostov 1950 1976 3 No No No No No No No Yes Yes No 72 Zernograd Rostov 1929 1935 1 No No Yes No No No No No No Yes

a Based on Agirrechu’s list, unless specified otherwise. Type: 1 - Science Cities, “scientific core” established in existing cities, which often had a particular historical significance; 2 - Science Cities that received city status simultaneously (or a few years later) with the creation of a scientific or scientific-industrial complex at a practically new location (often in the “open field”); 3 - Science Cities that have arisen in existing settlements and received city status after obtaining scientific functions; 4 - Science Cities that do not have city status; 5 - academic town. b Based on NAS (2002). c Based on Lappo and Polyan (2008). d Russian Wikipedia article on closed cities (ZATO), 28 September 2016. e Wikipedia articles for each city, 28 September 2016. f Russian Wikipedia article on Science Cities, 28 September 2016. Table B.1 – continued from previous page Naukograd Closed cityd Priority specialization areasa f e e a

a No. Location Oblast Founded Year Soviet status Year Russian status Type Past Now Military base Air rocket and space research Electronics and radio engineering Automation, IT and instrumentation Chemistry, chemical physics and new materials Nuclear complex Energetics Biology, biotechnology and agricultural sciences

73 Petergof Saint Petersburg 1711 1960 2005 1 No No No No No Yes Yes No No No 74 Desnogorskc Smolensk 1974 1982 2 No No No No No No No Yes Yes No 75 Lesnoy Sverdlovsk 1947 1954 2 Yes Yes No No No No No Yes No No 76 Nizhnyaya Salda Sverdlovsk 1760 1958 1 No No No Yes No No No No No No A.7 77 Novouralsk Sverdlovsk 1941 1949 2 Yes Yes No No No No Yes Yes No No 78 Verhnyaya Saldac Sverdlovsk 1778 1933 3 No No No No No No Yes No No No 79 Zarechny Sverdlovsk 1955 1955 2 No No No No No No No Yes Yes No 80 Michurinsk Tambov 1635 1932 2003 1 No No No No No Yes No No No Yes 81 Zelenodolskc Tatarstan 1865 1949 1 No No No No No Yes No No No No 82 Akademgorodok Tomsk 1972 1972 5 No No No Yes Yes Yes Yes Yes Yes Yes 83 Seversk Tomsk 1949 1949 2 Yes Yes No No No No Yes Yes No No 84 Redkino Tver 1843 1950 4 No No No No Yes No Yes No No No 85 Solnechny Tver 1947 1951 4 Yes Yes Yes No No Yes No No No No 86 Udomlyac Tver 1478 1984 3 No No No No No No No Yes Yes No 87 Glazovc Udmurtia 1678 1948 1 No No No No No No No Yes No No

a Based on Agirrechu’s list, unless specified otherwise. Type: 1 - Science Cities, “scientific core” established in existing cities, which often had a particular historical significance; 2 - Science Cities that received city status simultaneously (or a few years later) with the creation of a scientific or scientific-industrial complex at a practically new location (often in the “open field”); 3 - Science Cities that have arisen in existing settlements and received city status after obtaining scientific functions; 4 - Science Cities that do not have city status; 5 - academic town. b Based on NAS (2002). c Based on Lappo and Polyan (2008). d Russian Wikipedia article on closed cities (ZATO), 28 September 2016. e Wikipedia articles for each city, 28 September 2016. f Russian Wikipedia article on Science Cities, 28 September 2016. Table B.1 – continued from previous page Naukograd Closed cityd Priority specialization areasa f e e a

a No. Location Oblast Founded Year Soviet status Year Russian status Type Past Now Military base Air rocket and space research Electronics and radio engineering Automation, IT and instrumentation Chemistry, chemical physics and new materials Nuclear complex Energetics Biology, biotechnology and agricultural sciences

88 Votkinskc Udmurtia 1759 1957 1 No No No Yes No Yes No No No No 89 Dimitrovgrad Ulyanovsk 1698 1956 1 No No No No No No No Yes Yes No 90 Kovrov Vladimir 1778 1916 1 No No No No No Yes No No No No 91 Melenki Vladimir 1778 1 No No No No No Yes No No No No A.8 92 Raduzhny Vladimir 1971 1971 2 Yes Yes No No Yes Yes No No No No 93 Novovoronezhc Voronezh 1957 1964 2 No No No No No No No Yes Yes No 94 Borok Yaroslavl 1807 1956 4 No No No No No No No No No Yes 95 Pereslavl-Zalessky Yaroslavl 1152 1964 1 No No No No No Yes Yes No No No

a Based on Agirrechu’s list, unless specified otherwise. Type: 1 - Science Cities, “scientific core” established in existing cities, which often had a particular historical significance; 2 - Science Cities that received city status simultaneously (or a few years later) with the creation of a scientific or scientific-industrial complex at a practically new location (often in the “open field”); 3 - Science Cities that have arisen in existing settlements and received city status after obtaining scientific functions; 4 - Science Cities that do not have city status; 5 - academic town. b Based on NAS (2002). c Based on Lappo and Polyan (2008). d Russian Wikipedia article on closed cities (ZATO), 28 September 2016. e Wikipedia articles for each city, 28 September 2016. f Russian Wikipedia article on Science Cities, 28 September 2016. C Detailed data description and sources

Table C.1: Municipal level data sources and variables

Data type Data sub-type Data source Description

Factors guiding the selection of location of Science Cities

Administrative Various identification information OpenStreetMap, available through GIS- Unique municipality, federal district and region for municipality, region and fed- LAB (http://gis-lab.info/qa/osm- (oblast, krai, republic) identificators, codes and eral district adm.html) names

Population 1959 census data January 1959 , avail- All population in municipality in 1959, estimates for able through Demoscope (http: some municipalities //demoscope.ru/weekly/ssp/census.

A.9 php?cy=3)

Geography Area Calculated in QGIS based on Municipality area calculated in QGIS, measured in OpenStreetMap square kilometers

Coordinates of the municipality GPS coordinates of the center of municipality calcu- center lated in QGIS

Altitude CGIAR.a Consortium for Spatial Informa- Altitude of municipality in meters (mean, median, tion (CGIAR-CSI) SRTM 90m Digital Eleva- SD, min and max value) tion Data, version 4, available at http:// srtm.csi.cgiar.org/

Temperatures in January and July WorldClim version 1 (http://www. Monthly temperature data, for the period 1960- worldclim.org/version1), developed 90, assigned to municipalities in QGIS. Average, by Hijmans et al. (2005) median, standard deviation, minimum, and maxi- mum.

Continued on next page Table C.1 – Continued from previous page Data type Data sub-type Data source Description

Railroad Vernadsky State Geological Museum and Data on railroads were constructed using railroads United States Geological Survey, 20010600 shapefile describing the railroads of the former So- (2001) and Central management unit of the viet Union as of the early 1990s prepared by Ver- military communications of the Red Army nadsky State Geological Museum and United States (1943) Geological Survey, 20010600 (2001), along with a map of railroads from 1943 from Central manage- ment unit of the military communications of the Red Army (1943) to manually remove any differ- ences between the situation depicted in the shape- file and the 1943 map. Indicator equal to 1 if munic- ipality has access to railroad, and 0 otherwise. Rail- roads are as of late 1940s. A.10

Coastline/major river/lake Natural Earth, 1:10m Physical Vectors ver- Indicator equal to 1 if municipality has access to sion (http://www.naturalearthdata. coast/major river/lake and 0 otherwise. com/downloads/10m-physical- vectors/)

Distances Calculated in QGIS based on the sources Distance (in km) from the center of municipality to specified above the nearest railroad, coast, major river, lake, USSR border, plant (of any type), and HEI (of any type).

Level of industrial devel- Data on the factories, research and Dexter and Rodionov (2016). The dataset Number of factories (zavod) and subordinated or- opment design establishments of the So- contains almost 30,000 entries and in- ganizations. viet defense industry in 1947 cludes the name, location, main branch of defense production, establishment type as well as the start and end date for the estab- lishment’s military work.

Continued on next page Table C.1 – Continued from previous page Data type Data sub-type Data source Description

R&D institutes Number of Scientific Research & Design Institutes (NII, TsNII, and GSPI), design bureaus, and test sites.

Graduate share institutes HEIs in the municipality in 1959 De Witt (1961) Number of all HEIs, HEIs specialisng in technical (HEI) sciences, and HEIs specialising in biology and med- ical sciences

State Bank State bank State Bank branches in Bircan and De Haas (2020), originally in Number of State Bank branches 1946 State Bank (1946)

Long-term outcomes of interest

Patents Applications to EPO European Patent Office. Patents are Number of patent applications to EPO in 1978-2013, A.11 matched to municipalities via inventors’ by inventor (simple and fractional counting) addresses.

Population 2010 census data 2010 Russian census, available at All population, population with higher education, http://www.gks.ru/free_doc/new_ and population with PhD or doctoral degrees in mu- site/perepis2010/croc/perepis_ nicipality in 2010 itogi1612.htm

SMEs Results of the 2010 SME census Rosstat (Federal State Statistics Ser- The dataset contains information on the number of vice) (http://www.gks.ru/free_doc/ firms, revenue, number of employees, fixed assets, new_site/business/prom/small_ and fixed capital investment by size, 1- and 4-digit business/itog-spn.html) ISIC sector. We use SMEs per capita, SME sales per worker (labor productivity), and SME sales per unit of fixed labor, all sectors and manufacturing only. The SME census does not cover ZATOs, so 16 Sci- ence Cities, which are also ZATOs, are not covered.

Continued on next page Table C.1 – Continued from previous page Data type Data sub-type Data source Description

Night time lights Average stable night lights Version 4 DMSP-OLS Nighttime Lights Average night lights for 1992-94 and 2009-11, Time Series, National Oceanic and cleaned of gas flares Atomspheric Administration (NOAA) (http://ngdc.noaa.gov/eog/dmsp/ downloadV4composites.html)

Municipal budget Average municipal budget rev- Rosstat (Federal State Statistics Service) Average annual municipal budget revenues and ex- enues and expenditures over penditures over 2006-16, in 2010 prices. Total values 2006-16 and breakdown by major categories.

Amenities Availability of public goods in 2010 Rosstat (Federal State Statistics Service) Length of local roads that are lit during the night, number of museums, theatres and libraries in a mu- nicipality in 2010. A.12 a CGIAR is a global partnership of research organizations dedicated to reducing poverty and hunger, improving human health and nutrition, and enhancing ecosys- tem resilience through agricultural research. CGIAR-CSI is spatial science community that facilitates CGIAR’s international agricultural development research using spatial analysis, GIS, and remote sensing: http://www.cgiar-csi.org/. D Municipal-level summary statistics

Table D.1 below displays summary statistics for all variables featured in our em- pirical analysis (in five panels, by variable category), separately for Science Cities and other municipalities. For each variable, the table also displays the p-value obtained from a t-test for the differences in means between the two groups.

Table D.1: Municipal-level data: descriptive statistics

Science Cities Other municipalities

Obs. Mean (s.d.) Obs. Mean (s.d.) p-value

Panel A: Socio-economic outcomes

Population in 2010 (th.) 84 95.175 2250 58.324 0.000 (71.773) (278.488) Graduate share in 2010 84 0.224 2250 0.110 0.000 (0.078) (0.044) Postgraduate share in 2010 84 0.006 2250 0.003 0.000 (0.004) (0.002) Fractional patents before 2000 84 3.743 2250 1.090 0.007 (6.512) (32.153) Fractional patents after 2000 84 11.878 2250 3.539 0.001 (15.640) (89.093) Night lights (standardized): 2009-2011 84 1.505 2250 -0.062 0.000 (1.411) (0.926) Employment share in R&D-ICT 67 0.036 2176 0.007 0.000 (0.034) (0.009) Average salary in R&D-ICT (th.) 67 25.047 2127 15.363 0.000 (11.423) (7.750) Average salary in retail (th.) 67 22.744 2060 13.137 0.000 (11.807) (6.713) No. of SMEs (per 1,000 inhabitants) 65 24.119 2159 27.492 0.003 (8.568) (9.617) SME labor productivity 65 1.615 2153 0.794 0.000 (0.709) (0.427)

Panel B: Budget measures

Avg. total revenues per capita 64 19.468 2173 24.761 0.000 (5.984) (51.671) Avg. transfers per capita 64 9.538 2173 18.380 0.000 (3.786) (32.718) Avg. tax income per capita 64 9.930 2173 6.380 0.000 (4.136) (22.790) Avg. total expenditures per capita 64 9.577 2173 10.842 0.013 (3.059) (15.374) Continued on next page

A.13 Science Cities Other municipalities

Obs. Mean (s.d.) Obs. Mean (s.d.) p-value

Avg. exp. on roads per capita 64 0.626 2173 0.731 0.106 (0.476) (1.164) Avg. exp. on security per capita 64 0.149 2173 0.208 0.009 (0.126) (0.739) Avg. exp. on transportation per capita 64 0.164 2173 0.237 0.012 (0.156) (0.981) Avg. exp. on utilities per capita 64 2.230 2173 3.057 0.006 (1.118) (12.431) Avg. exp. on education per capita 64 9.577 2173 10.842 0.013 (3.059) (15.374) Avg. exp. on health per capita 64 2.181 2173 2.095 0.673 (1.569) (2.305) Avg. exp. on social services per capita 64 1.441 2173 2.087 0.000 (1.301) (3.165)

Panel C: Local public goods

Share of streets lit in the night 63 0.752 1962 0.477 0.000 (0.264) (0.278) Libraries per 1,000 inhabitants 64 0.185 2030 0.792 0.000 (0.207) (0.564) Museums per 1,000 inhabitants 64 0.019 2000 0.048 0.000 (0.020) (0.072) Theaters per 1,000 inhabitants 64 0.003 1964 0.001 0.009 (0.005) (0.007)

Panel D: Geographical characteristics

Latitude 84 55.698 2250 53.981 0.000 (3.739) (5.110) Longitude 84 47.794 2250 59.955 0.000 (20.860) (29.410)

Average temperature in January (◦C) 84 -11.350 2250 -13.559 0.000 (3.699) (7.045)

Average temperature in July (◦C) 84 18.528 2250 18.755 0.251 (1.729) (2.675) Average altitude higher than 1,000km 84 0.000 2250 0.040 0.000 (0.000) (0.196) Has access to rivers or lakes 84 0.369 2250 0.445 0.161 (0.485) (0.497) Direct access to the coast 84 0.095 2250 0.061 0.295 (0.295) (0.239) (Log) area in km2 84 5.196 2250 7.398 0.000 (1.933) (1.741) Is a closed city 84 0.202 2250 0.011 0.000 (0.404) (0.105) Continued on next page

A.14 Science Cities Other municipalities

Obs. Mean (s.d.) Obs. Mean (s.d.) p-value

Panel E: Historical characteristics

Has R&D institutes in 1947 84 0.333 2250 0.055 0.000 (0.474) (0.227) (Log) distance from railroads in 1943 84 1.240 2250 2.582 0.000 (1.328) (1.953) (Log) no. of plants in 1947 84 0.937 2250 0.363 0.000 (1.001) (0.756) (Log) no. of plants within 200km in 1947 84 6.027 2250 3.959 0.000 (1.529) (1.765) (Log) distance from the USSR border in 1946 84 6.148 2250 6.134 0.923 (1.272) (1.206) No. of State Bank branches in 1946 84 0.631 2250 1.096 0.000 (0.788) (0.987) No. of universities in 1959 84 0.143 2250 0.196 0.409 (0.415) (2.205) (Log) urban population in 1897 84 2.476 2250 1.996 0.291 (4.079) (3.735) (Log) urban population in 1926 84 3.772 2250 2.738 0.043 (4.534) (4.205) (Log) urban population in 1939 84 5.698 2250 3.127 0.000 (5.002) (4.604) (Log) urban population in 1959 84 7.390 2250 5.629 0.001 (4.784) (4.894) (Log) total population in 1959 84 9.632 2250 10.227 0.011 (2.079) (1.311) (Log) total population within 200km in 1959 84 15.494 2250 14.123 0.000 (1.068) (2.413) (Log) total population in 1989 82 11.019 2232 10.374 0.000 (1.187) (1.058)

Notes: The p-values in the rightmost columns are obtained from two-sided t-tests on the equality of each variable’s mean between Science Cities and other municipalities. “Average altitude higher than 1,000km,” “Has access to rivers or lakes,” “Direct access to the coast,” “Is a closed city,” and “Has R&D institutes in 1947” are coded as dummy vari- ables. Distances are expressed in kilometers. th. – thousands; avg. – average; exp. – expenditures. Refer to Appendix C for detailed data sources.

A.15 E Construction and analysis of the matched sample

This appendix describes the strategy we adopted for constructing the matched sample, with special attention to possible issues of SUTVA violation.

Construction of the matched sample. When constructing the matched sam- ple following our empirical strategy (motivated by the historical and institutional features discussed in section 2 of the paper), we must take into account two pos- sibly conflicting requirements. On the one hand, we would preferably match cities as close in space as possible, to control for spatially correlated unobserv- ables: local characteristics that are shared within a geographical area (such as, for instance, proximity to a major city like Moscow) that predict current economic outcomes. At the very least, we must match Science Cities located in “usable”, more urbanized areas to control municipalities from similar places, and Science Cities located in “secret”, more isolated locations to places alike. On the other hand, we must account for possible violations of the SUTVA (“interference”): the assignment of the Science City treatment to one city might also affect the cities nearby, either through positive spillovers, negative spillovers (draining their re- sources), or a combination of both. This leads to a trade-off: we want to match cities that are “close, but not too close”. We can obtain matches that are close in space, at least in the more densely populated and urbanized areas of Russia, by including geographical coordinates as well as local demographic and economic intensity measures (historical popu- lation and plants within 200km) in our set of matching covariates Xik . In order to mitigate concerns of SUTVA violation, a common approach is that of removing cities that are “too close” to Science Cities from the set of potential control mu- nicipalities. We believe that a distance of 50km between the two municipalicities’ geographical centroids is a suitable threshold for enforcing a restriction of this sort. In our setting, however, we found that enforcing restriction of this sort can lead to a matched sample that is undesirable in two respects. First, the matching covariates would not be very well balanced. Second, matches would be too far away in space, contradicting our requirement of controlling for spatially corre- lated unobservables. The contradiction is largely due to those Science Cities that

A.16 are clustered in some especially densely populated areas: the Moscow region and (to a much lesser extent) the Urals. In these areas, such a strong restriction would remove most local municipalities from the set of potential controls. Our approach is thus as follows. First, we make the following assumption.

No-SUTVA hypothesis. The effect of Science Cities only spills over other close control cities – specifically, control cities that are not more than 50km apart.

The key assumption is that the effect of a Science City does not spill over other Science Cities. We find this hypothesis tenable in our setting, given the unique specialization typical for each Science City and the fact that those few connec- tions that used to exist between some pairs of Science Cities (e.g. input-output linkages) were disrupted following the collapse of the USSR. Thus, we construct our matching sample by imposing a milder restriction: that Science Cities and their matches must be at least 50km apart. This leads to a matched sample with adequate balance properties, at the cost of making some matched pairs farther apart in geographical space. By the No-SUTVA hy- pothesis above, no treated city can be subject to interference, while the sample of matched control cities has the following characteristics.

• In some matched pairs, the control unit is not subject to interference: it is at least 50km away from its paired Science City and any other Science City.

• In other matched pairs, the control unit may be subject to interference, due to the presence of other Science Cities within a 50km radius.

This allows us to test possible instances of SUTVA violation, by comparing the average differences between treated and control cities across these two compo- nents of the matched sample. The results in Table 5 in the paper do not allow us to rule out that interference occurs, but they indicate that if they do, positive and negative effects likely offset one another. Our approach is inspired by recent advances from the statistical literature. In particular, Forastiere et al. (2020) propose a strategy to perform causal infer- ence when treated and control units are related through a network and SUTVA is violated. Under a modified Conditional Independence Assumption that allows

A.17 for interference, they propose a sample subclassification approach based on the propensity score that allows them to both estimate and test interference, since the observations of each subsample in their classification has a similar number of treated network “neighbors.” While our approach is not as refined, we believe that the key idea by Forastiere et al. (2020) – that is, grouping observations with a similar “neighborhood” of treated units – can be transferred to our setting. Ulti- mately, a geographical space is a particular kind of network.

Analysis of the matched sample. We now examine the matched pairs more closely: Figure E.1 displays them on the map of Russia. Thanks to our choice of covariates and our balanced approach described above, Science Cities and their counterparts are matched fairly close in space, especially so in the more densely populated and developed areas of Russia. In particular, municipalities very close to Moscow are typically matched to other municipalities that are also very close to Moscow, which mitigates concerns about the proximity of many Science Cities to the capital of Russia and the possible beneficial implications of it.

Figure E.1: Mapping Science Cities and their matches

A.18 Table E.1 reports the analysis of covariate balance; it displays the standard- ized mean difference and the variance ratio between treated and control obser- vations, in both the original and the matched samples. The table shows that matching achieves a remarkable degree of balance in both the first and the sec- ond moment, despite the rigidity of the Mahalanobis algorithm. Recall that the 78 matched pairs also satisfy exact matching constraints imposed on five dummy variables, including that for closed city status. In our matched sample, exactly 39 control units have at least one other treated city that is less than 50km away (the average minimal distance is 28.52km) while the remaining 39 control units are not as close to other Science Cities (the average minimal distance is 158.54km). This perfect split is ideal towards our approach to testing for instances of SUTVA violation described above.

Table E.1: Covariate balance: Science Cities and their matched municipalities

Standardized bias Variance ratio Raw Matched Raw Matched Latitude 0.383 0.061 0.535 1.272 Longitude -0.477 0.032 0.503 1.235 January average temperature (◦C) 0.393 0.033 0.276 0.820 July average temperature (◦C) -0.101 0.011 0.418 1.250 (Log) area in km2 -1.197 0.019 1.232 0.808 (Log) distance from the USSR border 0.011 -0.090 1.113 1.095 (Log) distance from railroads in 1943 -0.803 0.050 0.462 1.098 (Log) urban population in 1897 0.123 -0.029 1.193 0.969 (Log) urban population in 1926 0.236 -0.026 1.163 0.944 (Log) urban population in 1939 0.535 0.072 1.180 0.965 (Log) urban population in 1959 0.364 -0.036 0.956 1.088 (Log) total population in 1959 -0.343 0.045 2.514 0.776 (Log) total population within 200km in 1959 0.735 0.027 0.196 1.045 (Log) no. of plants in 1947 0.648 0.053 1.751 0.957 (Log) no. of plants within 200km in 1947 1.253 0.061 0.751 0.991 No. of universities in 1959 -0.034 -0.060 0.035 0.555 No. of State Bank branches in 1946 -0.521 0.000 0.638 1.000

Notes: For each variable listed in the first column, this table reports the difference in the variance-standardized mean (the “standardized bias,” reported in percentage points) and the variance ratio between treated and control observations, for both the raw sample and the matched sample. The matched sample is obtained by “genetic” Ma- halanobis matching on the variables reported above, forcing exact matching on the five dummy variables included in our list of covariates (see the paper and the notes to Table D.1).

A.19 F Additional empirical results

This appendix reports miscellaneous empirical results that complement those in section 5 in the paper. First, we discuss additional estimates related to potential drivers of our main results: variables that measure flows and stocks of expendi- tures in public goods. Subsequently, we report a set of results that are restricted to the subsample of “historical” Science Cities. Lastly, we elaborate the analysis of the night lights time series in our matched sample, and their persistence over time (Figure 2) with the aid of a regression model.

ATT estimation: Detailed budget outcomes and local public goods. In light of the estimates reported in Table 4 in the main text, we argue that the total size of local municipal expenditures is unlikely to drive the main results. What if Science Cities differed not in the total size, but in the composition of local expenditures? This idea is at the core of the vES hypothesis, which postulates enduring effects from the local investment in specific categories of public goods (such as physical infrastructure). To explore this possibility, we separately analyze the following categories of municipal-level expenditures: roads, security, transportation, util- ities, education, health, and social services. Local expenditures in roads and ed- ucation are prime candidates to act as drivers of our main results. However, the estimates reported in Table F.1, panel A dispel all of these suspects. In fact, the ATT estimates are generally not statistically different from zero and the values of

Γ∗ are consistently equal to 1. Note that the raw differences are in most cases negative and statistically significant. Results in Table 4 in the paper and Table F.1 concern the flows of expendi- tures in public goods. What if Science Cities and their matched counterparts differed in the stock of public goods, instead? As Science Cities were shaped for the purpose of accommodating researchers and scientists, their higher density of skilled workers today may be explained by a larger inherited endowment of cultural public goods, such as libraries, museums and theaters, or other types of public goods. Table F.1, panel B reports our estimates for this hypothesis us- ing ROSSTAT data. To our surprise, former Science Cities appear to possess, if anything, fewer cultural amenities than ordinary municipalities. This difference,

A.20 however, is not statistically significant in the matched sample, except for libraries

(though Γ∗ equals one there). We believe that the equal distribution of cultural amenities can be explained through the emphasis placed on shared education and culture by the communist ideology. Science Cities appear to have a higher share of streets lit in the night, but this evidence is too weak (in magnitude and statistically). Overall, these factors appear unlikely to drive our main results.

Table F.1: Municipal-level matching: additional results, all Science Cities

Whole sample Matched sample (1 nearest neighbor)

Outcome Raw difference TC ATT ATT b.a. Γ∗ Panel A: Detailed budget outcomes -0.110* -0.096 0.027 Exp. on roads p.c. 64 51 1.00 (0.065) (0.107) (0.098) -0.061** -0.006 -0.004 Exp. on security p.c. 64 51 1.00 (0.023) (0.017) (0.016) -0.080** -0.071 -0.049 Exp. on transportation p.c. 64 51 1.00 (0.029) (0.046) (0.044) -0.891** -0.425 -0.646* Exp. on utilities p.c. 64 51 1.00 (0.311) (0.229) (0.258) -1.126** 0.662 0.094 Exp. on education p.c. 64 51 1.00 (0.512) (0.355) (0.447) 0.061 -0.022 -0.316 Exp. on health p.c. 64 51 1.00 (0.203) (0.178) (0.187) -0.631*** -0.260 -0.187 Exp. on social services p.c. 64 51 1.00 (0.177) (0.185) (0.191) Panel B: Local public goods -0.601*** -0.113*** -0.109*** Libraries per 1,000 inhabitants 64 49 1.00 (0.029) (0.028) (0.031) -0.029*** -0.004 -0.005 Museums per 1,000 inhabitants 64 49 1.00 (0.003) (0.003) (0.003) 0.002*** 0.001 0.002 Theaters per 1,000 inhabitants 64 49 1.00 (0.001) (0.001) (0.001) 0.267*** 0.118* 0.045 Share of streets lit in the night 63 49 1.40 (0.034) (0.046) (0.048)

Notes: See the notes accompanying Table 1in the paper. The reported significance levels are based on a FWER adjust- ment using Holm’s method that groups all the outcomes listed in this table together with the main budget outcomes listed in Table 4 in the main text. exp. – expenditures; p.c. – per capita.

A.21 ATT estimation: Other results for the historical Science Cities. While Ta- ble 2 estimates in the paper suggest that our main results – except in part those for patent applications and employment share in R&D and ICT – are not driven by the present-day Naukogrady, the latter may differ in other respects, such as patterns of population growth or use of local resources. To examine these pos- sibilities, we replicate the “demographic dynamics” results from Table 3 in the paper on the sample of historical Science Cities. The estimates reported in Ta- ble F.2 below suggest that the demographic dynamics in the restricted sample is similar to that in the full sample.

Table F.2: Municipal-level matching, demographic dynamics: “historical” Science Cities

Whole sample Matched sample (1 nearest neighbor)

Outcome Raw difference TC ATT ATT b.a. Γ∗ 0.106*** 0.063*** 0.059*** Graduate share: b.y. 1965 64 51 3.40 ≤ (0.011) (0.010) (0.011) 0.097*** 0.048*** 0.043*** Graduate share: b.y. 1965 64 51 2.80 > (0.008) (0.008) (0.009) 0.003*** 0.002*** 0.002*** Postgraduate share: b.y. 1955 64 51 2.20 ≤ (0.001) (0.000) (0.001) 0.003*** 0.002*** 0.002*** Postgraduate share: b.y. 1955 64 51 1.85 > (0.000) (0.001) (0.001)

Notes: See the notes accompanying Table 1 in the paper. The reported significance levels are based on a FWER adjustment using Holm’s method that groups all the outcomes listed in this table together with the main outcomes listed in Table 2 in the paper. b.y. – birth year.

Next, we replicate the budget and “amenity” outcome estimates (Tables 4 and F.1) in the restricted subsample; estimates are reported in Table F.3. The main budget estimates (Panel A) show the same pattern as in the larger sample: in per capita terms, historical Science Cities receive on average more resources from local taxes and less federal transfers, leaving the overall amount available for expenditures unchanged. Some of the ATT estimates for the detailed budget categories (Panel B) and for the stock of public goods (Panel C) are statistically significant, with typically low associated values of Γ∗. Their signs, however, go in opposite directions, suggesting that one cannot extrapolate major conclusions from these findings.

A.22 Table F.3: Municipal-level matching: additional results, “historical” Science Cities

Whole sample Matched sample (1 nearest neighbor)

Outcome Raw difference TC ATT ATT b.a. Γ∗ Panel A: Main budget outcomes -5.976*** -0.974 -1.308 Total revenues per capita 51 44 1.00 (1.400) (0.748) (0.952) -8.840*** -2.780*** -2.773*** All transfers per capita 51 44 1.00 (0.919) (0.630) (0.787) 2.864*** 1.806*** 1.466*** Tax income per capita 51 44 1.75 (0.753) (0.395) (0.413) -5.861*** -0.996 -1.295 Total expenditures per capita 51 44 1.00 (1.357) (0.798) (1.256) Panel B: Detailed budget outcomes -0.117 0.035 0.137 Exp. on roads p.c. 51 44 1.00 (0.075) (0.057) (0.065) -0.071** -0.013 -0.001 Exp. on security p.c. 51 44 1.00 (0.024) (0.014) (0.013) -0.083** -0.037 -0.018 Exp. on transportation p.c. 51 44 1.00 (0.030) (0.025) (0.025) -1.054*** -0.555** -0.751** Exp. on utilities p.c. 51 44 1.00 (0.313) (0.208) (0.242) -1.384** 0.374 -0.027 Exp. on education p.c. 51 44 1.00 (0.521) (0.290) (0.368) -0.070 -0.042 -0.308 Exp. on health p.c. 51 44 1.00 (0.215) (0.143) (0.164) -0.529** -0.329 -0.116 Exp. on social services p.c. 51 44 1.00 (0.208) (0.174) (0.201)

Panel C: Local public goods -0.577*** -0.129*** -0.118*** Libraries per 1,000 inhabitants 51 42 1.00 (0.034) (0.027) (0.030) -0.028*** -0.008** -0.007* Museums per 1,000 inhabitants 51 42 1.00 (0.003) (0.003) (0.003) 0.002** 0.002** 0.002* Theaters per 1,000 inhabitants 51 42 2.90 (0.001) (0.001) (0.001) 0.237*** 0.129*** 0.046 Share of streets lit in the night 50 42 1.40 (0.038) (0.038) (0.038)

Notes: See the notes accompanying Table 1 in the paper. The reported significance levels are based on a FWER adjustment using Holm’s method that groups all the outcomes listed in this table. exp. – expenditures; p.c. – per capita.

A.23 The dynamics of night lights: regression-based test of persistence. In what follows, we provide a regression-based formal test to support the claim that the two trends for night lights in Figure 2 in the paper, the one for the treated and the one for the control municipalities, do not converge over the time interval 1992- 2010. Table F.4 below reports estimates of the following simple regression model:

Y π T π S T τ ε , (F.1) i t = 0 t + 1 i · t + t + i t where i indexes a city from the matched sample, t indexes the year, Yi t is the standardized night lights measure for that year, Tt is a linear trend, Si is a dummy indicating Science City status, τt is a year fixed effect and εi t is an error term which is allowed to be autocorrelated in time. Parameter π1 in (F.1) represents differences in the linear trend between Science Cities and their matched coun- terparts. The estimates in Table F.4are obtained from the sample comprising all matched Science Cities and their paired controls (column 1) and the subsample restricted to the “historical” Science Cities and their matches (column 2). Stan- dard errors are calculated via the Newey-West HAC formula, allowing autocorre- lation of εi t up to 10 years. The estimates of π1, albeit small in magnitude, are positive and statistically significant; this suggests that, if anything, Science Cities are on a divergent trend of economic development (as proxied by the night lights measure). Allowing for a longer autocorrelation time window of εi t delivers vir- tually identical inferences and qualitative conclusions.

Table F.4: Estimates of parameters π0 and π1 from model (F.1)

All Science Cities Historical Science Cities (1) (2)

Linear trend, all (π0) 0.0071*** 0.0063*** (0.0001) (0.0001)

Trend difference for S.C.s (π1) 0.0003*** 0.0002*** (0.0001) (0.0001) Year fixed effects YES YES Number of observations 2964 2527

Notes: *, ** and *** denote significance at the 10, 5, and 1 per cent level, respectively. Standard errors reported in parentheses are estimated with the Newey-West HAC formula, allowing for autocorrelation up to 10 years.

A.24 The dynamics of night lights: “grand ATT” estimation. Having ensured that there is no convergence, we estimate a “grand ATT” on the night lights matched panel by estimating the following simple model:

Y ψS τ0 ε0 . (F.2) i t = i + t + i t

Here, parameter ψ represents the causal effect of Science City status, and iden- tification is ensured by restricting the analysis to the matched sample. It is rea- sonable to expect the estimate(s) of ψ to be positive, but not necessarily statis- tically significant when allowing for autocorrelated disturbances. Note that this exercise cannot be performed along with the estimation of separate linear trends for Science Cities and their matches, because it would result in data overfitting. Once again, the autocorrelation of the error term is allowed for up to 10 years. The results are displayed in Table F.5. The baseline estimate reported in column 1 amounts to about one half of the standard deviation of the night lights mea- sure, which is similar – if not slightly higher – to the ATT estimate of the 2009-11 average from Tables 1-2 in the paper, and is statistically significant at the 1 per cent level. Adding bias adjustment terms à la Abadie and Imbens to the left-hand side measures Yi t (columns 2 and 4) and restricting the analysis to the “histor- ical” Science Cities (columns 3 and 4) are of little consequence for our grand ATT estimates. To summarize, it appears that the long-run effect of Science City status on local economic activity – as proxied by the night lights measures – is statistically robust, of considerable magnitude, and constant over time.

Table F.5: Estimates of parameter ψ from model (F.2)

All Science Cities Historical Science Cities (1) (2) (3) (4) Science City (ψ) 0.5207*** 0.5339** 0.4814*** 0.5174*** (0.1541) (0.1725) (0.1618) (0.1811) Year fixed effects YES YES YES YES Bias-adjusted estimate NO YES NO YES Number of observations 2964 2964 2527 2527

Notes: *, ** and *** denote significance at the 10, 5, and 1 per cent level, respectively. Standard errors reported in parentheses are estimated with the Newey-West HAC formula, allowing for autocorrelation up to 10 years.

A.25 G Robustness checks: visual representation

This appendix provides a visual representation of the treated-control differences used in the t-tests discussed in section 5, Table 5 in the paper. For each cate- gory and outcome variable combination, a bar chart is reported; each bar rep- resents a matched pair, and their (possibly negative) heights represent a pair’s treated-control difference. The bars are arranged in ascending order and colored according to the value of the category. An uneven distribution of colors along the horizontal axis is indicative of differences between categories, and vice versa.

G.1 “Secret” vs. “usable” Science Cities

Figure G.1: Differences in total population, 2010

200 “Secret” activities Other activities

0 : population

C 200

− − T 400 −

Figure G.2: Differences in the graduate share, 2010

“Secret” activities Other activities 0.2

0 : share of degree holders C − T

A.26 Figure G.3: Differences in fractional patents since 2000

“Secret” activities 50 Other activities

0 : fractional patents C − T 50 −

Figure G.4: Differences in night lights (2009-11)

4 “Secret” activities Other activities 2

0 : std. night lights C

− 2

T −

G.2 Science Cities built from scratch versus others

Figure G.5: Differences in total population, 2010

200 Built from scratch Pre-existing settlement

0 : population

C 200

− − T 400 −

A.27 Figure G.6: Differences in the graduate share, 2010

Built from scratch Pre-existing settlement 0.2

0 : share of degree holders C − T

Figure G.7: Differences in fractional patents since 2000

Built from scratch 50 Pre-existing settlement

0 : fractional patents C − T 50 −

Figure G.8: Differences in night lights (2009-11)

4 Built from scratch Pre-existing settlement 2

0 : std. night lights C

− 2

T −

A.28 G.3 Science Cities from Agirrechu’s list versus others

Figure G.9: Differences in total population, 2010

200 Agirrechu’s list Other Science Cities

0 : population

C 200

− − T 400 −

Figure G.10: Differences in the graduate share, 2010

Agirrechu’s list Other Science Cities 0.2

0 : share of degree holders C − T

Figure G.11: Differences in fractional patents since 2000

Agirrechu’s list 50 Other Science Cities

0 : fractional patents C − T 50 −

A.29 Figure G.12: Differences in night lights (2009-11)

4 Agirrechu’s list Other Science Cities 2

0 : std. night lights C

− 2

T −

G.4 Other Science Cities close to the matched control, or none

Figure G.13: Differences in total population, 2010

200 Other Science Cities close to C No other Science Cities close to C

0 : population

C 200

− − T 400 −

Figure G.14: Differences in the graduate share, 2010

Other Science Cities close to C No other Science Cities close to C 0.2

0 : share of degree holders C − T

A.30 Figure G.15: Differences in fractional patents since 2000

Other Science Cities close to C 50 No other Science Cities close to C

0 : fractional patents C − T 50 −

Figure G.16: Differences in night lights (2009-11)

4 Other Science Cities close to C No other Science Cities close to C 2

0 : std. night lights C

− 2

T −

G.5 Science Cities founded before 1950 versus later ones

Figure G.17: Differences in total population, 2010

200 Founded before 1950 Founded on 1950 or later

0 : population

C 200

− − T 400 −

A.31 Figure G.18: Differences in the graduate share, 2010

Founded before 1950 Founded on 1950 or later 0.2

0 : share of degree holders C − T

Figure G.19: Differences in fractional patents since 2000

Founded before 1950 50 Founded on 1950 or later

0 : fractional patents C − T 50 −

Figure G.20: Differences in night lights (2009-11)

4 Founded before 1950 Founded on 1950 or later 2

0 : std. night lights C

− 2

T −

A.32 G.6 Science Cities founded before 1960 versus later ones

Figure G.21: Differences in total population, 2010

200 Founded before 1960 Founded on 1960 or later

0 : population

C 200

− − T 400 −

Figure G.22: Differences in the graduate share, 2010

Founded before 1960 Founded on 1960 or later 0.2

0 : share of degree holders C − T

Figure G.23: Differences in fractional patents since 2000

Founded before 1960 50 Founded on 1960 or later

0 : fractional patents C − T 50 −

A.33 Figure G.24: Differences in night lights (2009-11)

4 Founded before 1960 Founded on 1960 or later 2

0 : std. night lights C

− 2

T −

A.34 H Detailed analysis of the structural model

In this appendix we discuss the structural model presented in section 6 in more detail. We start from a brief discussion on how it relates to other existing models of spatial equilibrium. Next, we analyze the decisions of both workers and firms, and their interplay in the equilibrium. Later, we set out our approach for em- pirically operationalizing and estimating the model, and show the results of an estimation exercise performed on the whole sample of Russian municipalities. Finally, we discuss possible extensions of the model.

H.1 Extended discussion of the setup

The model builds upon a simplified version of the recent framework by Allen and Donaldson (2020). We apply two major simplifications to their setup: we remove both inter-location trade and the distance-based migration costs. This makes the model much easier to empirically operationalize and estimate; with- out them, we would need detailed longitudinal data on cross-municipality trade and migration (at present unavailable in our setting) for estimation. These two gravity-based mechanisms are not the focus of our analysis, and we see little cost in neglecting them. In addition, we remove spillovers that depend on past pop- ulation values from our production functions in equation (2). As we mentioned, we cannot identify these effects with the data available to us. Note that Allen and Donaldson (2020) estimate very small production function elasticities (in the or- der of 0.04) associated with the past population. Unlike Allen and Donaldson, our setup features two distinct types of work- ers, distinguished by their skill level. In this respect, our framework is closer to the one by Moretti (2011, 2013). We also express the equilibrium in terms of log-differences between cities – although we adopt slightly different functional forms and distributional assumptions. In Moretti’s analysis the main objective was to perform a welfare analysis of workers’ choices by looking at the inter- play between forces of agglomeration and congestion (where the latter are due to higher rent prices, in the Rosen-Roback tradition). Our main objective instead is to express equilibrium equations that are well-suited to empirical estimation on

A.35 the matched sample. The dependence of individual utility (1) on current pop- ulation, under the restriction that parameters β1 and δ1 are negative, embeds congestion and other negative population externalities, such as higher rents in Moretti’s model. In this regard, our setup can be seen as an extension of his, one that introduces both path-dependence and skill-specific spillovers.

H.2 Workers’ choices

We analyze the properties of the model by restricting the analysis to a given pair of cities s and z. By the standard properties of multinomial choice models based on generalized extreme value distributions, in a spatial equilibrium the ratio be- tween the populations of the two cities, for each skill type, must be equal to the ratio between the two probabilities that an individual chooses either city. Hence:

1 £ ¡ n n n n n n¢¤ n exp(n ) exp w β `s0 β `s1 δ hs0 δ hs1 a σ s1 s + 0 + 1 + 0 + 1 + s 1 (H.1) exp n = £ ¡ ¢¤ n ( z1) exp w n βn` βn` δnh δnh an σ z + 0 z0 + 1 z1 + 0 z0 + 1 z1 + z for n h,`, as the denominators of both probabilities cancel out thanks to the = removal of distance-based migration costs (while this is an unrealistic assump- tion, it does not affect the main mechanics of the model). Taking logs, the above yields the labor supply equations at the municipal pair level:

¡ h h¢ h h h ¡ h h¢ ws wz β0 (`s0 `z0) β1 (`s1 `z1) δ0 (hs0 hz0) as az (hs1 hz1) − + − + − + − − − − = σh δh − 1 (H.2) ¡ ` `¢ ` ` ` ¡ ` `¢ ws wz β0 (`s0 `z0) δ0 (hs0 hz0) β1 (hs1 hz1) as az (`s1 `z1) − + − + − + − − − . − = σ` β` − 1 (H.3)

The elasticities of labor supply to local wages are given by the two denominators of (H.2) and (H.3), inverted. Similarly to other models of spatial equilibrium, neg- h h ative values of δ1 (β1 ) lead to a more homogeneous distribution of high-skilled (low-skilled) workers in space, due to congestion effects and/or higher rents.

A.36 H.3 Firms’ choices

The analysis of the firm side is standard. Consider first the production function of firms employing high-skilled workers. The assumption of perfectly mobile capital (the larger implications of which are discussed in section H.9) leads to h equalized rates of returns for kc . Hence:

³ h h´ 1 h³ h h´ ³ h ´ h i k k g g θ λ (hs1 hz1) ϕ (`s1 `z1) , (H.4) s − z = λ s − z + + − + −

as follows from equating the marginal product of capital across cities s and z. Labor is not as mobile as capital and wages are location-specific. The standard labor market equilibrium condition, in log-differences, leads to:

³ ´ ³ ´ ³ ´ ³ ´ w h w h g h g h θh λ 1 (h h ) ϕh (` ` ) (1 λ) kh kh , s − z = s − z + + − s1 − z1 + s1 − z1 + − s − z (H.5) and, combining (H.4) with (H.5), we obtain an equation that can be interpreted as the inverse labor demand if one takes low-skilled workers as given:

³ h h´ 1 h³ h h´ h h i w w g g θ (hs1 hz1) ϕ (`s1 `z1) . (H.6) s − z = λ s − z + − + −

A symmetric analysis delivers the low-skilled version of (H.6), again interpretable as the inverse labor demand if one takes high-skilled workers as given:

³ ` `´ 1 h³ ` `´ ` ` i w w g g θ (hs1 hz1) ϕ (`s1 `z1) . (H.7) s − z = λ s − z + − + −

Note that differentiating the labor elasticity parameter λ between firms of differ- ent skill type is straightforward, as it is in the model at large. Given our homoge- neous calibration approach to λ in the empirical analysis, however, we prefer to avoid doing this so as to keep the notation slightly simpler.

H.4 Spatial equilibrium

Equations (3) and (4) in the paper are obtained by combining equations (H.2), (H.3), (H.6), (H.7) and solving them for present-day population and wage ratios.

A.37 The history multipliers discussed in the text are algebraically derived as:

·µ ϕ` ¶ µ ϕh ¶ ¸ µhh M σ` β` δh βh δ` (H.8) = − 1 − λ 0 + 1 + λ 0 ·µ ϕ` ¶ µ ϕh ¶ ¸ µh` M σ` β` βh βh β` (H.9) = − 1 − λ 0 + 1 + λ 0 ·µ θh ¶ µ θ` ¶ ¸ µ`h M σh δh δ` δ` δh (H.10) = − 1 − λ 0 + 1 + λ 0 ·µ θh ¶ µ θ` ¶ ¸ µ`` M σh δh β` δ` βh (H.11) = − 1 − λ 0 + 1 + λ 0 where:

1 ·µ θh ¶µ ϕ` ¶ µ ϕh ¶µ θ` ¶¸− M σh δh σ` β` βh δ` . (H.12) = − 1 − λ − 1 − λ − 1 + λ 1 + λ

Inspecting the above expression, it appears clear that in the equilibrium, more positive values of the productivity spillovers (θn, ϕn), as well as of the contem- n n poraneous externalities (β1 , δ1 ), amplify the effect of the path-dependence pa- rameters (βn, δn), for n h,` – and vice versa. This is a typical feature of spatial 0 0 = equilibrium models in urban economics. Like in those models, some parametric restrictions are necessary in order to guarantee the existence and uniqueness of a non-degenerate equilibrium. Specifically, all the parameters featured in (H.8)- nn (H.11) must be such that µ 0 0 for all n h,` and n0 h,`. ≥ = = The constant terms in (3) are obtained, for c s,z, as: = " Ã ! Ã !# µ ϕ` ¶ g h µ ϕh ¶ g ` ph M σ` β` ah c βh a` c (H.13) c = − 1 − λ c + λ + 1 + λ c + λ " Ã ! Ã !# µ θh ¶ g ` µ θ` ¶ g h p` M σh δh a` s δ` ah s , (H.14) c = − 1 − λ s + λ + 1 + λ s + λ whereas those in (4) are as follows, for n h,`: = 1 ³ ´ r n g n θn ph ϕn p` , (H.15) c = λ c + c + c

h ` that is, they are simple linear functions of pc and pc above. These constant terms

A.38 are functions of all the factors that affect the outcomes of local labor markets, n n such as local amenities ac or productivity-enhancing variables embedded in gc .

H.5 Econometric model

In reshaping the spatial equilibrium equations into an econometric model, we make no attempt to do so in a way that would allow us to identify the persistence n n n n force parameters β0 and δ0 from the contemporaneous externalities β1 and δ1 , or from the Gumbel scale parameter σn, for n h,`. While this would be an = important and worthwhile exercise that could be pursued with additional data, it is not the objective of our analysis: identification of the history multipliers µnn0 suffices for the sake of calculating the wage bill ratios (5) discussed in the paper. n n To build our econometric model, we assume that ac and gc are linear func- tions of observable city characteristics as well as econometric error terms. Since n n the contemporaneous externalities β1 and δ1 are not identified, this implies that equations (H.13) and (H.14) are also linear functions of the same aforementioned objects, and their linear parameters cannot be linked via (identified) structural relationships. Therefore we write, for n h,`: =

F n n n X n n pc γ0 γS Sc γf x f c εc (H.16) = + + f 1 + =

¡ n n ¢ where (x1c ,...,xF c ) are observable city characteristics; γ1 ,...,γF are their asso- ciated parameters; Sc is a dummy variable that equals one if c is a Science City n and is zero otherwise; γS is the associated parameter that measures the addi- tional average attractiveness of Science Cities, for workers of type n, that is not n n captured by the other variables of the model; γ0 is a constant term; and εc is an n n h error term with mean zero. Since rc also depends on gc – in addition to pc and ` n pc – we need to specify a separate functional form for gc . Thus we also write, for n h,`: = F X0 n n n n ˜ n gc κ0 κS Sc κf x f 0c υc (H.17) = + + f 1 + 0= where F 0 F is the dimension of the vector (x˜ ,...,x˜ ), a subset of (x ,...,x ); ≤ 1c F 0c 1c F c

A.39 the new parameters are written as ¡κn,κn,κn,...,κn ¢, where κn represents aver- 0 S 1 F 0 S age productivity advantages of Science Cities for firms of type n not captured by n other features of the model; and the new error term (with zero mean) as υc . By plugging all these specifications into equations (3) and (4), we complete the construction of a model of four simultaneous equations. To specify the set of moment conditions that we use in our GMM approach, it is useful to define the following four error terms – one for each equation of the model:

F n n X n ¡ ¢ nh n` ηsz (ns1 nz1) γS γf x f s x f z µ (hs0 hz0) µ (`s0 `z0) = − − − f 1 − − − − − = (H.18)

n F 0 n ¡ n n¢ κS 1 X n ¡ ¢ ωsz ws wz κf x˜f s x˜f z = − − λ − λ f 1 − 0= n " F # n " F # θ h X h ¡ ¢ ϕ ` X ` ¡ ¢ γS γf x f s x f z γS γf x f s x f z − λ + f 1 − − λ + f 1 − = = θnµhh ϕnµ`h θnµh` ϕnµ`` + (hs0 hz0) + (`s0 `z0), (H.19) − λ − − λ − for n h,`. These four error terms are functions of the underlying errors εh, ε`, = s z h ` υs and υz in (H.16) and (H.17). Thus, the moment conditions are constructed as E£ηn ¤ 0 and E£ωn ¤ 0 for n h,`, and: sz = sz = =

£ n ¡ ¢¤ E η n0 n0 0 for n h,` and n0 h,`, (H.20) sz s0 − z0 = = = E£ηn ¡x x ¢¤ 0 for n h,` and f 1,...,F, (H.21) sz f s − f z = = = £ n ¡ ¢¤ E ω n0 n0 0 for n h,` and n0 h,`, (H.22) sz s0 − z0 = = = £ n ¡ ¢¤ E ω x˜ x˜ 0 for n h,` and f 0 1,...,F 0. (H.23) sz f 0s − f 0z = = =

h ` Observe that the parameters from (H.16), except γ0 and γ0, are identified in the population equations (3) – specifically, by the moments of the form (H.21) that use ¡x x ¢ for f 1,...,F as instruments; similarly, the parameters from f s − f z = h ` (H.17), except κ0 and κ0, are identified in the wage equations (4). Since λ is not identified, we calibrate it as λ 0.66 using a conventional choice for the labor =

A.40 elasticity, which in our setting is corroborated by previous exercises of produc- tion function estimation on Russian firms (Kuboniwa, 2011). Taking a value of λ as given, all the parameters in (H.18) and (H.19) are exactly identified. We estimate the model via one-step GMM with an identity weighting matrix; the key assumptions that enable consistent estimation are:

£ n ¯¡ ¢¤ E η ¯ n0 n0 0 (H.24) sz 0s − 0z = £ n ¯¡ ¢¤ E ω ¯ n0 n0 0 (H.25) sz 0s − 0z = for n h,` and n0 h,`. These conditions state that the model’s error terms are = = mean-independent of differences in the Soviet-era allocation of population of ei- ther skill. This is consistent with the empirical strategy from our main matching analysis, motivated by the idea that conditional on the characteristics that deter- mine the construction of our matching sample, Science City status (and thus, the local allocation of labor in Soviet times) is orthogonal to current outcomes. This makes it natural to estimate the model in our restricted matched sample. Since some matched pairs share the same control city, we cluster standard errors by grouping all such pairs into a single cluster; matched pairs whose control city is unique form one-observation clusters. The right-hand side variables from (H.16) and (H.17) that we chose for pro- ducing our main estimates are referred to as “covariates” in Table 6. They are as

follows. The variables (x˜1c ,...,x˜F 0c ) from specification (H.17) include all the spe- cific items of municipal budget expenditure that are listed in Table 4, Panel A – in logarithms. This is meant to control for the vES hypothesis and the possible im- plications of municipal-level expenditures on local productivity. In addition to them, we include the amenity “stocks” listed in Table 4, Panel B – again in loga- n rithms – in our list of variables (x1c ,...,xF c ) from specification (H.16), because pc n also subsumes other factors, represented by ac , that can affect individual loca- tion preferences. Given the limited sample size we keep the model parsimonious and abstain from adding more covariates, such as local geographical or historical characteristics. In light of how our matching sample is constructed, (H.24) and (H.25) would hold regardless.

A.41 H.6 Estimation on the full sample

To provide external validation for our results in Table 6 in the paper, we estimate the model in the larger sample of Russian municipalities. We perform this ad- ditional analysis by estimating a one-city version of the model summarized by equations (3) and (4) in the paper. Specifically, our four equations are:

n pn µnhh µn`` (H.26) c1 = c + c0 + c0 n hh n `h n h` n `` n n θ µ ϕ µ θ µ ϕ µ w r + hc0 + `c0, (H.27) c = c + λ + λ

for n h,`, maintaining specifications (H.16) and (H.17) for pn and r n alongside = c c the choice of covariates discussed above. Note that this “one-city” specification is not fully consistent with our structural model since it ignores the denominator of the probability of choosing city c for either skill type (and this denominator is a function of the structural parameters). As in analogous empirical applications based on multinomial choice models, we assume this problem away by postu- lating that the endogenous response effects implied by the model bear negligi- ble effects on that denominator, which can thus be treated as a constant to be incorporated into pn (more specifically, into parameter γn) for n h,`. Another c 0 = implication of this one-city version of the model is that instead of clustering stan- dard errors, we calculate them using the heteroscedasticity-consistent formula. The results from this analysis are presented in Table H.1: they are remarkably similar to those in Table 6 in the paper, across both approaches described there. Thanks to the increased sample size, all estimates appear to be more precise (es- pecially those for the spillover parameters ϕh and ϕ`). However, the “high-on- low” spillovers µh` are in most cases not statistically different from zero. This confirms our interpretation of these parameters in the paper: a higher relative number of high-skilled workers in the past attracts more low-skilled workers in the future, while the reverse is not true. The restrictions on the spillover param- eters (ϑn θn ϕn for n h,`) are now rejected; however the absolute mag- ≡ = − = nitudes of estimated values for θn and ϕn are very similar, suggesting that the constrained specification is a good approximation. In the constrained model we

A.42 obtain similar values for both ϑh and ϑ`, around 0.31 for Approach 1 and 0.25 for Approach 2, whereas the estimate for ϑ` is smaller and less precise in the matched sample. While this may be a by-product of increased statistical power, it could be indicative of a difference specific to Science Cities, due to their tradi- tional specialization in sectors characterized by high human capital intensity.

Table H.1: Estimation of the spatial equilibrium model on the full sample

Approach 1 Approach 2 Parameter (1) (2) (3) (4) θh (ϑh): high-to-high spillovers 0.236*** 0.298*** 0.231*** 0.236*** (0.024) (0.013) (0.022) (0.021)

ϕh: low-to-high spillovers -0.202*** -0.189*** (0.033) (0.031)

θ` (ϑ`) : high-to-low spillovers 0.243*** 0.331*** 0.239*** 0.245*** (0.026) (0.016) (0.024) (0.024)

ϕ`: low-to-low spillovers -0.194*** -0.182*** (0.036) (0.033)

µhh: history: high-on-high 0.974*** 0.933*** 1.674*** 1.729*** (0.022) (0.024) (0.021) (0.024)

µh`: history: high-on-low 0.019 0.095** 0.008 0.041 (0.028) (0.034) (0.020) (0.023)

µ`h: history: low-on-high 0.124*** 0.165*** 0.142*** 0.087** (0.023) (0.023) (0.025) (0.029)

µ``: history: low-on-low 0.835*** 0.759*** 1.128*** 1.096*** (0.029) (0.030) (0.024) (0.023) Spillover constraints test: p-value 0.000 0.000 Full set of covariates YES YES YES YES Constrained model NO YES NO YES No. of observations 1806 1806 1806 1806

Notes: *, ** and *** denote significance at the 10, 5, and 1 per cent level, respectively. The table displays estimates of the model expressed by equations (H.26) and (H.27) for n h,`, where pn , r n and g n are expressed as linear = c c c functions of a set of city characteristics (“covariates”) and an error term as in (H.15), (H.16) and (H.17), and for both approaches described in the text. Estimation is performed on the entire sample via one-step GMM using an identity weighting matrix. In constrained models, the parameter restrictions ϑn θn ϕn for n h,` are ≡ = − = applied; in unconstrained models, the p-value from a joint test of these restrictions (“spillover constraints test”) is reported. Heteroscedasticity-consistent standard errors are reported in parentheses.

A.43 H.7 Wage bill ratios

The total wage bill ratios from equation (5) can be calculated as a function of the allocation at t 0, using the history multipliers and the spillovers parame- = ters that we estimate, as follows. The ratio of total wage bills paid to high-skilled workers is derived from the equilibrium equations of the model as:

³h³ ´ i h³ ´ i ´ W h exp 1 θh µhh ϕhµ`h (h h ) 1 θh µh` ϕhµ`` (` ` ) , sz = − + s0 − z0 + − + s0 − z0 (H.28) while the corresponding ratio for low-skilled workers is derived as:

³h ³ ´ i h ³ ´ i ´ W ` exp θ`µhh 1 ϕ` µ`h (h h ) θ`µh` 1 ϕ` µ`` (` ` ) . sz = + − s0 − z0 + + − s0 − z0 (H.29) To clarify how a “ratio of ratios” interpretation can help draw useful implications from the calculation of (H.28) and (H.29), consider two alternative allocations ¡ ¢ ¡ ¢ hs∗0,hz∗0,`∗s0,`∗z0 and hs◦0,hz◦0,`◦s0,`◦z0 . The ratio between the two correspond- h h ing values Wsz∗ and Wsz◦ can also be written as:

h ¡¡ h ¢ ¡ h ¢¢ W ∗ exp w ∗ h∗ w ◦ h◦ sz s + s1 − s + s1 , (H.30) h ¡¡ h ¢ ¡ h ¢¢ W ◦ = exp w ∗ h w ◦ h sz z + z∗1 − z + z◦1 where asterisks and circles denote the respective allocation. Under the hypothe-

sis that the city z allocations are identical (h∗ h◦ and `∗ `◦ ) and that the z0 = z0 z0 = z0 general equilibrium effects due to a change in city s allocation are negligible, the denominator of (H.30) is approximately equal to 1 and the whole expression can thus be interpreted as the actual effect of the city s allocation, equal to the nu- merator of (H.30). This argument supports our claim in the paper that the ratio h between the values of Wsz , respectively given in columns 2 and 1 of Table 7, can be interpreted as reflecting the effect of the additional high-skilled allocation on city s. As remarked in the paper, these calculations cannot provide an answer about the overall welfare effects of Science Cities. We plan to develop a more extensive structural model that would allow us to do so in future work.

A.44 H.8 Governmental subsidies

Our model can extrapolate a subsidy tH∗ that the government needs to provide to the high-skilled inhabitants of city s in order to reproduce, in the spatial equilib- rium, an allocation equal to H ∗ H /H , under the hypothesis that cities s and = s1 z1 z are ex-ante identical. Specifically, tH∗ is derived from (H.13) as: ¡ ¢ hs∗1 hz∗1 tH∗ − . (H.31) = µ ϕ` ¶ M σ` β` − 1 − λ

An analogous expression can also be obtained for low-skilled workers via (H.14). Calculating (H.31) requires the knowledge of the unidentified structural param- eters that are subsumed into the history multipliers. The value proposed in the paper as the appropriate subsidy for the reproduction of a typical Science City ¡ ¢ n allocation: h∗ h∗ log(22/14), is obtained through our estimates of ϑ /λ s1 − z1 = for n h,` from column 4 in Table 6, and plausible values of the unidentified = parameters. Specifically, the latter are chosen as follows.

• We set σh δh and σ` β`, the inverse wage elasticities of the municipal-level − 1 − 1 labor supply, at 3.8. This value is based on an econometric analysis of migra- tion patterns across Russian regions by Vakulenko (2016), which suggests that 0.26 is a realistic magnitude for the elasticity of population to local income. This number is also close to the typical estimates of the extensive margin labor supply elasticity from a structural study by Larin et al. (2016).

• We set βh δ` 0.15, equaling the estimate of contemporaneous population 1 = 1 = − externalities by Allen and Donaldson (2020) in the United States.

H.9 Imperfectly mobile capital

The derivation and consequent estimation of our model are based on the as- sumption that capital is perfectly mobile in Russia, which implies that the re- turns to capital are equal across locations. This is a key assumption that allows us to sidestep our inability to obtain accurate measures of capital at the munici-

A.45 pal level. While fixed assets data are provided by ROSSTAT at the municipal level, we found that they are inconsistent with the values it reports at the regional level, casting doubt on the accuracy and reliability of its capital measurements. While one may question the realism of our key assumption, our reading of the literature on the spatial variation of capital costs in Russia suggests that this is not much worse an approximation than it would be in advanced western economies. Most notably, Ledyaeva (2009) shows that the set of geographical predictors of FDI in Russia partly overlaps with the set of covariates included in our matching strat- egy, suggesting that departures from the assumption are less of a concern if the empirical analysis is restricted to our matched sample. Here we sketch how the key equations of the model would change following alternative assumptions about capital mobility. Suppose first that capital is im- mobile in the short run – an extreme assumption, but a good approximation for a cross-sectional estimation exercise. In this case, the equilibrium population equations (3) would be modified, for n h,`, as: =

(n n ) ¡p˜n p˜n¢ µ˜nh (h h ) µ˜n` (` ` ) s1 − z1 = s − z + s0 − z0 + s0 − z0 ³ ´ ³ ´ χnh kh kh χn` k` k` , (H.32) + s − z + s − z

where tildes denote alternative values of variables and parameters, λh and λ` are differentiated labor elasticity parameters for the two types of firm h and `, while

³ ´³ ´ χhh M˜ 1 λh 2 β` ϕ` λh , (H.33) = − − 1 − − ³ ´³ ´ χh` M˜ 1 λh βh ϕh , (H.34) = − 1 + ³ ´³ ´ χ`h M˜ 1 λ` 2 δh θh λ` , (H.35) = − − 1 − − ³ ´³ ´ χ`` M˜ 1 λ` δ` θ` , (H.36) = − 1 +

and

h³ ´³ ´ ³ ´³ ´i 1 M˜ 2 β` ϕ` λh 2 δh θh λ` βh ϕh δ` θ` − . (H.37) = − 1 − − − 1 − − − 1 + 1 +

The increase in model complexity is clear; the equilibrium wage equations (4)

A.46 would result from plugging both equations expressed in (H.32) into (H.5) and its low-skilled analogue, giving rise to complicated expressions that are functions of the log-differences in both types of capital. The resulting model would allow identification of λh and λ` at the cost of increased complexity, provided that re- liable municipal-level capital data are available. Consider next an intermediate assumption: let capital be imperfectly mobile. Specifically, to change the capital allocation between periods t 0 and t 1, 2 = = firms have to pay a quadratic cost equal to χn ¡kn k ¢ /2 for n h,`, which c1 − c0 = acts as an effective tax on capital returns. This assumption would deliver even more complexity: to illustrate the key implications, we showcase a streamlined version of the model with only one type of labor: the low-skilled one, where θ 0, = and where we drop all n h,` subscripts. The spatial equilibrium equation for = the (low-skilled) population is:

¡ ¢ κ gs gz (as az ) β0 (`s0 `z0) ξ(ks0 kz0) (`s1 `z1) − + − + − + − , (H.38) − = 1 β1 κϕ ξ − − + ¡ ¢ ¡ ¢ ¡ ¢ where κ 1 χ / λ χ and ξ χ(1 λ)/ λ χ . The equilibrium equation = + + = − + for wages is, instead:

¡ ¢¡ ¢ ¡ ¢ ¡ ¢ κ 1 β1 gs gz κϕ ξ (as az ) β0 κϕ ξ (`s0 `z0) (ws wz ) − − + − − + − − − = 1 β1 κϕ ξ ¡ ¢ − − + ξ 2 β1 κϕ ξ (ks0 kz0) − − + − . (H.39) + 1 β1 κϕ ξ − − + The challenge that this model would entail is twofold. On the one hand, a version of the model with two skill types would be considerably more complex, as sug- gested by the one skill type equations (H.38) and (H.39). On the other hand, em- pirical estimation of this model would require data on capital at t 0. If access- = ing reliable municipal-level capital data in today’s Russia is challenging, analo- gous data that would trace back to Soviet times or the early transition years would be altogether impossible for obvious institutional reasons. Overall, we find our main assumption about capital, together with a realistic choice for the calibra- tion of parameter λ, to be the most appropriate choice for estimating our model.

A.47 References

Allen, Treb and Dave Donaldson (2020) “Persistence and Path Dependence in the Spatial Economy.” NBER Working Paper no. 28059.

Bircan, Ça˘gatay and Ralph De Haas (2020) “The limits of lending? Banks and technology adoption across Russia,” The Review of Financial Studies, Vol. 33, No. 2, pp. 536–609.

Central management unit of the military communications of the Red Army (1943) Schemes of railways and waterways of the USSR (Shemi zheleznih dorog i vodnih putey soobsheniya SSSR): Military Publishing of the People’s Commis- ariat of Defence. Available in Russian only.

De Witt, Nicholas (1961) Education and professional employment in the U.S.S.R., Washington, D.C.: National Science Foundation.

Dexter, Keith and Ivan Rodionov (2016) “The factories, research and design es- tablishments of the Soviet defence industry: A guide. Version 17,” working pa- per, The University of Warwick, Department of Economics.

Forastiere, Laura, Edoardo M. Airoldi, and Fabrizia Mealli (2020) “Identifica- tion and Estimation of Treatment and Interference Effects in Observational Studies on Networks,” Journal of the American Statistical Association. DOI: 10.1080/01621459.2020.1768100.

Gokhberg, Leonid (1997) “Transformation of the Soviet R&D System,” in L. Gokhberg, M. J. Peck, and J. Gács eds. Russian Applied Research and De- velopment: Its Problems and its Promise, Laxenburg, Austria: International In- stitute for Applied Systems Analysis, pp. 9–33.

Kuboniwa, Masaaki (2011) “The Russian growth path and TFP changes in light of estimation of the production function using quarterly data,” Post-Communist Economies, Vol. 23, No. 3, pp. 311–325.

Lappo, G. M. and P. M. Polyan (2008) “Naukograds of Russia: Yesterday’s for- bidden and semi-forbidden cities - today’s growth points (Naukogrady Rossii: Vcherashniye zapretnyye i poluzapretnyye goroda - segodnyashniye tochki rosta),” World of Russia (Mir Rossiyi), No. 1, pp. 20–49. Available in Russian only.

Larin, Alexandr V., Andrey G. Maksimov, and Darya V. Chernova (2016) “Elasti- ciy of labor supply on wages in Russia (Elastichnost’ predlozheniya truda po

A.48 zarabotnoy plate v Rossii),” Applied Econometrics (Prikladnaya ekonometrika), Vol. 41, pp. 47–61. Available in Russian only.

Ledyaeva, Svetlana (2009) “Spatial Econometric Analysis of Foreign Direct In- vestment Determinants in Russian Regions,” The World Economy, Vol. 32, No. 4, pp. 643–666.

Moretti, Enrico (2011) “Local labor markets,” in Handbook of Labor Economics: Elsevier.

(2013) “Real wage inequality,” American Economic Journal: Applied Eco- nomics, Vol. 5, No. 1, pp. 65–103.

NAS (2002) Successes and difficulties of small innovative firms in Russian nuclear cities: Proceedings of a Russian-American workshop. Committee on Small In- novative Firms in Russian Nuclear Cities, Office for Central Europe and Eurasia Development, Security, and Cooperation, National Research Council, in coop- eration with the Institute of Physics and Power Engineering, Obninsk, Russia, Washington, D.C.: National Academy of Sciences.

Roback, Jennifer (1982) “Wages, rents and the quality of life,” Journal of Political Economy, Vol. 90, No. 6, pp. 1257–1278.

Rosen, Sherwin (1979) “Wage-based indexes of urban quality of life,” in P. N. Miezkowski and M. R. Straszheim eds. Current Issues in Urban Economics: Johns Hopkins University Press, Baltimore, MD.

State Bank (1946) List of branches of the State Bank of the USSR as of 1st March 1946 (Spisok kontor, otdeleniy i agentstv Gosudarstvennogo banka SSSR po sos- toyaniyu na 1-ye marta 1946 goda), Moscow: State Bank of the USSR. Available in Russian only.

Vakulenko, Elena S. (2016) “Econometric analysis of factors of internal migration in Russia,” Regional Research of Russia, Vol. 6, No. 4, pp. 344–356.

Vernadsky State Geological Museum and United States Geological Survey, 20010600 (2001) “rails.shp. Railroads of the Former Soviet Union,” U.S. Geo- logical Survey, Denver, CO.

A.49