Quick viewing(Text Mode)

Essays on Infrastructure, Trade, and Politics in Developing Countries John S. Firth

Essays on Infrastructure, Trade, and Politics in Developing Countries John S. Firth

Essays on Infrastructure, Trade, and Politics in Developing Countries by John S. B.A., University of Notre Dame (2010) Submitted to the Department of Economics in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Economics at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2018 ○c John S. Firth, MMXVIII. All rights reserved. The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created.

Author...... Department of Economics May 15, 2018 Certified by...... Daron Acemoglu Elizabeth and James Killian Professor of Economics Thesis Supervisor Certified by...... Benjamin Olken Professor of Economics Thesis Supervisor Accepted by ...... Ricardo Caballero Ford International Professor of Economics Chairman, Departmental Committee on Graduate Theses 2 Essays on Infrastructure, Trade, and Politics in Developing Countries by John S. Firth

Submitted to the Department of Economics on May 15, 2018, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Economics

Abstract This thesis comprises three essays in empirical development economics. Broadly, the essays provide causal evidence on the effects of various barriers to trade, associated with infrastructure, law, and politics. Chapter 1 begins from the observation that transportation networks worldwide suffer from heavy congestion. To measure this congestion’s effect on the production side of the economy, I combine firm survey data with traffic data from Indian Rail- ways. Geographic variation in congestion comes from a recent wave of passenger trains which were planned according to certain rigid rules, making it possible to iden- tify the costs the additional traffic imposes on firms using the railways to ship goods. In estimating this “congestion externality”, the empirical strategy accounts for both direct and spillover effects of congestion. It also draws on a traffic model fromoper- ations research to disentangle a mean effect (congestion makes the average shipment slower) from a variance effect (congestion makes shipping times less predictable). In response especially to the unpredictability, firms simplify operations in several ways, leading to lower productivity and substantial revenue loss. While affected firms suffer, however, I draw on a general equilibrium model of competition to identify gains to their competitors. Policy implications of these results concern both the management of traffic on existing infrastructure, and the construction of new infrastructure. Chapter 2 (coauthored with Ernest Liu) provides a long-run perspective on the effects of trade costs on the geography of production. We consider India’s Freight Equalization Scheme (FES), which aimed to promote even industrial development by subsidizing long-distance transport of key inputs such as iron and steel. Many observers speculate that FES actually exacerbated inequality by allowing rich man- ufacturing centers on the to cheaply source raw materials from poor eastern regions. We exploit state-by-industry variation in the effects of FES on input costs, in order to show how it affected the geography of production. We find, first, that over the long-run FES contributed to the decline of industry in eastern India, pushing iron and steel using industries toward more prosperous states. This effect sinks in gradually, however, with the time needed to construct new plants serving as a friction

3 to industry relocation. Finally, we test for the stickiness of these effects, by studying the repeal of FES. Contrary to popular opinions of the policy and to agglomeration- based reasons for hypothesizing stickiness, we find that the effects of repealing FES are equal and opposite to those of its implementation. Still, due to changing locations of the processing of basic iron and steel materials, the resource-rich states suffering under FES never fully recover. Chapter 3 contributes to the debate on laws against foreign bribery. When gov- ernments pass laws to prevent their businesspeople from bribing foreign officials, how does this affect patterns of trade and foreign investment? A literature focusing on the OECD Anti-Bribery Convention claims that these laws direct international busi- ness toward less corrupt destination countries, with the effect of diverting business away from developing countries. I rebut this claim, using three empirical tests: (i) a baseline test building on previous work but accounting for the omitted role of OECD- level cooperation trends, (ii) an analysis of an initiative intensifying the Convention’s enforcement, and (iii) a test exploiting product-by-destination level variation in pre- Convention exposure to OECD exports. Together, these tests show that the redirec- tion of trade and investment following the passage of the foreign bribery laws was due not to the laws themselves, but to an underlying trend of increased political cooper- ation among OECD countries, as indicated by patterns in UN voting affinity. This cooperation is what simultaneously led OECD countries to pass measures such as the Convention, and to do more business with other OECD countries, which happen to be less corrupt on average than non-OECD countries.

Thesis Supervisor: Daron Acemoglu Title: Elizabeth and James Killian Professor of Economics

Thesis Supervisor: Benjamin Olken Title: Professor of Economics

4 Acknowledgments

I am deeply grateful to those who helped me through the long and sometimes difficult journey which led to this thesis. First and foremost, I thank my advisors, Daron Acemoglu, Ben Olken, and Abhijit Banerjee, for their incredible advice, encouragement, and mentorship. I feel very fortunate that they have invested so much in my development as an economist. Apart from my official advisors, several other professors at MIT provided in- valuable guidance, especially Esther Duflo, Rob Townsend, David Atkin, and Dave Donaldson. It has been a privilege to be part of such a vibrant and supportive group. My peers also played an integral role in my graduate school experience. For their help and friendship, I thank Sid George, Nick Hagerty, Greg Howard, Peter Hull, Donghee Jo, Gabriel Kreindler, Matt Lowe, Ernest Liu, Ben Marx, Rachael Meager, Yuhei Miyauchi, Scott Nelson, Arianna Ornaghi, Brendan Price, Otis Reid, Ben Roth, Ludwig Straub, Marco Tabellini, and Tite Yokossi. My inspiration to pursue a PhD and become an academic came during my under- graduate studies, and for this I especially thank Bill Evans, John Roos, Jim Sullivan, and Paul Weithman. I also thank Dan Keniston, Clément Imbert, and everyone at J-PAL South Asia who supported me in my first experiences with fieldwork. For help with the chapters on Indian Railways, I thank Arvind Subramanian, Rangeet Ghosh, and their colleagues at the Ministry of Finance. I am also grateful for the help and insights of many dedicated people working with the Indian Railways, including Shri Prakash, GK Mohanty, NM Rao, Vishy Shanker, Beji George, Ashwani Kumar, and Rajnish Kumar. Finally, I thank my parents, my siblings, and Anushka Shah, for the love and support which not only carried me through this thesis, but also kept it in perspective and made it all meaningful.

5 6 Contents

1 I’ve Been Waiting on the Railroad: The Effects of Congestion on Firm Production 15 1.1 Introduction ...... 15 1.2 Context and data ...... 26 1.2.1 Indian Railways ...... 26 1.2.2 Data sources ...... 28 1.3 Reduced form effects of congestion on firm revenue ...... 31 1.3.1 Basic empirical strategy ...... 31 1.3.2 Spillovers from diversion of traffic onto alternate routes . . 34 1.3.3 Reduced form results ...... 36 1.4 The effect of shipping times: mean versus variance ...... 43 1.4.1 Model and empirical strategy ...... 43 1.4.2 Results of the shipping times IV ...... 47 1.5 Explaining the revenue loss: costs versus competition ...... 47 1.5.1 Model and empirical strategy ...... 48 1.5.2 Empirical application of the model ...... 54 1.6 Policy ...... 57 1.6.1 Traffic management on existing infrastructure . . . . 57 1.6.2 New infrastructure ...... 60 1.7 Conclusion ...... 66 1.8 Tables and Figures ...... 68 1.9 Appendix ...... 90

7 2 Manufacturing Underdevelopment: India’s Freight Equalization Scheme, and the Long-run Effects of Distortions on the Geography ofPro- duction 117 2.1 Introduction ...... 117 2.2 Background ...... 121 2.3 Empirical analysis ...... 125 2.3.1 Long-run effect ...... 125 2.3.2 Transition path ...... 130 2.3.3 Stickiness ...... 134 2.4 Conclusion ...... 136 2.5 Tables and Figures ...... 138 2.6 Appendix ...... 153

3 Do Anti-Bribery Laws Affect International Trade and Investment? 159 3.1 Introduction ...... 159 3.2 Background ...... 160 3.3 Data and methodology ...... 162 3.4 Results ...... 164 3.4.1 Strategy 1: Baseline model, accounting for international coop- eration ...... 164 3.4.2 Strategy 2: Increased enforcement in Phase 3 of the OECD Working Group on Bribery ...... 169 3.4.3 Strategy 3: Product-level trade ...... 170 3.5 Conclusion ...... 172 3.6 Tables and Figures ...... 174

4 Bibliography 186

8 List of Figures

1-1 Histogram of line capacity utilization ...... 68 1-2 Route-wise average freight shipment times ...... 69 1-3 Sample from scraped website with data on train routes ...... 70 1-4 Goods shipped by rail in India ...... 71 1-5 Reduced form empirical strategy, accounting for spillover effects . . . 72 1-6 District-wise exposure to Duronto routes ...... 73 1-7 District-wise exposure to spillover routes ...... 74 1-8 Event study for effect of Durontos ...... 75 1-9 Mean and variance response to Durontos, as a function of pre-existing congestion ...... 76

1-10 Event study for the effect of 퐷푑푦 × 푇푑,푦=푦0 on revenue ...... 77 1-11 Effects of two hypothetical construction projects ...... 78 1-12 Effects of congestion on travel times ...... 115

2-1 The economic geography of India at the time of FES implementation 138 2-2 Historical trends in state manufacturing output ...... 139 2-3 Stylized description of supply chain, providing institutional details and link with data ...... 140 2-4 Map of basic iron and steel production ...... 141 2-5 Share of Bihar and West Bengal in iron- and steel-using manufacturing (Engineering) and in other industries ...... 142 2-6 Event study showing transition path of FES effects ...... 143 2-7 Event study on repeal of FES ...... 144

9 3-1 Event study checking for pre-trends in exports to corrupt destinations 182 3-2 Trends in total exports from OECD countries ...... 183 3-3 Trends in OECD exports, residualized ...... 183 3-4 Patterns in UN voting alignment among OECD members ...... 184 3-5 Pre-trends in estimates of import effects ...... 185

10 List of Tables

1.1 Descriptive statistics for factories in rail using industries ...... 79 1.2 Effects of Durontos on railway line traffic patterns . . . 80 1.3 Reduced form effects of Duronto trains on rail using firms . 81 1.4 Placebo effects on non rail using firms ...... 82 1.5 First stage effects of Duronto traffic on freight shipment times . 83 1.6 2SLS estimates of mean and variance effects ...... 84 1.7 Model estimates of cost and competition effects ...... 85 1.8 Model estimates of cost and competition effects, with elasticity inter- actions ...... 86 1.9 Aggregate effects of Duronto congestion on revenue, at state-industry level ...... 87 1.10 Aggregate effects of Duronto congestion on gross value added, at state- industry level ...... 88 1.11 The cost of running one Duronto train ...... 89 1.12 Effects on district railway traffic ...... 90 1.13 Effects on firm logistics ...... 91 1.14 Effects on firm product mix ...... 92 1.15 Heterogeneity by use of rail goods as inputs, and production of rail goods as output ...... 93 1.16 Heterogeneity by road density ...... 94 1.17 Reduced form estimates, controlling for distance to cities served by Durontos ...... 95

11 1.18 Reduced form estimates, controlling for Duronto and spillover traffic on shipping lines ...... 96 1.19 Reduced form estimates, controlling for changes in market access . . . 97 1.20 Reduced form estimates, with sample including all districts in mainland India ...... 98 1.21 Reduced form estimates, with sample excluding “donut” around Duronto endpoints ...... 99 1.22 Reduced form estimates, with narrower definition of spillover route . 100 1.23 Reduced form estimates, with wider definition of spillovers, including second-order ...... 101 1.24 Reduced form estimates, with spillovers restricted to 200km . . . . . 102 1.25 Replication of Chen and Harker (1990) ...... 114

2.1 List of industries in Indian Census of Manufactures ...... 145 2.2 Descriptive statistics ...... 146 2.3 Effects of FES on long-run industrial growth, for states with unchanged boundaries ...... 147 2.4 Effects of FES on long-run industrial growth, for all states . 148 2.5 Effects of FES on long-run location of downstream industries . 149 2.6 Effects of FES on long-run location of iron-using industries . 150 2.7 Effects of repealing FES ...... 151 2.8 Effects of repealing FES, controlling for pre-trend from 1967 to1990. 152 2.9 Effects of FES on long-run industrial growth, unweighted . 153 2.10 Effects of repealing FES, controlling for pre-trend from 1989 to1990. 154

3.1 Countries signing OECD Anti-Bribery Convention ...... 174 3.2 Difference-in-difference estimates of effects of OECD Anti-Bribery Con- vention on trade ...... 175 3.3 Triple-difference estimates controlling for indicators of international cooperation ...... 176 3.4 Effects of OECD-ABC on FDI ...... 177

12 3.5 Effects by enforcement of anti-bribery laws ...... 178 3.6 Effects of Phase 3 increase in convention implementation . 179 3.7 Non-OECD exporters picking up slack ...... 180 3.8 Effects of OECD exposure on imports ...... 181

13 14 Chapter 1

I’ve Been Waiting on the Railroad: The Effects of Congestion on Firm Production

1.1 Introduction

Transportation networks worldwide suffer from heavy congestion. In economics, most existing research on congestion treats it as an urban problem, affecting personal commutes. Yet congestion also affects long distance goods shipments, with firms and policymakers alike claiming that this poses a major barrier to firms’ productive efficiency and growth.

To visualize why congestion might affect firm production, consider a manufacturer waiting on its inputs to ship from Mumbai to New Delhi, along one of India’s busiest rail corridors. The distance is 870 miles. With a clear railway line, a freight train running at normal speed could make the trip in less than a day. In practice, it can take two weeks. Walking from Mumbai to Delhi would be faster – 11 days by Google Maps estimates. Financing and depreciation costs might accumulate while the goods are in transit, and the slow shipping might limit the manufacturer’s ability to adapt to changing conditions. But slow shipping is only part of the potential problem, as con-

15 gestion also makes shipping unpredictable: the goods might find a clear path on the rails and arrive in a couple of days, before the manufacturer has any use for them, or bad traffic might push the wait to weeks, disrupting the manufacturer’s supply chain and slowing productivity. Factors like these provide the basis for firms’ complaints about congestion, and the justification for major infrastructure investments such as India’s Dedicated Freight Corridor, a proposed network of freight-only railroad tracks aiming to boost firm productivity by freeing freight shipments from congestion.

At the same time, two economic factors suggest congestion might not prove very costly after all. First, firms can try to insulate themselves from congestion, forin- stance by holding inventories to guard against stockouts. The cost of congestion for an individual firm depends on the availability of these insulating measures, andon the extent to which the measures bring costs of their own. Second, even if congestion hurts some firms, its net effect depends on the ability of these firms’ competitors to steal their business and replace the lost revenue. In light of these possibilities, the magnitude of congestion’s cost is empirically ambiguous, and so too are the benefits of policies and infrastructure projects aimed at congestion relief.

To settle these empirical questions, I compile a unique dataset linking firm surveys with detailed measures of congestion and shipping times on the Indian Railways. The data reveal staggering amounts of congestion, with more than half of the major lines running beyond the capacity prescribed by international engineering norms. Consistent with operations research models of railway traffic, travel on these congested lines is slow and arrival times vary widely, both across routes and over time for a given route. The main reason for the congestion is that the Indian Railways, a political Ministry answering to voters’ demands, often introduces new passenger trains, and does so without regard for the effects on overall traffic flows. In countries such asthe United States, where freight operators own the tracks, passenger trains often need to stop and wait for freight trains. But India does the opposite: passenger trains almost always get first priority on the rails. As a result, heavy passenger traffic slowsfreight shipments, and looking at the addition of passenger trains is an ideal way to study the effects of congestion on firms using railway freight.

16 I focus, in particular, on a major recent passenger train program, the Duronto trains, exhibiting two features crucial for identification. First, Durontos adhere toa rigid rule of taking the shortest possible path between endpoints, ruling out endoge- nous selection of the path. Second, Durontos are supposed to make no stops between their endpoints, ruling out any effects of the trains on the intermediate rail lines other than through congesting these lines and disrupting freight shipments in the area. To avoid confounds from selection or effects unrelated to congestion, I only focus on these intermediate districts, excluding from my analysis the endpoint cities targeted by the Duronto program. Several pieces of evidence indicate that, conditional on be- ing located between two cities considered for the Duronto program, actually having a Duronto pass through a given district is as good as random.

Even with as-good-as-random shocks to congestion, an important identification challenge remains, having to do with spillover effects. Specifically, when one railway line becomes more congested, some of its traffic moves to neighboring lines, increas- ing congestion there. Thus, the neighboring lines are not a suitable control group. In the language of Rubin (1980), using the neighbors as controls would violate the stable unit treatment value assumption (SUTVA). Finding “pure controls” which sat- isfy SUTVA is a challenging problem in spatial economics, especially as relates to infrastructure projects, and the literature offers few convincing solutions (Donaldson, 2015; Redding and Turner, 2015). The fundamental dilemma is that control units need to simultaneously (a) satisfy SUTVA, which is more likely if they are far away or different in kind from the treatment units, and (b) serve as a counterfactual for the treatment units, which is more likely when they are nearby or similar.

I overcome this dilemma by using data on rail traffic patterns to identify exactly which districts will receive spillover traffic from the Durontos. For each Duronto route and each pair of stations along the route, I identify all the paths taken by at least one regularly scheduled train traveling between these stations. I refer to this set of paths as the “spillover routes” for the Duronto in question. To show that the districts on these spillover routes are exactly the ones exposed to spillover traffic, I conduct a “zero-th stage” analysis of traffic patterns. It shows that when a Duronto

17 is introduced on a given rail line, this increases traffic both on that line and on the spillover routes I have identified. The spillovers do not extend any farther, however, as there is no traffic increase on the “second order” spillover-routes-of-the-spillover- routes. I therefore know the set of districts exposed to spillovers, and can account for this in studying the effects on these districts’ firms.

Reduced form results show that Duronto traffic leads to increased costs, less ef- ficient production, and ultimately a substantial revenue loss for firms in railusing industries. For each new line of Duronto service passing through a district, local fac- tory revenue falls by 1.9 percent. The preferred specification also includes a control for each district’s exposure to spillover traffic, which serves two purposes. First, it shows that spillover traffic also causes revenue loss, with each spillover route pass- ing through a district leading to a 1.1 percent loss in factory revenue for rail using firms. Second, it serves to remove bias in the estimates of the Duronto main effect,by controlling for an omitted variable. Spillover traffic through a district is negatively correlated with Duronto traffic through that district, because the spillover routes tend to run in parallel to the main Duronto routes. Since spillovers themselves have a neg- ative effect, failing to control from them would bias estimates of the Duronto main effect toward zero. In the end, the local revenue loss associated with Duronto traffic is substantial, and we realize its full magnitude only by accounting for spillovers.

Why do affected firms lose revenue from Duronto-associated congestion? Inset- tings other than Indian Railways, congestion could affect firms through a variety of mechanisms, including increased shipping prices, reduced availability of freight ship- ment, or disruptions to passenger travel. In this setting, however, the evidence points toward a single, clear mechanism: the Duronto traffic disrupts freight shipments for rail-using firms, making these shipments slower and less predictable, which raises the effective costs of producing output and delivering it to consumers.

Several institutional details specific to Indian Railways support the claim that this is the mechanism at work. The Indian Railways fixes freight rates as a function of distance, independent of congestion, ruling out costs associated with firms facing higher shipping expenses. It also does not ration freight trains or change the sched-

18 ules of existing passenger trains as a result of new trains like the Durontos, ruling out effects associated with freight availability or movements of passengers for laboror consumption. Moreover, each of these institutional details finds validation in the em- pirical evidence, as Duronto traffic has no effect on firms’ reported nominal shipping expenses, the number of freight trains run, or the number of passenger trips. So while congestion, in general, works through a bundle of possible channels, this paper’s insti- tutional setting provides an opportunity to isolate its effects working through freight shipping times.

Apart from these institutional factors, several aspects of the firm’s observed empir- ical response to Duronto congestion also point toward effects associated with shipping times. First, Duronto traffic leads affected firms to hold larger inventories. Thethe- ory of inventory management, in the tradition of Arrow, Harris and Marschak (1951), holds that for a firm trying to guard against stockouts, a key determinants ofthe optimal inventory level is the time it takes inputs to arrive. Second, Duronto conges- tion leads firms to alter their product mix, making fewer products per factory, and switching to products which are less time sensitive and which have more predictable demand. These responses are consistent with firms trying to remove uncertainty in the production process, in order to offset the increased uncertainty about shipment times on a congested transport network.

Given that shipping times are the reason Duronto traffic creates problems, I dis- tinguish two specific aspects of this problem: a mean effect (congestion slowsav- erage shipping times) and a variance effect (congestion makes shipping times less predictable). To understand the link between congestion and shipping times and, indeed, why congestion is so central to transportation economics, consider a single hypothetical railway line. In principle, arbitrarily many trains can run on the line at arbitrarily high speeds, and with no variance in arrival time, if they are dispatched one after another, running at the same speed and in the same direction. With train speed differentials and different directions of travel, however, trains meet anddelay each other. It is because of these potential meetings between heterogeneous trains that congestion becomes a problem: each new train is a possible source of delay for

19 the trains already on the line. Mean travel times increase with congestion due to the higher number of expected train meetings, and variance increases because with more congestion there are more possible meetings, each of which might or might not happen. These effects are, moreover, worse when there is more heterogeneity intrain speeds. Averaging 70 to 80 kilometers per hour, the Durontos are among the fastest trains on Indian Railways, while freight trains, typically running at 25 kilometers per hour, are the slowest. It therefore comes as no surprise that the Durontos cause substantial increases in both the mean and the variance of freight shipping times.

In terms of the economic implications for firms, the relative importance of shipping time mean versus variance is, a priori, ambiguous. Slow shipping, in the sense of high mean shipping times, might prove unproblematic if firms simply need to place orders farther in advance. Or it might cause major problems if production and demand are uncertain. For instance, a car manufacturing firm might forecast high demand for red cars and place an order for red paint, only to find that by the time its paint arrives all of its recent orders are for blue cars and it is stuck with the wrong color. Variance of shipping time becomes a problem when, for instance, a firm’s input orders arrive later than anticipated, forcing it to stop production because it lacks a key input. On the other hand, variance might matter less if firms can costlessly guard against stockouts with measures such as inventories, or if they can forecast the arrival time of a particular shipment and plan accordingly.

The empirical challenge is to obtain independent variation in the mean and vari- ance of shipping time, which I accomplish by drawing on an operations research model of railway travel times. Chen and Harker (1990) and Harker and Hong (1990) model travel times on a railway line where trains are dispatched according to a given dis- tribution of departure times and train characteristics. I extend their model to show how travel times change with the introduction of additional trains. Both mean and variance increase with additional traffic, but the model’s key result is that at higher congestion levels the variance diverges from the mean. Intuitively, this divergence comes from “knock-on effects”: on a congested line trains might adhere to schedule on days when none of them are delayed; but when one train gets delayed, this delays

20 other trains which need to wait for it, which is in turn correlated with delays between other pairs of trains, yielding an especially high variance of travel time. I thus instru- ment for mean and variance using the Duronto shock along with flexible interactions of this shock with pre-Duronto congestion levels.

Two stage least squares results show differing effects of the slowness and the unpre- dictability, with unpredictability proving the more costly. Consistent with models of inventory management, increases in both mean and variance of shipping time prompt firms to increase their inventory holdings, as a guard against stockout risk.While these measures may provide firms with some insulation, they do not fully buffer against the costs of unpredictable shipping. For each 10 percent increase in the vari- ance of shipping times, average costs increase by 0.3 percent, and in turn, revenue falls by 1.1 percent.

The magnitude of these revenue losses is substantial, raising a puzzle: how can just a few new passenger trains cause such large losses for affected firms? I distinguish two basic explanations. One possibility is that Duronto congestion has a large “cost effect”: it substantially raises an affected firm’s production costs by making freight shipments slower and less reliable, and thereby disrupting its supply chain. Large cost effects imply that if every firm in the economy suffered an increase in congestion, large losses in output and welfare would follow. Several features of the setting make a large cost effect seem like a plausible explanation. The median district in the samplehas only one relatively small railway line, so adding even a single Duronto route through it consumes a large amount of its line capacity. The associated increases in both the mean and variance of freight shipment times forces firms into responses such as higher inventory holdings and more conservative production processes. With these movements away from efficient production, firms ultimately exhibit higher average costs and lower revenue productivity.

The alternate possibility, however, is that Durontos’ cost effect is actually small, and firm revenue losses owe more to stiff competition: in competitive markets, firms with even a small cost increase can fall behind their competitors and see revenue plummet. It is possible, moreover, that when these competitors steal the business

21 of congestion-affected firms, this is a relocation of output, but not a large netloss. Distinguishing between these cost and competition based explanations is essential because if the cost effect is in fact small, then increasing congestion for every firmin the economy could lead to negligible aggregate losses.

In distinguishing these explanations, I first turn to evidence isolating the cost effect, based on the finding that Duronto traffic leads to increases in factories’ average cost per unit of production. A simple prediction of firm optimization, for a very general class of production functions, and even in the presence of competition effects, is that observed increases in average cost provide a lower bound on the magnitude of the shift in the firm’s cost function. Intuitively, any competition effects reduce a firm’s output, pushing it down its cost function, which, with decreasing returns, serves to reduce average cost.1 So observed effects of congestion on average cost reflect the actual outward shift in the firm’s cost function, offset by this downward force from competition. The methodological appeal of this result is that it leverages firm data to isolate basic features of the firm problem which are invariant to any competition effect, making it possible to identify the cost effect without strong dependence on general equilibrium model assumptions.

Given the cost increase for affected firms, the ultimate implications for firm rev- enue depend on the nature of competition, which is characterized by two main em- pirical results. First, exposure of a firm’s competitors to Duronto traffic leads to increased sales for that firm, indicating ready substitution between the products of the affected and unaffected firms. Second, the negative effect of Durontos onfirm revenue is concentrated in industries with high elasticities of substitution, indicating that the degree of substitutability magnifies the consequences of the cost effect. To interpret these results and quantify the aggregate effects of the congestion shock, net of business stealing, I draw on a model of general equilibrium interactions between firms competing in a given industry. Based on Rotemberg (2017), the model predicts how a cost shock will translate into revenue loss for the affected firm, as a function

1In the presence of fixed costs, this movement might not reduce average cost, but would reduce average variable cost. The effects I find on average variable cost are similar to those on average cost.

22 of elasticities of substitution and the exposure of the firm’s competitors to similar cost shocks. The associated parameter estimates imply that competitors’ gains from business stealing are almost as large as the losses suffered by Duronto-affected firms, adding up to only a minimal aggregate loss from running the Duronto trains.

In applying these results to policy, I first consider the implications for traffic man- agement on existing infrastructure. Currently, Indian Railways maintains a uniform priority for passenger trains over freight traffic, and does little to increase the speed of lagging freight trains. Since variable shipping times are the main source of cost for freight using firms, however, the Railways could greatly help these firms bygranting higher priority to a freight train which has already met some delay. One such priority scheme would be a backpressure routing algorithm (Neely, 2010), which routes traffic on a network to minimize a sum of squared delays, and so reduced the probability of extreme delays and thereby the variance. More concretely, the Indian Railways is experimenting with running freight trains on fixed time tables, in contrast tothe policy of scheduling them on an ad hoc basis, in between the running of pas- senger trains, and with no promised arrival times. Implementing fixed time tables and more predictable freight shipping is a particular priority for the Dedicated Freight Corridor, and the costs associated with variable shipping times once again indicate that these policies could yield substantial gains for affected firms.

Apart from these implications for traffic management on existing infrastructure, congestion also bears on the construction of new infrastructure. In this vein, a second policy application considers the choice between two hypothetical rail construction projects. The first project adds a new rail line between Mumbai and Delhi, acorridor with several lines serving it already, but suffering from heavy congestion. The second hypothetical project is a new line between Amaravati, the newly planned capital of the state of Andhra Pradesh, and Raipur, the capital of neighboring Chhattisgarh. Currently the route connecting these cities is a circuitous 873 kilometers, even though the straight-line distance is only 406 kilometers. Assuming the Amaravati-Raipur route is congestion-free, how much of a reduction in its length would deliver the same benefits as building a new line to decongest the Mumbai-Delhi corridor?

23 In the logic of Fogel (1964) and of the least-cost path approach common in the contemporary trade literature (Donaldson, 2017; Donaldson and Hornbeck, 2016), the benefit of the new Mumbai-Delhi line is minimal, because it acts as a substitute for the existing lines. Accounting for congestion, however, two factors point to larger benefits for the Mumbai-Delhi line. First, given the convexity of congestion costs, large benefits result from decongesting an already congested line. Second, decongest- ing Mumbai-Delhi also relieves congestion on neighboring lines, leading to gains from spillover effects. To model the implications of these factors for welfare, I use aversion of the Allen and Arkolakis (2016) framework for characterizing how welfare is affected by infrastructure improvements. Combining this model framework with my empiri- cal estimates, I find that building the new Mumbai-Delhi line would yield the same benefits as shortening the Amaravati-Raipur rail route to the physically impossible distance of 384 kilometers. Decongestion indeed has possible advantages over simply shortening travel distances.

This paper’s analysis of congestion offers an empirical supplement to a recent literature modeling optimal infrastructure investment in the presence of congestion (Fajgelbaum and Schaal, 2017; Allen and Arkolakis, 2016). These papers model trade costs as a function of quantities shipped along a trade link, departing from the con- ventional assumption of iceberg trade costs. Fajgelbaum and Schaal (2017) show that incorporating congestion in this manner shuts down complementarities in infrastruc- ture investment, convexifying the optimization problem of a social planner choosing infrastructure investments, goods movements, and economic quantities. This convex- ification ensures a unique solution and simplifies the procedure for finding it.The assumption behind this modeling device is that additional traffic on the transporta- tion network increases costs for other users, and my results provide empirical support for this assumption.

More broadly, this paper adds to a burgeoning literature on the micro-foundations of trade costs. As surveyed in Anderson and Van Wincoop (2004), both domestic and international trade costs depend on a variety of frictions, from nominal freight prices, to policy barriers, among many others. More recent work uses a combination of the-

24 ory and micro data to show exactly how these frictions depend on the economics of, for instance, imperfect information (Allen, 2014; Startz, 2016), the organization of production networks (Hillberry and Hummels, 2008), and contractual relationships (Macchiavello and Morjaria, 2015). Closest to my paper is the Hummels and Schaur (2013) study using exporters’ revealed preference for shipments by air versus ocean, in order to estimate the value of time in trade. I go beyond their findings by iden- tifying congestion as an important source of the variation in shipping times, then demonstrating the causal effect on firms, the mechanism of the firm response, and how these effects differ, separately, for changes in the mean and variance ofship- ping time. My ability to take these extra steps owes to advantageous features of my setting, in which freight rates and distances are fixed, enabling me to isolate the effects of shipping times. Characterizing trade costs as a function of shipping times offers a useful way to predict the effects of infrastructure projects, since the planning of most projects involves ready engineering estimates of how the project will affect travel times.

Indeed, my emphasis on congestion bears special relevance to modern infrastruc- ture projects. Existing infrastructure papers tend to focus on historical projects (Fogel, 1964; Donaldson, 2017; Banerjee, Duflo and Qian, 2012), or in any case on the establishment of large-scale transport systems (Baum-Snow, 2007; Duranton and Turner, 2012; Faber, 2014), aiming to speak to the old debate about the importance of railroads and national highway systems in countries’ development. Today, when a developing country like India spends 3 percent of annual GDP on infrastructure, it typically is not constructing new transport systems, but more often widening existing highways, adding links to an already-dense railway network, or otherwise addressing the problem that the existing transport systems are inefficient, unreliable, and indeed, congested. These issues require firms to make complex logistical adjustments, making it essential to understand why and for which firms the adjustments are most costly.

At the broadest level, I contribute to the literature on the determinants of low firm productivity in developing countries. Policy debates cite poor infrastructure as a major impediment to productivity (World Bank, 1994; Bajaj, 2010; Ziobro,

25 2017), and point specifically to inventory holdings as a symptom (Guasch and Kogan, 2003; Datta, 2012; Li and Li, 2013). I provide causal evidence on the mechanisms of this firm response to poor infrastructure, its origins in the unpredictability as well as the slowness of shipping, and its implications for productivity. While some papers consider the effects of uncertainty on productivity (Allcott, Collard-Wexler and O’Connell, 2016) and misallocation (Asker, Collard-Wexler and De Loecker, 2014; David, Hopenhayn and Venkateswaran, 2016), disentangling uncertainty from the adverse events that often accompany it is difficult in practice (with one notable effort being Bloom, 2009), but I provide some of the first causal microeconomic evidence drawing this distinction.

1.2 Context and data

This section describes the Indian Railways context, and the data used to study the effects of congestion in this setting. The Indian Railways are an important carrier of both passengers and goods, but Railways traffic data shows overwhelming congestion on most of its lines, leading to slow and unreliable freight shipments. A major source of congestion is the indiscriminate adding of new passenger trains, so to study how this hurts freight using firms, I highlight one particular wave of new passenger trains, the Durontos, which were introduced according to certain rigid rules proving useful for identification. A basic contribution of this paper is linking data on these railway traffic patterns, with detailed data on firm outcomes, which I draw from India’s Annual Survey of Industries (ASI).

1.2.1 Indian Railways

The official slogan of the Indian Railways, “Lifeline to the Nation”, speaks tothe perceived economic importance of the Railways. India has the world’s third largest railway network by track length, and trails only Japan in passenger volume, handling over 8 billion trips per year. India especially excels at making passenger travel af- fordable. The average Indian passenger fare amounts to 0.6 US cents per kilometer,

26 compared with 2.4 US center per kilometer in China, and far higher rates in devel- oped countries: 12.6 cents per kilometer in Germany, for instance, and 19.0 cents per kilometer in Japan.2

The convenience and affordability of passenger travel comes, however, at acost. Passenger fares are insufficient to cover operating expenses, so the Railways’s financ- ing of passenger travel relies on a cross-subsidy from freight shipments. As a result, Indian freight rates are, in nominal terms, 49 percent higher than those in China, and on par with those in developed countries. Adjusted for PPP, Indian freight rates are approximately twice as high as those in both China and the United States (Ministry of Railways, 2015a). Apart from passengers’ financial burden on freight, passengers also consume the scarce track space shared by the two forms of traffic. Unlike many countries, India does not have separate tracks for passenger and freight trains.

In allocating track space, moreover, India accords highest priority to passenger travel. Passenger trains run on fixed schedules and new trains are frequently intro- duced by politicians to gain their constituents’ favor. Freight trains, on the other hand, have no fixed schedules, running on an as-needed basis. When a customer wants to make a freight shipment, the customer files a request with the Railways, and a railway manager tries to find a time to dispatch the freight train, in between the scheduled running of the passenger trains (Ministry of Railways, 2008). As a result, freight shipments often need to wait before beginning their journey, then even once they are en route, stop and wait again for passenger traffic to clear. So freight shipments are slow and unreliable, and the key determinant of freight shipping per- formance in a given area is the amount of passenger traffic there.

The role of passenger trains in affecting freight shipments motivates this paper’s focus on the introduction of new passenger trains as its source of variation in con- gestion. I focus on one particular set of passenger trains, the Durontos, introduced by Rail Minister Mamata Banerjee in 2009 (Banerjee, 2009; Ministry of Railways, 2009). The Duronto trains aim to provide nonstop service on the shortest possible

2Adjusted for PPP, these countries’ passenger fares, relative to those in India, are 2.7 times higher in China, 6.2 times higher in Germany, and 9.4 times higher in Japan.

27 routes between 12 of the largest cities in India. The decisions about where to in- troduce Duronto trains were based on passenger demand for travel between these major cities. The intermediate districts on the Duronto routes receive congestion as a by-product, and this is the identifying variation I use. Over time, the heavy passenger congestion on the Railways has pushed freight traffic off of the rails, and on to other modes of transportation such asroads.In 1950, Railways carried 89 percent of India’s freight traffic, measured by weight, but by 2016, this share fell to 31 percent. Of the freight traffic remaining on the Railways, an overwhelming majority, 87 percent, comes from just a few “rail goods”: coal, iron, steel, fertilizers, cement, mineral oils, and food grains (see Figure 1-4a). Conversely, these goods rely heavily on the rails, as indicated by the modal shares reported in Figure 1-4b. In particular, the railways carry 80 percent of India’s coal shipments and more than 50 percent of its iron, steel, and cement (Ministry of Railways, 2011). Producers of these goods have little choice but to ship by rail, since the goods’ bulk makes them difficult, or far more costly, or in some cases unsafe, to transport by road. Given this clear specialization of the rails, I focus my analysis on firms in “rail-using” industries, which I define to be those producing a rail good, or with rail goods comprising at least 5 percent of their input cost share, based on the industry input-output structure in the years prior to the introduction of Durontos.

1.2.2 Data sources

Railways data

To study geographic patterns in these train movements and the associated congestion effects, I collect data from the Indian Railways, consisting of three parts. First, Line Capacity data describes the structure of the railway network and the traffic on each part of the network. It comes from annual Line Capacity Reports, prepared by each of the 17 zonal authorities on Indian Railways. These reports provide traffic data based on a division of the railway network into 1218 track sections, where the median section length is 35 kilometers. For each section and each year,

28 the reports indicate (i) how many passenger, freight, and other trains run there in an average 24 hour period, (ii) the types of signaling, electrification, and other physical capabilities present on the section, and (iii) the theoretical capacity of the section. The theoretical capacity is an engineering estimate of the number of trains which can safely run on a section in a 24 hour period. It is based on Scott’s Formula, which accounts for the physical features of the track, the type of equipment present, and a range of other factors. Railway operators regard line capacity numbers as a rough guideline and often run trains in excess of these numbers, perhaps causing some loss in safety or travel time efficiency. Indeed, Figure 1-1 shows a histogram of thetraffic on each section as a fraction of the section’s line capacity, and its most striking feature is that half of the track sections on Indian Railways operate beyond their prescribed capacity.

Second, shipping times data indicates the mean and variance of railway freight shipments. This data is available starting in 2011, the year in which the Railways adopted its current computerized train database. For this project, annual summary data was extracted on freight shipments for all possible origin-destination pairs from a sample of 179 major stations. These 179 stations consist of the 109 most important freight shipment points, and a random sample of 70 additional stations. For each origin-destination pair, the data reports the annual number of freight trains run, and the mean and variance of the running time for these trains. Figure 1-2 illustrates three key lessons from these data. First, freight shipments are slow in general, with even relatively short shipments often taking five to ten days, and most shipments taking longer than the Railways’s benchmark shipping speed. Second, some routes experience extremely slow shipping, with average times stretching well beyond ten days. Finally, even conditional on track distance, there is considerable cross-route variation in shipping time; the 푅2 from regressing shipping time on distance in the cross section is only 0.19.3 So shipping times and the associated costs depend on factors other than distance, including, as I will show, congestion.

3There is, likewise, considerable variation across shipments for a given route, as indicated by the annual route-wise data on variance of shipping time.

29 Finally, geographic data on all railway stations and the routes of all passenger trains is scraped from the IndiaRailInfo website, a sample of which appears in Figure 1-3. The resulting data details a variety of train characteristics, and lists the stations which each train passes, including stations where the train does not stop. I use this data for two main purposes. First, it identifies the districts and track sections crossed by the Duronto trains and exposed to the associated congestion. Second, since it includes the universe of passenger trains, it provides the set of reasonable routes that a train might follow between two stations. As described in Section 1.3, this information on traffic patterns helps me identify the areas subject to spillover congestion when the Durontos are introduced.

Firm data

The main source of outcome data is India’s Annual Survey of Industries (ASI), which has been widely used in the economics literature. ASI data comes from an annual government survey of manufacturers, and includes factory-level measures of output, input use, and a variety of other firm characteristics, such as inventory holdings, which can help explain how firms adapt to congested infrastructure. Output data appears separately for each product, making it possible to observe changes in the factory’s product mix. Input data includes detailed measures of capital and labor, as well as materials use, disaggregated by the commodity category of each material. This disaggregation is useful, both for observing how firms alter their product and input mixes in response to congested railways, and for identifying the firms which use heavy inputs typically shipped by rail and which are therefore most likely to be affected by railway congestion. The ASI data includes all manufacturing establishments above a certain employ- ment threshold which varies by year, and a random sample of smaller establishments. It is provided to researchers in two forms. The first, ASI Panel data, includes fac- tory identifiers, making it possible to link a factory’s data across years andforma panel. While ASI Panel does not include district identifiers, a separate version of the data, ASI Geo, contains all of the same firms and indicators, with the addition

30 of district identifiers, but with the factory identifiers excluded. To construct panel data with geographic identifiers, I use observable characteristics to match the entries in the ASI Panel with those in ASI Geo for 2009-10, the year in which ASI Geo was discontinued. I then use the factory identifiers to link across years, ascertaining the district of factories in the ASI Panel data for 2010-11 through 2012-13. Constructing this geographically identified panel dataset enables me to use district-level geographic variation in congestion while still running regressions with factory fixed effects, and enables me to link the geographically identified firm data with Railways shipping time data which is available only starting in 2011. Table 1.1 provides additional descriptive statistics from both the ASI data, and the data on Railway traffic patterns.

1.3 Reduced form effects of congestion on firm rev- enue

This section identifies reduced form effects of Duronto passenger traffic on rail-using firms. I argue that for certain intermediate districts, having a Duronto run through the district is as good as random. This argument rests both on institutional facts about the Duronto trains, which by rule follow the shortest nonstop path between their endpoints, and on empirical checks of balance and parallel trends. To address SUTVA concerns, the empirical strategy accounts for spillover effects arising because Durontos divert traffic onto neighboring lines. After 1.3.1 outlines the basic strat- egy, subsection 1.3.2 describes the approach to spillovers, and 1.3.3 presents results showing that Duronto traffic disrupts firm operations, raising production costs and leading to revenue loss.

1.3.1 Basic empirical strategy

Figure 1-5a illustrates the empirical strategy, using a comparison between representa- tive “treatment” district Rourkela and representative “control” district Bokaro. Both Rourkela and Bokaro are important steel-producing districts with populations around

31 500,000. Neither is a major urban center, though, so neither was under consideration to receive Duronto passenger service. Rourkela happens to lie on the shortest rail path between Mumbai and Kolkata, so the Mumbai-Kolkata Duronto passes through. Bokaro lies on a similarly important rail line which is part of the shortest path between Ahmedabad and Kolkata. Other Duronto trains serve Ahmedabad and others serve Kolkata, but the specific Ahmedabad-Kolkata route does not receive Duronto ser- vice, so no Duronto passes through Bokaro. The fact that a Duronto passes through Rourkela but not Bokaro is an incidental consequence of the Railways’ intention to provide nonstop service between Mumbai and Kolkata, unrelated to any other differences between Rourkela and Bokaro. This observation supports the empirical strategy’s core identifying assumption, which is that firms in the two districts are comparable via differences-in-differences: in the absence of any Durontos, changes in firm outcomes in Rourkela would have been the same as those in Bokaro.

Some institutional details provide further support for this assumption. First, Durontos make no intermediate stops between Their s endpoints. This eliminates any possibility that the Duronto routes were chosen to serve or not serve the passengers of places such as Rourkela and Bokaro. A remaining concern is that planners might have chosen Durontos’ paths to avoid congesting Bokaro’s rail lines, for instance because these lines were already too congested or because this congestion would interfere with positive economic trends in Bokaro. A second institutional detail helps allay this concern: the Durontos by rule follow the shortest path between their endpoints. While other trains’ routes might be planned to avoid congesting favored or fast-growing areas, the Durontos’ shortest-path rule ensures that Duronto routes are not chosen based on these characteristics of the intermediate areas. A final remaining concern is endogenous choice of the entire Duronto route, for example because planners favor the firms between Ahmedabad and Kolkata, as a group, over the firms between Mumbai and Kolkata. This possibility is difficult to falsify, but is inconsistent with the motives of the Railways planners, whose explicit goal was to facilitate passenger travel between the target cities, and is also allayed, as I will show, by parallel pre-trends in firm outcomes in the districts with and without Duronto traffic.

32 The comparison between these districts motivates the basic specification

푦푖푡 = 훽퐷퐷푑푡 + 훽푆푆푑푡 + 훾푖 + 훾푡×푠 + 훾푡×푘 + 휖푖푡, (1.1)

where 푦푖푡 is an outcome of interest in year 푡 for factory 푖, which operates in industry

푘 and is located in district 푑 of state 푠, 퐷푑푡 is the number of Duronto trains passing through 푑 as of year 푡.4 The sample is limited to intermediate districts which lie between the 12 major urban centers served by the Duronto program. This sample definition excludes two types of districts. First, it excludes the 12 urban centers targeted by the Duronto program, so all of the sample’s variation in Duronto traffic depends on which cities a district happens to lie between, not on any explicit intention to target or avoid that city. The main results are robust to also excluding a “donut” of districts bordering the urban centers. Second, the sample excludes remote districts not lying between any of the 12 major cities. Results are, however, robust to including all districts in India as controls. While the initial Duronto plan involved nonstop service on the shortest paths between endpoints, later adjustments involved certain Durontos making stops or de- viating from the shortest path. These changes pose little threat to identification, since most of them happened after the end of my sample period in 2012, and in any case deviations were minimal. As of 2016, the average Duronto makes 2.4 stops, and travels on a route 2.9 percent longer than the shortest possible route. Still, to avoid concerns that these deviations might have been endogenous, I construct all Duronto treatment variables based on the shortest path between the Duronto’s endpoints. Fig- ure 1-6 shows district-wise treatment status, measured as the total number of Duronto routes passing through each district as of 2012.

The final ingredient in (1.1) is 푆푑푡, a measure of exposure to spillover traffic. It

4Looking at the number of routes actually passing through a district may seem like a simplistic measure of exposure to the added congestion, relative to measures accounting for (a) traffic on the lines a firm uses to ship its goods, or (b) for changes in market access (Redding andSturm, 2008; Donaldson and Hornbeck, 2016). In Appendix 1.9, however, I consider versions of these alternate measures. Empirically, they do not provide explanatory power beyond the actual passage of Duronto and spillover routes through a given district, so I focus the analysis on the simpler and easily interpretable measures of passage through a district.

33 serves two purposes. First, controlling for spillover effects is essential for identifying the causal effect of Duronto traffic, relative to a counterfactual of no Durontos. Sec- ond, estimating the spillover effects is of inherent interest, since measuring the full cost of the Durontos requires accounting for these effects.

1.3.2 Spillovers from diversion of traffic onto alternate routes

Spillovers arise because when a Duronto train passes through one district, the con- gestion it creates there diverts traffic to neighboring districts. Figure 1-5b illustrates this possibility, with Durontos running through Rourkela leading to diversion of traf- fic and possible congestion effects in Bokaro. Since this spillover traffic flowsonto lines other than the main Duronto lines, a reasonable expectation is that spillover traffic through a district is negatively correlated with main Duronto traffic through that district, and that failing to account for the spillovers will therefore lead to down- ward bias in estimates of the Duronto main effect. In principle, however, the opposite bias is also possible, if the lines with Duronto traffic are geographically concentrated, so that one Duronto’s spillover traffic flows onto the other Duronto lines, leading to positive correlation between Durontos and spillovers. Which type of bias prevails is therefore an empirical matter. To account for these spillovers, I use information on the Railways’ typical traffic patterns, drawn from the data on the universe of passenger train routes. For each Duronto route and each pair of stations along the route, I identify all of the paths taken by at least one passenger train traveling between these stations. I refer to this set of routes as the “spillover routes” for the Duronto in question. Figure 1- 5b illustrates this construction with stations 퐴 and 퐶 lying on the Mumbai-Kolkata line, and certain non-Duronto trains traveling from 퐴 to 퐶 via Bokaro. So although Bokaro is not directly affected by Duronto traffic, it is on an alternate route fortrains traveling between points on the Duronto route, and is therefore subject to a possible spillover effect. To validate this definition of the spillover routes, I conduct a “zero-th stage” anal- ysis of how Durontos affect traffic patterns. Unlike the main district-level regressions

34 in this paper, the zero-th stage analysis is at the level of track section 푛. Specifically, it studies how traffic on the section, 푛푡Traffic , measured as the average daily number of trains of a given type, responds to Duronto and alternate-route spillover traffic running on the section:

Traffic푛푡 = 훼퐷퐷푛푡 + 훼푆푆푛푡 + 훾푛 + 훾푡×Sample + 휖푛푡. (1.2)

Here, 퐷푛푡 is the number of Duronto trains running on section 푠 as of year 푡 and 푆푛푡 is the number of Duronto trains for which 푠 is on the spillover route. If the outcome is total number of trains running on the section, and all other traffic is held fixed on a line when a Duronto is introduced, we would find 훼ˆ퐷 = 1 and 훼ˆ푆 = 0. If, on the other hand, one Duronto train leads to displacement of exactly one train onto each section of its alternate route, we would find 훼ˆ퐷 = 1 and 훼ˆ푆 = 0.

Table 1.2 reports results of this regression. Column (1) shows that for each ad- ditional Duronto scheduled to run along a line, the total number of passenger trains running on that line increases by 0.61. This increase is less than one, because some traffic is diverted onto the spillover routes. Each spillover route receives 0.22addi- tional passenger trains, and as Column (2) indicates, 0.23 additional freight trains.

Column (3) shows that 훼ˆ퐷 and 훼ˆ푆 add to one, meaning the total amount of traffic is unchanged.5 Columns (4) through (6) show that while the spillover routes as de- fined above receive traffic as a result of the Durontos, there is no change intrafficon the “second-order” spillover routes of the spillover routes. Thus, I conclude that the possible traffic-diversion spillover effects extend to the routes I have identified, but no farther.6

5For each kilometer of Duronto route, there is 1.02km of alternate route, so the amount of train- kilometers diverted onto alternate routes is approximately the same as the amount of train-kilometers from the introduced Durontos. 6The empirical results are robust to using alternate definitions of the spillover routes, for example restricting to alternate routes for trains traveling between the same endpoints as the Durontos, and restricting to spillover routes within a 200 kilometer radius of the Duronto main route.

35 1.3.3 Reduced form results

This subsection describes the reduced form results, beginning with empirical checks of balance and pre-trends, then proceeding to the main reduced form effects of Duronto traffic, and a body of supplemental evidence which helps clarify the mechanism behind these effects. Throughout the analysis, I focus on four main outcomes: revenue, productivity (TFPR), average cost, and total inventory holding. I consider the natural logarithm of each of these variables, so estimated effects can be interpreted as percent changes. Effects on revenue represent an overall effect of the Duronto congestion, inclusive of any associated increases in production costs or losses in sales to competitors due to poor shipping performance. Effects on TFPR show how Durontos affect productiv- ity: this effect could result from the Duronto congestion disrupting the production processes, though it also includes effects due to changes in the price of the firm’s product. Studying average cost removes effects of these changes in output price. For single- product factories in the data, measures of average cost come from dividing total costs by the data’s reported quantities. For a multi-product factory 푖 making products {1, . . . , 퐾}, the average cost measure is

Total Cost 퐴퐶 = , (1.3) 푖 ∑︀퐾 푘=1 푝¯푘푞푖푘

where 푞푖푘 is 푖’s quantity of 푘 produced, and 푝¯ is the median all-India price of 푘. Using a fixed product price 푝¯ acts simply to weight across the factory’s product-level output quantities.7

7Changes in average cost as measured in (1.3) could be correlated with changes in product quality or by changes in the relative prices of the firm’s products. This correlation could lead to biasin regressions of average cost on Duronto running, if Durontos affect quality or these relative prices. However, the main results on average cost are robust to alternate measures of 푝¯, such as using fixed 2008 median prices to remove the effect of changes in relative product price, or 2008 firm-specific prices to account for fixed firm-specific differences in relative product quality. A disadvantage to using 2008 prices, and the reason I avoid it as the preferred definition of 푝¯, is that the ASI’s product classifications change in 2010 from ASICC to NPCMS, and the ASICC to NPCMS concordance is more exact in some industries than other, leading to differential changes in measured average costs between 2009-10 and 2010-11.

36 Finally, inventories represent the response firms take to insulate themselves from the costs of congestion. Of course, firms take many insulating measures apart from inventories, and Tables 1.13 and 1.14 detail some of these responses. But inven- tories appear repeatedly in the literature as a key response to poor infrastructure (Guasch and Kogan, 2003; Datta, 2012; Li and Li, 2013) and to uncertainty more generally (Fafchamps, Gunning and Oostendorp, 2000). Models of optimal inventory management trace their roots to Edgeworth (1888) and the Newsvendor Problem of Arrow, Harris and Marschak (1951). Appendix 1.9 presents a modern version of these models, in which a firm holds inventory to guard against stockout risk arising from uncertain lead times and demand fluctuations. As the model shows, the firm should hold larger inventories in response to increases in either the mean or variance of lead time. The model also predicts larger inventory responses for goods with higher value added, higher penalty of stockout, and higher demand uncertainty. These predic- tions provide an interpretation for the inventory effects of Duronto congestion, and its associated effects on the mean and variance of shipping time.

Balance and pre-trend checks

In addition to the institutional features supporting the empirical strategy, the data also show evidence of balance and parallel trends. Table 1.1 shows that intermediate districts receiving more Duronto traffic are similar to those receiving less. Districts set to receive more total Duronto and spillover traffic by 2012 do exhibit slightly lower revenue productivity and reliance on rail goods as inputs, though this difference is economically small and significant at only the 10 percent level. More important from the perspective of the difference-in-difference strategy is that Figure 1-8 shows parallel trends across Duronto-affected and unaffected districts, both in termsof congestion levels, and in terms of each of the four main outcomes of interest: revenue, productivity, average cost, and inventory holding. This evidence lends empirical support to the identification strategy’s most basic assumption that these districts would have continued to follow parallel trends in the absence of the Duronto program.

37 Main results

Table 1.3 presents the main reduced form effects of running Duronto trains. First consider Panel A, showing results from the preferred specification which accounts for the effects of both Durontos and the traffic spillovers. As Column (1) shows,one two-way Duronto route running through a district leads to a 1.9 percent decrease in revenue for the rail-using factories in that district.8 This revenue effect is large. For perspective, one Duronto route amounts to approximately 7 percent of the charted line capacity in the median district. Thus, scaling the revenue effect implies that if a district went from a completely clear railway line, with no passenger traffic, to having Duronto trains use its full line capacity, factories would suffer a revenue loss of 1.9/0.07 = 27 percent. Of course, this calculation perhaps represents an upper bound on the effect of having a line become completely full, since such a large increasein congestion might prompt firms into larger reorganizations to offset the congestion effect. Still, the large revenue effect might appear surprising on its surface: why should some passenger trains speeding through a district lead to such losses for firms? Part of the answer appears in Columns (2) through (4), which show effects on rail using factories’ productivity, production costs, and inventory holdings. Each Duronto route reduces TFPR by 1.1 percent, a smaller magnitude than the revenue loss, indicating that input use falls, but by less than the decrease in revenue. Whereas the revenue and TFPR effects both depend on the firm’s output price, the effects on average cost reflect cost per unit of output, independent of this price. As Column (3) shows,each Duronto route increases average cost by 0.8 percent. These cost increases could come from a variety of sources, including financing costs as goods shipments become slower, or risk of input stockout with uncertain arrival times. Inventory stocks, though they bring holding costs of their own, help insulate firms against these costs, and as Column (4) shows, each Duronto route increases firm inventory holding by 1.0 percent. This

8As noted above, the rail-using firms are those in industries which either produce one of thegoods typically shipped by rail (coal, iron, steel, fertilizers, food grains, cement, and mineral oils), or have these goods amount to at least 5 percent of their input cost share, as per the 2007-08 input-output table.

38 absolute increase in inventory holdings comes despite a scaling down in firm revenue, entailing an even larger increase in inventory holdings as a fraction of firm revenue.

Panel A also shows evidence of spillover effects. In particular, each alternate route through a district decreases rail-using firms’ revenue by 1.1 percent and in- creases average cost by 0.7 percent. These effects are smaller in magnitude than the Duronto main effect, though measured with less precision, making them statistically indistinguishable from the main effects.

Apart from the economic importance of their effects, the spillovers also play a role in identifying the main effect of Duronto traffic. As Panel B shows, the estimated magnitudes of the Duronto main effects are far smaller than in Panel A, as a result of omitting the spillover controls. The revenue loss, for instance, is only 1.3 percent. While this estimate is not quite statistically distinguishable at the 10 percent level from the Panel A estimate of 1.9 percent, the difference between these point estimates is economically meaningful. The reason for the difference is that spillover traffic through a district is negatively correlated with Duronto main line traffic, and the spillover effects themselves work in the same direction as the main effects. Thus, omitting the spillover control leads to downward bias.

As a placebo test, Table 1.4 shows no effect on firms in non rail using industries. Theoretically, these firms might have experienced Duronto effects, either due tocon- gestion spillovers as traffic moves from the congested rail lines to roads, or dueto general equilibrium effects, for instance if they compete or do transactions with rail- using firms. While these possibilities make the placebo test imperfect, another way to interpret Table 1.4 is as a falsification of the hypothesis that Duronto-affected dis- tricts were, even without the Duronto congestion, set to embark on different economic trends from the unaffected districts. This could happen, as discussed above, ifthe Duronto running patterns were correlated with planners’ broader policy favoritism of certain districts. In such a case, we would expect to find effects even on the non rail using firms in Duronto-affected districts. Yet we see no such effects.

39 Additional evidence on mechanism

To detail the mechanism behind these reduced form effects, I provide evidence on several additional outcomes (reported in detail in Appendix 1.9) and on heterogeneity (reported in Appendix 1.9). This evidence tells a clear story: the congestion arising from new Duronto passenger trains disrupts freight shipments for rail-using firms, making these shipments slower and less predictable, which raises the effective costs of producing output and delivering it to consumers. In support of this story, we should find, first of all, that Duronto congestion has its largest effects on firms relying most intensively on railway shipment of freight. The contrast between the effects for rail-using firms and lack of effect for nonrail- using firms shows already that the extensive margin of rail use matters forfinding congestion effects. The intensive margin also matters, as Table 1.15 shows larger effects for industries with a greater fraction of their input cost share comingfrom goods shipped by rail and larger effects for industries actually producing these rail goods.9 So the costs associated with railway congestion scale with the firm’s reliance on railway shipments, and this congestion seems to create problems both for the receipt of inputs and for the delivery of output. While firms’ use of railway freight at least partly explains the reduced form effect, it remains possible that another part of the effect comes from changes in passenger movement as a result of the Durontos. In particular, one possibility is that when the Durontos ease passenger movements between the major cities, this has some effect on the intermediate sample districts, either because citizens there cantake advantage of the Durontos by traveling via the major cities, or because the advent of Durontos brings economic benefits to the major cities which then spill over to nearby intermediate districts. Ruling out this possibility, Table 1.20 shows that the main results do not change when we exclude from the sample a 100 kilometer “donut” of districts surrounding the major cities, and Table 1.17 shows that there is no effect of simply being located close to a major city receiving Duronto service.

9Among the rail using industries in the sample, producing a rail good is positively correlated with the industry’s rail good input cost share.

40 Another possibility related to passenger movements is that the Durontos affect local passenger train movements in the intermediate sample districts, either by cre- ating too much congestion for the local trains, or by causing substitution away from other long distance trains which, unlike the Durontos, make stops in the intermedi- ate districts. In practice, the Indian Railways rarely cancels or significantly modifies the schedules of existing passenger trains, making this possibility unlikely. Empiri- cal results in Table 1.12 show that, indeed, Duronto traffic through an intermediate district affects neither the number of passenger departures from that district’s sta- tions, nor the number of arrivals.10 So the evidence points, once again, toward the conclusion that the Durontos’ reduced form effects arise not from effects on passenger movements, but from disruptions of freight shipment. It remains to establish why Duronto traffic disrupts freight shipment: is it actu- ally by affecting freight shipping times, as my hypothesis suggests, or via some other such as reduced availability of freight trains? Four sets of facts point toward the shipping time story. First, if Duronto congestion led to reduced availability of freight trains, we would expect to see reductions in railway freight shipment volumes. Empirically, however, Columns (3) and (4) of Table 1.12 shows that the number of freight train arrivals and departures from a district is unaffected by Duronto traffic.11 This finding is consistent with the institutional detail that congestion does notneces- sarily lead to rationing of freight train slots (Ministry of Railways, 2008): firms can, and empirically do, still present their goods for shipment when the rail lines become congested, but they simply need to wait longer for these shipments. Second, we might suspect that, apart from any effects on shipping time, congestion raises firms’ freight shipment prices. While congestion could affect freight prices in many settings, for instance because it is associated with shifts in the supply or demand for freight shipments, none of these effects are likely to occur on Indian

10These results on the total number of arrivals and departures include trips both within and outside the district, but results similarly show no effect when disaggregated by arrivals and departures withing and outside the district. 11The data shows the number of freight train departures, and not the weight or value of goods aboard the trains. So it is possible that congestion-affected firms are in fact sending and receiving less material. But the continued running of the same number of freight trains shows, at least, that congestion does not make these trains unavailable if the firms want to put goods aboard them.

41 Railways, where the government fixes freight rates as a function of the good shipped and distance covered. Empirically, Table 1.13 also shows no effects of Duronto traffic on the amounts rail-using firms report paying in distribution costs. Third, we might suspect that congestion leads firms to ship their goods by roads instead of rail. Such behavior could lead firms to incur additional costs associated with highway shipment, or provide certain advantages, perhaps without affecting shipment times. Table 1.16 shows, however, that the Duronto effects are no differ- ent in states with greater availability of road shipping options, as measured by the density of national highways. Indeed, given that most of the goods shipped by rail are heavy materials for which road shipment is unsafe or impractical, it makes sense that firms shipping these goods cannot use road substitution as an insulator against rail congestion, and are instead at the mercy of the shipping times available on the railways. Finally, the ways in which firms cope with congestion are consistent with anef- fort to make their production process simpler and more predictable, in the face of slower and less reliable goods shipments. Apart from increasing inventory holdings as discussed already, firms also hedge against uncertainty with adjustments intheir product mix. Table 1.14 reports these adjustments. Most basically, the firms make fewer products per factory, as reported in Column (1).12 Column (2) shows that they also substitute toward less time sensitive products, where the measure of time sensitivity comes from the revealed preference of internationally trading firms to pay for fast air shipping, as studied in Hummels and Schaur (2013). Column (3) shows substitution toward products with more predictable demand, based on a standard volatility measure from the literature.13 Column (4) shows evidence, though signif-

12This result indicates that factories exit from certain product markets, though as reported in Table 1.1, the Duronto and spillover traffic do not have large enough effects to prompt exitatthe plant level. 13Demand uncertainty, estimated at the product level using pre-2005 ASI data, is measured as standard deviation of 휈푖,푡, the unpredictable part of log changes in (product-level) sales:

Δ ln 푃 푌푖,푡 = 휌0,푖 + 휌1,푖Δ ln 푃 푌푖,푡−푖 + 휈푖,푡.

This method of estimating demand uncertainty follows papers such as McConnell and Perez-Quiros (2000) and Blanchard and Simon (2001).

42 icant only at the 10 percent level, that firms substitute toward products with less complex production processes, where the measure of complexity, as in Levchenko (2007), is the (inverse) Herfindahl index of the product’s input cost shares according to US input-output tables. Each of these adjustments potentially streamlines the firm’s production process, amounting to exactly the firm response we should expectif the main problem caused by congestion is to make freight shipments slower and less reliable.

1.4 The effect of shipping times: mean versus vari- ance

Building on the argument that Duronto effects result from a disruption of freight shipments, this section asks whether the problem is that shipments become slower (mean effect) or that they become less predictable (variance effect). To explain how the Durontos affect shipping times, I draw on an operations research model of railway traffic, and leverage the model’s predictions to instrument separately for the meanand the variance of freight shipping times. Section 1.4.1 details this empirical strategy. As the results in Section 1.4.2 show, the Duronto effects owe primarily to the variance of shipping time, which adds uncertainty to the already uncertain world that firms face.

1.4.1 Model and empirical strategy

We are interested in estimating an equation of the form

푦푖푡 = 훽푀 ln 푀푑푡 + 훽푉 ln 푉푑푡 + 훾푖 + 훾푡×푠 + 훾푡×푘 + 휂푖푡, (1.4)

where 푦푖푡 is an outcome of interest for factory 푖, 푀푑푡 is the mean shipping time for 14 shipments to and from district 푑, and 푉푑푡 is the variance. The empirical challenge

14To calculate this mean and variance for the empirical application, I restrict focus to those point-to-point shipping routes with at least two shipments in each of the sample years for which

43 is to obtain independent empirical variation in this mean and variance which is not

correlated with the error term 휂푖푡. To separately identify these mean and variance effects, I draw on Chen and Harker (1990) and Harker and Hong (1990), who model two-way traffic on a single rail line, with trains dispatched according to a given distribution. Trains 푖 and 푗 meet with

probability 푞푖푗, in which case 푖 experiences delay 푑푖푗, which is random. The mean and variance of travel time are

∑︁ 퐸(푡푖) = 퐹 푅푖 + 푞푖푗퐸(푑푖푗) (1.5) 푗

∑︁ 2 ∑︁ ∑︁ 푉 푎푟(푡푖) = [푞푖푗푉 푎푟(푑푖푗) + 푞푖푗(1 − 푞푖푗)퐸 (푑푖푗)] + 퐶표푣(푞푖ℎ푑푖ℎ, 푞푖푘푑푖푘), (1.6) 푗 ℎ 푘

where 퐹 푅 is free-running time. Solving for the expectation and variance of each 푡푖 requires numerical methods, and Appendix 1.9 elaborates on this solution, but equa- tions (1.5) and (1.6) reveal a key prediction. To first order, the effect of adding more

trains on 퐸(푡푖) is simply that each new train 푗 imposes some expected delay, 푞푖푗퐸(푑푖푗),

on train 푖. For 푉 푎푟(푡푖), however, there is both a direct effect of this additional train 푗, reflected in the first sum in (1.6), and an additional effect arising from the covariance of the meeting times for all possible pairs of trains on the line. The extra dimension of these pairwise interactions makes the covariance term, and ultimately 푉 푎푟(푡푖), scale more rapidly when there are many trains on the line. The implication is that the ef- fect on 푉 푎푟(푡푖) of adding an additional train to the line, relative to the effect on 퐸(푡푖), is greater for lines which already have high congestion, than for low congestion lines. The intuition for this prediction is based on “knock-on effects”. Even on a congested line, all trains might run on schedule and reach their destinations quickly. But once one train is delayed, it meets other trains and makes them delayed, starting a chain

data is available (2011 and 2012). This ensures that changes in mean and variance are due to shipping becoming slower and less predictable for a fixed set of routes, rather than due to changes in the composition of shipping origins and destinations. Having at least two shipments is necessary, because otherwise the variance is undefined. For district-level measures of mean and variance, I average across the routes serving each district, weighting by the number of trains run on that route over all the years for which data is available (2011 to 2015). I also normalize each value by the route’s track distance, to avoid giving more weight to longer routes, though doing the analysis without this normalization does not substantially change the results.

44 reaction and possibly very slow travel for all trains involved. The variance blows up at high congestion levels because of this difference between everything running on schedule and everything falling in to the chain reaction.

This prediction serves as a basis for the first-stage equations

푀 푀 푀 푀 ln 푀푑푡 = 휋1 퐷푑푡 + 휋2 (퐷푑푡 × 푇푑,푡=푡0 ) + 휋3 푆푑푡 + 훾푑 + 훾푦 + 휖푑푡 (1.7) 푉 푉 푉 푉 ln 푉푑푡 = 휋1 퐷푑푡 + 휋2 (퐷푑푡 × 푇푑,푡=푡0 ) + 휋3 푆푑푡 + 훾푑 + 훾푦 + 휖푑푡.

Here, 퐷푑푦 is the number of Durontos affecting district 푑, 푆푑푦 is the spillover control, and 푇푑,푦=푦0 is the amount of traffic on the local railway lines in 2008, the year prior to the introduction of Durontos. So the idea of the identification strategy is that when a Duronto hits a given railway line, it has some effect on mean shipping times which is relatively independent of the amount of pre-existing congestion on that line. For variability of shipping times, however, pre-existing congestion and the associated interaction term matters: a Duronto hitting a low-congestion line has some small effect on shipping variance, while a Duronto hitting a high congestion linesetsoff knock-on effects that entail a much greater increase in the variance. Figure 1-9 provides an empirical illustration of this mechanism, binning all track sections by their pre-existing congestion levels, and plotting the bin-specific effects of Durontos on mean and variance of shipping time. The divergence it shows between these two curves represents the source of identifying variation.

Even with random variation in the introduction of Durontos and with the con- trols for spillovers, this identification strategy requires an exclusion restriction: the Duronto and interaction instruments affect firm outcomes only through their effect on mean and variance of freight shipping times. One possible violation of the restriction would occur if Durontos work through channels other than shipping times. As dis- cussed in Section 1.3.3, both institutional and empirical details help to rule out these possibilities. Because Durontos run non-stop and local passenger train schedules are unaffected, Durontos have no effects on local labor movement. Also, because Indian Railways fixes freight rates based on the type of good and distance traveled, conges-

45 tion arising from the Durontos has no effect on shipping prices or on the quantities that can be shipped; the shippers simply need to wait longer.

Even if Durontos work only through shipping times, another possible violation of the exclusion restriction would occur if Durontos work through some feature of the shipping time distribution other than the mean and variance. For example, congestion might fatten the tails of this distribution, increasing the probability of extremely long and disruptive delays. Because my shipping time data includes only mean and variance statistics, I cannot directly test for this. However, I construct an over- identification test (Hansen, 1982) to help address the concern. Specifically, assume 퐷 and 퐷 × 푇 are valid instruments. Testing whether higher-order interactions such as 퐷2 × 푇 are correlated with the 휂ˆ is a way of testing whether the track conditions and congestion affect the outcomes through a channel other than mean and variance of shipping time. These tests do not reject the hypothesis that the extended set of instruments are correlated with the errors, lending support to the claim that they actually are not producing effects through higher-order moments of the shipping time distribution.15

Another threat to identification comes from the use of pre-existing congestion in the interaction term. Areas with higher pre-exiting congestion are different from less congested areas, and might be on different time trends. Figure 1-10 addresses this concern, showing that high-congestion lines receiving Duronto trains are not on a differential trend.

15I also employ similar tests using 퐷 interacted with the district’s pre-Duronto congestion bin, for instance 퐷 × 1[50% < 퐶 ≤ 60%]. These tests similarly do not reject the exclusion restriction. In an earlier version of the paper, I estimated linear effects of 푀 and 푉 , instead of the log functional form used here. This linear formulation enables a Wald-style specification test (Godfrey, 1988) whose formulation is similar to the over-identification test but which follows a different logic. Suppose, regardless of whether 퐷 and 퐷 × 푇 are valid instruments for 푀 and 푉 , that 퐷, 퐷 × 1[50 < 퐶 ≤ 60], 퐷 × 1[60 < 퐶 ≤ 70], and so forth are valid instrument for an extended set of endogenous variables including not only 푀 and 푉 , but polynomial terms 푀 2 and 푀푉 . Estimating this extended model and constructing the Wald test shows that it is actually the mean and variance, and not these polynomial terms, producing the effects on firms. Effects of 푀 2 being the true source of the estimated effects of variance is a particular concern given that the variance scales with the meanorwith mean-squared for many random variable distributions. However, that does not seem to be what is happening here.

46 1.4.2 Results of the shipping times IV

Table 1.5 presents first-stage effects of the Duronto trains on the mean and variance of freight shipment times. Column (1) and shows that Durontos increase mean shipping times, but as per the small point estimates on the interaction term, this effect is no greater for Durontos hitting high-congestion lines. Column (2) shows that the effect of Durontos on the variance of shipping times increases with the pre-existing congestion in that district. For each additional 10 percent of pre-existing line capacity utilization, a Duronto route through a district leads to 2.1 percent greater variance of shipping time. Table 1.6 presents two-stage least squares estimates of the effects of mean and variance of shipping times. Column (1) shows that increasing the variance of ship- ping times by 10 percent reduces firm revenue by 1.1 percent. Mean shipping times, on the other hand, do not have a statistically significant effect on revenue, andwe can easily reject the hypothesis that the mean and variance are equal, in favor of the alternative that the variance effect is greater. Estimates for revenue productiv- ity, reported in Column (2), show negative point estimates which are comparable in magnitude, though only the effect of variance is significant at the 10 percent level. Column (3) shows how shipping times affect production costs, with a 10 percent in- crease in variance leading to a 0.3 percent increase in average cost, compared with almost no effect coming from the mean. Finally, Column (4) indicates thatboth mean and variance contribute to increases in inventory holdings, consistent with the predictions of canonical inventory models as described in Appendix 1.9.

1.5 Explaining the revenue loss: costs versus com- petition

To determine the economic reason for Durontos’ effect on firm revenue, this section models how firm production and competition are affected by the running ofDuronto trains. As the reduced form results in Section 1.3 show, a rail-using firm suffers

47 substantial losses when one of these trains passes through its district. But this revenue loss could occur for two basic reasons. One possibility, which I call a “cost effect”, is that the Duronto traffic greatly disrupts firm operations and increases production costs. Large cost effects entail that if every firm in the economy suffered an increase in congestion, large losses in aggregate output would follow. A second possibility, however, is that Durontos’ revenue effects owe more to simple market competition: the disruption caused by Durontos is, perhaps, only very small, but because the disrupted firms compete with other firms less exposed to traffic, even asmallcost increase can force them out of business. Distinguishing between these possibilities is essential both because it bears on the net effect of the Duronto program, and because if the cost effect is in fact small, then increasing congestion for every firminthe economy could lead to negligible aggregate losses, while a large cost effect implies large losses from nationwide congestion.

1.5.1 Model and empirical strategy

The following model serves to isolate the pure cost effect in the presence of competitive forces, and to provide empirical estimating equations. As in Rotemberg (2017), the economy has 퐾 sectors, and a consumer with income 퐼 has utility

퐾 ∑︁ 휑 푈 = 푄푘 + 푐, (1.8) 푘=1 where 푄푘 is sectoral output and 푐 is consumption of an outside good, whose price is normalized to one.

Consumer optimization implies that sectoral revenue is

휑 (︂푃 )︂ 휑−1 푃 푄 = 푘 . (1.9) 푘 푘 휑

Sectoral production is a CES aggregate of the output quantities 푞푗푘 of each firm

48 푗 in the sector: 휎푘 휎 −1 (︃ 푁 휎푘−1 )︃ 푘 ∑︁ 휎푘 푄푘 = 푎푗푘푆푗푘푞푗푘 , (1.10) 푗=1 where 푎푗푘 is quality, and 푆푗푘 is the share of output going to consumption. Sectoral prices come from profit maximization of the sector’s final good producer:

1 (︃ 푁 )︃ 1−휎푘 ∑︁ 1−휎푘 푃푘 = 푝푗푘 . (1.11) 푗=1

The elasticity of substitution across varieties, 휎푘, is an important parameter for de- termining the effects of exposing some firms in the sector to congestion. Low substi- tutability 휎푘 will mean that when some firms are exposed, these firms can raise their prices to offset the associated costs, without losing much business to their competi- tors. Their ability to retain sales could result from their making specialized products, or from the geography of production, for instance because their customers are local and distant competitors have difficulty reaching these customers. High 휎푘, on the other hand, means that if congestion forces affected firms into even a small price increase, customers will switch to the competitors.

Production of each firm’s variety is Cobb-Douglas:

훼퐾 훼퐿 훼푅 훼푁 푞푗푘 = 퐴푗푘퐾푗푘 퐿푗푘 푅푗푘 푁푗푘 , (1.12) where 퐴 is firm-specific TFP, and production uses capital 퐾, labor 퐿, “rail good” materials 푅, and “non rail good” materials 푁. In particular, 푅 is a composite of the rail goods specified above (coal, iron, steel, cement, fertilizers, foodgrains, and mineral oils), while 푁 is a composite of all other materials. For ease of notation, ∑︀ index inputs by ℐ. Returns to scale are reflected by 훾 ≡ ℐ∈{퐾,퐿,푅,푁} 훼ℐ . For now, assume constant returns to scale (훾 = 1).

As in Hsieh and Klenow (2009), production is subject to firm-specific distortions affecting the marginal product of each input: 휏퐾,푗, 휏퐿,푗, 휏푅,푗, and 휏푁,푗. The literature proposes many possible sources of these distortions, from credit constraints to political

49 connections; Hopenhayn (2014) provides a useful survey. Taking the pre-existing distortions as given, transport congestion could increase the distortions through a variety of channels. For instance, slow shipping on a congested rail network could force the firm to incur some financing or depreciation costs for each unit of railinput used, or uncertainty in input arrival times could distort another input, such as labor, if workers tasks become less productive or more difficult to coordinate as a result of the uncertainty. So I model congestion as potentially affecting each of the distortions, and will show how firm behavior responds to these changes in distortions.16

The firm takes the overall price index as given and maximizes profits

∑︁ 휋푗푘 = 푝푗푘푞푗푘 − (1 + 휏ℐ,푗)푝ℐ ℐ푗, (1.13) ℐ implying that it sets price at a constant markup over marginal cost:

(︂ )︂훼ℐ ∏︀ 훼ℐ 휎푘 ∏︁ 푝ℐ ℐ (1 + 휏ℐ,푗) 푝푗푘 = · · . (1.14) 휎푘 − 1 훼ℐ 퐴푗푘 ℐ

Firm revenue is

휑 (︂푃 )︂ 휑−1 푦 = 푝 푞 = (푝1−휎푘 )(푃 휎푘−1) 푘 . (1.15) 푗푘 푗푘 푗푘 푗푘 푗푘 휑

Allow firm productivity to grow according to

퐴̂︂푗푘 = −휖푗푘, (1.16)

where 휖푗푘 is mean-zero and normally distributed. Then, combining (1.15) with (1.14),

16Apart from the channels mentioned here, congestion could also yield effects similar to an “output distortion”, for instance if slow shipping makes consumers buy less of the firm’s product at a given price. As Hsieh and Klenow (2009) note, however, the effects of changing this output distortion are equivalent to the effects of changing all of the input distortions equally. I thus omit an explicit output distortion, though I note for purposes of interpretation that the input distortions I study could also reflect these channels related to output distortion.

50 (1.11), and (1.16), changes in firm revenue are

(︃ )︃ ∑︁ 푦̂︀푗푘 = (1 − 휎푘) 훼ℐ (1\ + 휏ℐ,푗) + 휖푗푘 ℐ ⏟ ⏞ direct effect (1.17) 푁푘 [︃(︃ )︃ ]︃ 1 ∑︁ ∑︁ 푦푗′푘 + (휎푘 − ) 훼ℐ (1\ + 휏ℐ,푗′ ) + 휖푗′푘 . 1 − 휑 푌푘 푗′=1 ℐ ⏟ ⏞ stealing effect

As I elaborate below, the first term in (1.17) is the direct effect of exposure tocon- gestion on firm revenue loss, while the second term captures the firms gainsfrom

stealing the business of competitors exposed to congestion. Let 휓ℐ be the effect of one Duronto route on a firm’s input ℐ distortion:

(1\ + 휏ℐ,푗) = 휓ℐ 퐷푗. (1.18)

We can now write a simplified version of equation (1.17):

푦̂︀푗푘 = 훽Ψ퐷푗 + 휒Ψ휇푘 +휖 ˜푗푘, (1.19)

where

훽 ≡ 1 − 휎푘 ∑︁ Ψ ≡ 훼ℐ 휓ℐ ℐ 1 휒 ≡ 휎 − 푘 1 − 휑 ∑︀ 푗′ 퐷푗푦푗푘 휇푘 ≡ 푌푘

푁푘 1 ∑︁ 푦푗′푘 휖˜푗푘 ≡ (1 − 휎푘)휖푗푘 + (휎푘 − ) 휖푗′푘 . 1 − 휑 푌푘 푗′=1 Here, 훽 reflects the direct effect of increasing distortions. This effect is largest ifthe elasticity of substitution 휎푘 is high, since this means that firm varieties in the sector

51 are close substitutes, so even a small distortion to one firm’s costs will cause itto lose a large amount of business to its competitors. While 훽 captures the effect of the distortions themselves, Ψ captures how these distortions respond to Duronto routes

퐷푗 through the firm’s district. In particular, Ψ is a sum of the Durontos’ effect, 휓ℐ ,

on each input distortion, weighted by each input’s cost share 훼ℐ . Just as congestion can lead to revenue losses for a given firm, it also presents an opportunity for the firm to steal the business of its competitors who experience con- gestion of their own. The magnitude of this stealing depends on crowd-out parameter 휒. It is largest when the firm’s product is a ready substitute for its competitors’ prod-

ucts (high 휎푘), and when the sector as a whole is less replaceable by other sectors (low 1 1−휑 ). The measure of sectoral exposure, 휇푘 is an output-weighted average of exposure ′ to Duronto congestion for all the firms in the sector. Finally, the disturbance 휖푗푘 is normally distributed with mean zero. An additional prediction comes from the observation that the revenue effect in- creases with the elasticity of substitution. In particular, re-write (1.19) as

′ 푦̂︀푗푘 = Ψ1퐷푗 − Ψ2(휎푘 × 퐷푗) + 휒Ψ휇푘 + 휖푗푘. (1.20)

Here, Ψ1 reflects the cost effect for firms inlow 휎 industries, while Ψ2 reflects that revenue losses become greater for firms in more competitive industries. Below, Iuse industry level estimates of 휎 to estimate (1.20). Note that if the model is correct and

휎 is measured perfectly, we should find Ψ1 = Ψ2 = Ψ. The aggregate effect on sectoral output comes from summing across all firmsin (1.17), yielding ˆ 푌푘 = (훽 + 휒)Ψ휇푘 + 휖푘, (1.21)

∑︀푁푘 푦푗푘 where the disturbance 휖푘 ≡ 휖˜푗푘 is a weighted average of the firm-level dis- 푗=1 푌푘 turbances. Equation (1.21) nicely breaks the effect of the Duronto congestion shock into three parts. First, firms in the sector face some exposure to the congestion, as

measured by 휇푘. Second, this disrupts firm operations, leading to some total distor- tion Ψ, which reflects pure “cost effect” of the Durontos, independent of anyoutput

52 market competition. Finally, 훽 + 휒 reflects how the previous two components, work- ing through market competition, lead to an ultimate effect on sectoral revenue. The aggregate effect depends on whether the direct losses to firms, reflected by 훽, are large relative to the ability of other firms in the sector to replace the lost output, as reflected by 휒.17 To isolate the pure cost effect Ψ, note that Cobb-Douglas production (1.12) with constant returns to scale entails that the firm’s average cost equals marginal cost:

(︂ )︂훼ℐ ∏︀ 훼ℐ ∏︁ 푝ℐ ℐ (1 + 휏ℐ,푗) 퐴퐶푗 = 푀퐶푗 = · . (1.22) 훼ℐ 퐴푗푘 ℐ

It follows that

퐴퐶̂︂푗 = Ψ퐷푗 + ̂︀휖퐴퐶,푗, (1.23) with (︂ )︂훼ℐ ∏︁ 푝ℐ 1 휖퐴퐶 ≡ · . (1.24) 훼ℐ 퐴푗푘 ℐ Under the assumption that Duronto running is uncorrelated with changes in factor

prices 푝ℐ and the physical TFP 퐴푗푘 in the firm’s production function, regressions of the form (1.23) identify the effect of Durontos on costs. In other words, observed changes in average cost reflect an actual cost effect, independent of any competition effect. A more general version of this statement holds for generic production functions with non-increasing returns. Let 퐶(푞) be the cost function, which is unknown, but assumed to satisfy 퐶′(푞) > 0 and 퐶′′(푞) ≥ 0. Also assume there are no fixed costs

18 (lim푞→0+ = 퐶(0) = 0). Suppose costs increase by shifting outward, so the new cost

17It is straightforward to extend the above discussion to account for the effects of spillover traffic. 푆 Letting 푆푗 be the amount of spillover traffic in the district of firm 푗, Ψ the total cost effect of spillover routes, and 휇푆 the exposure of other firms in the sector, we obtain analogues of equations (1.19) and (1.21): 푆 푆 푆 푦̂︀푗푘 = 훽(Ψ퐷푗 + Ψ 푆푗) + 휒(Ψ휇푘 + Ψ 휇푘 ) +휖 ˜푗푘 ^ 푆 푆 푌푘 = (훽 + 휒)Ψ휇푘 + (훽 + 휒)Ψ 휇푘 + 휖푘.

18Even for a production with fixed costs, a version of (1.27) holds, with average variable cost, rather than average cost, as the object of interest.

53 function is 퐶˜(푞) = (1 +휏 ˜)퐶(푞). How can we identify (1 +휏 ˜)? Equating marginal revenue with marginal cost, firm optimization entails

휎 + 1 = (1 +휏 ˜)퐶′(푞). (1.25)

Differentiating with respect to 휏˜, we see that

휕푞 퐶′(푞) 1 = − < 0. (1.26) 휕휏˜ 퐶′′(푞) 1 +휏 ˜

Finally, noting that 퐴퐶푗 = (1 +휏 ˜)퐶(푞) and considering the effect of changing 휏˜, it follows that 휕[퐶(푞)/푞] 휕푞 퐴퐶̂︂푗 = (1\ +휏 ˜) + (1\ +휏 ˜). (1.27) 휕푞 휕휏˜ So the effect of the cost shift 휏˜ on average costs is, first, a direct increase in costs, (1\ +휏 ˜). But as the second term reflects, the cost shift also pushes the firm downits 휕푞 cost function ( 휕휏˜ < 0), which with non-increasing returns has the effect of reducing 휕[퐶(푞)/푞] average costs ( 휕푞 > 0). Thus, since the second term in (1.27) is negative,

observed changes in average cost 퐴퐶̂︂푗 are a lower bound on the cost shift (1\ +휏 ˜).

1.5.2 Empirical application of the model

To identify the effect of competitors’ exposure to Duronto congestion, I estimate an empirical counterpart of (1.19):

푆 푦푖푡 = 푎1퐷푑푡 + 푎2푆푑푡 + 푎3휇푠푘 + 푎4휇푠푘 + 훾푖 + 훾푡×푠 + 훾푡×푘 + 휖푖푡, (1.28)

푆 where 휇푠푘 and 휇푠푘 are the exposure to Duronto and spillover traffic, respectively, of factories in the same state 푠 and four-digit NIC industry 푘 as factory 푖. All exposure measures are calculated based on the Duronto routes in service as of year 푡, but the 2008 district locations of each industry’s output. As above, all regressions include fixed effects for each firm, and year-specific effects for each state andindustry. Table 1.7 presents results of this regression. Column (1) shows, first, that the

54 main effect of a Duronto route is a 3.1 percent loss in revenue for rail-using factories. This is greater than the revenue loss estimated in the basic reduced form regression of Table 1.3, because Duronto traffic is positively correlated with the exposure of

퐷 competitors to Duronto traffic, and this exposure 휇푠푘 itself has a positive effect on a firm’s own revenue. In particular, if each of a firm’s competitors is exposed toon additional Duronto route, that firm gains 2.5 percent in revenue.

In the context of the model, the sum of these revenue coefficients, 푎ˆ1 +ˆ푎3, provides an estimate of (훽 + 휒)Ψ, which indicates the aggregate effect of Duronto exposure on firm revenue. I cannot statistically reject the hypothesis that the sum ofthese coefficients is greater than or equal to zero, against the alternative that it is negative; the p-value on this test is 0.29. So it is not possible to rule out that the competi- tors replace all, or at least a large portion, of the output lost by congestion-affected firms. Estimates of the spillover and state-industry spillover exposure effects offerless precision, but yield a similar qualitative conclusion.

Columns (2), (3), and (4) of Table 1.7 show that competitors’ exposure to conges- tion does not affect a firm’s revenue productivity, average cost, or inventory holding. These results are unsurprising: while competitors’ exposure enables a firm to steal the business of these competitors, it does not affect the firm’s own logistical oper- ations or production costs. In principle, competitors’ exposure to congestion might have affected revenue productivity through price effects, though revenue productivity depends not only on prices but on physical productivity, which is likely to remain unaffected. The main effects on these three variables remain the same asinthere- duced form, however, with each Duronto route still leading to a 0.8 percent increase in average costs. As per equation (1.23), this effect on average costs is interpretable as an estimate of the pure cost effect Ψ under Cobb-Douglas production, and more generally as a lower bound on the shift in the cost function as illustrated in (1.27). So Duronto congestion does lead to some disruption of firm production and pure cost effect which, though magnified by competition, is nontrivial onitsown.

Table 1.8 shows support for the additional prediction of equation (1.20) that rev- enue effects scale with the elasticity of substitution. In an industry with inelastic

55 demand, Duronto congestion causes little revenue loss: the 10th percentile elasticity is 휎 = 2.9, implying the Duronto effect on revenue is a 2.0 percent loss. Intuitively, the low elasticity means that when congestion increases costs for these firms, consumers still buy their products. For high elasticity industries, on the other hand, the conges- tion effect leads customers to substitute to other sellers, and affected firmssuffera larger revenue loss: the 90th percentile industry has 휎 = 6.2, implying a 4.2 percent ˆ revenue loss. The estimated coefficient on the Duronto main effect Ψ1 = −0.0014 ˆ and that on the elasticity interaction Ψ2 = −0.0065 do not explicitly validate the ˆ ˆ ˆ theoretical prediction that Ψ1 = −Ψ2, though the confidence interval on Ψ1 is wide enough that we also cannot reject this prediction. One likely reason for the difference ˆ ˆ between Ψ1 and −Ψ2 is measurement error in the elasticities 휎. The economically relevant elasticity concerns substitution between a firm’s variety and the varieties of other firms in the state-industry, but the elasticities in the data reflect substitu- tion patterns between Harmonized Standard 6-digit products.19 Still, this type of ˆ measurement error would attenuate estimates of Ψ2. So we should expect the true magnitude of Ψ2 to be larger than estimated, and the basic conclusion still holds: Duronto congestion effects are worst for firms facing stiff competition. While the estimates so far use firm-level data to estimate the parameters that matter for aggregate revenue effects, a direct test for aggregate effects is also possible, using an empirical counterpart of (1.21):

푆 푌푠푘푡 = 푏1휇푠푘푡 + 푏2휇푠푘푡 + 훾푡×푠 + 훾푡×푘 + 휖푠푘푡, (1.29)

where 푌푠푘푡 is aggregate output for industry 푘 firms in state 푠 in year 푡 and the exposure measures are calculated as above. The results in Table 1.9 show negative but statistically insignificant effects of exposure to Duronto and spillover traffic.The

19The product categories with the lowest elasticities are those with specialized products: uranium and thorium ores, manufacture of cement plaster, manufacture of electricity distribution, and control apparatus, and manufacture of electric motors. Those with higher elasticities include more substi- tutable products: iron ores, soft drinks, alcohol, and animal feeds. So although the elasticities from Broda, Greenfield and Weinstein (2006) do not measure the relevant cross-firm-variety elasticity, they reflect the interchangeability of products within categories, and in this sense proxy well forthe relevant elasticities.

56 magnitudes of these estimates nevertheless fall within the same range as the implied aggregate effects from the firm level regression. In particular, the implied valueof (훽 + 휒)Ψ from Table 1.7 is −0.006, while the Duronto exposure effects in Table 1.9, which estimate the same parameter, range between −0.002 and −0.014.20 Taken together, the empirical results in this section show that the reduced form revenue loss owes, in large part, to firms losing their edge against competitors, who in turn take advantage of the opportunity and mitigate aggregate revenue loss. Still, congestion affected firms do experience a genuine disruption to their operations andan increase in production cost, which would imply some losses in aggregate productivity if all firms in an economy experienced a congestion increase.

1.6 Policy

Congestion bears on infrastructure policy for two distinct reasons. First, it has im- plications for traffic management on existing infrastructure. Only with notions of capacity and congestion can we conceptualize the economic benefits from congestion pricing and prioritization of different types of traffic. Second, decisions about how and where to construct new infrastructure need to account for congestion. Doing so overturns some commonly held intuitions about the form of optimal investment.

1.6.1 Traffic management on existing infrastructure

Congestion externality from running additional traffic

The first traffic management issue is how to account for congestion in setting pricesor restricting quantities. Currently, Indian Railways does not increase prices with con- gestion. In calculating how to set congestion pricing, an essential input is a measure of the cost externality the running of one train imposes on other users of the rail net- work. My estimated Duronto effects provide a measure of this externality. Of course,

20Even if Duronto traffic did not produce aggregate effects on revenue, it might affect grossvalue added by relocating business to less productive factories which would not have produced as much output in the absence of the congestion increase. Table 1.10 tests for these effects on gross value added, again finding coefficients which are negative but statistically insignificant.

57 running one Duronto train may impose externalities on the passengers in other trains, in addition to the effects on freight-using firms. My estimates capture the effectson freight alone, and in this sense are a lower bound on the total externality. A naive way to measure firm losses from introducing one Duronto route is to look at the revenue loss for Duronto-affected firms, relative to firms in districts unaffected by Duronto traffic. To calculate this loss, I sum the 2008 revenue of all rail-using firms in the path of each Duronto train, and multiply by the estimated revenueloss coefficient from Table 1.7. As reported in Column (1) of Table 1.11, the introduction of the average Duronto route leads to a firm revenue loss of INR 461 million inthe districts it passes through, plus an additional INR 155 million in districts subject to spillover traffic, for a total loss of INR 616 million (USD 12.7 million at2008 exchange rates). For comparison, this loss amounts to 60 percent of the estimated INR 1,024 million annual passenger revenue from running one Duronto route.21 Railway passenger services already operate at a loss, with operating costs twice as high as the fare revenue collected (Ministry of Railways, 2015a), and this externality adds an additional cost on top. At the same time, consistent with a central theme of this paper, the negative externality for certain firms leads to a positive externality for the firms which steal their business. As Column (2) of Table 1.11 reports, competitors in the same state and industry as Duronto and spillover affected firms gain a total of INR 567 million for each Duronto route introduced. Thus, the net firm revenue loss as reported in Column (3) is INR 49 million, or only about 5 percent of the route’s passenger fare revenue. While the thought experiment so far considers the effects of running Duronto trains through some districts but not others, it leaves open an important economic question: what would be the effects of a nationwide increase in congestion? Apart from the economic interest in answering this question, it is also relevant to real policies the Railways might consider, such as uniform limits on the amount of passenger traffic

21I do not have detailed data on fare revenues, but derive estimates by using the limited number of per-journey revenue amounts reported in Ministry of Railways (2015b), and multiplying by the annual number of journeys for each route.

58 congesting a given line, uniform increases in the track priority of freight relative to passenger traffic, or the construction of the proposed nationwide network of Dedicated Freight Corridors, aiming to improve freight performance for all firms.

The effects of such a nationwide change in congestion depend on the extentto which congestion disrupts firm production, as reflected in the “cost effect” discussed in Section 1.5. The model there shows that if we assume Duronto congestion increases production costs by some proportion 1 +휏 ˜, then estimated effects on average cost provide a lower bound on 휏˜. Under perfect competition, multiplying each firm’s cost function by 1 +휏 ˜, equivalent to multiplying aggregate supply by 1 +휏 ˜, will lead to a 100 · 휏˜ percent reduction in output, and a 100 · 휏˜ percent reduction in total surplus. As Column (4) of Table 1.11 reports, exposing all rail-using firms to this cost shock would lead to an output loss amounting to INR 94,962 million (USD 2.0 billion). Whereas this represents the effect of exposing every rail using firm to Duronto traffic, Column (5) reports the effect of exposing every manufacturing firm, rail-using ornot, to a similar cost shock, resulting perhaps from a Duronto-sized congestion increase on its preferred mode of transportation, whether that be rails, roads, or otherwise. This effect amounts to INR 258,551 (USD 5.3 billion).

Of course, this extrapolation to non rail using firms assumes that these firms areas sensitive to congestion as the rail using firms studied in my empirical analysis. While it is possible that these non rail using industries are less sensitive to congestion, two factors suggest that, in fact, they could be more sensitive. First, in terms of selection, the industries choosing to remain on the rails despite the high congestion are likely industries for which this congestion is less of a problem. Second, the goods that rail-using firms ship on the railways are typically homogeneous commodities like coal, iron, and cement. Whereas these firms might succeed in buffering themselves against congestion by holding large inventories of the homogeneous commodities, we might expect worse effects of congestion for other firms shipping more specialized inputs that need to arrive quickly and predictably. So in both of these regards, my estimates of the rail-specific congestion effects are perhaps lower bounds on the effects of congestion for the productive economy as a whole.

59 Priority of traffic

A second traffic management issue is how to prioritize different types of traffic. Daily operations on Indian Railways are handled by managers who decide which trains are allowed to run first on an open track, and how to accelerate or decelerate trains so they arrive at certain times. Currently, these managers’ protocol is to give the highest priority to passenger trains, making them adhere as well as possible to their schedule. An alternative would be to increase the priority for freight trains, either running the freight trains on fixed schedules, or granting higher priority to a freight train once it has met a certain amount of delay. The latter notion is the idea behind back- pressure routing (Neely, 2010), which is an approach to maximizing throughput based on minimizing a sum of squares of units’ backlogs. By using backpressure routing or another prioritization objective function which helps lagging traffic catch up, railways managers could reduce the variance of travel times. Whether this strategy yields economic benefits depends on whether the variance of travel times leads to economic costs, and my estimates indicate that it does.

1.6.2 New infrastructure

Congestion also factors into planners’ decisions about how and where to build new infrastructure. India, with $12 billion in financing from the World Bank, is nowin the process of constructing Dedicated Freight Corridors, which will be a set of higher speed railway tracks exclusively for freight shipment. Policymakers see congestion relief as a chief goal of these projects, and argue that this relief will provide great help to manufacturing growth (Ministry of Finance, 2015). One corridor is under con- struction along the west coast, between Mumbai and Delhi, with another in progress running from Punjab to West Bengal. Several other branches in other parts of the country are under consideration. But which of these lines to actually build remains an open question. To see the implications of congestion in answering this question, consider a choice between two hypothetical rail construction projects. The first project, like the actual

60 Dedicated Freight Corridor under construction, adds a new rail line between Mumbai and Delhi, a corridor with several lines serving it already, but suffering from heavy congestion. Figure 1-11a depicts a stylized version of this project. In a least-cost path approach to specifying trade costs, as is typical in the empirical literature on infrastructure (Donaldson, 2017; Donaldson and Hornbeck, 2016), the cost of moving

from M to D, 휏푀퐷, is a function of the length of the shortest path between M and D.22 If the new line and the existing shortest path between M and D are of similar length and quality, and we have no notion of capacity or congestion, then adding the

new line will not reduce the trade cost 휏푀퐷. The new line is simply a close substitute for the existing line. The intuition that comparable links in a transportation network serve as substi- tutes for one another has a long and influential intellectual tradition, going back to Fogel (1964). Fogel’s main insight was that, although the American railways carried large volumes of freight shipments, the railroads in fact made a small contribution to economic growth, because even in the absence of railroads, shippers would have been able to use a close substitute: the waterways. The intuition of substitutability between two different lines or modes of transport is, perhaps, correct in a context like the American railroads, if there is little congestion relative to the level of capacity. In a congested network, however, this intuition breaks down. First, the new line between M and D shares the traffic load with the existing line, reducing congestion and the associated trade cost between M and D. Second, due to traffic spillovers, the new line will reduce trade costs for trips to and from the neighboring city X. In particular, if there are some traders who previously traveled from M to D via X in order to avoid congestion on the short path between M and D, these traders can now move to the new, less congested short path between M and D, reducing congestion along the line passing through X. For these reasons, the new line is not a perfect

22The approach does, often, show sophistication in allowing for different prices per distance trav- eled on different parts of the network or for different modes, or might incorporate other aspectsof trade costs. But the core idea of a least-cost path approach is to find the minimum travel distance between points. Applying such a least cost path approach to the Indian Railways would almost certainly specify costs as a function of distance, since, on the Railways, the distance determines the price.

61 substitute for the existing lines, but acts as a sort of complement, in that it helps carry the burden of traffic.

To quantify the possible advantages of adding this link in a congested area, con- sider the comparison with a more classical infrastructure project connecting previ- ously unconnected cities, as depicted in Figure 1-11b. This project, which in principle could be constructed as another Dedicated Freight Corridor, builds a new line between Amaravati, the newly planned capital of Andhra Pradesh, and Raipur, the capital of neighboring Chhattisgarh. Currently the route connecting these cities is a cir- cuitous 873 kilometers, even though the straight-line distance is only 406 kilometers, leaving considerable scope to build a shorter line. Assume that, in this area with less passenger traffic, there is no congestion, so that, as in the classical approach, trade costs between two points are proportional to the minimum distance between this points on the transport network. How short would we need to make the new distance-reducing Amaravati-Raipur line, in order to achieve the same gains as the new congestion-reducing Mumbai-Delhi line?

To answer this question, I draw on a basic result from the general equilibrium trade model in Allen and Arkolakis (2016), which is that the welfare effects of reducing travel costs along a link, (푖, 푗) in a transport network can be expressed as

푁 푁 푑 ln 푊 ∑︁ ∑︁ 푑 ln 푊 푑 ln 휏푘푙 = × . (1.30) 푑 ln 푡푖푗 푑 ln 휏푘푙 푑 ln 푡푖푗 푘=1 푙=1

Here, 푊 is aggregate welfare, 휏푘푙 is the average cost of trading between 푘 and 푙, which is determined by the paths that various traders take. Along these paths between 푘

and 푙 traders incur costs 푡푖푗, of moving between each directly connected pair of cities 푖 and 푗. Accounting for congestion, these costs can depend on the total amount of trader traffic between 푖 and 푗. As (1.30) shows, building new infrastructure between

푖 and 푗 lowers trade costs 휏푘푙 for each pair of trading cities whose routes pass through

푖 and 푗. In turn, these reductions in 휏푘푙 affect welfare 푊 according to standard trade model predictions. Specifically, in the economic geography version of Allen and Arkolakis (2016) with mobile labor, a straightforward application of the envelope

62 theorem shows that the reduction in trade cost between two cities is proportional to the bilateral trade flow between the cities:

푑 ln 푊 푋푘푙 = − 푊 , (1.31) 푑 ln 휏푘푙 푌 where 푋푘푙 is the bilateral trade flow, and 푌푊 is world income. To apply this model to the hypothetical comparison between the two projects, I rely on a stylized version of this comparison, in order to abstract from the many real world differences between Mumbai-Delhi and Amaravati-Raipur, including in particular geographic and political barriers to building between Amaravati-Raipur, and differences in the sizes and composition of the economies in these areas. Iinstead focus on the conceptual factors relevant to congestion.

This stylized comparison requires several assumptions. First, let ∆ ln 휏푘푙 be the effect on (푘, 푙) trade costs of building the new project in question, and assume all trade costs are symmetric. Second, assume that the project in the congested area

has one “main” effect 휏˜푚 on trade costs between the cities directly connected (−휏˜푚 =

∆ ln 휏푀퐷 = ∆ ln 휏퐷푀 ), and a uniform effect 휏˜푠 on trade costs involving the “spillover”

city X (−휏˜푠 = ∆ ln 휏푀푋 = ∆ ln 휏푋푀 = ∆ ln 휏퐷푋 = ∆ ln 휏푋퐷). Third, assume that goods traded to and from the neighboring cities X and Y always travel directly on the line between these cities and the endpoints (M, D, A, or R), while goods traveling between the endpoint cities sometimes take the longer route through X or Y.23 It

follows that the A-R line does not affect trade costs for∆ Y( ln 휏푘푙 = 0 if 푖 = 푌 or 푗 = 푌 ). Fourth, normalize world income to one, and assume that the total bilateral

flow between the endpoints, 푓푚 is equal in each of the scenarios (푓푚 ≡ 푋푀퐷 +푋퐷푀 =

푋퐴푅 +푋푅퐴), as is the total flow 푓푠 from each of the neighboring spillover cities to each

23In a Wardrop (1952) equilibrium on a congested network, travelers between given endpoints will equalize the cost of travel across routes between these endpoints. The Wardrop equilibrium concept, concerned with decentralized travelers, is perhaps not entirely applicable in the Indian Railways setting of centrally planned traffic, though might be applicable to a model of total travel and congestion between the endpoints, inclusive of rail users and road users who make decentralized travel decisions. In any event, even in optimal centrally planned traffic flows, having some traders take each route typically requires there being more congestion on the shorter route so that its cost of travel is approximately equal to the cost of travel on the longer route.

63 of the endpoints (푓푠 ≡ 푋푀푋 + 푋푋푀 = 푋퐷푋 + 푋푋퐷 = 푋퐴푌 + 푋푌 퐴 = 푋푅푌 + 푋푌 푅). Fifth, let ∆푑 ≡ 1− 푑퐴푅 be the proportional reduction in A-R travel distance from 푑퐴푌 +푑푌 푅

building the new line between A and R; here, 푑푖푗 is the physical distance between 푖 and 푗. Finally, let the area of each of the projects be a closed economy, so building the project in this area does not lead to business stealing from other areas, and we can abstract from the competitive mechanism studied earlier in this paper, allowing for a more straightforward application of the Allen and Arkolakis (2016) model.24 Based on (1.30) and (1.31), the welfare effects of the two lines are

∆ ln 푊(M-D line) = 푓푚휏˜푚 + 2푓푠휏˜푠 (1.32)

∆ ln 푊(A-R line) = 푓푚∆푑. (1.33)

The effects of building in the congested area depends on both the direct effectson M and D, and the spillover effect onto neighboring X. The effect of building inthe uncongested area, on the other hand, depends only on the reduction in travel distance between A and R, with no indirect effect. From (1.32) and (1.33), it follows that benefits of building in the congested M-D area are greater just incase

푓푠 ∆푑 < 휏˜푚 + 2 휏˜푠. (1.34) 푓푚

This expression is intuitive. Building a new line in the congested M-D area is more

beneficial when this has a large effect on trade costs between M andD(high 휏˜푚),

when there are large spillover effects on trade costs in the neighboring휏 city(˜푠), and when there is a relatively high volume of economic activity exposed to these spillover

gains (high 푓푠/푓푚). My empirical estimates put magnitudes on the relevant variables in (1.34). First, the effect of Durontos on average costs, as argued above, provides a lower bound estimate of “cost effect” 휏˜. Such an increase in per-unit production cost will affect a firm in the same way as an increase in tradecost 휏˜푚. Recall that running one

24The equilibrium in this model and the derived welfare effects still, of course, account for com- petition between the firms in each of the cities within the area being treated as a closed economy.

64 Duronto route leads to a 0.8 percent increase in average cost, and that this one route is 7 percent of the line capacity in the median district. If the new M-D line reduces line utilization between M and D from 100 percent to 50 percent, and the effects of this decongestion are proportional to the effects of adding Durontos, then the

100−50 effect of the new line on the main line trade costis 휏˜푚 = 0.8( 7 ) = 5.7 percent. Next, the estimates show that one Duronto route increases costs in the neighboring spillover areas by 0.7 percent. If these lines experience similar decongestion effects,

100−50 25 then 휏˜푠 = 0.7( 7 ) = 5.0 percent. The comparison in (1.34) depends, at this point, on the relative amount of economic activity exposed to the spillovers, 푓푠/푓푚. This quantity could be small if, as is the case with Mumbai-Delhi, the link between the endpoints is an important trade route. It could also be large if, as is also the case with Mumbai-Delhi, there is a great amount of economic activity in the endpoints’ neighboring areas which are exposed to spillover traffic. Assuming for instance that

푓푠/푓푚 = 1, it follows that building the M-D line achieves the same gains as shortening the A-R line by 15.7 percent.

While this figure compares, in a stylized vacuum, the effects of de-congesting versus shortening, the numbers are more stark accounting for the actual levels of eco- nomic activity in the real-life comparison cities. In particular, letting 푓푚1 be bilateral trade between Mumbai and Delhi and 푓푚2 be bilateral trade between Amaravati and Raipur, the comparison in (1.34) becomes

푓푚1 푓푠 ∆푑 ≶ 휏˜푚 + 2 휏˜푠. (1.35) 푓푚2 푓푚2

In particular, the economic advantage to building the Mumbai-Delhi line becomes greater if there is more trade between these cities than between Amaravati and Raipur

(high 푓푚1 /푓푚2 ), or if there is more activity in the spillover areas relative to that

25This could also lead to some offsetting of the gains from decongesting the main line, astraffic previously on the long route moves back to this main line. In the extreme, the “fundamental law” of road congestion (Duranton and Turner, 2011) holds that building a new route can have no effect on travel times, as travelers fill the new route and increase its travel times. Such an extreme possibility is unlikely in the case of centrally managed rail traffic, though offsetting mechanism will likely occur, to some extent, depending on how the Railways re-routes traffic.

65 26 between Amaravati and Raipur (high 푓푠/푓푚2 ). Based on rail shipment volumes along these routes and along the set of spillover routes, I obtain estimates of 푓푚1 /푓푚2 = 2.1 and 푓푚1 /푓푚2 = 4.4, implying that building the Mumbai-Delhi line achieves the same gains as shortening the Amaravati-Raipur distance by 56 percent.27 This shortening would require building a 384 kilometer line between Amaravati-Raipur, which is, of course, physically impossible, given that the straight line distance is 406 kilometers. So this comparison does reveal possible benefits to building new lines in already served but congested areas.

1.7 Conclusion

The example of the Duronto trains shows that while running additional traffic on a transport network benefits those involved with that traffic, it also imposes externali- ties on certain other users of the transport system. These externalities work in large part by increasing the variance of shipment times, adding uncertainty to an already uncertain world faced by developing country firms. These uncertainty effects are dif- ficult to disentangle from negative shocks in most other settings, but here Ishowthat they create a significant drag on the productivity of the affected firms. At thesame time, one firm’s loss is a competitor’s gain, which helps to offset the affected firms’ losses in terms of congestion’s net effect. This analysis also points to some interesting further questions. First, a full welfare analysis of the Duronto trains would depend not only on how they affect firms in in- termediate districts, but also on how they benefit the passengers riding them between the endpoint districts, and on how they create congestion for other passenger trains. While these effects are beyond the scope of this paper, related work studies thebene- fits on the passenger side, using Railways data to study patterns of seasonal migration

26An additional complication in the real-life comparison is that building the Amaravati-Raipur line gives rail connection to previously unconnected districts between these cities. At the same time, the districts between Mumbai and Delhi gain some congestion relief from accessing the Dedicated Freight Corridor. 27In obtaining these figures, I take the Amaravati area to include the adjoining Krishna district, which contains the city of Vijayawada.

66 from rural areas to labor markets (Firth, Forster and Imbert, 2017). Second, over the long run, firms can make locational adjustments in response to conditions onthe transportation network. For example, Gulyani (2001) reports that Indian automakers respond to transportation problems by clustering geographically, and thus limiting their reliance on transport infrastructure. In this light, another related paper studies how certain distortions in railway freight pricing contributed, over the long run, to agglomeration of closely related industries in certain regions of India (Firth and Liu, 2017).

67 1.8 Tables and Figures

Figure 1-1: Histogram of line capacity utilization

Notes: This figure depicts the capacity utilization of the track sections onIndian Railways. The utilization percentage is measured as the average daily number of trains passing on the section, divided by the prescribed amount of traffic for that section, which is based on an engineering rule known as Scott’s Formula. Source: Indian Railways Line Capacity Charts.

68 Figure 1-2: Route-wise average freight shipment times

Notes: This figure plots cross-sectional route-wise average run times for freight ship- ments, against the track distance of the route. Each point is reflects the annual average run time, in days, for an origin-destination pair between which freight ship- ment takes place. These points are compared, first, against the benchmark time it would take if shipments maintained the standard freight shipment speed of 25 kilo- meters per hour. The other line is a best-fit of run time as a function of distance. The 푅2 from regressing run time on distance is 0.19. Source: Indian Railways freight shipment database.

69 Figure 1-3: Sample from scraped website with data on train routes

Notes: This figure shows a sample of the information about each train which is scraped from the website IndiaRailInfo. For each train, data is collected on the actual route the train follows, and the shortest possible path between its endpoints.

70 Figure 1-4: Goods shipped by rail in India

(a) Composition of rail freight traffic (b) Modal shares

Notes: This figure shows that a certain set of “rail goods” account for the bulk ofrailway freight traffic in India, and conversely that these goods rely heavily on the rails rather than other modes of transportation. Panel (a) shows the commodity-wise composition by weight of goods shipped by rail; the composition by value is similar. Panel (b) shows, for each of these commodity categories with available data, the fractions of freight shipped by Rail, by Road, and by Other modes of transport. Source: Ministry of Railways (2011).

71 Figure 1-5: Reduced form empirical strategy, accounting for spillover effects

(a) Basic reduced form

(b) Spillovers from diversion of traffic

72 Figure 1-6: District-wise exposure to Duronto routes

Notes: This figure depicts each district’s exposure to the Duronto treatment, defined, as in the text, as the number of two-way Duronto shortest-path routes passing through the district. It shows the cumulative treatment as of 2012, including all trains added between 2009 and 2012. The sample is restricted to districts on the shortest path between the major cities connected by the Duronto program, and which were therefore places that the Durontos conceivably could have run. The out of sample areas, including the endpoint districts actually served by the Durontos, are shaded in black.

73 Figure 1-7: District-wise exposure to spillover routes

Notes: This figure depicts each district’s exposure to the spillover traffic from theDuronto treatment, defined, as in the text, as the number of Duronto routes for which the district lies on a “diversion” route. It shows the cumulative treatment as of 2012, including all trains added between 2009 and 2012. The sample is restricted to districts on the shortest path between the major cities connected by the Duronto program, and which were therefore places that the Durontos conceivably could have run. The out of sample areas, including the endpoint districts actually served by the Durontos, are shaded in black.

74 Figure 1-8: Event study for effect of Durontos

(a) Effect on ln(Revenue) (b) Effect on ln(TFPR)

(c) Effect on ln(Average Cost) (d) Effect on ln(Inventory)

(e) Effect on congestion

Notes: In panels (a) to (d), this figure shows event studies for the effect of introducing a Duronto route on each of the four main firm outcomes of interest. Panel (e) presents a “zero-th stage event study”, showing at the track section level, the effect of a new Duronto route on the amount of traffic running on that section.

75 Figure 1-9: Mean and variance response to Durontos, as a function of pre-existing congestion

Notes: This figure shows how shipping times respond to increased traffic, consistent with therailway model from operations research. Specifically, it plots the 훽푐 coefficients from the regressions

160 ∑︁ 푀 ln 푀푑푦 = 훽푐 (퐷푑푦 × 1[푐 ≤ 퐶푑,2008 < 푐 + 10]) + 훾푑 + 훾푦 + 휖푑푦 푐=50 160 ∑︁ 푉 ln 푉푑푦 = 훽푐 (퐷푑푦 × 1[푐 ≤ 퐶푑,2008 < 푐 + 10]) + 훾푑 + 훾푦 + 휖푑푦. 푐=50

76 Figure 1-10: Event study for the effect of 퐷푑푦 × 푇푑,푦=푦0 on revenue

Notes: This figure shows an event study for the effect on revenue of the interaction of Duronto traffic with district pre-existing congestion. Specifically, it plotsthe 훽푦 coefficients in

2012 ∑︁ ln (Revenue)푖푡 = 훽푦(퐷푑,2012 × 푇푑,2008 × 1[푡 = 푦]) + 퐷푑푡 + 푆푑푡 + 훾푖 + 훾푡×푠 + 훾푡×푘 + 휖푖푡. 푦=2006

77 Figure 1-11: Effects of two hypothetical construction projects

(a) Stylized Mumbai-Delhi corridor (b) Stylized Amaravati-Raipur corridor

(c) Map of cities involved

78 Table 1.1: Descriptive statistics for factories in rail using industries

Δ by eventual treatment Mean St. Dev. Duronto Spillover (1) (2) (3) (4) Firm variables, at factory level Revenue (million INR) 1251.7 3096.1 29.304 15.31 (20.634) (26.368) ln(TFPR) 2.389 0.864 -0.022* 0.001 (0.013) (0.008) Average cost 1.013 1.136 -0.006 0.013 (0.008) (0.015) Total inventory (million INR) 185.5 480.8 3.254 -0.713 (3.399) (4.268) Inputs 108.9 278.2 1.713 -1.142 (2.054) (2.462) Finished goods 76.3 190.0 1.437 0.776 (1.308) (1.759) Input share of rail goods 0.265 0.189 -0.005* -0.003 (0.003) (0.004) Makes rail good (dummy) 0.631 0.483 0.003 0.011* (0.004) (0.006) Survival until 2012 0.534 0.499 -0.011 0.002 (0.007) (0.012)

Rail traffic variables Line capacity (trains per day) 32.4 28.2 -1.055 0.997 (2.193) (2.312) Line capacity utilization % 95.2 11.8 0.406 0.744 (0.452) (0.539) % passenger traffic 66.3 16.1 0.263 -0.560 (0.588) (0.582) % freight traffic 28.1 15.3 -0.157 0.403 (0.573) (0.683) % other traffic 5.5 4.8 -0.110 0.137 (0.076) (0.242) Mean freight ship time, days 5.11 4.10 0.624 -0.255 (normalized to 1000km) (0.810) (0.898) Variance of freight time, days 6.39 11.19 -0.037 -0.402 (normalized to 1000km) (0.722) (0.734)

Factories in sample 8281 Districts in sample 248 Notes: This table presents descriptive statistics for ASI factories in rail-using industries, de- fined as those which either (a) produce a good commonly shipped by rail (coal, iron,steel, cement, fertilizers, food grains, mineral oils), or (b) whose input cost share for the median firm in pre-2009 data is at least 5 percent. The rail traffic variables are district-level measures for the rail lines in the districts containing at least one of these rail using factories. The rail shipping time variables are calculated as a weighted average over all of the shipping routes going to and from the district, weighted by the number of freight trains run on each route. Sources: ASI, Indian Railways Line Capacity data, Indian Railways, Freight Shipment data. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01

79 Table 1.2: Effects of Durontos on railway line traffic patterns

Main specification With second-order spillovers

Passenger Freight Total Passenger Freight Total trains trains trains trains trains trains (1) (2) (3) (4) (5) (6) Duronto routes 0.611*** 0.0208 0.666*** 0.602*** 0.0286 0.666*** (0.131) (0.118) (0.177) (0.130) (0.120) (0.178)

Spillover exposure 0.221** 0.227** 0.408*** 0.210** 0.239** 0.411*** (alternate routes) (0.0917) (0.114) (0.152) (0.0907) (0.115) (0.154)

Second-order spillovers 0.0485 -0.0499* -0.00979 (0.0389) (0.0256) (0.0473)

Mean of dep. var. 27.43 13.6 43.57 25.83 12.71 40.97 푅2 (adjusted, within) 0.042 0.006 0.031 0.040 0.007 0.030 Observations 2198 2198 2198 2494 2494 2494

Section FE XXXXXX Yr × Sample FE for {Dur,Alt} XXXXXX Yr × Sample FE for S-O XXX Notes: This table presents estimates of equation (1.2), showing the “zero-th stage” effect of Duronto trains on railway congestion. It is estimated at the level of the track section, where the dependent variable is the annual daily average number of trains of each type running on the section. The first independent variable is the number of Duronto trains (based on the shortest path between endpoints) scheduled to run on the section as of that year. The next independent variable, spillover exposure is the number of introduced Duronto trains for which the section lies on a spillover alternate route, as defined in the text. The second order spillovers variable, considered only in Columns (4)through (6), indicates the exposure of the district to the alternate routes of these alternate routes, showing that traffic spillovers do not extend quite this far. Standard errors in parentheses clustered bytrack section. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

80 Table 1.3: Reduced form effects of Duronto trains on rail using firms

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Panel A. Preferred specification Duronto routes through district −0.0194*** −0.0111*** 0.0081** 0.0097* (0.0050) (0.0041) (0.0032) (0.0053)

Spillover routes through district −0.0110* −0.0064 0.0072* 0.0006 (0.0063) (0.0042) (0.0042) (0.0064)

Panel B. Without spillover control Duronto routes through district −0.0125** −0.0075* 0.0036 0.0093* (0.0049) (0.0041) (0.0030) (0.0053)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.1), for factories in rail-using industries. The dependent variables are the four main outcomes of interest as defined in the text. The regressors are the number of two-way Duronto routes (based on shortest path) passing through the district as of the current year, and the number of introduced Duronto trains for which the district lies on a spillover alternate route, as defined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year (Cameron, Gelbach and Miller, 2011). * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

81 Table 1.4: Placebo effects on non rail using firms

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Panel A. Preferred specification Duronto routes through district −0.0006 −0.0031 0.0011 0.0021 (0.0044) (0.0019) (0.0032) (0.0049)

Spillover routes through district −0.0023 −0.0024 0.0012 −0.0015 (0.0059) (0.0028) (0.0041) (0.0060)

Panel B. Without spillover control Duronto routes through district 0.0003 −0.0019 0.0004 0.0031 (0.0044) (0.0019) (0.0033) (0.0049)

Observations 50483 48101 37688 45012 Clusters 1 (factories) 10844 10420 8664 9690 Clusters 2 (district × year) 2329 2293 2248 2305

Notes: This table presents estimates of equation (1.1), for factories in non rail using industries. All other details are as in Table 1.3 above. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

82 Table 1.5: First stage effects of Duronto traffic on freight shipment times

ln(Mean) ln(Variance) (1) (2) Duronto routes through district 0.113*** 0.039 (0.028) (0.026)

(Duronto routes)×(2008 congestion) -0.021 0.211*** (0.031) (0.042)

Observations 6896 6896 Clusters (districts) 174 174 퐹 statistic 19.22 27.43

Control for spillovers XX Notes: This table presents estimates of equations (1.7), indicating the first stage effect of Duronto traffic through a district on the (log) mean and (log) variance of annual shipping times toandfrom the district. The district level shipping time measures are calculated using the set of freight routes which remain in operation, with at least one train running in each year, throughout the sample period. The measure of 2008 congestion is the total amount of traffic on all of the railway lines in the district, divided by the prescribed line capacity. Both regressions include fixed effects for district and year. Robust standard errors in parentheses, with clustering by district. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

83 Table 1.6: 2SLS estimates of mean and variance effects

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Panel A. 2SLS ln(Mean) −0.029 −0.025 0.001 0.032* (0.026) (0.016) (0.018) (0.019) ln(Variance) −0.107*** −0.033* 0.034* 0.042* (0.031) (0.019) (0.019) (0.023)

Panel B. Reduced form Duronto routes through district −0.007 −0.004 0.001 0.005* (0.007) (0.003) (0.003) (0.003)

(Duronto routes)×(2008 congestion) −0.022*** −0.006 0.007* 0.008** (0.006) (0.004) (0.004) (0.004)

Observations 6896 6682 6390 6676 Clusters 1 (factories) 3448 3341 3195 3338 Clusters 2 (district × year) 348 348 344 348

Control for spillovers, exposure XXXX Notes: Panel A of this table presents second stage estimates of equation (1.4), showing the effects of mean and variance of shipping time on the four main firm outcomes of interest. Panel B presents reduced form estimates of these outcomes on the instruments specified in (1.7). All regressions in- clude fixed effects for factory, year by state, and year by NIC industry. Robust standard errorsin parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

84 Table 1.7: Model estimates of cost and competition effects

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0308*** −0.0115*** 0.0079** 0.0094** (0.0050) (0.0037) (0.0035) (0.0046)

Spillover routes through district −0.0104* −0.0079* 0.0069* 0.0004 (0.0061) (0.0046) (0.0037) (0.0064)

Exposure of (State × Industry) 0.0249** 0.0035 0.0012 0.0011 to Duronto routes (0.0106) (0.0079) (0.0110) (0.0131)

Exposure of (State × Industry) 0.0186 −0.0108 −0.0036 −0.0014 to spillover routes (0.0125) (0.0082) (0.0127) (0.0147)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.28), for factories in rail-using industries. The dependent variables are the four main outcomes of interest as defined in the text. The regressors are the number of Duronto and spillover routes passing through the district, along with the exposure of other district competitors in the same state and 4-digit NIC industry to Duronto and spillover traf- fic, weighted by the 2008 industry revenue in the competing district. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

85 Table 1.8: Model estimates of cost and competition effects, with elasticity interactions

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0014 −0.0081** 0.0080** 0.0089** (0.0052) (0.0039) (0.0037) (0.0041)

(Duronto routes) × 휎 −0.0065*** −0.0006** −0.0001 0.0002 (0.0004) (0.0003) (0.0004) (0.0006)

Spillover routes through district −0.0046 −0.0056 0.0067* 0.0004 (0.0064) (0.0052) (0.0039) (0.0073)

(Spillover routes) × 휎 −0.0012* −0.0003 0.0002 0.0003 (0.0007) (0.0005) (0.0005) (0.0007)

Exposure of (State × Industry) 0.0231* 0.0039 0.0022 0.0019 to Duronto routes (0.0119) (0.0084) (0.0117) (0.0140)

Exposure of (State × Industry) 0.0201 −0.0094 −0.0034 −0.0008 to spillover routes (0.0129) (0.0088) (0.0140) (0.0151)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.28) for factories in rail-using industries, adding regressors capturing the interaction between Duronto and spillover traffic and the industry elastic- ity of substitution coming from Broda et al. (2006). All regressions include fixed effects for factory, year by state, and year by NIC industry. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

86 Table 1.9: Aggregate effects of Duronto congestion on revenue, at state-industry level

Dependent variable: ln 푃 푌

(1) (2) (3) (4) Exposure of (State × Industry) −0.0132 −0.0020 −0.0139 −0.0028 to Duronto routes (0.0159) (0.0227) (0.0155) (0.0224)

Exposure of (State × Industry) −0.0048 −0.0051 to spillover routes (0.0097) (0.0206)

Observations 7901 7883 7901 7883 Clusters (state × industry) 1932 1914 1932 1914

State × Industry FE XXXX Year FE XX State × Yr FE, Ind × Yr FE XX Notes: This table presents estimates of equation (1.29), showing the effect of Duronto and spillover traffic exposure on aggregate sales for each state by 4-digit NIC industry. All regressions include fixed effects for year and state by industry, with Columns (2) and (4) adding effects foryearby state and year by industry. Robust standard errors in parentheses, with clustering by state times industry. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

87 Table 1.10: Aggregate effects of Duronto congestion on gross value added, at state- industry level

Dependent variable: ln 퐺푉 퐴

(1) (2) (3) (4) Exposure of (State × Industry) −0.0149 −0.0058 −0.0098 −0.0073 to Duronto routes (0.0164) (0.0231) (0.0160) (0.0230)

Exposure of (State × Industry) −0.0056 −0.0043 to spillover routes (0.0100) (0.0208)

Observations 7850 7831 7850 7831 Clusters (state × industry) 1909 1901 1909 1901

State × Industry FE XXXX Year FE XX State × Yr FE, Ind × Yr FE XX Notes: This table presents estimates of equation (1.29), showing the effect of Duronto and spillover traffic exposure on aggregate (log) gross value added for each state by 4-digit NIC industry. All regressions include fixed effects for year and state by industry, with Columns (2) and (4) addingef- fects for year by state and year by industry. Robust standard errors in parentheses, with clustering by state times industry. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

88 Table 1.11: The cost of running one Duronto train

Panel A: Cost of running one Duronto route, imperfect competition

Loss for affected firms Gain for competitors Net effect (1) (2) (3) Duronto direct effects -461.2 372 -89.2 Spillover effects -154.9 195.4 40.5

Total (million INR) -616.1 567.4 -48.7

Panel B: All firms experience congestion increase equivalent to one Duronto, perfect competition

Rail-using firms All manufacturing (4) (5)

Total output loss (million INR) 94,962 258,551

Notes: All figures in this table are in millions of 2008 Indian rupees (nominal exchange rate is48.5INR per USD; exchange rate at PPP is 12.9 INR per USD). Calculations are as described in the text, with Panel A reporting the estimated revenue loss for rail-using firms of running one Duronto train, inclusive of direct losses to affected firms and gains to their competitors. A point of comparison for these figures is the annual passenger fare revenue from one of these routes, which I estimate at INR 1,024 million. Panel B reports the aggregate effects of exposing all firms to a cost shock equivalent to that estimated for the Duronto-affected firms.

89 1.9 Appendix

Additional firm outcomes

Table 1.12: Effects on district railway traffic

Passenger trips Freight trains

Originating Terminating Originating Terminating (1) (2) (3) (4) Durontos 0.0062 0.0029 −0.0081 0.0012 (0.0113) (0.0111) (0.0121) (0.0169) Spillovers 0.0014 0.0017 −0.0058 −0.0119 (0.0122) (0.0139) (0.0151) (0.0149)

Observations 304 312 224 224 Clusters (districts) 152 156 112 112

Notes: This table presents a district level regression of passenger and freight trips, on the amount of Duronto and spillover traffic introduced through the district. The dependent variables, all in natural logarithms, measure the number of trips of each type either orig- inating or terminating in the district. The regression sample includes only 2011 to 2012, the first years for which passenger and freight traffic data are available. All regressions include fixed effects for district, and year by state. Robust standard errors in parentheses, with clustering by district. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

90 Table 1.13: Effects on firm logistics

Input Output Transport Transport inventory inventory expenses equipment (1) (2) (3) (4) Durontos 0.0173*** 0.0023 −0.0081 0.0012 (0.0052) (0.0063) (0.0124) (0.0164) Spillovers 0.0015 0.0004 −0.0058 −0.0119 (0.0066) (0.0074) (0.0161) (0.0199)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.1), for factories in rail-using industries. The dependent variables, all in natural logarithms, provide measures of the factory’s logisti- cal response to railway congestion. Column (1) shows effects on holdings of input inventory, while Column (2) shows effects on holdings of finished goods inventory. Column (3)shows effects on the firm’s “other distributional expenses”, the expense category into whichship- ping expense falls. Column (4) shows effects on the amount of transport equipment owned. The regressors are the number of two-way Duronto routes (based on shortest path) passing through the district as of the current year, and the number of introduced Duronto trains for which the district lies on a spillover alternate route, as defined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

91 Table 1.14: Effects on firm product mix

Number of Time Demand Product products made sensitivity uncertainty complexity (1) (2) (3) (4) Durontos −0.0114*** −0.0022* −0.0119*** −0.0057* (0.0030) (0.0012) (0.0042) (0.0029) Spillovers −0.0049 −0.0041** −0.0055 −0.0041 (0.0042) (0.0018) (0.0046) (0.0038)

Observations 22704 22615 23138 22614 Clusters 1 (factories) 5491 5241 5313 5241 Clusters 2 (district × year) 1825 1876 1885 1876

Notes: This table presents estimates of equation (1.1), for factories in rail-using industries. The dependent variables, all in natural logarithms, provide measures of the factory’s prod- uct mix. Column (1) shows effects on the number of distinct products produced. Column (2) shows effects on the average time sensitivity of the products made, weighed across productby output value. As described in the text, product level measures of time sensitivity come from Hummels and Schaur (2013). Column (3) shows effects on the average demand uncertainty of the products made, again weighted by product output value. The measure of demand uncer- tainty is as described in the text and is as used in Blanchard and Simon (2001). Column (4) shows effects on product complexity, measured as in Levchenko (2007) as the (inverse) Herfind- ahl index of the inputs used to make the product according to US input-output tables. The regressors are the number of two-way Duronto routes (based on shortest path) passing through the district as of the current year, and the number of introduced Duronto trains for which the district lies on a spillover alternate route, as defined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors inparen- theses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

92 Other heterogeneity

Table 1.15: Heterogeneity by use of rail goods as inputs, and production of rail goods as output

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Durontos −0.0040 −0.0021 −0.0029 0.0017 (0.0067) (0.0044) (0.0078) (0.0066)

(Durontos) × (Rail input share) −0.0351* −0.0287* 0.0258 0.0291** (0.0204) (0.0165) (0.0191) (0.0145)

(Durontos) × (Makes rail good) −0.0173** −0.0124** 0.0102 −0.0011 (0.0077) (0.0056) (0.0092) (0.0084)

Spillovers −0.0099 −0.0050 0.0046 −0.0008 (0.0089) (0.0055) (0.0106) (0.0097)

(Spillovers) × (Rail input share) −0.0038 −0.0026 0.0061 0.0025 (0.0243) (0.0221) (0.0210) (0.0289)

(Spillovers) × (Makes rail good) −0.0052 −0.0030 0.0184* 0.0018 (0.0102) (0.0079) (0.0107) (0.0112)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table shows heterogeneity in the Duronto and spillover effects, based on whether the factory is likely to rely on the rails for sourcing inputs or for delivering output. The rail input share is the industry’s total input cost share of the goods typically shipped by rail in India (coal, iron, steel, cement, fertilizers, foodgrains, and mineral oils). The mean of this input share is 0.27 in the regression sample. Makes rail good is an indicator of whether the factory produces one of the rail goods. Its mean in the regression sample is 0.53. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

93 Table 1.16: Heterogeneity by road density

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Durontos −0.0188*** −0.0113** −0.0082** 0.0094 (0.0060) (0.0047) (0.0041) (0.0061)

(Durontos) × (Road density) 0.0031 0.0048 −0.0046 0.0025 (0.0075) (0.0060) (0.0088) (0.0090)

Spillovers −0.0117* −0.0079* 0.0076* 0.0009 (0.0061) (0.0043) (0.0043) (0.0062)

(Spillovers) × (Road density) −0.0023 −0.0073 0.0059 −0.0011 (0.0068) (0.0047) (0.0079) (0.0083)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table shows heterogeneity in the Duronto and spillover effects, as a function of state road density, measured as kilometers of national highway per square kilometer of area. This road density variable is standardized so it has mean 0 and standard deviation 1. Its raw mean is 0.036 and raw standard deviation is 0.015. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

94 Robustness

Table 1.17: Reduced form estimates, controlling for distance to cities served by Duron- tos

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0181*** −0.0102** 0.0078** 0.0098* (0.0057) (0.0043) (0.0036) (0.0054)

Spillover routes through district −0.0109* −0.0051 0.0074* 0.0017 (0.0065) (0.0044) (0.0043) (0.0067)

Distance to Duronto endpoint −0.0012 −0.0025 0.0036 −0.0002 (hundred km.) (0.0021) (0.0018) (0.0029) (0.0028)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.1), adding a control for the distance to the near- est city with a new Duronto train serving it in that year, measured in hundreds of kilometers. The dependent variables are the four main outcomes of interest as defined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errorsin parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

95 Table 1.18: Reduced form estimates, controlling for Duronto and spillover traffic on shipping lines

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0186*** −0.0133*** 0.0071* 0.0014 (0.0061) (0.0046) (0.0039) (0.0057)

Spillover routes through district −0.0124* −0.0055 0.0079* 0.0025 (0.0071) (0.0044) (0.0042) (0.0069)

Duronto traffic −0.2422 −0.3206 0.0094 0.8749* on shipping lines (0.5750) (0.4421) (0.3892) (0.4855)

Spillover traffic −0.4519 −0.2110 −0.1002 −0.0439 on shipping lines (0.7041) (0.5698) (0.5224) (0.6200)

Observations 21582 21034 19770 20993 Clusters 1 (factories) 3007 2986 2789 2994 Clusters 2 (district × year) 972 960 908 964

Notes: This table presents estimates of equation (1.1), adding a control for the amount of Duronto and spillover traffic introduced along the tracks used for railway shipments to and from each dis- trict. This traffic is measured as a fraction of the tracks’ line capacity, and averaged over allthe routes used for the district’s shipments, weighted by the total number of shipments, between 2011 and 2015 (which is all years with available data). The dependent variables are the four main out- comes of interest as defined in the text. All regressions include fixed effects for factory, yearby state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

96 Table 1.19: Reduced form estimates, controlling for changes in market access

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0178*** −0.0123*** 0.0074** 0.0025 (0.0054) (0.0039) (0.0033) (0.0053)

Spillover routes through district −0.0119* −0.0056 0.0076* 0.0033 (0.0064) (0.0041) (0.0039) (0.0065) ln(Market access) −0.0834 −0.1036 0.0036 −0.1589 (0.1588) (0.1624) (0.1622) (0.1667)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.1), adding a control for the amount of Duronto and spillover traffic introduced along the tracks used for railway shipments to and from each dis- trict. This traffic is measured as a fraction of the tracks’ line capacity, and averaged over allthe routes used for the district’s shipments, weighted by the total number of shipments, between 2011 and 2015 (which is all years with available data). The dependent variables are the four main out- comes of interest as defined in the text. All regressions include fixed effects for factory, yearby state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

97 Table 1.20: Reduced form estimates, with sample including all districts in mainland India

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0193*** −0.0103*** 0.0074** 0.0099** (0.0045) (0.0037) (0.0029) (0.0048)

Spillover routes through district −0.0103* −0.0058 0.0074* 0.0006 (0.0057) (0.0038) (0.0038) (0.0058)

Observations 37651 36708 30412 20993 Clusters 1 (factories) 8311 8150 7179 8012 Clusters 2 (district × year) 2855 2841 2772 2843

Notes: This table presents estimates of equation (1.1), where the sample includes all districts in mainland India as possible controls, not only the districts located between major cities as in the preferred specification. The dependent variables are the four main outcomes of interest asdefined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

98 Table 1.21: Reduced form estimates, with sample excluding “donut” around Duronto endpoints

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0189*** −0.0106** 0.0081** 0.0096* (0.0053) (0.0044) (0.0035) (0.0055)

Spillover routes through district −0.0110* −0.0062 0.0072 0.0009 (0.0066) (0.0045) (0.0045) (0.0069)

Observations 25874 25488 19760 25791 Clusters 1 (factories) 5740 5723 5286 5731 Clusters 2 (district × year) 1791 1780 1683 1775

Notes: This table presents estimates of equation (1.1), where the sample excludes all districts within 100km of the districts receiving Duronto passenger train service. The dependent variables are the four main outcomes of interest as defined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

99 Table 1.22: Reduced form estimates, with narrower definition of spillover route

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0194*** −0.0109** 0.0080*** 0.0097 (0.0052) (0.0047) (0.0029) (0.0063)

Spillover routes through district −0.0109* −0.0065 0.0073* 0.0034 (0.0064) (0.0041) (0.0044) (0.0059)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.1), with spillover routes defined to include only those paths crossed by trains traveling between the Duronto endpoints, instead of between any pair of points on the Duronto route. This traffic is measured as a fraction of the tracks’ line capacity, and averaged over all the routes used for the district’s shipments, weighted by the total number of shipments, between 2011 and 2015 (which is all years with available data). The dependent variables are the four main outcomes of interest as defined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

100 Table 1.23: Reduced form estimates, with wider definition of spillovers, including second-order

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0179*** −0.0116*** 0.0074** 0.0091* (0.0046) (0.0042) (0.0030) (0.0055)

Spillover routes through district −0.0103 −0.0063 0.0068 0.0006 (0.0069) (0.0043) (0.0043) (0.0067)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.1), with spillover routes defined to include not only the standard spillover routes as described in the text, but also the second-order spillover routes. This traffic is measured as a fraction of the tracks’ line capacity, and averaged over all theroutes used for the district’s shipments, weighted by the total number of shipments, between 2011 and 2015 (which is all years with available data). The dependent variables are the four main outcomes of interest as defined in the text. All regressions include fixed effects for factory, year bystate, and year by NIC industry. Robust standard errors in parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

101 Table 1.24: Reduced form estimates, with spillovers restricted to 200km

ln(Revenue) ln(TFPR) ln(Avg cost) ln(Inventory) (1) (2) (3) (4) Duronto routes through district −0.0191*** −0.0101** 0.0076** 0.0094 (0.0055) (0.0046) (0.0033) (0.0058)

Spillover routes through district −0.0099 −0.0059 0.0068 0.0005 (0.0062) (0.0046) (0.0045) (0.0058)

Observations 27558 26896 21624 26618 Clusters 1 (factories) 6191 6074 5238 5964 Clusters 2 (district × year) 1932 1914 1866 1928

Notes: This table presents estimates of equation (1.1), with spillover routes restricted to include only lines within 200km of the main Duronto route. This traffic is measured as a fraction ofthe tracks’ line capacity, and averaged over all the routes used for the district’s shipments, weighted by the total number of shipments, between 2011 and 2015 (which is all years with available data). The dependent variables are the four main outcomes of interest as defined in the text. All regressions include fixed effects for factory, year by state, and year by NIC industry. Robust standard errorsin parentheses, with clustering by factory and district-year. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

102 Alternate measures of exposure to Duronto congestion

Model-based measure of changes in market access

Another possible channel for the effects of Duronto traffic is through changes ina district’s “market access” (Redding and Sturm, 2008; Donaldson and Hornbeck, 2016), or the cost, net of congestion, of sending and receiving shipments to and from other districts with large product or input markets. Returning to the illustration of Figure 1-5, even if Bokaro and all of its shipping lanes were untouched by the Durontos or by the traffic pushed onto alternate routes, it might still experience gains orlosses as a result of the Durontos disrupting production in other areas and prices adjusting accordingly. For example, suppose Rourkela and Bokaro compete to supply goods to some third district, Ranchi. If the Duronto train affects production in Rourkela or disrupts Rourkela-Ranchi shipments, Bokaro firms gain an advantage supplying to Ranchi and might thus enjoy increased sales. On the other hand, these effects might harm Bokaro rather than benefit it. For example, if certain Rourkela firms supply inputs to Bokaro, or to Bokaro firms’ suppliers in towns like Ranchi, then the Durontos could raise the price of these inputs.

To account for these effects, I use a measure of market access in the spirit ofDon- aldson and Hornbeck (2016). Based on Eaton and Kortum (2002), the Donaldson and Hornbeck (2016) model shows that all of the positive and negative general equi- librium welfare effects of an infrastructure project are captured in statistic termed market access. Under symmetric trade costs, the market access of a district 푑 is

∑︁ −휃 −1 푀퐴푑 = 푘 휏푑푑′ 푀퐴푑′ 푌푑′ , 푑′

′ where 푘 is a constant, 푑 indexes the other districts, 휏푑푑′ is a bilateral trade cost, 휃 is ′ a trade elasticity, and 푌푑′ is the real output of 푑 . Intuitively, a district 표 has greater market access when it has lower costs of trading with other districts which have large local economies but relatively little access to other districts which might compete with 표 for business. Donaldson and Hornbeck (2016) show that this market access

103 term is highly correlated with a first-order approximation which is better suited to empirical applications. Through a similar derivation, I arrive at the approximation

∑︁ −휃 푀퐴푑 ≈ 휏푑푑′ 푌푑′ . (1.36) 푑′

I parametrize 휏푑푑′ by

휓푇 휓퐷 휏푑푑′ ≡ (푇 푟푎푓)푑푑′ 퐷푖푠푡푑푑′ , (1.37) where (푇 푟푎푓)푑푑′ ≡ (Traffic푑푑′,2008+Durontos푑푑′ )/Traffic푑푑′,2008 is the amount of Duronto traffic on the railway lines connecting 푑 to 푑′, expressed as a fraction of 2008 traffic,

′ and (퐷푖푠푡)푑푑′ is the distance from 푑 to 푑 . The elasticity of trade costs with respect to

distance is 휓퐷. Following Ramondo, Rodríguez-Clare and Saborío-Rodríguez (2016),

I set 휓퐷 = 0.27 and 휃 = 4. Finally, I set 휓푇 = 0.02, so the effect of congestion on trade costs is assumed, conservatively, to be less than one-tenth that of distance. These parameter choices are necessarily somewhat ad hoc, representing a limitation to using the market access approach in my setting. With richer data on the actual shipments firms make or on spatial price gaps, it would be possible to find tradecost equivalents of given amounts of congestion to make these calculations more precise.

Controlling for ln 푀퐴푑푡 captures the effect of Duronto-associated changes in mar- ket access, which are greatest for districts which have Durontos running on the lines between them and other districts which are large or growing.28 Table 1.19 shows results of factory regressions with this market access term, in addition to the main Duronto and spillover effects. This in fact does not change the estimates ofthe Duronto and spillover effects, suggesting that these variables suffice to capture most of the economic effect of the added traffic, without having to turn to models involving notions like market access.

28 An identification concern arises because 푀퐴푑푡 and 푌푑′푡 are endogenously determined. However, the results to not change substantially from following the alternate approach of Donaldson and Hornbeck (2016) which holds district output fixed at 2008 levels.

104 Traffic in shipping lanes

Another form of exposure occurs if Duronto or spillover traffic, though not traveling through a district, nevertheless travels on the railway lines used to carry goods to or from that district. To account for these effects, I use data on freight shipment patterns. For each district, I identify the set of routes used for its shipments, and calculate the amount of Duronto traffic on these routes. First, for each section of track 푛, let #Durontos푛푡 퐷푛푡 ≡ Capacity푛,푡=2008 be the number of Durontos running on that section as of year 푡, expressed as a fraction of the section’s 2008 capacity. This section might be part of one or more origin-destination shipping routes, 푟. A measure of Duronto congestion on 푟 is

∑︁ Route Duronto Traffic푟푡 ≡ 휌푛푟퐷푛푡 푛 with weights 휌푛푟 equal to the length of section 푛 divided by the total length of route 푟.

So (Route Duronto Traffic)푟푡 is a weighted average of congestion on all of the sections 푛 which make up route 푟. Finally, for a district-level measure of Durontos’ effect on shipments, I take a weighted average of route congestion over all of the routes serving the district:

∑︁ District’s Duronto Shipping Exposure푑푡 ≡ 휔푟푑(Route Duronto Exposure)푟푡, 푟 with weights 휔푟푑 equal to the total number of route 푟 shipments serving district 푑, divided by the total number of shipments serving 푑. A limitation of these measures is that data on freight shipping patters is available for only a limited number of districts. Nevertheless, Table 1.18 presents reduced form results incorporating these controls for the available district, and showing again that this does not produce an additional effect conditional on the amount of Duronto and spillover traffic passing directly through the district.

105 Inventory model

Consider a firm solving a classic economic-order quantity (EOQ) problem: thefirm’s inventories deplete as it meets customer demand, then when the inventory reaches some reorder level 푅, the firm places a restocking order of quantity 푄. Demand 푑 at

2 any instant is random, with cdf 퐹 and mean 휇푑, variance 휎푑. Time 푡 is measured in years (so 휇푑 is the average annual demand). Expected inventory level varies between 푠 and 푠+푄, where safety stock 푠 is defined as the expected inventory level just before a restocking order arrives.

푠 = 푅 − 휇푑휇휏 , where 휇휏 is expected shipping time. The economically relevant portion of the problem is time between the placement of the restocking order and the arrival of the order, since this is where the stockout risk arises. To focus on this portion of the problem and to simplify, we fix 푄.A version of the problem without this simplification appears in Nahmias (2001). Note that 푄 will be approximately 푅 − 푠, since the firm will not want its inventory level when orders arrive to be systematically greater than 푅, nor systematically less. Stockout occurs if inventory on hand at the time of order (푅) is less than total demand while the order is in transit, denoted 퐷. Total demand 퐷 depends on both the demand realizations at each instant, and shipping time 휏, which is random with

2 cdf 퐺, mean 휇휏 , and variance 휎휏 . Denote the cdf of 퐷 by 퐻.

Lemma. Assume 퐹 , 퐺 are independent. Then,

휇퐷 ≡ 퐸[퐷] = 휇푑휇휏 (1.38)

2 2 2 2 휎퐷 ≡ 푉 [퐷] = 휇휏 휎푑 + 휇푑휎휏 .

Proof. See Hadley and Whitin (1963).

When stockout occurs, the firm pays penalty 푝 for each unit of unmet demand.

106 The expected stockout penalty is in one reorder cycle is therefore

∫︁ ∞ 푛(푅) = 푝 (푥 − 푅)ℎ(푥)푑푥. 푅

The number of annual cycles is 휇푑/푄, so the expected annual penalty is

휇 푑 푛(푅). (1.39) 푄

Inventory holding costs depend on depend on the interest rate 푖. Normalizing by the price of output and letting 푣 be value added, the price of an input unit is 1 − 푣, so the cost of holding it in inventory is 푖(1 − 푣). Average inventory holdings are

1 1 퐼¯ = 푠 + 푄 = 푅 − 휇 휇 + 푄. 2 푑 휏 2

So expected annual holding costs are

1 푖(1 − 푣)(푅 − 휇 휇 + 푄). (1.40) 푑 휏 2

Combining expected penalty (1.39) with holding costs (1.40), total costs are

휇 1 퐶(푅) = 푑 푛(푅) + 푖(1 − 푣)(푅 − 휇 휇 + 푄). (1.41) 푄 푑 휏 2

The firm choose 푅 to minimize (1.41), yielding first order condition

휇 0 = − 푑 푝(1 − 퐻(푅)) + 푖(1 − 푣). 푄 and optimum

* 푅 = 휌휎퐷 + 휇푑휇휏 , (1.42)

−1 푖(1−푣) where 휌 ≡ Φ (1 − 푄) is constant with respect to 휇휏 and 휎휏 . 푝휇푑

As (1.42) makes clear, average inventory level is increasing in both the mean and variance of shipping time. Also, these effects of shipping times are larger for

107 goods with higher value added 푣, higher penalty of stockout 푝, and higher demand uncertainty 휎푑:

휕2푅* 휕2푅* > 0 > 0 휕휇휏 휕푣 휕휎휏 휕푣 휕2푅* 휕2푅* > 0 > 0 휕휇휏 휕푝 휕휎휏 휕푝 휕2푅* 휕2푅* > 0 > 0. 휕휇휏 휕휎푑 휕휎휏 휕휎푑

These results serve as the basis for the predictions of heterogeneous effects, as dis- cussed in section 1.3.3.

108 Shipping times model

Consider a single section of railway track, with a given set of 퐾 trains running east- bound and 퐾¯ trains running westbound. Each train 푖 has scheduled departure and

arrival times 퐷푖 and 퐴푖 and a free-running time 퐹 푅푖, which is the time the train would take to traverse the section without any interference from other trains.

Interference delays arise when trains meet, because one of the trains needs to stop

to let the other pass. Let 푞푖푗 be the probability that trains 푖 and 푗 meet; naturally,

푞푖푗 = 푞푗푖. This probability is not exogenously specified, but will need to be solved for, given the departure times and stoppages on the line. Conditional on 푖 and 푗

meeting, let 푃푖푗 be the probability that train 푖 is delayed, and assume 푃푖푗 = 1 − 푃푗푖.

Conditional on experiencing such a delay, train 푖 needs to stop for 푑푖푗 minutes. Thus,

the total delay, 푡푖푗 equals 푑푖푗 if 푖 and 푗 meet, and zero otherwise. It follows that the total travel time for train 푖 is

∑︁ 푡푖 = 퐹 푅푖 + 푡푖푗. (1.43) 푗∈퐾∪퐾¯

There are two possible sources of randomness in this expression for travel time. First,

allow the actual departure time 푑푖 to be a random variable centered around 퐷푖. As in the real world, a given train might leave later than expected, causing disruption to the schedules of other trains. This randomness in departure time is not essential, however, as there is also another source of uncertainty, coming from the random

length of delay conditional on two trains meeting. In particular, 푑푖푗 is a random variable, which can depend, as in the formulation of Petersen (1974) on the length of the track section, the availability of sidings for trains to pull aside, the amount of time taken for trains to make this switch, and the speed differential between meeting

trains. There are different ways to specify the distribution of 푑푖푗 as a function of these factors (Petersen, 1974; Chen and Harker, 1990; Harker and Hong, 1990), though the exact specification is not important for the derivation that follows.

109 Taking the expectation and variance on both sides of (1.43), it follows that

∑︁ 퐸(푡푖) = 퐹 푅푖 + 푞푖푗퐸(푑푖푗) (1.44) 푗∈퐾∪퐾¯

∑︁ 2 ∑︁ 푉 푎푟(푡푖) = [푞푖푗푉 푎푟(푑푖푗) + 푞푖푗(1 − 푞푖푗)퐸 (푑푖푗)] + 퐶표푣(푡푖ℎ, 푡푖푘). (1.45) 푗∈퐾∪퐾¯ ℎ,푘∈퐾∪퐾,ℎ¯ ̸=푘

The derivation of these expressions, related to the derivation in Chen and Harker

(1990), requires solving for the mean, variance, and covariance of 푡푖푗. Since 푡푖푗 =

푑푖푗 when 푖 and 푗 meet and zero otherwise, it follows that 퐸(푡푖푗) = 푞푖푗퐸(푑푖푗) and 2 2 퐸(푡푖푗) = 푞푖푗퐸(푑푖푗). The expression for the mean in (1.44) follows immediately. For the variance,

2 푉 푎푟(푡푖푗) = 푞푖푗푉 푎푟(푑푖푗) + 푞푖푗(1 − 푞푖푗)퐸 (푑푖푗), (1.46)

which yields first term in (1.45). The calculation of 푉 푎푟(푡푖) also requires solving for

the covariance of the 푡푖푗. These covariances of train meetings are the reason that travel time variance increases by so much with the addition of trains to an already- congested track. In particular, on a congested track, train 푖 meeting with train ℎ makes it more likely that train 푖 will also meet train 푘, since train 푖 will have been stopped and thrown off schedule by the first meeting. To help see this, we solvefor these covariances, which can be written as

퐶표푣(푡푖ℎ, 푡푖푘) = 퐸(푡푖ℎ푡푖푘) − 퐸(푡푖ℎ)퐸(푡푖푘). (1.47)

Expanding this expression requires computing the probability of train 푖 interfering with both of trains ℎ and 푘. Doing so requires, in general, considering three possible cases: (a) 푖 is running in one direction, with ℎ and 푘 in the opposite direction, (b) 푖 and 푘 are running in one direction, with ℎ in the opposite direction, and (c) all three of 푖, ℎ, and 푘 are running in the same direction.

With uncertainty over departure times, the covariance term for each of these

cases depends on the distribution of departure times. Let 푔푖(·), 푔ℎ(·), and 푔푗(·) be

110 the departure time density functions and letting 푔ℎ−푖,푘−푖(·) be the joint distribution of differences in departure time. It follows, in case (a), that the probability of 푖 interfering with both ℎ and 푘 is the probability that it meets both of them head-on, which is ∫︁ 퐷푘+휏 ∫︁ 퐷ℎ+휏 ∫︁ 퐷푖+휏 푔푖(푥)푔ℎ(푦)푔푘(푧)푓1(푥, 푦, 푧) 푑푥 푑푦 푑푧. (1.48) 퐷푘−휏 퐷ℎ−휏 퐷푖−휏 Here 휏 is the cycle window within which all trains depart–perhaps one day if we are considering a daily schedule. The probability of 푖 meeting both other trains given the departure times is 푓1(푥, 푦, 푧). In case (b), the probability of 푖 meeting both other trains is the probability that 푖 overtakes ℎ and meets 푘 head-on, plus the probability that it is overtaken by ℎ and meets 푘 head-on. This yields

∫︁ 퐷ℎ−퐷푖+휏 ∫︁ 퐷푘−퐷푖−휏 푔ℎ−푖,푘−푖(푦, 푧)[푓2(푦, 푧) + 푓3(푦, 푧)] 푑푧 푑푦 0 퐷푘−퐷푖−휏 (1.49) ∫︁ 0 ∫︁ 퐷푘−퐷푖−휏 + 푔ℎ−푖,푘−푖(푦, 푧)[푓4(푦, 푧) + 푓5(푦, 푧)] 푑푧 푑푦. 퐷ℎ−퐷푖−휏 퐷푘−퐷푖−휏

Here, 푓2, 푓3, 푓4, and 푓5 give the relevant probabilities of meetings and overtakings given the departure times.

In case (c) the probability of 푖 meeting both other trains is given by the probability that it overtakes both of them, plus the probability both of them overtake it, plus the probability that it overtakes one and is overtaken by the other. In particular, this probability can be written

∫︁ 퐷ℎ−퐷푖+휏 ∫︁ 퐷푘−퐷푖+휏 푔ℎ−푖,푘−푖(푦, 푧)푓6(푦, 푧) 푑푧 푑푦 0 0 ∫︁ 퐷ℎ−퐷푖+휏 ∫︁ 0 + 푔ℎ−푖,푘−푖(푦, 푧)푓7(푦, 푧) 푑푧 푑푦 0 퐷푘−퐷푖−휏 (1.50) ∫︁ 0 ∫︁ 퐷푘−퐷푖+휏 + 푔ℎ−푖,푘−푖(푦, 푧)푓8(푦, 푧) 푑푧 푑푦 퐷ℎ−퐷푖−휏 0 ∫︁ 0 ∫︁ 0 + 푔ℎ−푖,푘−푖(푦, 푧)푓9(푦, 푧) 푑푧 푑푦. 퐷ℎ−퐷푖−휏 퐷푘−퐷푖−휏

Here again, 푓6, 푓7, 푓8, and 푓9 give the probabilities of the meetings and overtakings

111 as a function of realized departure times. Explicit calculations of 푓1 to 푓9 appear in Chen and Harker (1990) and Harker and Hong (1990), yielding the expression for the covariance term.

A simpler calculation follows from assuming no uncertainty in departure times, and focusing on the first two cases for train meetings.29 To simplify notation, write the difference in train departure times as

⎧ ⎪퐷 − 퐷 if |퐷 − 퐷 | ≤ 휏/2 ⎪ 푗 푖 푗 푖 ⎨⎪ 퐷푖(푗) = 퐷푗 − 퐷푖 − 휏 if 퐷푗 − 퐷푖 > 휏/2 (1.51) ⎪ ⎪ ⎩⎪퐷푗 − 퐷푖 + 휏 if 퐷푗 − 퐷푖 < −휏/2.

Note that a meeting between trains occurs if 푡푖 > |퐷푖(푗)|. Thus, 푞푖푗 = 푃 (푡푖 ≥ 퐷푖(푗))

if 퐷푖(푗) ≥ 0 and 푞푖푗 = 푃 (푡푖 ≥ −퐷푖(푗)) if 퐷푖(푗) < 0. Taking (1.47), and plugging in

the probabilities from cases (a) and (b) along with these expressions for 푞푖푗, we obtain an expression for the expected travel time of train 푖

∑︁ ∑︁ 퐸(푡푖) = 퐹 푅푖 + 푃 (푡푖 ≥ −퐷푖(푗))퐸(푑푖푗) + 푃 (푡푖 ≥ 퐷푖(푗))퐸(푑푖푗). (1.52)

푗:퐷푖(푗)<0 푗:퐷푖(푗)≥0

Similarly, we obtain an expression for the variance:

∑︁ 2 푉 푎푟(푡푖) = 푃 (푡푖 ≥ −퐷푖(푗)){푉 푎푟(푑푖푗) + [1 − 푃 (푡푖 ≥ −퐷푖(푗))]퐸 (푑푖푗)}

푗:퐷푖(푗)<0 ∑︁ {︁ + 푃 (푡푖 ≥ 퐷푖(푗))푉 푎푟(푑푖푗) (1.53)

푗:퐷푖(푗)≥0 }︁ + {푃 (푡푖 ≥ 퐷푖(푗)) · 퐴 + [1 − 푃 (푡푖 ≥ 퐷푖(푗))] · 퐵}퐸(푑푖푗) ,

29This restriction does not substantively change the results, and Chen and Harker (1990) provide an adjustment which accounts for this additional case after finding the solution based on only the first two cases.

112 where

∑︁ 퐴 ≡ [1 − 푃 (푡푖 ≥ 퐷푖(푘))]퐸(푑푖푘) 푘:퐷 (푗)≥퐷 (푘)≥0 푖 푖 (1.54) ∑︁ 퐵 ≡ 푃 (푡푖 ≥ 퐷푖(푘))퐸(푑푖푘).

푘:퐷푖(푘)≥퐷푖(푗)

The equations (1.52) and (1.53) hold for a generic distribution of travel times. To simplify the calculation, assume as in Chen and Harker (1990) and Harker and

Hong (1990) that 푡푖 is normally distributed with mean 푇푖 and variance 푉푖. A normal distribution of travel times should be a realistic approximation on a congested line where the number of train interferences is large, though this is not a precise applica- tion of the central limit theorem, since it is not the case that all variables here are independent. It is now straightforward to obtain the expression

∞ 1 ∫︁ 1 2 −(푡−푇푖) /2푉푖 푃 (푡푖 ≥ 퐷푖(푗)) = √ √ 푒 푑푡. (1.55) 2휋 퐷푖(푗) 2휋푉푖

Substituting based on (1.55) in (1.52) and (1.53) and given expressions for 퐸(푑푖푗) and

푉 푎푟(푑푖푗), we obtain a system of 2퐾 nonlinear equations in 2퐾 unknowns {푇1, . . . , 푇퐾 , 푉1, . . . , 푉퐾 }.

Algorithms for solving these equations, such as the Newton-Raphson method, require calculating a Jacobian matrix and applying a series of iterative approximations

푇 푉 until convergence. Letting 푓푖 (T, V) and 푓푖 (T, V) be the equations for the mean and 휕푓 푇 (T,V) variance of travel time 푖, it is straightforward to the terms of the Jacobian, 푖 , 휕푇푗 휕푓 푉 (T,V) 휕푓 푇 (T,V) 휕푓 푉 (T,V) 푖 , 푖 , and 푖 , for each 푖 and 푗, by plugging in the derivatives 휕푇푗 휕푉푗 휕푉푗

휕푃 (푡푖 ≥ 퐷푖(푗)) 1 2 = √ 푒−(퐷푖(푗)−푇푖) /2푉푖 휕푇푖 2휋푉 푖 (1.56) 휕푃 (푡푖 ≥ 퐷푖(푗)) 퐷푖(푗) − 푇푖 2 = 푒−(퐷푖(푗)−푇푖) /2푉푖 . 휕푉 √︀ 3 푖 2 2휋푉푖

In solving these equations, I use a quasi-Newton method involving full calculation of the Jacobian only at the first step (Broyden, 1965), which achieves the same re- sults and faster convergence than the Newton-Raphson and successive approximations

113 methods used in Chen and Harker (1990) and Harker and Hong (1990).

Table 1.25: Replication of Chen and Harker (1990)

Chen & Harker Replication

Train 푖 퐸(푡푖) 푆퐷(푡푖) 퐸(푡푖) 푆퐷(푡푖) 1 2:33 0:05 2:33 0:05 2 2:35 0:05 2:35 0:05 3 1:58 0:03 1:58 0:03 4 2:18 0:00 2:18 0:00 5 2:22 0:00 2:22 0:00 6 2:31 0:05 2:28 0:05 7 2:52 0:09 2:50 0:09 8 2:39 0:05 2:38 0:04 9 2:47 0:08 2:47 0:08 10 3:18 0:14 3:14 0:13 11 2:55 0:06 2:56 0:06 12 3:00 0:08 2:59 0:08 13 2:20 0:04 2:21 0:04 14 2:27 0:07 2:26 0:07 15 2:12 0:03 2:18 0:03 16 2:19 0:04 2:19 0:04 17 3:33 0:13 3:30 0:12 18 2:31 0:07 2:31 0:07 19 2:01 0:01 2:01 0:02 20 3:11 0:09 3:10 0:09 21 3:11 0:09 3:04 0:09 22 2:19 0:05 2:15 0:05

Following this procedure, I am able to nearly replicate the results of these pa- pers. Specifically, Figure 1.25 shows the estimated mean and variance of travel time, in hours and minutes, for a line containing 22 trains studied by Chen and Harker (1990). The estimates in my replication show some minor differences from their origi- nal results, likely due to minor differences in the procedure for simplifying the meeting and overtaking probabilities. While I could not discern this exact procedure from the details provided in their paper, the results do show a very close match, which suffices for practical purposes. Figure 1-12 shows, then, the effect of adding trains to the line. Starting withthe first five trains on the list, I add the succeeding trains one by one, and solve thesystem

114 Figure 1-12: Effects of congestion on travel times

of equations after adding each new train, to see how this affects the travel times of the other trains. Consistent with the theoretical predictions and with the empirical strategy as described in Section 1.4, there is a divergence for mean and variance of travel time as the congestion level increases. Specifically, the effect of a new train on mean travel time is the same small amount regardless of whether the new train is the fifth or the twenty-second on the line. For the variance, on the other hand, there is little effect from adding the first few trains, but a far greater effectwhen traffic is heavy.30 These results provide the basis for the empirical strategy using the congestion levels to instrument separately for the mean and the variance of shipping times.

30While this escalation of the variance happens for almost all of the trains on the line, there a couple of trains (corresponding to the lowest sequence of red triangles in the graph), for which the additional traffic has little effect on the variance. These are the trains with highest priority onthe rails, which do not need to stop and wait when they meet other trains. In reality, of course, the freight trains on Indian Railways receive the lowest priority so they will not exhibit this pattern. Rather, the additional traffic will increase the variance of their travel times in the manner discussed.

115 116 Chapter 2

Manufacturing Underdevelopment: India’s Freight Equalization Scheme, and the Long-run Effects of Distortions on the Geography of Production

2.1 Introduction

A common goal of industrial policy is to promote balanced economic development, both across sectors and across regions. India’s Second Five Year Plan in 1956 made this goal explicit: “Only by securing a balanced and co-ordinated development of the industrial and the agricultural economy in each region, can the entire country attain higher standards of living”. Balanced development might prove difficult to attain, however, if some regions are inherently more productive than others. Even if balanced development is feasible, it might prove undesirable, if concentrating economic activity leads to gains from specialization or agglomeration externalities. Indeed, policies such as special economic zones and subsidies for industrial centers aim at precisely the

117 opposite of India’s goal: concentration rather than balance. This paper provides empirical evidence on the effects of policies aimed altering the geographic distribution of industry. We focus, in particular, on India’s Freight Equalization Scheme (FES), which was in force from 1956 to 1991 and subsidized the long-distance transport of certain key manufacturing inputs, such as iron, steel, fertilizers, and cement. While FES aimed to promote “the dispersal of industries all over the country” (Ministry of Commerce and Industry, 1957), evidence on its effects is mixed. First, was FES effective in altering patterns of geographic development overthe long term? A 1977 government report claimed that the size of the freight equaliza- tion subsidies amounted to only a small proportion of firms’ final output prices, and therefore could not have affected the geography of production (Government of India Planning Commission, 1977). Evidence from Garred et al. (2015) on the implemen- tation of FES is consistent with this claim, showing that in the initial years of FES, it had little effect on the location of industry. Politicians from eastern India, on the other hand, claim that the FES subsidies were devastating to industry in their home states (Ghosh and Das Gupta, 2009; Krishna, 2017). Even though the subsidies were small, they provided just enough in- centive for certain industries to move production away from the resource-rich eastern states and to expand production in prosperous western states such as Maharashtra, Gujarat, and Punjab. These politicians’ gripe becomes all the more plausible in light of the Hirschman (1958) notion of backward and forward linkages: Once cer- tain industries move from eastern to western India, it becomes more attractive for their upstream and downstream neighbors to locate there, driving agglomeration and disproportionate economic growth in the west. Over the past half-century, the west- ern (and southern) states benefiting from FES have enjoyed India’s highest rates of manufacturing growth, while the resource-rich eastern states have fallen from being manufacturing powerhouses in the 1950s and 1960s to being among India’s poorest states today. We find evidence consistent with these claims: FES achieved exactly the opposite

118 of its purported goal, exacerbating inequality between western India and the resource- rich east. Specifically, we show that FES led industries using the equalized ironand steel to move farther from the bases of raw materials production in eastern India. The empirical strategy relies on a triple-difference, studying how output in a state-industry is affected by the state’s distance from the sources of iron and steel, interacted with the intensity of iron and steel use in the industry’s production. We also show that input linkages magnify these effects, with eastward relocation occurring not only for the direct users of the equalized materials, but also for their downstream neighbors and for industries with higher Leontief input shares of these materials.

While these results point to long-term effects of FES, they leave open questions about the dynamics following its implementation: How long does it take for the lo- cation of industry to reach its steady state under FES, and what happens along the transition path? A burgeoning literature highlights the importance of transitional dy- namics in studying the effects of trade liberalizations and other changes in mobility frictions (Dix-Carneiro, 2014; Caliendo, Dvorkin and Parro, 2017). As Dix-Carneiro and Kovak (2017) show, the impact of Brazilian tariff cuts on regional earnings 20 years after liberalization was more than three times the effect 10 years after liberal- ization. Our results show that the transition under FES was gradual. Even though the policy had little effect over its first 10 to 15 years, it led to steady movements of iron and steel using industries out of eastern India, and significant overall effects by the time FES reached its culmination in 1990. Tracing the transition paths shows that these movements tightly coincide with movements in the locations of factories. Evidence suggests that the time needed to set up new factories is one of the key frictions slowing the transition.

The data from more recent years also enables a deeper analysis of the mechanisms behind the FES effects, and stronger support for the identification strategy. In partic- ular, 1950s manufacturing data come from the Census of Manufactures (CoM), which divides output into just 28 industry categories. Of these, exactly one corresponds to the production of iron and steel, and one other, “engineering” includes a broad set of activities which use iron and steel. India’s successor manufacturing survey, the

119 Annual Survey of Industries (ASI), provides a more detailed classification of activity, based essentially on the International Standard Industrial Classifications (ISIC). We thus obtain precise measures of each industry’s use of equalized iron and steel in its inputs, and we can map industries to their Leontief shares, reflecting higher-order input-output linkages. We thus obtain greater precision, and ability to show that the movements we observe in FES-affected industries actually result from their us- age of iron and steel, and not from other industry characteristics. We employ these techniques in studying both the transition path and repeal of FES.

Indeed, after 35 years of accumulated distortion under FES, a final open question concerns the effects of FES’s repeal, which occurred in 1992. On the one hand, repeal- ing FES might restore the natural advantage to eastern India, undoing the policy’s effects in a symmetric fashion. But on the other hand, FES might have contributed to agglomeration externalities (Marshall, 1890), and entrenched the productivity advan- tage of western India. Some commentators argue that this is exactly what happened. Because FES left eastern states with such a deficit in productive infrastructure, these states were unable to recover after the removal of FES (Das Gupta, 2016). A wide lit- erature in economic geography also provides reasons why a given spatial distribution of production might be dynamically stable and therefore difficult to change (Fujita et al., 1999; Baldwin et al., 2011).

We find that the truth is somewhere in between these two possibilities. Stickiness, first of all, is minimal. When FES is repealed, iron and steel using industries move back toward the sources of their materials, and the magnitude of this effect is exactly the opposite of the effect of implementing FES. The sources of these materials changed under FES, however, as the government constructed new integrated steel plants (ISPs) for the processing of basic iron and steel materials. Whereas the ISPs were initially located only in West Bengal and Bihar, the fact that they were more dispersed at the time of FES repeal meant that the gains from this repeal were shared among these Eastern states and a handful of other locations. Thus the historical narrative of aggrievement bears some truth – FES robbed West Bengal and Bihar of their industrial strength, then its repeal provided them little compensation. Yet stickiness

120 due to agglomeration and entrenched advantage was not the mechanism for this. The remainder of the paper proceeds as follows. Section 2.2 describes the history and institutional context of FES, clarifying the structure of steel production in India, the terms of the FES policy, and its possible channels of effect given the geography of production in India. The empirical analysis then proceeds in three parts. First, subsection 2.3.1 considers the effects of FES on the long-run steady state distribution of manufacturing activity. Next, subsection 2.3.2 characterizes the transition path as the geography of production gradually changes under FES. Finally, subsection 2.3.3 studies the repeal of FES, testing whether these effects are symmetric with the effects of implementing FES in the first place. Section 2.4 offers concluding remarks toput the results in context.

2.2 Background

India enacted the Freight Equalization Scheme (FES) in 1956, with the goal of achiev- ing balanced industrial development. As Figure 2-1a shows, manufacturing output at the time was heavily concentrated into just a few areas, with West Bengal and the surrounding region being one of the most important centers of production. In 1950, West Bengal and Bihar accounted for 92 percent of all iron and steel production in India and 48 percent of all manufacturing output in engineering-related industries. These areas enjoyed a natural advantage in manufacturing, due to their proximity to raw materials, particularly iron ore, as depicted in Figure 2-1b. They were also rich in coal and other important mineral resources. FES served to neutralize this geographic advantage. Starting in 1956, the govern- ment fixed uniform prices for the transport of iron and steel. This acted asasubsidy to long-distance shipping, with a user located all the way across the country now be- ing able to obtain iron and steel at the same cost as a user located nearby the sources of materials in West Bengal and Bihar. Along with the equalization of iron and steel to achieve geographic balance in manufacturing, the government also equalized the shipping costs of cement and fertilizers, in order to promote balance in construction

121 and agriculture. Below we find that the cement and fertilizer equalization had little effect on manufacturing activity, and moreover that the data show manufacturing firms making little use of these materials.1 So we maintain focus on the equalization of iron and steel.2 The administration of FES for iron and steel fell under the authority of the Min- istry of Steel, as detailed by Raza and Aggarwal (1986), Singh (1989), and Mohanty (2015). To finance FES, the government calculated an ex-factory “retention price”, which depended on the expected average distance of shipments for the particular type of iron or steel. A self-financing Equalization Fund collected the difference be- tween this price and the actual shipping cost of short shipments, and paid out the associated credits for longer shipments. The fund was administered initially by the government Tariff Commission, then starting in 1964 by a Joint Plant Committee (JPC) established by the Ministry of Steel specifically for price regulation. The scope of FES was limited to the output of India’s integrated steel plants (ISPs). Figure 2-3 places these ISPs in the context of a stylized depiction of the supply chain for iron and steel products in India. The most basic natural resource needed for these products is iron ore, which as noted above, is located primarily in a handful of states in eastern India. The next step in the supply chain is transforming iron ore into basic iron and steel “materials” such as pig iron, structural steel, coils, sheets, and plates. This transformation generally happens at the ISPs, which are controlled by the Ministry of Steel and operate at tremendous scale, giving them a virtual monopoly on the production of the basic materials. There are only seven ISPs in India, and the newest of them was constructed in 1971. So the empirical analysis will regard the location of the ISPs as exogenously fixed, and study where other, more flexible factories choose to produce given these locations.3

1The geographic distributions of cement and fertilizers are also less concentrated ex ante, reducing the scope for the equalization of these materials to lead to large geographic shifts in economic activity. 2Some sources have erroneously stated that coal was subject to FES (Chakravorty and Lall, 2007). This is not quite correct. Coal production is a state enterprise and has been subject to a series of controls ensuring uniform pricing across markets, however the transportation has essentially always been priced based on distance (Raza and Aggarwal, 1986; Indian Railway Conference Association, 2000). 3This assumption is plainly innocuous in studying the effect of implementing FES. At the time,

122 The users of the materials produced by ISPs can be grouped into two categories. First, makers of processed iron and steel products directly use the ISP output, in order to produce somewhat more specialized products, such as more refined forms of iron and steel, or pieces of iron and steel shaped in particular ways – say into a car axle or chassis. Finally, there is a set of downstream industries using these spe- cialized products, and for example, assembling a car or another final good. Figure 2-3 indicates how we use industry codes to group industries according to their down- streamness in a way that is consistent across years, including years in the 1950s and 1960s when the industry codes observed in data are coarse. For recent years with more detailed industry classifications, measures such as Leontief input shares provide a more sophisticated way to characterize the linkages across industries.

As of 1956, India had two ISPs in operation, one being the current Tata Steel plant at Jamshedpur, Bihar, and the other being the IISCO plant, operated by the state-owned Steel Authority of India (SAIL) at Burnpur in West Bengal. Figure 2-4 plots their locations. The concentration of iron and steel material production at these points makes it clear how FES would have served as a subsidy for downstream iron and steel users to move production away from West Bengal and Bihar. After the im- plementation of FES, several new steel plants opened and fell under the FES scheme: SAIL Bhilai Steel Plant in Madhya Pradesh (now Chhattisgarh), SAIL Rourkela Steel Plant in Orissa, SAIL Bokaro Steel Plant in Bihar (now Jharkhand), and SAIL Dur- gapur Steel Plant in West Bengal, and Vizag Steel Plant in Andhra Pradesh. Under FES, the location of these new plants affects average shipping distances and therefore affects the all-India prices of iron and steel products. Since FES keeps pricesuni- form across regions, however, the new plants do not provide any particular advantage to nearby iron and steel users, and therefore should not affect these users’ location choices. there were only two ISPs and these plants had been in existence since 1907 and 1918. Once FES begins, moreover, the location of ISPs is irrelevant, since the cost of obtaining the ISP materials is equalized across locations. For studying the repeal of FES, there is in principle a concern that the locations of more recently constructed ISPs could be co-determined with the industry growth trends we are interested in studying. Subsection 2.3.3 examines this concern, arguing that it ultimately does not threaten identification.

123 The products covered by FES included most iron and steel materials produced by the ISPs for domestic use. Excluded were tin plates, pipes, electrical steel, and alloy steel, though these products amount to a small fraction of the plants’ output. The more important products subject to FES included basic materials such as pig iron and steel sheets. These materials serve as inputs to other firms manufacturing more processed or complex products. But the restriction of FES to the ISPs means that the output of these downstream users, even if it contains iron and steel, is still subject to normal, distance-based shipping charges.

FES remained in effect until 1991. The repeal was sudden, with the National Development Council meeting to evaluate FES in December 1991 and announcing its removal with effect from January 1992. In place of FES, the government implemented a “freight ceiling” policy, charging freight based on distance for shorter shipments, but capping the price of longer shipments. The ceiling was, however, set very high: 1125km for pig iron, 1375km for flats, 1400km for bars, and 1500km for semi-steel. In practice, the ceiling did not bind for most users, with only the farthest reaches of northern and southern India lying more than this distance from the nearest steel plant. In 2001, the ceiling was also lifted, marking the complete abolition of FES.

Ex-post appraisals of FES have involved vigorous debate over its effects on the geography of production. A 1977 inter-ministerial report calculated the size of the FES subsidies, concluding that they were inconsequential, amounting to a relatively small fraction of firms’ final output prices (Government of India Planning Commis- sion, 1977). Other commentators argue, however, that FES was a driving force for industrial production to move away from eastern India. Over the past half-century, the western (and southern) states benefiting from FES have enjoyed India’s high- est rates of economic growth, while the resource-rich eastern states find themselves among India’s poorest states today. Figure 2-2 shows that the divergence has been especially stark in manufacturing, lending plausibility to the idea that FES and the reversal of the manufacturing advantage contributed to these states’ overall reversal of fortune.

124 2.3 Empirical analysis

This section estimates the empirical effects of FES, focusing on three main questions. First, what was the long-run effect of FES on the geography of production? Second, what was the transition path from the implementation of FES to this long-run steady state? Finally, what was the effect of removing FES, and was it symmetric withthe effect of enacting the policy?

2.3.1 Long-run effect

The long-run effect is the main point of contention in debates over the geographic consequences of FES. The government report calculating a small total cost of the FES subsidies might set our priors toward expecting a small total effect of FES on industry location. But it is well know in economic geography that even a small divergence in fundamentals across regions can lead economic activity to concentrate in one area rather than the other (Baldwin et al., 2011), and more specifically that even a narrow advantage in the transportation costs affecting one region can lead to large competitive gains for the firms there (Redding and Turner, 2015; Firth, 2017). Some literature also suggests that subsidies to upstream sectors have the largest effects on aggregate productivity (Liu, 2017), leaving open the possibility that even a small subsidy to raw materials like iron and steel could lead to substantial gains spread throughout linked sectors. To identify the effects of FES, we estimate

∆ ln 푌푖푠휏 = 훽(IronSteelUser푖 × ln DistanceSteelPlant푠) + 훿푖 + 훿푠 + 휖푖푠휏 , (2.1)

where 푌푖푠휏 an outcome of interest for industry 푖 in state 푠 over the period 휏 from 1950-51 to 1990-91. In all specifications, we divide the long-differenced growth rate

∆ ln 푌푖푠휏 by the number of years in 휏, so coefficients can be interpreted as effects on average annual growth rates. IronSteelUser푖 indicates whether industry 푖 makes use of iron or steel as an input, and DistanceSteelPlant푠 is the distance from the centroid

125 of state 푠 to the nearest ISP in existence as of 1956. The fixed effect 훿푖 controls for industry-level trends, while 훿푠 controls for state-level growth trends affecting all industries in the state. In essence, 훽 captures whether iron- and steel-using industries outperform other local industries by a larger margin in the states where FES should give more of a boost to these iron and steel users. Since FES was lifted at the beginning of 1992, the time period considered provides the longest possible horizon for estimating the effects of FES. Studying this long horizon is important, because the results of Garred et al. (2015), as reproduced in Figure 2-5, show a minimal effect of FES in the decade following its implementation, but some indication that iron- and steel-using industries are moving away from West Bengal and Bihar by the late 1960s, which is the end of their sample period. The empirical strategy reflected in (2.1) is similar to that in Garred et al. (2015),but incorporates more data to capture long-run effects. The data to estimate (2.1) comes from two main sources. The first source is the Census of Manufactures (CoM), an annual survey of manufacturing firms, conducted from 1946 to 1958. For this analysis, data is available for 1950-51. The data indicates, for each manufacturing industry, the details of production happening in each state. The main state-by-industry variables of interest for this analysis are the number of factories in operation, number of workers, and total value of ex-factory output. One limitation of the CoM arises from its categorization of industries. This cat- egorization is very coarse compared to modern industry classifications, with all pro- duction divided into 28 industry categories, listed in Table 2.1. Among these, two are especially relevant for the present analysis: the “Iron and Steel” industry which uses equalized material as an input, and the “Engineering” industry which is one step further downstream. Some of the output attributed to the Iron and Steel industry will itself be subject to equalization, but only that coming from the handful of ISPs. Another limitation concerns state boundaries. The states listed in the CoM reflect state boundaries prior to the major redrawing which occurred in the late 1950s and early 1960s. This redrawing carved new states along ethno-linguistic lines in ways that often entail many-to-many mappings between the old and new states. A handful

126 of states do remain unchanged from 1950 until 1990: West Bengal, Orissa, Bihar, Delhi, and Uttar Pradesh.4 So in looking at long-run patterns, we consider each of these states as existing continuously, and construct a synthetic state, Other, which is a composite of all remaining states. Below we also conduct separate analyses for more limited sets of years within which more states exist continuously. The second main data source is the successor to the CoM, known as the Annual Survey of Industries (ASI). From 1959 to 1971, we have annual ASI data, at the level of state-by-industry. The industries correspond to three-digit codes in the International Standard Industrial Classification (ISIC), Rev. 1. The set of available outcome measures varies by year, but always includes, for each state-industry, the number of factories, the number of workers employed, and the total value of ex-factory output. Starting in 1988-89, we have annual establishment-level ASI data. The wide coverage of years and the establishment-level data will facilitate more detailed analysis below, but to estimate (2.1) we only need 1990-91 data, with industries and states collapsed to match those in the 1950 data. Estimates of (2.1) appear in Table 2.3. The 훽ˆ are positive and significant, meaning that iron- and steel-using activity shifts toward states which are distant from the sources of iron and steel. Specifically, to interpret the magnitude in Column (1), consider a comparison between two states: Uttar Pradesh, whose centroid is 731km from the nearest of the original ISPs, and Punjab, which is roughly twice as far, 1388km. If all industries in Uttar Pradesh are growing at the same rate, the estimate 훽ˆ = 0.5 says that in Punjab the iron and steel industries are growing at an annual average of 0.5 percentage points faster than the rest of the state’s industries. Column (2) shows that this growth is accompanied by growth in the industries downstream from the iron and steel users. These downstream industries in Punjab would, for example, experience a 0.6 percentage point annual growth advantage.5 To clarify the trends underlying these results, Column (3) departs from the pre-

4In 2000 some of these states split: Bihar split into Bihar and Jharkhand. Uttar Pradesh split into Uttar Pradesh and Uttarkhand. Madhya Pradesh split into Madhya Pradesh and Chhattisgarh. 5While this effect is larger than the effect on industries directly using iron and steel, thetwo estimates are not statistically distinct.

127 ferred specification. In particular, it removes the state and industry fixed effects, and adds controls for state distance to a plant and industry iron and steel use. The average state-industry grows between 1950 and 1990.6 Iron and steel industries grow by an average of 4.3 percentage points more than other industries. A state like Pun- jab grows by 0.478 percentage points more than a state like Uttar Pradesh. Then, as the coefficient of interest shows, FES gives a special advantage to the iron and steel producers located in places like Punjab which receive subsidized shipping under equalization. Columns (4) through (6) of Table 2.3 present similar results, but with the distance variable replaced by an indicator which equals zero for the states containing one of the original ISPs (West Bengal and Bihar), and one for other states. These results measure the direct loss to the states which claimed aggrievement under FES. As the results show, these states not only lose their share in the iron and steel industry, but also see downstream producers move away. For robustness, Table 2.4 presents results with all available states, not only those with unchanged boundaries from 1950 to 1990. These results admit some noise be- cause the observed 1950 to 1990 growth rates for each state owe in part to the states’ changing boundaries. Nevertheless, the estimates of interest are essentially unchanged. Table 2.9 shows that results are also not affected by whether the state- industries are weighted by their 1950 size. The core identifying assumption behind these results is that in the absence of FES, the growth of industries using iron and steel relative to other industries would not have differed in a way that is associated with distance from the ISPs. As basic support for this assumption, we consider pre-FES trends in state-by-industry output. We know from Garred et al. (2015), as reflected in Figure 2-5, that the pre-trends were parallel. The share of West Bengal and Bihar in iron and steel using industries followed the same time path as these states’ share in other industries, lending plausibility to the

6The unreported constant term in this regression is 3.8, suggesting that the average state-industry grows an an annual rate of 3.8 percent between 1950 and 1990. This high rate is partly an artifact of the data (the 1990 ASI is more comprehensive than the 1950 CoM), and partly because of genuine growth over this period. The interest, in any event, is not in this secular growth rate, but in how rates differ by state and industry.

128 claim that these trends would have remained parallel in the absence of FES. The remaining threat to identification, then, would need to come from other state-by- industry differences, arising at the same time as FES or shortly after, which push iron and steel using industries to grow relatively more slowly in West Bengal and Bihar than elsewhere.

One possible confound in this vein is industrial policy enacted around the time of FES. India has tried favor or protect certain industries at various times in its history, including, indeed, with the Industrial Policy Resolution of 1956 (Government of India, 1956). This resolution categorized industries according to their degree of planned state ownership and provided subsidies to particular industries (Mohan, 2002), possibly setting the course for more prosperous growth in some industries than in others. But even this type of industry-level differential growth is not a threat to the identification strategy, because it is accounted for by the industry effect 훿푖. These policies would confound identification only if, for instance, there were cross-state differences inthe ability of firms in a particular industry to take advantage of the terms of the policies. But none of the policies’ terms nor the historical record about them points to such differences.

As another possible confound, regional development policies or political conditions have at times led to growth advantages for certain regions, especially with the promi- nence in India of regional parties whose fortunes wax and wane with the political cycles. Ahluwalia (2002) finds some evidence, albeit limited, for effects of govern- ment intervention on these divergences in states’ growth rates. But the identification strategy circumvents this concern too, since the state effect 훿푠 partials out differences in regional trends. The strategy’s key advantage is that despite the recurrence of state- and industry-specific development policies, FES is rather unique in differen- tially affecting state-by-industries. Below, with the advantage of richer data from later years, we conduct additional tests to show the movements we attribute to FES are not attributable to other state-by-industry characteristics.

129 2.3.2 Transition path

To examine the transition path from the implementation of FES to the long-run steady state under the policy, we conduct an event study:

2007 ∑︁ ln 푌푖푠푦 = 훽푦(1Year=푦×IronSteelUser푖×ln DistanceSteelPlant푠)+훿푖푠+훿푖푦+훿푠푦+휖푖푠푦. 푦=1950 (2.2)

The coefficients of interest are the 훽푦, which indicate how the year 푦 outcomes 푌푖푠푦 differ as a function of the treatment status of industry 푖, state 푠. Some recent work emphasizes transitional dynamics in response to trade liberalization or other changes in frictions (Dix-Carneiro, 2014; Caliendo, Dvorkin and Parro, 2017). Highlighting the importance of these dynamics, Dix-Carneiro and Kovak (2017) show that the impact of Brazilian tariff cuts on regional earnings 20 years after liberalization was more than three times the effect 10 years after liberalization. Estimating (2.2) helps explain the mechanism by which FES arrived at the effects estimated in subsection 2.3.1 and what types of gains or losses were realized along the way.

Figure 2-6 presents the results of the event study. First, Figure 2-6a shows the ef- fects on output. Several years of data are missing due to limitations in the availability of data from the CoM and ASI. But the remaining years provide enough information to characterize the transition path. In the years following the implementation of FES, represented by the blue line at 1956, there was no great movement of iron and steel industries away from the ISPs. Then, for 1962, there is a positive but insignificant coefficient, indicating some movement in the direction of the ultimate FES effect. A slight upward trend continues until 1970. Projecting this trend forward through the 1970s and 1980s leads to the ultimate effect of FES, which is observed around 1990. The fact that these effects of FES appear gradually points to the key difference between our analysis and Garred et al. (2015), the only other econometric analysis of FES. As of 1970, when their data ends, FES shows some hint of an effect, but it is not statistically or economically significant. Only by looking to the longer term do we see the effect.

130 Figure 2-6b helps to explain this speed of transition, by conducting a similar event study for the effects of FES on the number of factories operating in a state-industry. The time pattern is remarkably similar to that for the output event study in Figure 2-6a. This suggests that the relocation of iron and steel output depends on the closing of factories near the ISPs and the opening of factories in faraway states, not simply the adjustment of output in existing factories. Some institutional details provide further corroboration. As Krishna Moorthy (1984) and Singh (1989) describe, steel plants require extensive planning, taking five to nine years from conception to reaching full operation. Restrictive licensing regulations, as discussed in Krueger (2002), also slowed rates of factory relocation and delayed the time at which we should expect to see effects of FES. If a steel producer decided to move or open a factory aroundthe time of FES in 1956, the new factory would likely become operational only somewhat later, right around the time when we start to see factories move in Figure 2-6b. These moving and setup costs could then create an additional friction which slows the movement of other input-output linked producers who find it worthwhile to move only once a certain number of other factories have moved. Another advantage of the transition years is that they afford richer data than what is available in the CoM prior to the implementation of FES. By studying long differences from 1959 until the end of FES in 1990, we can examine the “transition effect” starting from the early years of FES. Such estimates of course miss anyimpact of FES realized immediately upon its implementation, though as the above event studies show, these impact effects were minimal. Starting in 1959, the ASI reports state-by-industry output at the level of three-digit ISIC, Rev. 1, which maps to the four-digit NIC-87 codes (based on ISIC, Rev. 2) and four-digit NIC-98 codes (based on ISIC, Rev. 3) used in later versions of ASI. To link across years, we concord all data to the NIC-98 codes. We then match these industry codes input-output table measures of each industry’s input shares and Leontief input shares (Ministry of Statistics and Programme Implementation, 1994).7 This input-output information enables a more structural interpretation of the FES

7Results are similar using other measures of input use, such as those from US input-output tables.

131 effects. In particular, suppose production is Cobb-Douglas, where 푎푖푗 is the share of industry 푗’s output used in the production of industry 푖. These shares give rise to the production matrix A. The Leontief inverse matrix is L ≡ (I − A)−1. An element of this matrix, 푙푖푗, indicates the effect of a productivity shock in sector 푗 on output in sector 푖, accounting for all of the higher-order linkages between 푗 and 푖 (e.g., 푗 is used in the production of some other good, which is in turn used in the production of 푖). Suppose each state is a closed economy able to obtain iron and steel at an exogenous

price. Let 푗 be iron and steel, and 푑 ln 푝푗푠 the effect of FES on the price, inclusive of transport cost, faced by a producer of 푖 in state 푠. It follows that

푑 ln 푦푖푠 = −푙푖푗푑 ln 푝푗푠. (2.3)

We can decompose the price effect into

푑 ln 푝푗푠 = 휅(푑 ln휏 ¯푗 − ln 퐷푠), (2.4)

where 휅 is the fraction of shipping cost in the price of 푗, 휏¯푗 is the per-kilometer freight rate (which potentially increases with FES due to the distortion of subsidizing long

distance shipping), and 퐷푠 is the distance from state 푠 to the source of 푗. The state- specific component of the change in shipping cost is proportional to this distance because under FES all states are charged as if they are the average distance from the source of steel, yielding larger benefits for more distant states. Combining (2.3) and (2.4) it follows that

푑 ln 푦푖푠 = 휅푙푖푗 ln 퐷푠 − 휅푑 ln휏 ¯푗. (2.5)

The second term, 휅푑 ln휏 ¯푗 is absorbed by secular trends, leaving the effect of FES to be

the product of the Leontief share 푙푖푗 and the distance. We therefore estimate a version of (2.1) with the industry iron or steel use indicator replaced by the Leontief share. Appendix 2.6 provides a more general expression for the effects of these distortions, allowing the price of materials to differ for each of an industry’s input suppliers, for instance because each industry uses a different set of materials or because these

132 input suppliers are located in other regions whose prices of materials are differentially affected by FES. The implied estimating equation is, however, similar.

Table 2.5 presents results from estimating this specification. Leontief share and distance interact to increase state-industry growth rates, as we predict under FES. In particular, moving from the 10th percentile of industries’ Leontief reliance on iron and steel (푙푖푗 = 0.01) to the 90th percentile (푙푖푗 = 0.39) increases the relative growth rate of the industry by 0.9 percentage points annually. The results also show, in Panel B, that factory location is affected, with Leontief shares determining industries’ movement away from the steel sources. The final available industry variable which is consistent across years is the number of workers employed, and this increases slightly as a result of FES, implying a positive but insignificant effect on labor productivity. Table 2.6 presents similar results, but with the raw input shares instead of the Leontief shares. These estimates are noisier but indicate qualitatively the same effects on output and factory location.

Both Table 2.5 and 2.6 also use the input-output structure to show the robustness of the results to other input distortions coinciding with the equalization of iron and steel. As described in Section 2.2, the Indian government also equalized cement and fertilizers around the same time as iron and steel, and implemented price controls on coal which, while not an equalization scheme per se, might also have altered the geography of production. Cement and fertilizers are produced in a more dispersed set of locations, reducing some of the worry that their equalization would confound estimates of the effects for iron and steel. But coal is produced, also as a government enterprise, in the same resource-rich states hosting the ISPs. If the FES for iron and steel was part of a broader set of policies to relocate certain industries between certain states, we might expect to see this working through the controls on coal. Columns (2), (3), and (4) of these tables show, however, that the main results are unaffected by allowing for state-specific effects of using coal, fertilizers, or cement. The driver of the observed patterns is, rather, industries’ iron and steel use, just as predicted if the cause is FES.

133 2.3.3 Stickiness

While FES contributed to iron and steel users’ movements from eastern to western India, it is unclear whether repealing FES would fully undo these effects. In particular, industry movements toward western India under FES could have led to agglomeration in western industrial centers. Marshall (1890) posits a variety of possible sources of agglomeration externalities, including labor market pooling, knowledge spillovers, and input-output linkages. Recent empirical work provides direct causal evidence on the strength of more specific agglomerative forces, such as intra-urban transportation links (Ahlfeldt et al., 2015) and firm-to-firm matching (Miyauchi, 2018). Ryan (2012) shows, specifically in India in 2005, that agglomeration levels are similar tothose observed in the United States (Ellison and Glaeser, 1997). Once economic activity agglomerates in an area, a change such as the lifting of FES might be insufficient to undo that area’s productivity advantage and push production elsewhere.8

To test whether the effects of FES persist even in the absence of the policy,we estimate (2.1) over the time period from 1990, immediately before FES was lifted, until 2013-14, the latest year for which data is available. Results of this test appear in Table 2.7. Columns (1) through (3) show 훽ˆ which are almost identical in magnitude to those found in subsection 2.3.1, but now negative instead of positive. In other words, repealing FES leads iron and steel using industries to move back toward the ISPs, just as strongly as FES made them move away. Seemingly FES did not lead to irreversible agglomeration.

Columns (4) through (6) show, nevertheless, that repealing FES did not fully compensate the states of West Bengal and Bihar for their losses under FES. The reasons these states do not fully recover, despite the movements toward ISPs, is that now West Bengal and Bihar are not the only states containing ISPs. Recall from Figure 2-4 that under FES the government opened new steel plants outside of these states, partly in order to reduce average shipment distances and limit the total cost

8Indeed, a long tradition in economic geography (Baldwin et al., 2011) studies the conditions under which a geographic steady state is stable, and more recent work places these models in a quantitative framework (Allen and Donaldson, 2018).

134 of the FES subsidy. When iron and steel using activity moves back toward this set of ISPs, it leads to some gains in West Bengal and Bihar, and therefore estimates in columns (4) through (6) which are negative and nontrivial in magnitude. Yet activity is also moving toward some of the new ISPs, outside West Bengal and Bihar, meaning these states need to share some of the gains from FES’s repeal. A new identifying assumption specific to this portion of the analysis is that the location of the new ISPs built under FES was not associated with counterfactual industry growth rates. For example, identification would be confounded if the ISPs were built in areas projected to have high relative growth rates in their iron and steel using industries. One institutional factor allaying this concern is that the new ISPs were built primarily to optimize transport costs under FES, without any indication that FES would be repealed in the future. Also, the newest of the ISPs was built in 1971, at which time it would have been difficult to forecast the industry growth trends starting from the repeal of FES, which was still 20 years to come. As an additional test against this concern, we estimate a modified version of (2.1), taking out the state-industry growth rates from the years prior to the FES repeal:

∆ ln 푌푖푠,휏 − ∆ ln 푌푖푠,휏−1 = 훽(IronSteelUser푖 × ln DistanceSteelPlant푠) + 훿푖 + 훿푠 + 휖푖푠휏 . (2.6) Here, 휏 is still the period of interest, 1990 to 2013, and 휏 − 1 is the preceding period containing the pre-trend, 1967 to 1990.9 In addition to addressing the concern that the new ISP locations might be associated with projected growth rates, this specification allows for the possibility that transition to the steady state was still underway at the time of FES’s repeal. In the earlier event study of Figure 2-6 we do not have enough data from the 1980s to reveal the transition path leading up to 1990, and it is possible that this path contains a pre-trend driving the results found in Table 2.7.10

9This period definition forms pre and post periods of equal length, but we also consider the longest available pre-trend dating back to 1959, and a period dating back only to 1989 in order to capture an instantaneous rate of change. 10The precise event study relevant to identify such a pre-trend would involve an extension of Figure 2-7 backward in time, which is slightly different from the event study in Figure 2.2, given the different set of ISPs in existence at the time of repeal versus implementation. Either of these tests would, in any event, require additional data from the 1980s.

135 Estimates of (2.6) appear in Table 2.8. The results are very similar in magnitude to those in Table 2.7, though with some loss in precision leading to insignificant estimates in columns (1) and (3). One notable difference is the larger estimate for the effects on downstream industries. A possible explanation is that under FES downstream industries were slower to transition from east to west, so that at the time of the FES repeal these industries were still in the process of moving west, and relative to this benchmark they exhibit an especially large movement west when FES is repealed. The other difference, appearing in columns (3) and (6), is the large negative trend both for iron and steel industries nationwide, and for states distant from ISPs. Possible reasons for this are, first, that the iron and steel industries simply grow slowly relative to the period’s strong manufacturing growth in a variety of other sectors (Bollard, Klenow and Sharma, 2013), and second, that distant states’ growth rate was slow only relative to the fast benchmark they set while they enjoyed the benefits of FES. To characterize the transition path associated with the repeal of FES, Figure 2-7 presents an event study. The effects on output, depicted in Figure 2-7a, show asteady transition following the repeal of FES, and a leveling off the effect around 2002. This amounts to a somewhat faster transition than that following the implementation of FES, perhaps because economy-wide liberalization in the 1990s reduced the frictions for factories to move and for owners to establish new factories in the locations that were optimal given the new freight regime. An especially relevant reform is de- licensing, studied by Aghion et al. (2008), which removed a set of barriers to firm entry and expansion. Indeed, Figure 2-7b shows that, once again, the event study for number of firms in a state-industry mirrors that for output, suggesting that factory location is a key margin of adjustment along the transition path.

2.4 Conclusion

These results demonstrate that even small geographic distortions in input prices can help one region to nose ahead of another and exploit this advantage to steal industrial activity. Over the long term, this can result in substantial effects on the geographic

136 distribution of production. The effects can take time to materialize, however, given frictions such as factory moving and setup costs. The cumulative nature of these advantages might lead one to suspect that they cannot easily be reversed. We find in the case of FES, though, that repealing the policy led industry to move back toward the sources of iron and steel just as quickly as it left. Indeed, the results on implementation and repeal also complement one another, with the alignment between these results building confidence that, in both cases, the distortions related toFES cause industries to move across space in the manner described. An important aspect of this story is the role of input-output linkages, with the FES policy distortion affecting not only firms using the equalized materials but alsotheir downstream neighbors. Input-output linkages potentially also play a role determining the rate of transition. For instance, if one set of industries faces frictions and is slow to move across space in response to some price change, then the downstream industries will be delayed in when they find it attractive to move, they will in turn face frictions slowing their move once it starts, and so forth down the supply chain. Avenues for future work could involve exploring these channels through which input-output linkages drive geographic movement and determine its ultimate welfare implications.

137 Figure 2-1: The economic geography of India at the time of FES implementation

(a) Output at start of FES

(b) Sources of raw materials

Notes: These maps depict the geography of production in India, and thus the effects which are likely to follow from the Freight Equalization Scheme (FES). As panel (a) shows, industrial activity in 1960, around the time of FES implementation was concentrated in two main centers: one around Bombay, and one around West Bengal. These figures come from ASI data. As panel (b) shows, part of the comparative advantage of West Bengal owed to its proximity of natural resources such as iron ore. By subsidizing long-distance shipments of input materials, FES created and incentive for iron and steel using production to move away from West Bengal and toward other centers like Bombay.

138 Figure 2-2: Historical trends in state manufacturing output

Notes: This figure depicts the broad state-level trends in manufacturing output between 1960 and 2008: growth in resource-rich states such as West Bengal, Orissa, MP, and Bihar has lagged behind growth in distance states such as Punjab, Gujarat, and Tamil Nadu. These latter states stood to benefit most from FES and have indeed enjoyed some of India’s most impressive growth rates in recent decades.

139 Figure 2-3: Stylized description of supply chain, providing institutional details and link with data

Notes: This figure categorizes activities into steps in the supply chain for prod- ucts made from iron and steel. It also clarifies the institutional context of pro- duction, particularly as this bears on geography. The source locations of iron ore are fixed, and the location of ISPs for processing this ore are essentially fixed given the sites of major plants established by the government. In our framework, users of this basic iron and steel (and those downstream from them) take the lo- cation of the government plants as fixed, and choose their own location based on the shipping costs to get materials from those plants. Each piece of this chain which corresponds to manufacturing output finds an empirical counterpart in our data, as indicated.

140 Figure 2-4: Map of basic iron and steel production

Notes: This map depicts the locations of India’s integrated steel plants (ISPs). Only the output of these plants was subject to FES. At the time of FES imple- mentation, there were only two ISPs, with each operating at a large scale. Five new plants were established in the ensuing years, in part serving to reduce the government’s distribution costs under FES.

141 Figure 2-5: Share of Bihar and West Bengal in iron- and steel-using manufacturing (Engineering) and in other industries

Notes: This figure reproduces Figure 3.2 from Garred et al. (2015). It plots theshare of Bihar and West Bengal in Engineering, the main iron- and steel-using manufacturing industry (solid line) versus to their share in industries not using iron or steel (dashed line). The initial years following implementation of FES show little reduction in these states’ Engineering share relative to their Other share. Toward the end of the sample period, however, the Engineering share begins to fall to the level of the Other share.

142 Figure 2-6: Event study showing transition path of FES effects

(a) ln(Output)

(b) ln(Number of factories)

Notes: This figure shows the long-run effects of FES. Plotted coefficients for each year are the 훽푦 from

2007 ∑︁ ln 푌푖푠푦 = 훽푦(1Year=푦×IronSteelUser푖×ln DistanceSteelPlant푠)+훿푖푠+훿푖푦+훿푠푦+휖푖푠푦. 푦=1950

143 Figure 2-7: Event study on repeal of FES

(a) ln(Output)

(b) ln(Number of factories)

Notes: This figure shows the long-run effects of FES. Plotted coefficients for each year are the 훽푦 from

2013 ∑︁ ln 푌푖푠푦 = 훽푦(1Year=푦×IronSteelUser푖×ln DistanceSteelPlant푠)+훿푖푠+훿푖푦+훿푠푦+휖푖푠푦. 푦=1989

144 Table 2.1: List of industries in Indian Census of Manufactures

Wheat flour Plywood and tea-chests Rice milling Paper and paper-board Biscuit making Matches Fruit and vegetable processing Cotton textiles Sugar Woolen textiles Distilleries and breweries Jute textiles Starch Chemicals Vegetable oils Aluminum, copper, and brass Paints and varnishes Iron and steel Soap Bicycles Tanning Sewing machines Cement Electric lamps Glass and glassware Electric fans Ceramics General engineering and electrical engineering (including producer gas plant manufacture)

Notes: This table lists the industry categories found in the annual Census of Manufactures (1946-58). The final category, Engineering, is broad, encompassing a range of manufacturing activities involving some measure of complexity.

145 Table 2.2: Descriptive statistics

1950 (CoM) 1959 to 1970 (ASI) 1989 to 2013 (ASI)

Mean St. Dev. Mean St. Dev. Mean St. Dev. (1) (2) (3) (4) All industries Ex-factory value of output 251 911 .145 .334 10519 31899

Number of factories 121 379 26 45 71 176

Workers employed 28556 104814 8550 24381 5495 17517

Annual growth rate .182 2.857 .014 1.957 .005 2.994

Industries 15 54 104

State X industry obs. 172 411 1551

Iron and steel using industries Ex-factory value of output 153 202 .147 .268 11125 35369

Number of factories 147 301 25 33 54 88

Workers employed 18008 25474 8682 13774 4359 9768

Annual growth rate .164 3.065 .005 2.185 -.013 3.131

Industries 2 11 34

State X industry obs. 19 110 520

Notes: This table presents descriptive statistics on the state-by-industry data compiled from Indian firm surveys. The data is divided by distinct firm survey: Census of Manufactures (CoM), oldASI, and modern ASI. Each figure reported reflects a state-by-industry level average for each year, with years then averaged within each dataset. Output is reported in millions of INR, at 2011 constant prices. Because industry definitions can change over time, the size of a state-by-industry unitcan vary across datasets.

146 Table 2.3: Effects of FES on long-run industrial growth, for states with unchanged boundaries

Dependent variable: Δ ln(Revenue) (1) (2) (3) (4) (5) (6)

(Ind. uses iron/steel) 0.452*** 0.471*** 0.481** × ln(State dist. to ISP) (0.128) (0.134) (0.191)

(Downstream iron/steel) 0.605*** × ln(State dist. to ISP) (0.195)

(Ind. uses iron/steel) 3.001*** 3.127*** 3.164** ×(State contains ISP) (0.857) (0.894) (1.268)

(Downstream iron/steel) 4.055*** ×(State contains ISP) (1.313)

Ind. uses iron/steel 4.312*** 4.319*** (0.973) (0.970)

ln(State dist. to ISP) 0.478*** (0.154)

State contains ISP 3.197*** (1.011)

Observations 80 80 80 80 80 80 Adjusted 푅2 0.815 0.830 0.426 0.815 0.830 0.431

State, Industry FE XXXX Notes: This table presents results of the main specification (2.1) for studying the effects of FES. Observations are at the state-by-industry level, weighted by pre-FES output. The de- pendent variable is change in logged ex-factory value of output between 1950 and 1990, di- vided by the number of years. The interacted variables of interest include, first, an indicator of whether the industry is a direction user of iron or steel as an input, and an indicator of whether the industry is downstream from an iron or steel user. Precise definitions of these in- dustry categories are in the text and in Figure 2-3. Next, the state-level variables are logged distance from the state’s centroid to the nearest integrated steel plant (ISP), and an indica- tor of whether the state itself contains an ISP (true for West Bengal and Bihar). The sample is limited to states in continual existence from 1950 to 1990. Robust standard errors are in parentheses. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

147 Table 2.4: Effects of FES on long-run industrial growth, for all states

Dependent variable: Δ ln(Revenue) (1) (2) (3) (4) (5) (6)

(Ind. uses iron/steel) 0.507** 0.527*** 0.747*** × ln(State dist. to ISP) (0.196) (0.199) (0.227)

(Downstream iron/steel) 0.252* × ln(State dist. to ISP) (0.151)

(Ind. uses iron/steel) 3.524*** 3.661*** 5.110*** ×(State contains ISP) (1.333) (1.357) (1.634)

(Downstream iron/steel) 1.779* ×(State contains ISP) (1.057)

Ind. uses iron/steel 4.308*** 4.319*** (0.960) (0.958) ln(State dist. to ISP) 0.338* (0.198)

State contains ISP 2.378* (1.383)

Observations 152 152 152 152 152 152 Adjusted 푅2 0.778 0.783 0.217 0.778 0.783 0.218

State, Industry FE XXXX Notes: This table presents results of the main specification (2.1), with the sample including all available states. In the case of changing boundaries a 1950 state is match to the 1990 state overlapping with it most. All other details are as in Table 2.3 above. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

148 Table 2.5: Effects of FES on long-run location of downstream industries

(1) (2) (3) (4) Panel A. Effect on Δ ln(Revenue) (Leontief share: iron/steel) 2.324** 2.345** 2.443** 2.395** × ln(State dist. to ISP) (1.148) (1.152) (1.166) (1.173)

Observations 1048 1048 1048 1048 Adjusted 푅2 0.623 0.975 0.974 0.975

Panel B. Effect on Δ ln(Number of firms) (Leontief share: iron/steel) 1.078* 1.102* 1.104* 1.204* × ln(State dist. to ISP) (0.609) (0.613) (0.620) (0.628)

Observations 1063 1063 1063 1063 Adjusted 푅2 0.729 0.906 0.901 0.902

Panel C. Effect on Δ ln(Labor productivity) (Leontief share: iron/steel) 0.605 0.615 0.674 0.590 × ln(State dist. to ISP) (0.534) (0.546) (0.542) (0.546)

Observations 1048 1048 1048 1048 Adjusted 푅2 0.360 0.990 0.990 0.990

Control for State × (Coal use) X Control for State × (Fertilizer use) X Control for State × (Cement use) X Notes: This table presents a version of (2.1) where the indicator of industry iron and steel use is replaced by the industry’s Leontief input share of iron and steel use. Underlying this speci- fication is equation (2.3) and the derivation in Appendix 2.6. All regressions include stateand industry fixed effects, and all other details are as above. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

149 Table 2.6: Effects of FES on long-run location of iron-using industries

(1) (2) (3) (4) Panel A. Effect on Δ ln(Revenue) (Input share: iron/steel) 3.445* 3.773** 3.709** 3.671* × ln(State dist. to ISP) (1.841) (1.851) (1.873) (1.882)

Observations 1063 1063 1063 1063 Adjusted 푅2 0.631 0.975 0.975 0.975

Panel B. Effect on Δ ln(Number of firms) (Input share: iron/steel) 1.554 1.747* 1.610 1.771* × ln(State dist. to ISP) (0.962) (0.961) (0.982) (0.996)

Observations 1078 1078 1078 1078 Adjusted 푅2 0.730 0.908 0.903 0.903

Panel C. Effect on Δ ln(Labor productivity) (Input share: iron/steel) 1.182 1.229 1.315 1.192 × ln(State dist. to ISP) (0.858) (0.871) (0.872) (0.879)

Observations 1063 1063 1063 1063 Adjusted 푅2 0.366 0.990 0.990 0.990

Control for State × (Coal use) X Control for State × (Fertilizer use) X Control for State × (Cement use) X Notes: This table modifies the test of Table 2.5 by using the direct iron and steel input useshare, instead of the Leontief share. All regressions include state and industry fixed effects, and all other details are as above. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.* 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

150 Table 2.7: Effects of repealing FES

Dependent variable: Δ ln(Revenue) (1) (2) (3) (4) (5) (6)

(Ind. uses iron/steel) −0.502* −0.518* −0.518* × ln(State dist. to ISP) (0.263) (0.269) (0.274)

(Downstream iron/steel) −0.051 × ln(State dist. to ISP) (0.178)

(Ind. uses iron/steel) −3.085 −2.781 −3.300 ×(State contains ISP) (2.251) (2.305) (2.165)

(Downstream iron/steel) 0.927 ×(State contains ISP) (1.588)

Ind. uses iron/steel 1.838 2.036 (1.117) (1.739) ln(State dist. to ISP) 0.342*** (0.097)

State contains ISP 3.270*** (0.890)

Observations 1355 1355 1355 1355 1355 1355 Adjusted 푅2 0.347 0.347 0.009 0.347 0.347 0.011

State, Industry FE XXXX Notes: This table studies the effect of repealing FES, by estimating (2.1) over the period 1990 to 2013. All other details are as in Table 2.3. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

151 Table 2.8: Effects of repealing FES, controlling for pre-trend from 1967 to1990

Dependent variable: Δ ln(Revenue) (1) (2) (3) (4) (5) (6)

(Ind. uses iron/steel) −0.497 −0.832** −0.519 × ln(State dist. to ISP) (0.412) (0.423) (0.414)

(Downstream iron/steel) −0.912*** × ln(State dist. to ISP) (0.323)

(Ind. uses iron/steel) −0.789 −1.735 −0.992 ×(State contains ISP) (3.739) (3.812) (3.648)

(Downstream iron/steel) −2.663 ×(State contains ISP) (2.886)

Ind. uses iron/steel −12.121*** −13.873*** (2.029) (3.221) ln(State dist. to ISP) −0.346* (0.193)

State contains ISP −3.803** (1.659)

Observations 1118 1118 1119 1118 1118 1119 Adjusted 푅2 0.447 0.452 0.013 0.447 0.447 0.015

State, Industry FE XXXX Notes: This table presents estimates of (2.6). All details are as in Table 2.7, except that the 1967 to 1990 pre-trend is taken out of the growth outcome variable. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

152 2.6 Appendix

Additional tables

Table 2.9: Effects of FES on long-run industrial growth, unweighted

Dependent variable: Δ ln(Revenue) (1) (2) (3) (4) (5) (6)

(Ind. uses iron/steel) 0.588** 0.600** 0.483*** × ln(State dist. to ISP) (0.246) (0.251) (0.166)

(Downstream iron/steel) 0.157 × ln(State dist. to ISP) (0.195)

(Ind. uses iron/steel) 4.639** 4.712** 4.017*** ×(State contains ISP) (1.809) (1.845) (1.228)

(Downstream iron/steel) 0.996 ×(State contains ISP) (1.343)

Ind. uses iron/steel 2.260** 1.696** (0.898) (0.697)

ln(State dist. to ISP) 0.675*** (0.111)

State contains ISP 4.506*** (0.764)

Observations 152 152 152 152 152 152 Adjusted 푅2 0.354 0.349 0.202 0.356 0.351 0.193

State, Industry FE XXXX Notes: This table presents estimates of (2.1) as in Table 2.3, except that here state-industries are not weighted by 1950 size of state-industry. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

153 Table 2.10: Effects of repealing FES, controlling for pre-trend from 1989 to1990

Dependent variable: Δ ln(Revenue) (1) (2) (3) (4) (5) (6)

(Ind. uses iron/steel) −0.702 −0.990* −0.816* × ln(State dist. to ISP) (0.510) (0.524) (0.492)

(Downstream iron/steel) −0.711** × ln(State dist. to ISP) (0.303)

(Ind. uses iron/steel) −1.554 −2.157 −2.420 ×(State contains ISP) (4.174) (4.224) (3.938)

(Downstream iron/steel) −1.634 ×(State contains ISP) (2.676)

Ind. uses iron/steel −10.856*** −12.946*** (1.934) (3.224)

ln(State dist. to ISP) −0.404** (0.183)

State contains ISP −4.338*** (1.534)

Observations 1098 1098 1098 1098 1098 1098 Adjusted 푅2 0.444 0.447 0.019 0.443 0.443 0.022

State, Industry FE XXXX Notes: This table presents estimates of (2.6) as in Table 2.7, except that here it takes out the instantaneous pre-trend from 1989 to 1990, instead of the long-run pre-trend going back to 1967. * 푝 < 0.10, ** 푝 < 0.05, *** 푝 < 0.01.

154 Conceptual framework: effects of changes in materials prices

Deriving Estimating Equations Consider a Cobb-Douglas model with input- output linkages. Production function takes the form

훼푖 훾푖 푎푖푗 푦푖 = 푧푖푙푖 휉푖 Π푗푥푖푗

where 휉푖 is raw input (iron or steel) and 푥푗 is the from industry 푗 used for industry 푖. Suppose 푖 indices for industries and there is a representative consumer with Cobb- Douglas preferences

훽푖 푈 = Π푖푦푖

We have 푝푗푥푖푗 푎푖푗 = 푝푖푦푖

푤푙푖 훼푖 = 푝푖푦푖

휉 푝푖 휉푖 훾푖 = 푝푖푦푖 푝 푐 푝 푐 푖 푖 = 푗 푗 . 훽푖 훽푗

푝푖푐푖 = 훽푖푤푙.

Total differentiating, the equations above, we have

휉 푑 ln 푦푖 + 푑 ln 푝푖 = 푑 ln 푝푗 + 푑 ln 푥푖푗 = 푑 ln 푙푖 = 푑 ln 푝푖 + 푑 ln 휉푖

푑 ln 푝푖 + 푑 ln 푐푖 = 푑 ln 푝푗 + 푑 ln 푐푗 = 0.

To derive an estimating equation, total differentiating the production function:

∑︁ 푑 ln 푦푖 = 푑 ln 푧푖 + 훼푖푑 ln 푙푖 + 훾푖푑 ln 휉푖 + 푎푖푗푑 ln 푥푖푗. 푗

155 We now substitute the equations above and get

∑︁ 푑 ln 푦푖 = 푑 ln 푧푖 + 훼푖푑 ln 푙푖 + 훾푖푑 ln 휉푖 + 푎푖푗푑 ln 푥푖푗 푗

= 푑 ln 푧푖 + 훼푖 (푑 ln 푦푖 + 푑 ln 푝푖)

(︁ 휉)︁ ∑︁ +훾푖 푑 ln 푦푖 + 푑 ln 푝푖 − 푑 ln 푝푖 + 푎푖푗 (푑 ln 푦푖 + 푑 ln 푝푖 − 푑 ln 푝푗) 푗

= 푑 ln 푧푖 + 훼푖 (푑 ln 푦푖 − 푑 ln 푐푖)

(︁ 휉)︁ ∑︁ +훾푖 푑 ln 푦푖 − 푑 ln 푐푖 − 푑 ln 푝푖 + 푎푖푗 (푑 ln 푦푖 − 푑 ln 푐푖 + 푑 ln 푐푗) . 푗

∑︀ Using the fact that 훼푖 + 훾푖 + 푎푖푗 = 1, we have

휉 ∑︁ 푑 ln 푐푖 = 푑 ln 푧푖 − 훾푖푑 ln 푝푖 + 푎푖푗푑 ln 푐푗.

In vector form, 푑 ln c = 푑 ln z − 훾푑 ln p휉 + A푑 ln c. where capital bold-font letters indicate matrices and non-capital bold-font letters indicate vectors. We therefore have

푑 ln c = (I − A)−1 [︀푑 ln z − 훾푑 ln p휉]︀ .

Given Cobb-Douglas, we also have 푑 ln c = 푑 ln y hence

푑 ln y = (I − A)−1 [︀푑 ln z − 훾푑 ln p휉]︀ .

We therefore have

∑︁ (︁ 휉)︁ 푑 ln 푦푖 = − 퐿푖푗 푑푧푗 + 훾푗푑 ln 푝푗 휉 ∑︁ (︁ 휉)︁ = −(푑푧푖 + 훾푖푑 ln 푝푖 ) − (퐿푖푗 − 1푗=푖) 푑푧푗 + 훾푗푑 ln 푝푗 . ⏟ ⏞ ⏟ ⏞ own effect network effect

We have derived the equation above assuming each 푖 is an industry. Here, we can

156 view 푑푧푖 as an industry fixed effect and 푑 ln 푦푖 is the change in output in industry 푖, 휉 while 푑 ln 푝푖 is the change in cost of the raw input 휉 in industry 푖. Mapping to data, we have multiple observations for any given industry 푖 as we have regional variation, hence the correct estimating equation should be

휉 ∑︁ (︁ 휉)︁ 푑 ln 푦푖푟 = 훿푖 + 훿푟 − 훾푖푑 ln 푝푖푟 − (퐿푖푗 − 1푗=푖) 훿푗 + 훿푟 + 훾푗푑 ln 푝푗 /

One can run the regression in a reduced form way to lump own- and network-effect together: ∑︁ (︁ 휉)︁ 푑 ln 푦푖푟 = − 퐿푖푗 훿푗 + 훿푟 + 훾푗푑 ln 푝푗 . 푗 Losing some efficiency, we can relax the restrictions that the fixed effects havetobe consistent across equations, and instead run

∑︁ 휉 푑 ln 푦푖푟 = 훿푗 + 훿푟 − 퐿푖푗훾푗푑 ln 푝푗 . 푗

157 158 Chapter 3

Do Anti-Bribery Laws Affect International Trade and Investment?

3.1 Introduction

Two contradictory themes emerge from existing studies of the 1997 OECD Anti-Bribery Convention. On one hand, these studies agree the Convention is poorly implemented. Al- though signatory countries pledged to prosecute their businesspeople paying bribes to foreign officials, the OECD lacks any formal power to enforce this pledge. As of 2014 Transparency International reports that 30 of 39 Convention signatories do “Limited” or “Little or No” actual enforcement against foreign bribery. Yet despite the lack of implementation, litera- ture on the Convention somehow seems to find real effects: recent papers argue that theit redirects trade (D’Souza, 2012) and foreign investment (Cuervo-Cazurra, 2008) away from corrupt destination countries, to the point of functioning as “de facto economic sanctions” against emerging markets (Spalding, 2010). How can such a poorly implemented Convention achieve such results? To resolve this dilemma, I look closer at the Convention’s apparent effects. Using three distinct empirical strategies, I find consistent evidence that the Convention did not in fact affect trade and FDI patterns in corrupt destinations. My first strategy is a straightforward modification of D’Souza (2012) and Cuervo-Cazurra (2008). As outlined below, these papers employ a triple difference: Convention signatories which pass anti-bribery laws in a given year are compared against those with no such law at that time, and results show that this differentially affects trade and investment flowsif the destination is corrupt. I hypothesize that these estimates are confounded by a broad

159 OECD-level trend in cooperation among its members. At the time of the Anti-Bribery Convention in the late 1990s, the OECD was especially active, adding several new members, launching initiatives against tax havens, nearly reaching a major Multilateral Agreement on Investment, and branching out into other areas such as the Program for International Student Assessment. Even apart from any real effects of these concurrent programs in redirecting economic activity, the launching of so many initiatives suggests that in the late 1990s OECD member countries were succeeding in their collective action problems and cooperating with one another. Perhaps the Anti-Bribery Convention was a symptom of this broad pattern of cooperation rather than a meaningful policy change with effects of its own. This hypothesis motivates motivates revisiting the analysis of D’Souza (2012) and Cuervo-Cazurra (2008), with an additional control for the interaction of Convention passage and an indicator of destination membership in the OECD. Adding this interaction kills the corruption term. In other words, OECD countries trade more with each other at the time of the Convention, and they themselves happen to be less corrupt, but beyond this there is no residual avoidance of corrupt markets. My second strategy exploits an apparent increase in the Convention’s force, provided by the Phase 3 initiative of the OECD Working Group on Bribery. In Phase 3, Convention sig- natories monitor each other, investigating enforcement and publishing reports which praise the countries actually enforcing their foreign bribery laws and pressure the non-enforcers. Jensen and Malesky (2018) argue that the 2010 launch of Phase 3 marked a discrete increase in Convention enforcement. Their difference in difference estimates suggest Phase 3 reduced bribery by OECD firms operating in Vietnam. In contrast, I find that Phase 3 hadnoeffect on aggregate trade and FDI flows to corrupt countries. My final strategy uses product-level trade data. I construct a product-by-destination measure of pre-Convention exposure to imports from Convention countries. I find that non- Convention exporters do gain business in exposed markets, however this effect is no stronger in corrupt destinations. I also test whether aggregate imports decline in exposed destination, this time finding the evidence inconclusive. Taken together, these results finally resolve the dilemma: consistent with its poor implementation, the Convention achieved little real effect.

3.2 Background

Prior to the 1990s, the U.S. with its Foreign Corrupt Practices Act was the lone country actively prosecuting its businesspeople for bribing foreign officials. As late as 1999, major European corporations openly claimed tax deductions for bribes. Especially brazen in this practice was Siemens, whose telecoms-equipment division set up three “cash desks” to which

160 executives could bring empty suitcases to be filled with up to 1 million Euros in cash, with few questions asked and no documentation required (The Economist, 2008). Perceiving that these practices disadvantaged US multinationals, the US pressured other developed countries to pass similar laws against foreign bribery. Tarullo (2003) describes these laws as a prisoners’ dilemma, since all countries might benefit from a level, non-corrupt playing field for exports and foreign investment, but any individual country benefits from allowing its companies to bribe abroad. After several years of stalled negotiations, the anti-foreign bribery movement suddenly gained support among OECD countries, leading them to finalize the Anti-Bribery Conven- tion in 1997. All Convention signatories agree to make the bribery of foreign public officials a criminal offense, to investigate credible allegations, and where appropriate prosecute and punish those who offer, promise, or give bribes to foreign officials (OECD Working Group on Bribery, 2014). The Convention entered into force in 1999, leading countries to pass compliant laws in the ensuing years. Table 3.1 lists the Convention signatories, along with the year in which each country passed its anti-bribery law. The original signatories include all OECD member countries, in addition to six non-member countries. Most signatories pass the anti-bribery law soon after the Convention enters into force, and all do so by 2004. This study relates to several strands of existing literature. Most directly, it challenges the existing consensus that anti-foreign bribery laws redirect trade and investment. In addition to the above mentioned studies of the OECD Convention, Beck, Maher and Tschoegl (1991) and Hines (1995) conduct similar analyses of the US FCPA, finding similar redirections of trade and foreign investment. My results do not necessarily conflict with these findings, since unlike the OECD Convention, the US FCPA is vigorously enforced. In fact, I provide evidence that the Convention might have redirected trade from corrupt markets at least for the few countries which actively comply. A second strand of literature examines the role of corruption as an impediment to inter- national business. Wei (2000) and Wei and Shleifer (2000) provide cross-country empirical evidence suggesting corruption reduces foreign investment, while Anderson and Marcouiller (2002) provide similar evidence for trade. The mechanisms explaining this result vary, from corruption creating imperfect contractibility (Dixit, 2015), increasing uncertainty (Wei, 1997), or advantaging local ownership (Javorcik and Wei, 2009). While most of these theo- ries regard corruption as a characteristic of the destination country, the OECD Convention targets the “supply side,” requiring origin countries to restrain their bribe payers. The effects of this intervention are ambiguous. The Convention could hinder trade and investment by constraining the actions of OECD firms. On the other hand it could facilitate trade and investment if it promotes transparency and thus easier contracting in foreign business, or if

161 it commits multinationals to avoid bribery and thus strengthens their bargaining position vis-a-vis rent extracting foreign officials. Finally, this study speaks to the economic importance of multinational organizations such as the OECD, and of international relations more broadly. Rose (2005) finds that OECD accession increases a country’s trade flows, perhaps a surprising result given that the OECD carries little formal power. Supporting this result, Davis (2014) conducts case studies of Japan, Mexico, Korea, arguing that OECD accession leads to important institutional liberalization. I present an alternative to this view, suggesting that accession to the OECD or participation in an initiative such as the Convention has little real effect, but simply tends to coincide with broader patterns of international cooperation. This view is consistent with a tradition in political science arguing that strategic considerations in international relations act as an important determinant of trade patterns (Gowa, 1995).

3.3 Data and methodology

As a baseline specification, consider the strategy used by existing papers to estimate the effects of anti-bribery legislation. The best implementation of this strategy is the preferred specification of D’Souza (2012):

표→푑 ′ ln(퐸푥푝표푟푡푠)푡 = 훽1(퐶표푛푣표푡 × 퐶표푟푟푢푝푡푖표푛푑) + 푋표푑푡훾1 + 훿표푑 + 훿표푡 + 훿푑푡 + 휖표푑푡. (3.1)

The dependent variable is the log dollar value of total exports from origin country 표 to destination country 푑 in year 푡. Trade data from UN Comtrade includes annual bilateral trade flows in US dollars, as reported by the exporting country. To maintain consistency with D’Souza (2012), I include the same set of 143 exporters and 155 importers. For the years 1992 to 2006, this yields 332,475 pair-by-year observations, of which 173,849 include positive trade flows and available data for the other control variables. As in previous work, I include only these positive trade flows.1

퐶표푛푣표푡 is an indicator of whether country 표 has passed a Convention-related law in year

푡. Since countries implement in different years, 퐶표푛푣표푡 varies by origin-year. Note that in this regression identification comes from both this difference in timing, and differences

1A natural alternative would be a Tobit model, to include the observations with no trade flows and capture intensive-margin effects. However Tobit models become inconsistent in the presence of fixed effects and the semi-parametric solution to this problem (Honoré, 1992) uses amedian estimator which is poorly suited to this setting in which so many observations include zero trade flows. Another alternative is the pseudo maximum-likelihood Poisson method suggested bySilva and Tenreyro (2006). D’Souza (2012) tries this method and finds it does not change the results, so I maintain focus on positive trade flows.

162 between signatories and non-signatories. The fixed effects control for unobserved differences at the country pair level, as well as year-specific differences in average trade for each exporter and importer.

The coefficient of interest in (3.1) is 훽1 which estimates the effect of the Convention on trade flows as a function of 퐶표푟푟푢푝푡푖표푛푑, the 1998 level of corruption in the destination country. As a corruption measure, I use the Control of Corruption Index in the World Gov- ernance Indicators provided by the World Bank. The index measures standard deviations in a survey-based corruption score, ranging from approximately -2.5 to 2.5, with a mean of approximately 0. I multiply by -1, so higher values correspond to more corruption.

The specification also controls for a set of variables 푋표푑푡 motivated by gravity model predictions of trade flows. First, this includes a set of variables from Rose (2005) and the CEPII gravity dataset (Head and Mayer, 2014): joint membership in a regional trade agreement, one or both countries being in GATT/WTO, and membership in a common currency union. To extend these variables beyond 2006, I use a hand-coded list of recent WTO accessions and the DESTA database of regional trade agreements. I also collect GDP and population data from the World Bank, and construct the logged product of per capita GDP for each country pair in each year. Finally, I use data from the IMF International Financial Statistics to compute exchange rate volatility, calculated as the standard deviation of the log of each year’s monthly nominal exchange rates. For FDI, the approach is similar. Bilateral FDI flows come from the OECD database. Since this includes only OECD reporters, the Convention effects in the FDI regressions are identified entirely from differences in the time at which each OECD country implements the Convention.2 I run these regressions for 1996 to 2002, again to maintain consistency with Cuervo-Cazurra (2008). I generally employ the same set of fixed effects as in (3.1), butin one initial regression follow the specification of Cuervo-Cazurra (2008) by only including origin fixed effects and adding a wider set of gravity controls from Rose (2005). These controls include distance between countries and indicators of being landlocked, being an , sharing a border, sharing a common language, having a common colonizer, and ever being in a colonial relationship. My first departure from this baseline strategy adds indicators of political alignment. I construct the indicators of joint OECD and EU membership using lists of member countries and their accession dates. I also measure political alignment using data on UN voting, from Strezhnev and Voeten (2013). My preferred measure of alignment is their Affinity index, using three-category vote data. This index ranges from -1 for countries with the least similar

2Most of the main results do not change substantially if the trade sample is restricted to the FDI sample of OECD origins only.

163 interests, to 1 for countries which always agree. For my analysis of Phase 3 and the effects of product-level exposure to OECD imports, I use the same country level variables, but an extended range of trade flow data. The Phase 3 analysis uses data from 2010 to 2013, for which there are 88,660 pair-by-year observations, 53,804 of which include positive trade flows and non-missing values of the control variables. The product-level data for 1992 to 2006 is taken at the three-digit level of SITC Revision 3, and includes a total of 9,721,620 observations with positive trade flows.

3.4 Results

3.4.1 Strategy 1: Baseline model, accounting for international cooperation

My first empirical strategy starts from the baseline model (3.1), examines its identification requirements, and shows how OECD-level patterns of cooperation act as an omitted variable.

Causation in the baseline model

Table 3.2 presents estimates based on (3.1), reproducing the main results of D’Souza (2012).

The preferred specification in column 3 yields a negative estimate of 훽^1 = −0.063, suggesting that implementation of the Convention redirects trade away from corrupt countries. Specif- ically, a corrupt destination country receives on average a 6.3 percent decrease in annual imports from each Convention implementer, relative to a country which is less corrupt by one standard deviation in the WGI corruption index. This result is robust to the inclusion of different sets of controls, and consistent with the coefficient reported in column 2,which includes country-pair and year fixed effects훿 ( 표푑 and 훿푡) but no separate time dummies for each origin and destination country (훿표푡 and 훿푑푡). This result shows that at the time of Con- vention implementation, countries shift their trade away from corrupt countries, although it does not demonstrate that the shift occurs because these countries are more corrupt.

In order to interpret 훽^1 as the change in trade flows caused by the interaction of the Convention and destination corruption, we require a standard counterfactual assumption: if the Convention had not been implemented in a particular country-year, trade would not have shifted to less corrupt destinations.3

3This is just one way to express the relevant conditional independence assumption, which we state formally as follows: For all values of 퐶표푛푣 × 퐶표푟푟푢푝푡푖표푛,

푓표푑푡(퐶표푛푣 × 퐶표푟푟푢푝푡푖표푛)⊥퐶표푛푣표푡 × 퐶표푟푟푢푝푡푖표푛푑|푋표푑푡, 훿표푑, 훿표푡, 훿푑푡,

164 The most basic test of this requirement checks for parallel pre-trends. Figure 3-1 plots Convention signatories’ relative exports to corrupt destinations as a function of the number of years before or after implementation.4 The negative coefficient reported in Table 3.2 reflects the post-implementation coefficients in Figure 3-1 being on average lower thanthe pre-implementation coefficients. If the parallel trends assumption holds, we expect thepre- implementation coefficients in Figure 3-1 to be constant from year to year. In the figure, these coefficients slope downward, casting minor doubt on parallel trends. However, the downward slope is very slight, and there is a clear break, with the post-intervention coefficients falling lower than pre-intervention. So overall the pre-trends support the above result. Even with parallel pre-trends, the identification requirement is violated if, for instance, there are other characteristics of high-corruption destination countries which make them more likely to lose imports from OECD origin countries at the time these origins implement the Anti-Bribery Convention. As suggested above, the pattern of OECD countries coop- erating with each other indicates that joint membership in the OECD might be such an omitted factor. Specifically, implementation of the Convention in a given country coincides with that country cooperating more with the rest of the OECD, which entails more exports to other OECD countries, and because these OECD destinations are typically less corrupt, this explains the redirection of trade from corrupt to less corrupt countries. To control for these patterns, I augment (3.1) by interacting additional controls with 5 퐶표푛푣표푡. Table 3.3 implements this idea, suggesting that when they implement the Conven-

where 푓표푑푡(·) is the value of log exports that would flow from 표 to 푑 in year 푡 for a given value of 퐶표푛푣 × 퐶표푟푟푢푝푡푖표푛. 4Specifically, these coefficients come from the regression

5+ 표→푑 ∑︁ 휏 ln(퐸푥푝표푟푡푠)푡 = 훽−휏 (푌 푒푎푟푠푃 푟푖표푟표푡 × 퐶표푟푟푢푝푡푖표푛푑) 휏=2 5+ ∑︁ 휏 + 훽+휏 (푌 푒푎푟푠푃 표푠푡표푡 × 퐶표푟푟푢푝푡푖표푛푑) + 훿표푑 + 훿표푡 + 훿푑푡 + 휖표푑푡, 휏=0

휏 휏 where 푌 푒푎푟푠푃 푟푖표푟표푡 and 푌 푒푎푟푠푃 표푠푡표푡 are indicators for year 푡 being 휏 years before or after the implementation of the Convention in a country 표. Each coefficient gives a year-specific estimate of Convention versus non-Convention countries’ relative tilt toward higher-corruption export destina- tions, relative to what this tilt is one year prior to implementation. 5Specifically, I run

표→푑 ln(퐸푥푝표푟푡푠)푡 = 훽1(퐶표푛푣표푡 × 퐶표푟푟푢푝푡푖표푛푑) + 훽2(퐶표푛푣표푡 × 푂퐸퐶퐷표푑푡) + 훽3(퐶표푛푣표푡 × 푊표푑푡) ′ ′ ′ +푋표푑푡훾1 + 푂퐸퐶퐷표푑푡훾2 + 푊표푑푡훾3 + 훿표푑 + 훿표푡 + 훿푑푡 + 휖표푑푡, where 푂퐸퐶퐷표푑푡 indicates joint membership in the OECD. I also run this specification with a set of additional controls, 푊표푑푡, that might explain changes in trade flows at the time of Convention implementation. These controls include common membership in other international organizations such as the European Union, political alignment as reflected in United Nations voting alignment,

165 tion, countries redirect their trade toward other OECD members, but otherwise not to less corrupt countries. Columns 1 and 2 add a single additional control, 퐶표푛푣표푡 × 푂퐸퐶퐷푑, and this leads to an insignificant estimate of 훽1, though with a fairly wide confidence interval. I reject the above estimate of −0.063, although the 95-percent confidence interval extends as low as −0.051. With the additional controls in columns 3 and 4, 훽1 is estimated to be posi- tive and even marginally significant. Focusing for interpretation on column 1, implementing the Convention itself has no effect on the amount of trade between a country pair, unless the destination country is also an OECD member, in which case the year of Convention implementation coincides with an 18 percent increase in exports. This does not necessarily mean that the Convention causes countries to export more to OECD partners, but more plausibly that both Convention implementation and the trade increase follow from a broader pattern of cooperation among these countries.6 Table 3.4 presents analogous results for FDI. Columns 1 and 2 reproduce the results of Cuervo-Cazurra (2008), while columns 3 and 4 control for the interaction of Convention implementation with joint OECD membership. Again, the baseline result indicates a redi- rection of trade away from corrupt countries, but corruption itself is not an important driver of trade once we control for the cooperation among OECD countries.

The cooperation patterns

To characterize these broader patterns of cooperation among OECD countries, I first consider longer-term trends in trade and FDI flows. Figure 3-2 plots Convention signatories’ total exports to other OECD members alongside exports to non-OECD destinations. Overall, these two series follow similar trends, both in levels and in logs. However, zooming in on the years surrounding Convention implementation (indicated by the dashed lines at 1995 and 2002), we observe a steady increase in exports to OECD partners, accompanied by stagnation or decrease in exports to non-OECD countries. Similar patterns hold looking at plots of residualized trade (Figure 3-3) and longer-term trends in FDI. UN voting alignment data provides an even more stark indication of trends in inter- national cooperation. Figure 3-4 uses an event study framework to characterize OECD countries’ patterns of political alignment in the years surrounding the Convention. Specif- common membership in trade agreements or currency unions, and destination country characteris- tics such as GDP. If the trade redirection at the time of Convention implementation is driven by cooperation among OECD members and there is no residual redirection resulting from destination corruption, we expect a positive estimate of 훽2 and an estimate of zero for 훽1. 6Another approach to controlling for the these patterns is to simply re-estimate (3.1) with OECD countries removed from the sample of destinations. These results are available in the appendix, and they do not alter the conclusions above: even among non-OECD destinations, Convention signatories do not seem to shift toward less corrupt destinations.

166 ically, I regress bilateral voting affinity scores on year-specific dummies for joint OECD membership, together with country pair and year fixed effects:

2012 ∑︁ 퐴푙푖푔푛푚푒푛푡표푑푡 = 훽휏 (푂퐸퐶퐷푑 × 1푡=휏 ) + 훿표푑 + 훿푡 + 휖표푑푡. 휏=1993

I include dummies for the years 1993 through 2012, with 1992 also included in the sample as the omitted category. The plotted 훽휏 capture the trend in OECD members’ voting alignment with each other relative to their degree of alignment with non-OECD countries. Consistent with the patterns in trade and FDI, diplomatic cooperation among OECD members increases steadily in the years surrounding the Convention.

Interpreting these OECD-level trends requires some caution. While the trends confound estimates of the Convention’s effects, they do not necessarily indicate that the OECDas an organization plays a role in changing trade patterns or causing cooperation among its members. The results indicate simply that countries which happen to be in the OECD cooperate effectively at this time, perhaps for international political reasons unrelated to the OECD itself or to the Convention. For instance, the OECD includes primarily Western European and and rich nations, which shortly before the Convention had to cooperate in building the newly formed European Union, and shortly after the Convention had to cooperate in the US-led invasion of Afghanistan. Especially given that the OECD has little formal power, we might expect these forces to be far stronger than any actual role of the

OECD. I thus re-estimate column 2 of Table 3.2, controlling for 퐶표푛푣표푡 × 퐸푢푟표푝푒푎푛푈푛푖표푛푑 ^ instead of 퐶표푛푣표푡 × 푂퐸퐶퐷푑. This still yields an estimate of 훽1 = −0.039 (s.e. 0.0156), indicating that intra-EU cooperation at the time of the Convention does not explain the entire role of intra-OECD cooperation in redirecting trade. So it seems that cooperation among at least roughly the set of OECD countries is an important shifter of trade at the time of the Convention, even if the OECD itself plays little role and the Convention is merely a symptom of this cooperation.

Apart from these trends around the time of the Convention, notable trends in OECD cooperation appear at other times. For example, Figure 3-3 indicates that, controlling for fixed effects and gravity-related factors, intra-OECD trade dips sharply following the2008 financial crisis. Figure 3-4 indicates that at this time UN alignment among OECD members was highly volatile from year-to-year, though overall not increasing or decreasing.

If OECD-level or other cross-national trends often determine patterns of trade, this poses an additional challenge for statistical inference. By default, I follow Cuervo-Cazurra (2008) and D’Souza (2012) in clustering standard errors at the level of country pair. This

167 approach might be insufficiently conservative, however, if trade flows in fact dependon OECD-level shocks such as a pattern of cooperation among member countries. Conceptually, an appropriate correction would be to cluster at the level of OECD membership, although this is of course impractical with only one OECD. As an intermediate correction used in robustness checks, I cluster at the level of origin country. The main results are robust to clustering on origin for the trade regressions, but not for FDI where the sample is smaller. In any event, the impossibility of clustering at a higher level marks a general challenge in using cross-country regressions to estimate the effects of anti-bribery legislation, particularly given the seeming importance of broad patterns of international cooperation. More generally, the importance of these trends casts doubts on estimates of the effects of accession to trade agreements or other trade policy changes requiring international coop- eration. An important paper in this vein is Rose (2005), which estimates the effects on total trade when a country joins the GATT/WTO, IMF, or OECD. Accession to trade agreements often has real effects on tariffs, but if it also signals the intention of member countries to cooperate, this intention rather than the policy change itself might be what drives the esti- mated trade increases. Indeed, a surprising finding of Rose (2005) is that OECD accession has a consistently large positive effect on trade, even though the OECD, to the extentit lacks formal power, arguably acts as little more than a mechanism for countries to signal their intention to strengthen relations with the other OECD countries.

Differences in enforcement

The one apparent effect comes from restricting focus to countries which actively enforce their anti-bribery laws. With the OECD lacking power to enforce the Convention, many member countries pass laws to nominally comply with the Convention, but in practice never enforce these laws. Transparency International (2006) compiles an annual report in which it assess whether signatories comply with the convention by actively prosecuting foreign bribery cases. As of this report’s first edition in 2006, eleven countries were rated as “Significant”enforcers: Belgium, Bulgaria, Denmark, France, Germany, Hungary, South Korea, Norway, Spain, Sweden, Switzerland, and the United States. We thus augment (3.1) with the term 퐸푛푓표푟푐푒표 × 퐶표푛푣표푡 × 퐶표푟푟푢푝푡푖표푛, where 퐸푛푓표푟푐푒표 indicates that as of 2006, the origin country did “Significant” enforcement. Table 3.5 presents the results, showing in columns 1 and 2 that for active enforcers, trade does shift to less corrupt destinations. Column 3 shows that the same is true for FDI. An advantage of this specification is that it provides another avenue to controlling for OECD-level trends in trading patterns. Assuming any OECD-level trends apply equally in enforcing and non-enforcing countries, this specification delivers the effect of active enforce-

168 ment against foreign bribery relative to having a nominal law but no actual enforcement. The specification’s shortcoming is that it still does not address the timing issue mentioned above, nor other possible differences between enforcers and non-enforcers. Most problem- atically, it controls for enforcement levels which are arguably not a pre-determined country characteristic but an intermediate outcome by which the Convention affects trade flow. So interpreting these results demands caution.

3.4.2 Strategy 2: Increased enforcement in Phase 3 of the OECD Working Group on Bribery

If the estimates above are driven by patterns of cooperation specific to the late 1990s and early 2000s, one possible remedy is to turn to another time period. Jensen and Malesky (2018) argue that Phase 3 of the OECD Working Group on Bribery led to a discrete increase in the intensity of Convention implementation. Prior to the 2010 launch of Phase 3, many countries slacked in enforcing their bribery laws. Phase 3 involved signatory countries moni- toring each other and reporting on their lax enforcement. Jensen and Malesky (2018) provide difference in difference estimates that Phase 3 reduces self-reported bribery by OECDfirms operating in Vietnam. This motivates a country-level version of their empirical strategy:

표→푑 ′ ln(퐸푥푝표푟푡푠)푡 = 훽1(푃 ℎ푎푠푒3표푡 × 퐶표푟푟푢푝푡푖표푛푑) + 푋표푑푡훾1 + 훿표푑 + 훿표푡 + 훿푑푡 + 휖표푑푡, (3.2)

where 푃 ℎ푎푠푒3표푡 equals 1 if country 표 signed the Convention, and the year is 2011 or later. In addition to providing estimates of Convention’s effects in a period other than the high-cooperation period of its initial adoption, this strategy addresses another potential endogeneity problem relating to countries’ choices of implementation timing. If national governments anticipated that their firms are going to continue exporting to corrupt destina- tions, they might have delayed implementation of the Convention, waiting until these firms announce plans to move out of corrupt markets or stop lobbying to delay implementation. In this case, the identifying assumption would be violated, with the estimates in Table 3.2 reflecting trade shifts that were happening anyway, not the effects of the Convention. In contrast, because Phase 3 came at the OECD level, its timing was independent of the goals of any particular implementing government, and thus less likely to have anticipated specific changes in trade flows.

Table 3.6 presents the results of this Phase 3 analysis. In all three specifications, 훽^1 > 0 meaning Phase 3 coincides with a redirection of trade toward corrupt countries. This estimate is, however, insignificant in columns 2 and 3, which include exporter- and importer-

169 by-year fixed effects. Still, the positive or null estimates here deliver the surprising conclusion that although Phase 3 seems to reduce OECD firms’ bribery in corrupt countries as per Jensen and Malesky (2018), this constraint does not in aggregate translate into reduced trade with these countries.7 For comparability with the Jensen and Malesky result, the sample includes one pre-Phase 3 year, 2010, and three post-Phase 3 years, 2011 to 2013. The point estimates are somewhat sensitive to the choice of sample years, indicating again that the comparison groups relevant to these regressions often do not follow parallel trends.

3.4.3 Strategy 3: Product-level trade

To assess whether non-OECD exporters fill in for the OECD exporters moving out of corrupt destinations, I use product-level trade data. Product-level data offers more statistical power than the above country pair level analysis, and it allows pair-by-year fixed effects which control for country-level trends such as the OECD cooperation pattern. I run the tests below using exposure at the time of the Convention’s adoption, but I also tried them using Phase 3 implementation, and this also yields null results. The first product-level regression tests whether the Convention leads to increases in non-OECD exports. Consider a market for exports of good 푔 to destination 푑. If this market includes no OECD exporters prior to the Convention it should be unaffected by the Convention; if it includes large export flows from the OECD, it is more exposed tothe Convention, and non-OECD exporters have larger potential gains from taking the place of the OECD countries. This effect should be strongest in corrupt markets. I thus estimate

표→푑 퐸푥푝표푟푡푠푔푡 = 훽1(푃 표푠푡푡 × 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔)

+ 훽2(푃 표푠푡푡 × 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔 × 퐶표푟푟푢푝푡푖표푛푑) (3.3)

+ 훿표푑푔 + 훿표푑푡 + 훿푔푡 + 휖표푑푔푡.

The treatment period for these non-OECD exporters, indicated by 푃 표푠푡푡, begins in 1999, since this is when the Convention officially came into force and many of the signatories implemented foreign bribery laws. The key variable is 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔, which measures OECD countries’ collective pre-Convention exports of good 푔 to destination country 푑. This export level is calculated as the total volume of 1997 imports from OECD countries, normalized by total imports in all goods to destination 푑, and by the number of number of 표→푑 non-OECD countries exporting this good to 푑 in 1997. The dependent variable 퐸푥푝표푟푡푠푔푡

7Estimating effects on FDI flows would offer a more pertinent comparison with Jensenand Malesky (2018), although this is infeasible, since the FDI data only includes OECD source countries, within which there is no variation in the onset of Phase 3.

170 is also normalized by total imports.8

Table 3.7 presents the results of this test. The preferred specification in column 1 indicates that non-OECD exporters gain more in markets previously exposed to trade from the OECD. In particular, the average gain for these non-OECD countries is 3.5 percent of what it would be if they had completely captured the pre-Convention market of the OECD. This estimate is robust to several different variations, including the addition of destination-by-year fixed effects, and the inclusion of OECD destinations in the sample.

However, 훽^2 is a precisely estimated zero, meaning the non-OECD gain is no larger in corrupt destinations. Column 2 adds interactions with origin country corruption. The negative coefficient on 푃 표푠푡푡 ×퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔 ×퐶표푟푟푢푝푡푖표푛표 indicates that less corrupt non-OECD countries pick up more of the slack. On the one hand, this result makes sense since these less corrupt countries are most similar to the low corruption OECD countries whose market they are capturing. On the other hand, we might have expected the more corrupt non-OECD countries to fill in, if the Convention actually forced out the OECD firms most engaged in bribery. This result and the null estimates on 훽^2 raise the concern that the non-OECD manage to pick up the OECD countries’ slack for reasons unrelated to corruption and the Convention.

In the appendix, I also conduct several variations on this test. First, as an alternate parameterization, I take log exports as the dependent variable, with the exposure measure being the ratio of OECD to non-OECD imports in good 푔 in 1997.9 This yields broadly similar results but with smaller magnitudes. I also estimate a country-pair version of (3.3), using 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑, total OECD imports across all products. This country-pair version cannot incorporate pair-by-year effects as in (3.3), but also yields similar results.

The second product-level strategy tests whether exposure to the Convention decreases

8 These normalizations entail that we can interpret 훽1 as the fraction of the OECD’s pre- Convention export market which is captured by non-OECD countries at the time of the Convention. Formally, exposure is ∑︀ 표→푑 표∈푂퐸퐶퐷 푇 푟푎푑푒1997,푔 1 · ¬푂퐸퐶퐷 , 푋1997,푑 푁1997,푑푔 ¬푂퐸퐶퐷 where 푋1998,푑 is total 1997 imports to destination 푑 and 푁1997,푑푔 is the number of non-OECD countries exporting 푔 there. In other words, this expression is the amount by which each non-OECD trade flow would increase if the OECD countries were to disappear from 푑 completely and the non- OECD countries exactly picked up the slack (holding fixed the number of non-OECD countries and the total trade flow into 푑). 9 In this parameterization, a coefficient of 훽1 = 1 again means that non-OECD exporters expand by exactly the amount of previous OECD exports in that market.

171 total imports in a product. I estimate

푙푛(퐼푚푝표푟푡푠)푑푔푡 = 훽1(푃 표푠푡푡 × 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔)

+ 훽2(푃 표푠푡푡 × 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔 × 퐶표푟푟푢푝푡푖표푛푑) (3.4)

+ 훽3(푃 표푠푡푡 × 퐶표푟푟푢푝푡푖표푛푑) + 훿푑푔 + 훿푔푡 + 휖푑푔푡, where 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔 is OECD imports in good 푔, as a fraction of total imports in 푔. A possible concern is that destination countries with many OECD-dominated import sectors differ from those with fewer, perhaps by being more technologically similar tothe OECD, more politically aligned with it, or otherwise likely to experience different rates of growth in the post-Convention years. To help control for differential trends resulting from these destination-level differences, I also run this specification with destination-by-year fixed effects.

Table 3.8 suggests a negative coefficient on 푃 표푠푡푡 × 퐸푥푝표푠푢푟푒푂퐸퐶퐷푑푔, and an even stronger effect for corrupt destinations. If we believe these estimates, they mean thatfor each additional 10 percentage points of pre-Convention exposure, an import market shrinks by 1.3 percent following the Convention. If the destination is one standard deviation more corrupt than average, its import market shrinks by an additional 3.2 percent. However, Figure 3-5 indicates a severe failure of parallel pre-trends: even in the years preceding the Convention, markets highly exposed to OECD imports were shrinking relative to less exposed markets. In fact, the Figure 3-5 point estimates for the post-Convention years are actually higher than those from the two years preceding the Convention, showing that the signs in Table 3.8 reverse if we change the sample years or control for differential linear pre-trends. So still there is no evidence for trade shifting in response to the Convention.

3.5 Conclusion

Each of the three empirical tests I present suggests the OECD Anti-Bribery Convention did not in fact shift trade and FDI away from corrupt destinations. If anything, trade seems to move toward corrupt destinations when we control for patterns of OECD cooperation or look to the results on Phase 3. In my product-level strategy, I find that corruption does not affect the ability of non-OECD exporters to pick up the slack, and that estimates ofnet import effects are confounded by pre-Convention trends. A general lesson from this analysis is that even net of gravity equation predictions, trade and FDI patterns depend greatly on trends at the level of countries and country groups. I suggest a link between these trends and patterns of international political cooperation,

172 and this accords with a tradition in political science, although the political determinants of these economic flows are still poorly understood. Whatever their origins, these trends confound both identification and statistical inference in tests using aggregate trade and FDI flows to measure the effects of initiatives such as the Convention. To overcome these problems, a promising avenue would be a firm-level analysis concentrated in a particular country implementing a tough foreign bribery law. I do find suggestive evidence for trade- and FDI-shifting by countries actively enforcing their anti-bribery laws. This result suggests that although the OECD Convention was not as such an effective policy change, foreign bribery legislation can meaningfully affect trade and investment when it takes the form of the tough laws in countries such as the United States, Germany, Switzerland, and the UK after its 2010 Bribery Act. The importance of enforcement also provides an intuitive resolution to the dilemma presented above. Foreign bribery laws seem to work only when enforced, and because the laws passed following the OECD Convention were often poorly enforced, they did little to redirect foreign business.

173 3.6 Tables and Figures

Table 3.1: Countries signing OECD Anti-Bribery Convention

Country Implementing law OECD member Argentina 1999 No Australia 1999 Yes Austria 1998 Yes Belgium 1999 Yes Brazil 2002 No Bulgaria 1999 No Canada 1999 Yes Chile 2002 No Czech Republic 1999 Yes Denmark 2000 Yes Estonia 2004 No Finland 1999 Yes France 2000 Yes Germany 1999 Yes Greece 1998 Yes Hungary 1998 Yes Iceland 1998 Yes Ireland 2001 Yes Italy 2000 Yes Japan 1999 Yes Korea 1999 Yes Luxembourg 2001 Yes Mexico 1999 Yes Netherlands 2001 Yes New Zealand 2001 Yes Norway 1999 Yes Poland 2001 Yes Portugal 2001 Yes Slovak Republic 1999 Yes Slovenia 1999 No Spain 2000 Yes Sweden 1999 Yes Switzerland 2000 Yes Turkey 2003 Yes United Kingdom 2002 Yes United States 1977 Yes

174 Table 3.2: Difference-in-difference estimates of effects of OECD Anti-Bribery Con- vention on trade

(1) (2) (3)

Convention × Dest. corruption -0.0708*** -0.0634*** (0.0113) (0.0136)

Convention implemented 0.0288 0.0346* (0.0182) (0.0180)

Regional trade agreement 0.200*** 0.192*** 0.178*** (0.0237) (0.0236) (0.0280)

Common currency 0.0610** -0.0126 0.0194 (0.0286) (0.0285) (0.0431)

Exchange rate volatility -0.252*** -0.259*** -0.155 (0.0545) (0.0545) (0.193)

One in GATT/WTO 0.138** 0.133** -0.112*** (0.0648) (0.0648) (0.0331)

Both in GATT/WTO 0.375*** 0.378*** (0.0682) (0.0682)

Log origin GDP per capita 3.572*** 5.193*** (1.288) (1.380)

Log dest. GDP per capita 3.800*** 5.424*** (1.295) (1.388)

Log origin population 1.279*** 1.252*** (0.129) (0.130)

Log dest. population 0.673*** 0.716*** (0.109) (0.110)

Year × ln(퐺퐷푃표 × 퐺퐷푃푑) -0.00159** -0.00240*** -2.289 (0.000647) (0.000693) (2.055)

Observations 173849 173849 173849 푅2 0.909 0.909 0.918 Fixed Effects Pair, Year Pair, Year Pair, oXyr, dXyr Notes: The dependent variable is the logged dollar value of total bilateral export flows. “Convention”, or 퐶표푛푣표푡 is an indicator of whether country 표 has imple- mented an anti-bribery law in accordance with the Convention as of year 푡. Stan- dard errors are adjusted for clustering by exporter-importer pair. * = 푝 < 0.10, ** = 푝 < 0.05, *** = 푝 < 0.01.

175 Table 3.3: Triple-difference estimates controlling for indicators of international coop- eration

(1) (2) (3) (4)

Convention × Dest. corruption -0.0199 -0.0125 0.0313* 0.0395 (0.0149) (0.0196) (0.0188) (0.0258)

Convention × Dest. in OECD 0.183*** 0.174*** 0.135*** 0.0946* (0.0326) (0.0467) (0.0356) (0.0513)

Convention × Dest. in EU 0.0897*** 0.0462 (0.0331) (0.0447)

Convention × RTA -0.0237 0.107*** (0.0260) (0.0323)

Convention × Common currency -0.335*** -0.224*** (0.0469) (0.0566)

Convention × Dest. log GDP/pop 0.0477*** 0.0421*** (0.0134) (0.0163)

Convention implemented 0.00332 -0.361*** (0.0194) (0.104)

Dest. in OECD -0.0294 -0.0223 (0.0547) (0.0549)

Dest. in EU 0.0347 0.0126 (0.0349) (0.0393)

Regional trade agreement 0.185*** 0.178*** 0.186*** 0.134*** (0.0237) (0.0279) (0.0278) (0.0315)

Common currency -0.0751*** -0.0220 0.195*** 0.119** (0.0283) (0.0430) (0.0427) (0.0552)

Log dest. GDP/pop 5.291*** 6.370*** (1.390) (1.470)

Observations 173849 173849 173849 173849 푅2 0.909 0.918 0.909 0.918 Fixed Effects Pair, Year Pair, oXyr, dXyr Pair, Year Pair, oXyr, dXyr Gravity Controls Yes Yes Yes Yes Notes: The dependent variable is the logged dollar value of total bilateral export flows. “Convention”, or 퐶표푛푣표푡 is an indicator of whether country 표 has implemented an anti-bribery law in accordance with the Convention as of year 푡. Standard errors are adjusted for clustering by exporter-importer pair. * = 푝 < 0.10, ** = 푝 < 0.05, *** = 푝 < 0.01.

176 Table 3.4: Effects of OECD-ABC on FDI

(1) (2) (3) (4) (5)

Conv. × Dest. corruption -0.151*** -0.0358 0.00345 -0.00681 0.0207 (0.0519) (0.0462) (0.0824) (0.0572) (0.0971)

Conv. × Dest. in OECD 0.123 0.0714 (0.126) (0.202)

Conv. implemented 0.0866 0.0538 0.0180 (0.0637) (0.0789) (0.0876)

Corruption in dest. -0.970*** (0.0393)

Dest. in OECD 0.220 (0.381)

Dest. GDP PPP 1.09e-13* -2.95e-13** -6.22e-12 -2.90e-13** -6.34e-12 (5.60e-14) (1.36e-13) (4.18e-08) (1.36e-13) (4.18e-08)

Log dest. population 0.585*** -4.865*** -0.144 -4.484*** -0.158 (0.0284) (1.045) (8811.2) (1.090) (8821.0)

Log distance -0.629*** (0.0361)

Landlocked dest. -0.309*** (0.0845)

Island dest. 0.173* (0.0921)

Border 0.966*** (0.141)

Common language 0.204** (0.0993)

Common colonizer 2.677*** (0.415)

Colonial link 1.284*** (0.141)

Observations 4129 4129 4129 4129 4129 푅2 0.557 0.873 0.910 0.873 0.910 Fixed Effects Origin Pair, Year Pair Pair, Year Pair oXyr, dXyr oXyr, dXyr Notes: The dependent variable is the logged dollar value of total bilateral FDI flows. Due to constraints on data availability, the sample includes FDI flows only from OECD origin countries. “Convention”, or 퐶표푛푣표푡 is an indicator of whether country 표 has implemented an anti-bribery law in accordance with the Convention as of year 푡. Standard errors are adjusted for clustering by exporter-importer pair. * = 푝 < 0.10, ** = 푝 < 0.05, *** = 푝 < 0.01.

177 Table 3.5: Effects by enforcement of anti-bribery laws

(1) (2) (3) ln(trade) ln(trade) ln(FDI)

Enforce × Convention × Corrupt푑 -0.117*** -0.123*** -0.316** (0.0416) (0.0414) (0.155)

Convention × Corrupt푑 0.00773 -0.0845** 0.152 (0.0318) (0.0363) (0.174)

Observations 116264 52691 4718 푅2 0.899 0.926 0.897 Fixed Effects Pair, oXyr, dXyr Pair, oXyr, dXyr Pair, oXyr, dXyr Origin Sample All countries Signatories only Signatories only Dest. Sample Non-signatories Non-signatories Non-signatories

Notes: “Convention”, or 퐶표푛푣표푡 is an indicator of whether country 표 has implemented an anti- bribery law in accordance with the Convention as of year 푡. “Corrupt” is the measure of cor- ruption in the destination country, as described in the text. Standard errors are adjusted for clustering by exporter-importer pair. * = 푝 < 0.10, ** = 푝 < 0.05, *** = 푝 < 0.01.

178 Table 3.6: Effects of Phase 3 increase in convention implementation

(1) (2) (3)

Phase 3 × Dest. corruption 0.0392** 0.0284 0.00988 (0.0185) (0.0185) (0.0335)

Post 2010 × Dest. corruption -0.0172 (0.0171)

Phase 3 0.00677 (0.0242)

Phase 3 × Dest. in OECD -0.0152 (0.0636)

Phase 3 × Dest. in EU -0.0203 (0.0569)

Phase 3 × RTA -0.0581 (0.0518)

Phase 3 × Common Currency 0.0370 (0.143)

Phase 3 × Dest. log GDP/capita -0.00317 (0.0247)

Dest. in OECD .

Dest. in EU 0.172 (0.161)

Regional trade agreement 0.0814 0.165** 0.183** (0.0578) (0.0680) (0.0725)

Common currency

Log dest. GDP per capita 10.04 (6.150)

Observations 53804 53804 53804 푅2 0.951 0.954 0.954 Fixed Effects Pair, Year Pair, oXyr dXyr Pair, oXyr dXyr Gravity Controls Yes Yes Yes Notes: The dependent variable is the logged dollar value of total bilateral export flows. “Phase”, or 푃 ℎ푎푠푒3표푡 is an indicator of whether country 표 is a Convention sig- natory and Phase 3 has commenced (i.e., the year is 2011 or later). It could equiva- lently be written as 푆푖푔푛푒푑퐶표푛푣푒푛푡푖표푛표 × 푃 표푠푡2010푡. Standard errors are adjusted for clustering by exporter-importer pair. * = 푝 < 0.10, ** = 푝 < 0.05, *** = 푝 < 0.01.

179 Table 3.7: Non-OECD exporters picking up slack

(1) (2) ln(Exports) ln(Exports)

Post × Exposure 0.0350*** 0.0392*** (0.00496) (0.00517)

Post × Exposure × Corrupt푑 0.00181 0.00411 (0.00612) (0.00635)

Post × Exposure × Corrupt0 -0.0261*** (0.00523)

Post × Exposure × Corrupt푑 × Corrupt표 -0.00293 (0.00703)

Observations 1788389 1146148 푅2 0.763 0.805 Gravity controls Yes Yes Dest. Sample Non-signatories Non-signatories Notes: The dependent variable is the logged dollar value of total bilateral ex- port flows. 푃 표푠푡푡 is an indicator that the year is 1999 or later. Exposure is product-destination level measure of 1997 pre-Convention exposure to im- ports from Convention countries. These specifications include fixed effects for pair-by-product, pair-by-year, and product-by-year. Standard errors are ad- justed for clustering by exporter-importer pair. * = 푝 < 0.10, ** = 푝 < 0.05, *** = 푝 < 0.01.

180 Table 3.8: Effects of OECD exposure on imports

(1) (2) ln(Imports) ln(Imports)

Post × Exposure × Corrupt푑 -0.315*** -0.125*** (0.0862) (0.0323) Post × Exposure -0.130* (0.0742)

Post × Corrupt푑 0.369*** (0.0770)

Observations 455017 473940 푅2 0.901 0.912 Gravity Controls Yes Yes Dest. Sample Non-signatories Non-signatories Notes: The dependent variable is the logged dollar value of to- tal bilateral export flows. 푃 표푠푡푡 is an indicator that the year is 1999 or later. Exposure is product-destination level measure of 1997 pre-Convention exposure to imports from Convention coun- tries. These specifications include fixed effects for destination-by- product, destination-by-year, and product-by-year. Standard er- rors are adjusted for clustering by destination country. * = 푝 < 0.10, ** = 푝 < 0.05, *** = 푝 < 0.01.

181 Figure 3-1: Event study checking for pre-trends in exports to corrupt destinations

182 Figure 3-2: Trends in total exports from OECD countries

Figure 3-3: Trends in OECD exports, residualized

Notes: These plots are constructed from the residuals from

표→푑 ′ ln(퐸푥푝표푟푡푠)푡 = 푋표푑푡훾1 + 훿표푑 + 훿표푡 + 훿푑푡 + 휖표푑푡.

휏 Signatory-to-OECD trade: 퐸푛[ˆ휖표푑푡|푂퐸퐶퐷표 = 푂퐸퐶퐷푑 = 1, 푡 = 휏] 휏 All other country pairs: 퐸푛[ˆ휖표푑푡|푂퐸퐶퐷표 · 푂퐸퐶퐷푑 = 0, 푡 = 휏] for 휏 ∈ {1992,..., 2013}

183 Figure 3-4: Patterns in UN voting alignment among OECD members

184 Figure 3-5: Pre-trends in estimates of import effects

185 186 Bibliography

Aghion, Philippe, Robin Burgess, Stephen J Redding, and Fabrizio Zili- botti. 2008. “The unequal effects of liberalization: Evidence from dismantling the License Raj in India.” American Economic Review, 98(4): 1397–1412.

Ahlfeldt, Gabriel M, Stephen J Redding, Daniel M Sturm, and Nikolaus Wolf. 2015. “The economics of density: Evidence from the Berlin Wall.” Econo- metrica, 83(6): 2127–2189.

Ahluwalia, Montek S. 2002. “State level performance under economic reforms in India.” Economic policy reforms and the Indian economy, 91–125.

Allcott, Hunt, Allan Collard-Wexler, and Stephen D O’Connell. 2016. “How do electricity shortages affect industry? Evidence from India.” The American Eco- nomic Review, 106(3): 587–624.

Allen, Treb. 2014. “Information frictions in trade.” Econometrica, 82(6): 2041–2083.

Allen, Treb, and Costas Arkolakis. 2016. “The welfare effects of transportation infrastructure.”

Allen, Treb, and Dave Donaldson. 2018. “Geography and path dependence.”

Anderson, James E, and Douglas Marcouiller. 2002. “Insecurity and the pat- tern of trade: An empirical investigation.” Review of Economics and statistics, 84(2): 342–352.

Anderson, James E, and Eric Van Wincoop. 2004. “Trade costs.” Journal of Economic Literature, 42(3): 691–751.

Arrow, Kenneth J, Theodore Harris, and Jacob Marschak. 1951. “Optimal inventory policy.” Econometrica, 250–272.

Asker, John, Allan Collard-Wexler, and Jan De Loecker. 2014. “Dynamic inputs and resource (mis)allocation.” Journal of Political Economy, 122(5): 1013– 1063.

Bajaj, Vikas. 2010. “Clogged rail lines slow India’s development.” The New York Times.

187 Baldwin, Richard, Rikard Forslid, Philippe Martin, Gianmarco Ottaviano, and Frederic Robert-Nicoud. 2011. Economic Geography and Public Policy. Princeton University Press.

Banerjee, Abhijit, Esther Duflo, and Nancy Qian. 2012. “On the road: Access to transportation infrastructure and economic growth in China.”

Banerjee, Mamata. 2009. “Speech Introducing the Railway Budget 2009-2010.”

Baum-Snow, Nathaniel. 2007. “Did highways cause suburbanization?” The Quar- terly Journal of Economics, 122(2): 775–805.

Beck, Paul J, Michael W Maher, and Adrian E Tschoegl. 1991. “The impact of the Foreign Corrupt Practices Act on US exports.” Managerial and Decision Economics, 12(4): 295–303.

Blanchard, Olivier, and John Simon. 2001. “The long and large decline in US output volatility.” Brookings Papers on Economic Activity, 2001(1): 135–164.

Bloom, Nicholas. 2009. “The impact of uncertainty shocks.” Econometrica, 77(3): 623–685.

Bollard, Albert, Peter J Klenow, and Gunjan Sharma. 2013. “India’s myste- rious manufacturing miracle.” Review of Economic Dynamics, 16(1): 59–85.

Broda, Christian, Joshua Greenfield, and David Weinstein. 2006. “From groundnuts to globalization: A structural estimate of trade and growth.”

Broyden, C. G. 1965. “A class of methods for solving nonlinear simultaneous equa- tions.” Mathematics of Computation, 19(92): 577–593.

Caliendo, Lorenzo, Maximiliano Dvorkin, and Fernando Parro. 2017. “Trade and labor market dynamics: general equilibrium analysis of the China trade shock.” NBER Working Paper, 77999.

Cameron, A Colin, Jonah B Gelbach, and Douglas L Miller. 2011. “Robust inference with multiway clustering.” Journal of Business & Economic Statistics, 29(2): 238–249.

Chakravorty, Sanjoy, and Somik V Lall. 2007. Made in India: The Economic Geography and Political Economy of Industrialization. Oxford University Press, USA.

Chen, Bintong, and Patrick T Harker. 1990. “Two moments estimation of the delay on single-track rail lines with scheduled traffic.” Transportation Science, 24(4): 261–275.

Cuervo-Cazurra, Alvaro. 2008. “The effectiveness of laws against bribery abroad.” Journal of International Business Studies, 39(4): 634–651.

188 Das Gupta, Chirashree. 2016. State and Capital in Independent India: Institutions and Accumulations. Cambridge University Press.

Datta, Saugato. 2012. “The impact of improved highways on Indian firms.” Journal of Development Economics, 99(1): 46–57.

David, Joel M, Hugo A Hopenhayn, and Venky Venkateswaran. 2016. “In- formation, misallocation, and aggregate productivity.” The Quarterly Journal of Economics, 131(2): 943–1005.

Davis, Christina. 2014. “Membership conditionality and institutional reform: the case of the OECD.”

Dix-Carneiro, Rafael. 2014. “Trade liberalization and labor market dynamics.” Econometrica, 825–885.

Dix-Carneiro, Rafael, and Brian K Kovak. 2017. “Trade liberalization and re- gional dynamics.” American Economic Review, 107(10): 2908–46.

Dixit, Avinash. 2015. “Governance, trade, and investment.” Research in Economics, 69(2): 166–179.

Donaldson, Dave. 2015. “The gains from market integration.” Annual Review of Economics, 7(1): 619–647.

Donaldson, Dave. 2017. “Railroads of the Raj: Estimating the impact of trans- portation infrastructure.” American Economic Review.

Donaldson, Dave, and Richard Hornbeck. 2016. “Railroads and American eco- nomic growth: A ‘market access’ approach.” Quarterly Journal of Economics, 131(2): 799–858.

D’Souza, Anna. 2012. “The OECD anti-bribery convention: changing the currents of trade.” Journal of Development Economics, 97(1): 73–87.

Duranton, Gilles, and Matthew A. Turner. 2011. “The fundamental law of road congestion: Evidence from US cities.” American Economic Review, 101(6): 2616– 52.

Duranton, Gilles, and Matthew A Turner. 2012. “Urban growth and transporta- tion.” Review of Economic Studies, 79(4): 1407–1440.

Eaton, Jonathan, and Samuel Kortum. 2002. “Technology, geography, and trade.” Econometrica, 70(5): 1741–1779.

Edgeworth, Francis Y. 1888. “The mathematical theory of banking.” Journal of the Royal Statistical Society, 51(1): 113–127.

189 Ellison, Glenn, and Edward L Glaeser. 1997. “Geographic concentration in US manufacturing industries: a dartboard approach.” Journal of Political Economy, 105(5): 889–927.

Faber, Benjamin. 2014. “Trade integration, market size, and industrialization: evi- dence from China’s National Trunk Highway System.” Review of Economic Studies, 81(3): 1046–1070.

Fafchamps, Marcel, Jan Willem Gunning, and Remco Oostendorp. 2000. “Inventories and risk in African manufacturing.” The Economic Journal, 110(466): 861–893.

Fajgelbaum, Pablo D, and Edouard Schaal. 2017. “Optimal transport networks in spatial equilibrium.”

Firth, John. 2017. “I’ve been waiting on the railroad: The effects of congestion on firm production.”

Firth, John, and Ernest Liu. 2017. “Manufacturing underdevelopment: India’s Freight Equalization Scheme, and the long-term effects of distortions on the geog- raphy of production.”

Firth, John, Felix Forster, and Clement Imbert. 2017. “Internal migration in India: New evidence from rail passenger travel data.”

Fogel, Robert William. 1964. Railroads and American Economic Growth. Cam- bridge University Press.

Fujita, Masahisa, Paul R Krugman, Anthony J Venables, and Massahisa Fujita. 1999. The Spatial Economy: Cities, Regions and International Trade. Vol. 213, Wiley Online Library.

Garred, Jason, David Atkin, Dave Donaldson, and Amit Khandelwal. 2015. “Access to raw materials and local comparative advantage: The effects of India’s Freight Equalization Policy.”

Ghosh, Prabhat Prasad, and Chirashree Das Gupta. 2009. “Political implica- tions of inter-state disparity.” Economic and Political Weekly, 185–191.

Godfrey, L. G. 1988. Misspecification Tests in Econometrics. Cambridge University Press.

Government of India. 1956. “Industrial Policy Resolution of 1956.”

Government of India Planning Commission. 1977. “Report of Inter-Ministerial Group on Freight Equalisation of Commodities.”

Gowa, Joanne. 1995. Allies, Adversaries, and International Trade. Princeton Uni- versity Press.

190 Guasch, J Luis, and Joseph Kogan. 2003. “Just-in-case inventories: A cross- country analysis.”

Gulyani, Sumila. 2001. “Effects of poor transportation on lean production and industrial clustering: Evidence from the Indian auto industry.” World Development, 29(7): 1157–1177.

Hadley, G., and Thomson M. Whitin. 1963. Analysis of Inventory Systems. Englewood Cliffs, NJ: Prentice-Hall, 1963.

Hansen, Lars Peter. 1982. “Large sample properties of generalized method of mo- ments estimators.” Econometrica, 50(4): 1029–1054.

Harker, Patrick T, and Sungwook Hong. 1990. “Two moments estimation of the delay on a partially double-track rail line with scheduled traffic.” Journal of the Transportation Research Forum, 31(1).

Head, Keith, and Thierry Mayer. 2014. “Gravity equations: Workhorse, toolkit, and cookbook.” In Handbook of international economics. Vol. 4, 131–195. Elsevier.

Hillberry, Russell, and David Hummels. 2008. “Trade responses to geo- graphic frictions: A decomposition using micro-data.” European Economic Review, 52(3): 527–550.

Hines, James R. 1995. “Forbidden payment: Foreign bribery and American business after 1977.”

Hirschman, Albert O. 1958. The Strategy of Economic Development. Yale Univer- sity Press.

Honoré, Bo E. 1992. “Trimmed LAD and least squares estimation of truncated and censored regression models with fixed effects.” Econometrica, 533–565.

Hopenhayn, Hugo A. 2014. “Firms, misallocation, and aggregate productivity: A review.” Annual Review of Economics, 6(1): 735–770.

Hsieh, Chang-Tai, and Peter Klenow. 2009. “Misallocation and manufacturing TFP in China and India.” Quarterly Journal of Economics, 124(4): 1403–1448.

Hummels, David, and Georg Schaur. 2013. “Time as a trade barrier.” American Economic Review, 103(7): 2935–2959.

Indian Railway Conference Association. 2000. Goods Rates Tables, 1970-2000.

Javorcik, Beata S, and Shang-Jin Wei. 2009. “Corruption and cross-border investment in emerging markets: Firm-level evidence.” Journal of International Money and Finance, 28(4): 605–624.

191 Jensen, Nathan, and Edmund Malesky. 2018. “Does the OECD Anti-Bribery Convention affect bribery? An empirical analysis using the unmatched count tech- nique.” International Organization. Krishna Moorthy, K. 1984. Engineering Change: India’s Iron and Steel. Madras: Technology Books. Krishna, Rai Atul. 2017. “Freight equalization hit Bihar growth, says Prez.” Hin- dustan Times. Krueger, Anne O. 2002. Economic Policy Reforms and the Indian Economy. Uni- versity of Chicago Press. Levchenko, Andrei A. 2007. “Institutional quality and international trade.” The Review of Economic Studies, 74(3): 791–819. Li, Han, and Zhigang Li. 2013. “Road investments and inventory reduction: Firm level evidence from China.” Journal of Urban Economics, 76: 43–52. Liu, Ernest. 2017. “Industrial policies in production networks.” Macchiavello, Rocco, and Ameet Morjaria. 2015. “The value of relationships: evidence from a supply shock to Kenyan rose exports.” The American Economic Review, 105(9): 2911–2945. Marshall, A. 1890. Principles of Economics. London: MacMillan. McConnell, Margaret M, and Gabriel Perez-Quiros. 2000. “Output fluctua- tions in the United States: What has changed since the early 1980’s?” American Economic Review, 90(5): 1464–1476. Ministry of Commerce and Industry. 1957. “Annual Administration Report 1956- 57.” Ministry of Finance. 2015. “Economic Survey of India for 2014-15.” Ministry of Railways. 2008. Operating Manual for Indian Railways. Ministry of Railways. 2009. Railway Budget 2009-2010. Ministry of Railways. 2011. “Indian Railways: Annual Report and Accounts 2010- 11.” Ministry of Railways. 2015a. “Indian Railways: A White Paper.” Ministry of Railways. 2015b. “Report of the Committee for Mobilization of Re- sources for Major Railway Projects and Restructuring of Railway Ministry and Railway Board.” Ministry of Statistics and Programme Implementation. 1994. Input Output Transactions Table, 1993-94.

192 Miyauchi, Yuhei. 2018. “Matching and agglomeration: Theory and evidence from Japanese firm-to-firm trade.”

Mohan, Rakesh. 2002. “Small-Scale Industry Policy in India.” Economic Policy Reforms and the Indian Economy, 213.

Mohanty, Gopal. 2015. “Report on Freight Equalization Scheme (FES).”

Nahmias, Steven. 2001. Production and Operations Analysis. McGraw-Hill.

Neely, Michael J. 2010. “Stochastic network optimization with application to com- munication and queueing systems.” Synthesis Lectures on Communication Net- works, 3(1): 1–211.

OECD Working Group on Bribery. 2014. “Annual Report of the OECD Working Group on Bribery.”

Petersen, ER. 1974. “Over-the-road transit time for a single track railway.” Trans- portation Science, 8(1): 65–74.

Ramondo, Natalia, Andrés Rodríguez-Clare, and Milagro Saborío- Rodríguez. 2016. “Trade, domestic frictions, and scale effects.” The American Economic Review, 106(10): 3159–3184.

Raza, Moonis, and Yash Aggarwal. 1986. Transport Geography of India: Com- modity Flows and the Regional Structure of the Indian Economy. Concept Publish- ing Company.

Redding, Stephen J, and Daniel M Sturm. 2008. “The costs of remoteness: Ev- idence from German division and reunification.” The American Economic Review, 98(5): 1766–1797.

Redding, Stephen J., and Matthew A. Turner. 2015. “Transportation costs and the spatial organization of economic activity.” Handbook of Regional and Urban Economics, 5(20): 1339–1398.

Rose, Andrew. 2005. “Which international institutions promote international trade?” Review of International Economics, 13(4): 682–698.

Rotemberg, Martin. 2017. “Equilibrium effects of firm subsidies.”

Rubin, Donald B. 1980. “Randomization analysis of experimental data: The Fisher randomization test comment.” Journal of the American Statistical Association, 75(371): 591–593.

Ryan, Nicholas. 2012. “Causes of industrial agglomeration in India.”

Silva, JMC Santos, and Silvana Tenreyro. 2006. “The log of gravity.” The Review of Economics and Statistics, 88(4): 641–658.

193 Singh, Ram Badan. 1989. Economics of Public Sector Steel Industry in India. Commonwealth Publishers.

Spalding, Andrew Brady. 2010. “Unwitting sanctions: understanding anti-bribery legislation as economic sanctions against emerging markets.” Florida Law Review, 62: 351.

Startz, Meredith. 2016. “The value of face-to-face: Search and contracting problems in Nigerian trade.”

Strezhnev, Anton, and Erik Voeten. 2013. “United Nations General Assembly voting data.” IQSS Dataverse Network.

Tarullo, Daniel K. 2003. “The limits of institutional design: Implementing the OECD Anti-Bribery Convention.” Virginia Journal of International Law, 44: 665. The Economist

The Economist. 2008. “Bavarian baksheesh.”

Transparency International. 2006. “Progress Report: Enforcement of the OECD Convention on Combating Bribery of Foreign Public Officials.”

Wardrop, John Glen. 1952. “Some theoretical aspects of road traffic research.” Proceedings of the Institution of Civil Engineers, 1(3): 325–362.

Wei, Shang-Jin. 1997. “Why is corruption so much more taxing than tax? Arbi- trariness kills.”

Wei, Shang-Jin. 2000. “How taxing is corruption on international investors?” Re- view of Economics and Statistics, 82(1): 1–11.

Wei, Shang-Jin, and Andrei Shleifer. 2000. “Local corruption and global capital flows.” Brookings Papers on Economic Activity, , (2): 303–346.

World Bank. 1994. “World Development Report 1994: Infrastructure for Develop- ment.”

Ziobro, Paul. 2017. “Trains in vain: Epic CSX traffic jam snarls deliveries, from coal to fries.” The Wall Street Journal.

194