Essays on Financial, Transportation and Savings Investment Technologies MASSACHUETT WNTUE

by IOI CHNO Tite Yokossi MAROFT 212017 Submitted to the Department of Economics in partial fulfillment of the requirements for the degree of LIBRARIES Doctor of Philosophy in Economics ARCNES at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2017 @ Tite Yokossi, MMXVII. All rights reserved. The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created.

Author.... Signature redacted...... Department of Economics

January '192017 CertifiedtfeCy. e ...... b...... Signature redacted Esther Duflo Abdul Latif Jameel Professor of Alleviation and Supervisor Certified by...Certiied Signature y ...... redacted Thesis...... Benjamin Olken Professor of Economics

Thesis Supervisor Accepted by... Signature redacted ...... Ricardo Caballero _ ...... ------. F..rd International Professor of Economics Chairman, Departmental Committee on Graduate Studies 77 Massachusetts Avenue Cambridge, MA 02139 M IT Libraries http://Iibraries.mit.edu/ask

DISCLAIMER NOTICE

Due to the condition of the original material, there are unavoidable flaws in this reproduction. We have made every effort possible to provide you with the best copy available.

Thank you.

The images contained in this document are of the best quality available. 2 Essays on Financial, Transportation and Savings Investment Technologies

by Tite Yokossi

Submitted to the Department of Economics on January 19, 2017, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Economics

Abstract

This thesis investigates the impact and adoption of three types of technologies mobile money, a leading financial technology in Kenya, the colonial railway, an im- portant transportation technology in Nigeria, and two prevalent savings investment technologies for the provision of retirement income: inter-generational transfers (pay- as-you-go systems) and capital markets investments. Access to mobile money services is shown to have a significant impact on economic activity. Areas with access to mo- bile money services grow faster, especially when they are initially richer, urban, and connected to roads and to banks. The heterogeneity of the the short- and long-run effects of railroads on individual and local development in Nigeria is found to be substantial. Unlike in areas further away from the coast, the railway had no impact in areas that had access to ports of export and those areas barely adopted the rail- way as it did not reduce their shipping costs. The cross-country heterogeneity in the adoption of savings investment technologies is shown to be accounted for by rational, welfare maximizing decisions based on distinct underlying economic characteristics.

Thesis Supervisor: Esther Duflo Title: Abdul Latif Jameel Professor of Poverty Alleviation and Development Economics

Thesis Supervisor: Benjamin Olken Title: Professor of Economics

3 4 Acknowledgments

I am deeply grateful to Esther Duflo, Benjamin Olken and Tavneet Suri for their guidance, patience and insightful advice during the course of this doctoral program. Chapter 2 has greatly benefited from insightful comments of and discussions with Abhijit Banerjee, Stacy Carlson, Esther Duflo, John Firth, Gabriel Kreindler, Ernest Liu, Matthew Lowe, Benjamin Marx, Benjamin Olken, Tavneet Suri, and participants of the MIT Development Lunch. Special thanks go to the Financial Sector Deepening Kenya and its partners for generously sharing the Mobile Money network expansion data and to Tavneet Suri for helping me access it. Chapter 3 was written jointly with Dozie Okoyel and Roland Pongou.2 I am grateful to them for their energy, insights and contributions which made this project possible. My co-authors and I would like to acknowledge insightful remarks and suggestions from David Atkin, Abhijit Banerjee, Esther Duflo, Jason Garred, Talan Iscan, Benjamin Olken, Lars Osberg, and Frank Schilbach, that greatly improved the chapter. We are also grateful for valuable comments from participants at the Development Economics Lunch at MIT, SIER conference at the African School of Economics, the 2016 Canadian Development Study Group Meetings, the Macro and Development Group Lunch at , University of Western Ontario's 50th Anniversary conference, and St. Francis Xavier University Economics Seminar. We are grateful to Remi Jedwab and Alexander Moradi for making their dataset on city growth in Africa publicly available and accessible. The text in Chapter 4, published in October 2016 in the European Economic Review under the title A rational, economic model of paygo tax rates,3 was written

'Dalhousie University 2University of Ottawa 3 DOI : http: //dx.doi. org/10.1016/j.euroecorev.2016.06.002

5 with Georges de Menil4 , Fabrice Murtin5 and Eytan Sheshinski.' I am deeply in- debted to them for their insights, initiatives and contributions without which this project would not have been completed. I would like to give heartfelt thanks to Georges de Menil for his intellectual and moral support throughout this doctoral program. My co-authors and I are grateful for the suggestions of Roel Beetsma, Gabrielle Demange, Peter Diamond, Richard Disney, Fritz Kubler, participants in the Netspar 1 0 th International Workshop of Pension, Insurance and Saving and the members of research seminars at the Paris School of Economics and the Stern School, New York University. The chapter has also benefited from the extensive comments of an anonymous referee and the Editor of the European Economic Review. The opinions expressed in this final chapter do not necessarily reflect the official views of the OECD.

My doctoral program has been funded in great part thanks to research assistance for Simon Johnson. I would like to thank Professor Johnson not only for the funding but also for the pleasant and intellectually rewarding experience I had working with him.

I would like to acknowledge my colleagues at MIT, from whom I learned a lot during my years in the program. Discussions with them have provided me with insights, clarifications, ideas and solutions that have been essential to the completion of this dissertation. Special thanks to Stacy Carlson, John Firth, Gabriel Kreindler, Matthew Lowe, Yuhei Miyauchi and Benjamin Roth with whom I interacted most.

I am indebted to my parents Esther and Tcharo Yokossi and to my partner Anne Mai Wassermann who supported me, encouraged me and bore with me through the

4Paris School of Economics (EHESS) 50ECD. 6 Hebrew University of Jerusalem and Brown University.

6 ups and downs of this doctoral program. I am also grateful to my brother Gaius, sister Kyria, and to my extended family and friends for their support.

Thank you, everyone.

7 Contents

1 Introduction 17

1.1 Mobile Money and Economic Activity: the Impact of a Financial Tech- nology ...... 18

1.2 Colonial Railroads in Nigeria: the Heterogeneous Impact of a Trans-

portation Technology ...... 19

1.3 Pay-As-You-Go vs. Capital Markets: A Rational Model of the Adop-

tion of Savings Investment Technologies ...... 19

2 Mobile Money and Economic Activity: the Impact of a Financial

Technology 21

2.1 Introduction ...... 22

2.2 Background on Mobile Money in Kenya ...... 29 2.3 D ata ...... 31 2.3.1 Lights ...... 31 2.3.2 The M-PESA Agent Network Expansion ...... 35

2.3.3 Other factors ...... 36

2.4 Empirical Strategy ...... 37 2.4.1 Basic Specification ...... 37 2.4.2 Identification ...... 38

8 2.5 Results ...... 43

2.5.1 Main Results ...... 43

2.5.2 Extensive and Intensive Margins . 45

2.5.3 Robustness Checks ...... 46

2.6 Heterogeneity and Mechanisms ...... 47

2.6.1 Heterogeneity ...... 47 2.6.2 Channels ...... 51

2.7 Conclusion ...... 52

Appendices 55

2.A Figures and Tables ...... 55 2.B Additional Tables ...... 71

3 Colonial Railroads in Nigeria : the Heterogeneous Impact of a

Transportation Technology 77

3.1 Introduction ...... 78

3.2 Historical Background ...... 86

3.2.1 Alternative Transportation Modes .... . 87

3.2.2 Railway Construction ...... 88

3.2.3 Growth of Export Agriculture Following the Railway Con- struction...... 89

3.3 Data and Empirical Strategy ...... 91 3.3.1 D ata ...... 91

3.3.2 Identification Strategies ...... 92

3.4 Average Effect of the Railway: Countrywide Estimates 100

3.4.1 State Fixed Effects Results ...... 100

9 3.4.2 Instrumental Variable Estimates ...... 101 3.4.3 Identification Checks Results ...... 102 3.4.4 Additional Identification Checks Results ...... 104 3.4.5 Urbanization Outcomes Results ...... 105 3.5 North-South Differences in the Impact of the Rail Line ...... 106 3.5.1 Estimated Impact of the Railway in the North and in the South ...... 107 3.5.2 Differential Impact in the North and South: Robustness Checks ...... 108 3.6 Dynamics and Persistence ...... 109 3.7 M echanism s ...... 111 3.7.1 Adoption Rates and Benefit of Rail by Key Regional Crops 112 3.7.2 Key Factor of Heterogeneity: Distance to Ports of Export . 115 3.8 Concluding Remarks ...... 117

Appendices 119 3.A Figures and Tables ...... 119 3.B Additional Tables ...... 140

4 Pay-As-You-Go vs. Capital Markets : A Rational Model of the Adoption of Savings Investment Technologies 153 4.1 The Diversity of Effective Paygo Rates ...... 160 4.2 The M odel ...... 161

4.3 The Diversity of National Characteristics ...... 164 4.3.1 D ata ...... 164 4.3.2 Annual Dynamics ...... 167

10 4.4 Life-time income dynamics ...... 170 4.4.1 Average wage income ...... 170 4.4.2 Average investment income ...... 172 4.4.3 Expected utility ...... 173 4.5 Results ...... 174

4.5.1 Testing the Model ...... 17 5

4.5.2 Decomposition of Explained Variance ...... 17 6

4.6 Further Results ...... 18 1 4.6.1 The integration of financial markets ...... 18 1 4.6.2 Global Crises ...... 18 3

4.7 Conclusion ...... 18 3

Appendices 185

4.A Constructing Estimates of Effective Paygo Tax Rates in 2002 . ... . 185

4.B Simulation Algorithm ...... 187

4.C Estimations Using Data on Average Annual Earnings...... 188 4.D The Implications of 2008 ...... 190

4.E 'Tables and Figures ...... 191 4.F Additional Tables ...... 199

11 List of Figures

2.A.1 Nighttime Light Density and Population Density ...... 56 2.A.2 Nighttime Light Density and Household Wealth - Kenya DHS C lusters ...... 57 2.A.3 Mobile Money Agent Network Expansion ...... 58 2.A.4 Event Study Graphs ...... 59

3.A.1 Rail Lines Across Clusters and Local Areas ...... 120 3.A.2 Straight Lines Between Nodes ...... 120 3.A.3 Rail Lines, Ports and Placebo Lines ...... 121 3.A.4 Rail Lines, Roads and Placebo Lines ...... 121 3.A.5 Railway Adoption for Northern and Southern Main Exports ... 122

4.E.1 Linear M odel ...... 198 4.E.2 Poisson M odel ...... 198

12 List of Tables

2.A.1 Summary Statistics Lights and Access to M-PESA Agents . .. 60

2.A.2 Summary Statistics Baseline Variables ...... 61

2.A.3 Main Difference-in-Differences Estimates ...... 62

2.A.4 Extensive Margin Results ...... 63

2.A.5 Intensive Margin Results ...... 64

2.A.6 Areas Under Baseline Grid ...... 65

2.A.7 Robustness to M-PESA Agent Access Distance Cutoff ...... 66

2.A.8 Interactions with Distances to Bank and Roads ...... 67

2.A.9 Interactions with Distances to Cities ...... 68

2.A.10 Interactions with Distances to Power Transmission Lines and Cell

Tow ers ...... 69

2.A. 11 Interactions with Aggregate Access and Population Density . .. 70

2.B.1 Event Study Coefficients Estimates ...... 72

3.A. 1 History of Railway Construction in Nigeria ...... 122

3.A.2 Observables in Areas Within and Outside 20 km of Railway Tracks 123

3.A.3 Effect of Proximity to Railway on Contemporary Outcomes (State Fixed Effects) ...... 124

3.A.4 First-Stage Estimates based on Distance to Line Joining Major N odes ...... 126

13 3.A.5 Effect of Proximity to Railway on Contemporary Outcomes (2SLS) 127

3.A.6 Falsification Exercises: Other ransportation Means and Placebo Lines ...... 128

3.A.7 Robustness Checks : Various Control Groups ...... 130

3.A.8 Long-Run Effects of Railway on Urbanization Outcomes . . . . . 131

3.A.9 Effect of Proximity to Railway in North and South of Nigeria .. 132

3.A.10 Differential Impact by Cohort (Non-Migrants Only) ...... 134

3.A.11 Short-Run Effects of the Railway on Urbanization Outcomes . .. 136 3.A. 12 Benefits of Shipping by Rail For Key Regional Crops ...... 137

3.A.13 Effect of Proximity to Railway By Distance to Coastal Port ... 138

3.A.14 Effect of Railway By Proximity to Early Cities ...... 139

3.B.1 Effect of Railway: Robustness to other Measures of Connectedness 141

3.3.2 Conley Standard Errors ...... 142

3.B.3 Effect of Railway: Robustness to Various Sub-samples ...... 143

3.B.4 Effect of Railway: Robustness to Various Sub-samples ...... 144

3.B.5 Falsification Exercise: Placebo Lines Estimates in North and South 145

3.3.6 Placebo Lines as Control Group in North and in South ...... 146

3.B.7 Robustness of No Effect in South Excluding Crude Oil Producers 147

4.E.1 Effective Paygo Tax Rates ...... 191 4.E.2 Unit Root Test Applied to Wage Innovations (five lags) ...... 192 4.E.3 Annual Statistics ...... 193 4.E.4 Effective and Predicted Paygo Tax Rates ...... 194

4.E.5 Capital Return Effects ...... 195

4.E.6 Labor Return Effects ...... 196 4.E.7 Integration of Financial Markets (RRA=12) ...... 197

14 4.F.1 Predicted Rational Paygo and Saving Rates (in percent) for dif- ferent RRA ...... 199 4.F.2 Predicted PaygoTax Rates : Comparison of GDP and Earnings

Based Estimates ...... 200

4.F.3 Predicted PaygoTax Rates with and without Accounting for the Risk of a Crisis (RRA=12) ...... 201

15 16 Chapter 1

Introduction

17 This thesis explores the impact and adoption of three types of technologies : mobile money, a leading financial technology in Kenya, the colonial railway, an im- portant transportation technology in Nigeria, and two prevalent savings investment technologies for the provision of retirement income : inter-generational transfers

(pay-as-you-go systems) and capital markets investments.

1.1 Mobile Money and Economic Activity: the Im- pact of a Financial Technology

Chapter 2 investigates the impact on economic activity of M-PESA, one of the most successful mobile phone-based financial innovations in the world. Combining data from the expansion of the mobile money agent network and light density data de- rived from nighttime satellite imagery, I estimate the impact on local economic per- formance of the mobile money innovation in Kenya, in the six years that followed its launch. The strategies I pursue exploit the variation in local areas that all got access to mobile money services but at different times, as well as the high resolution of the data which allows for the inclusion of hyper-local fixed effects. My results indicate that areas with access to mobile money services grow faster, especially if they are initially richer, urban, and connected to roads and banks. Access to mobile money services appear to be a complement to, rather than a substitute for, other alternatives that enable people to connect, trade and allocate investments within their networks.

18 1.2 Colonial Railroads in Nigeria : the Heteroge- neous Impact of a Transportation Technology

Chapter 3, written with Dozie Okoye (Dalhousie University) and Roland Pongou

(University of Ottawa), uncovers substantial heterogeneity in the long-run effects of colonial railroads in Nigeria. We show that the railway has large long-lasting impacts on individual and local development in the North, but virtually no impact in the South neither in the short run nor in the long run. This heterogeneous impact of the railway can be accounted for by the distance to ports of export. We highlight the fact that the railway had no impact in areas that had access to ports of export, thanks to their proximity to the coast and to their use of waterways, and that those areas barely adopted the railway as it did not reduce their shipping costs. Our analyses rule out the possibility that the heterogeneous impacts are driven by cohort effects, presence of major roads, early cities, or missionary activity, or by crude oil production.

1.3 Pay-As-You-Go vs. Capital Markets A Ra- tional Model of the Adoption of Savings Invest- ment Technologies

Chapter 4, written with Georges de Menil (Paris School of Economics), Fabrice

Murtin (OECD), and Eytan Sheshinski (Hebrew University of Jerusalem and Brown

University), explores the heterogeneity in the adoption of savings investment tech- nologies for the provision of retirement income. Two important engines chosen by societies for savings investment are : inter-generational transfers (pay-as-you-go sys-

19 tems) and capital markets investments. In this chapter, we argue that the effective weight assigned to pay-as-you-go retirement systems (inter-generational transfers) relative to capital markets is determined by a representative agent and a benevo- lent government jointly maximizing the expected life-time utility of the agent. We leverage distributions of labor and capital income calculated from national data on real GDP, real wages and the real return to capital since 1950. With uniform risk aversion, predicted rational rates explain 83% of the variance of observed pay-as- you-go rates. The globalization of capital markets would lead to convergence of pay-as-you-go rates. Our results are immune to crises like 2008.

20 Chapter 2

Mobile Money and Economic Activity: the Impact of a Financial Technology

21 2.1 Introduction

There is a long-standing debate about the role and importance of financial innova- tions in . On one end of the spectrum, famous economists such as Robinson (1952) and Lucas (1988) claim that finance follows and merely responds to changing demands from the real sector. On the other end of the spectrum, Miller (1998) finds the idea that financial markets contribute to economic growth "too obvious for serious discussion". Levine (2005) reviews the em- pirical literature on finance and growth and concludes that the bulk of the existing research suggests that countries with better banks and financial markets grow faster and that the relationship is not entirely due to a reverse causality of economic growth on finance. However, to assess the impact of finance on economic performance, the existing literature has focused on the development of banks and securities markets, both institutions hardly accessible to the majority of people in developing countries. In this article, I exploit the mobile money revolution in Kenya to show that a fi- nancial innovation generated outside the sphere of classic financial institutions can induce economic growth.

Mobile money is a recent financial innovation in developing economies that en- ables individuals to transfer money through Short Messaging Service (SMS). Kenya's M-PESA is one of the first and arguably the most successful example of mobile money service to date. M-PESA has been adopted by about 70% of Kenya's adult population in just four years after its launch in 2007.1 Essential to the fast adoption of M-PESA was the growth of a network of agents, small business outlets that provide cash-in and cash-out services. Households go to agents who exchange cash for an electronic balance (e-money) that can be sent by SMS from a mobile money account to another.

'Source : SIM card registration data from the telecommunications firm Safaricom

22 In 2012, there were over 49,000 mobile money agents in Kenya, 2 a staggering number provided that the country had about a thousand bank branches. 3 Thanks to this broad network of agents, many unbanked people have access to M-PESA services and use it to send and receive money, save, invest, and pay bills.

Using a large household survey in Kenya, Jack et al. (2013) and Jack and Suri

(2014) show that the access to M-PESA has increased risk-sharing and the ability of households to cope with shocks. They point to the ease to send remittances, stemming from lower transactions costs, as the main mechanism behind the improved risk-sharing. In the same vein, Suri and Jack (2016) find that access to mobile money services has reduced the number of households in extreme poverty. In this article, I ask: do the new possibilities generated by M-PESA translate into a tangible increase in local economic activity and if so, who do they benefit most?

Following recent papers measuring economic growth in Africa, I use light den- sity derived from nighttime satellite imagery as a proxy for local economic activity.

Exploiting difference-in-difference specifications, I investigate the impact on local economic performance of access to mobile money services, measured by proximity to mobile money agents. I divide Kenya based on a grid that mirrors the under- lying structure of nighttime light density images. A pixel in light digital archives represents a geographic area slightly under 1 km2 in Kenya.

For each pixel, I define access to M-PESA services based on the ever-decreasing distance between pixel cells and the agent network. The year 0 of access is char- acterized by the year within which a pixel cell gets below a 5 km distance (within walking distance) from its closest agent, if it ever does. In order to maximize the likelihood that I am comparing similar pixel cells, I restrict the sample to the set of

2Communications Commission of Kenya, Annual Report for the Financial Year 2013-2014. 3Kenya National Bureau of Statistics, Economic Survey 2012.

23 pixels for which a year +1 and a year +2 of access can be defined. I refer to this sample as the Reference sample and it is the basis for most of the results in this article. In other words, the sample of reference is the sample of cells that got access to M-PESA services for at least 2 years after their year 0.

To identify the effect that access to mobile money services (proximity to agents) has on economic activity proxied by light density, I need to remove any existing endogeneity in the spatial distribution of agents and its variation over time. To understand the spatial distribution of mobile money agents, one has to look at both sides of the expansion of the M-PESA agent network: the supply side with Safaricom, the company that launched M-PESA, offering lucrative commissions to would-be agents and as a result, the demand side with agent applicants filing an extremely large number of applications. I deal with concerns that Safaricom could have rolled out M-PESA agents to richer, urban and densely populated areas by including pixel cell fixed effects.

Concerns that Safaricom could have approved agent applications based on the most promising areas are alleviated because there is circumstantial evidence that Safaricom did not know the exact location of its agent applicants or even existing agents (definitely not beyond knowing their district or city of origin). Nonetheless, I include sublocation'-year fixed effects which effectively remove any concerns that knowledge of the future trajectory or a city or district could have biased the results. On the demand side of the agent expansion, a threat to identification comes from the possibility that areas that were experiencing a local boom had more agent applicants filing applications and as a result got more agents. Including sublocation- year fixed effectively deals with this potential source of bias. My results are immune

4A sublocation in Kenya has an average area of 9 km X 9 km. It is the lowest administrative level in the country and comes after provinces, districts, divisions and locations.

24 to scenarios in which an area (large or as small as a sublocation) suddenly takes a different trajectory or receives a shock at the same time that some of its pixel cells get access to M-PESA. Despite pixel, year and sublocation-year fixed effects, it is possible that agent applicants knew a lot more than Safaricom about their local economic geography and could file applications when they sensed that a particular area within a sublo- cation was growing. A first thing to note is that agent applicants could not decide the timing of their opening a mobile money agent business because of Safaricom's unpredictable and time-consuming approval process. Nevertheless, I investigate pre- trends by looking at the estimates of the impact of access to mobile money services each year relative to year 0. The non-existence of pre-trends in the years leading to year 0 of access renders credible the causal interpretation of the results. My findings suggest that after areas get access to M-PESA, they grow faster. A back-of-the envelope calculation indicates that the M-PESA effect is comparable to 7.1% of the increase in the average lights during the study "treatment" period. Based on gross assumptions, this translates into an additional growth rate in GDP of 0.5% a year for areas that have access to mobile money services. I decompose the estimates in an extensive and intensive margin of the light variable and show that the results are driven by areas that are already lit in the baseline, rather than by areas that become lit as a result of their access to mobile money services. The results are not driven by an electrification drive because restricting to the sample of areas that were under grid before M-PESA leaves the findings virtually unchanged. They are robust to changes in the cutoff distance that defines access to mobile money services. Importantly, the effect of M-PESA is stronger in areas that are more connected to banks, roads, cell towers and electricity transmission lines at baseline. The same

25 goes for areas that are closer to towns or cities. This latest result is not driven by proximity to the capital Nairobi or to the second largest city Mombasa. Access to mobile money services appear to be a complement to, rather than a substitute for, other alternatives that enable people to connect, trade and allocate investments within their networks.

Finally and unsurprisingly, the local effect of access to M-PESA is stronger when the aggregate access to mobile money services increases. This is consistent with the essence of M-PESA : connecting people for money transfers.

Related Literature The literature on mobile banking in general and on mobile money in particular is sparse because the phenomenon is recent. Jack et al. (2013) and Jack and Suri (2014) find that the mobile money revolution in Kenya helps households share risks and smooth shocks out. Suri and Jack (2016) show that access to mobile money has reduced the number of households in extreme poverty.

Aker et al. (2011) and Muralidharan et al. (2014) investigate the impact of delivering cash transfers to beneficiaries of a public program through mobile phones or biometric smartcards. Descriptive studies (Jack and Suri, 2011; Higgins et al., 2012; Mas, 2009;

Mas and Morawczynski, 2009; Morawczynski and Pickens, 2009; Plyler ct al., 2010) analyze the adoption and use of mobile money services in developing countries and particularly in Kenya.

The findings in this article suggest that mobile money, a recent and uniquely suc- cessful financial innovation, spurs economic growth. Since Goldsmith (1969) path- breaking study of finance and growth, a substantial body of empirical work has assessed the impact of financial system operations on economic activity and deter- mined whether banks and stock markets play an important role in promoting growth at different stages of economic development. The early literature has focused on

26 cross-country studies of finance and growth, trying different measures of financial development, and adding more and more controls and stock markets variables to the analysis (King and Levine, 1993; Levine and Zervos, 1998). Later on, instrumental variables have been used to go beyond showing that financial development predicts economic growth and to address the issue of causality. Measures of legal origins have been highly exploited to produce financial-development-impacting variables that can plausibly be treated as exogenous (La Porta et al., 1998; Levine et al., 2000). A sig- nificant issue with these papers is that they assume that legal origin may affect GDP growth only through financial development variables and the covariates specified in their regressions.

Other papers have used panel datasets and included country fixed effects to avoid biases stemming from cross-country regressions (Levine et al., 2000; Beck et al., 2000; rousseau and Wachtel, 2000). Case studies in the United States (Jayaratne and Strahan, 1996), in Italy (Guiso et al., 2004), in China (Allen et al., 2005), or in

France (Bertrand (t al., 2007) have shown that laws or regulations that improve the functioning of financial intermediaries can have a positive effect on growth. Industry- level and firm-level data have been used to explore the different mechanisms through which finance can affect growth, be it by lowering the wedge between external and internal finance, by relaxing constraints to firm investment, or by enhancing the private benefits of controlling firms (Rajan and Zingales, 1998; Denirgfii-Kunt and Maksimovic, 1998; Dyck and Zingales, 2004).

By looking at a specific financial innovation, this article addresses the follow- ing question : can adding a financial instrument engender economic growth? This circumvents the complicated and error-prone exercise of measuring financial devel- opment in an effort to causally link it to economic growth. The other novelty is that unlike the literature that tries to measure the impact of finance on growth, I focus

27 on a financial innovation occurring outside the realm of classic financial institutions, banks and securities markets, which are out of reach for large swaths of populations in developing countries. Finally, I use light density data to measure economic growth as does a recent strand of the economic literature. Light density at night has been shown to reflect economic activity at the national level (Henderson. et al., 2012; Chen and Nordlhaus, 2011) but also at subnational levels (Michalopoulos and Papaioannou, 2013; Storey- gard, 2016). Using DHS data, I show that light density at night indeed reflects changes in household wealth in Kenya. One of the main advantages of this data is that nighttime light density derived from satellite imagery provides a highly lo- cal measure of economic performance that can be aggregated at any administrative level at a constant time frequency in low-income countries where even province-level GDP data is usually unavailable. In a country like Kenya, there is hardly any other measure of local economic activity. A number of economists have used the light density data as a proxy for economic activity in empirical work. Storeygard (2016) explores the effects of transports costs on the economic growth of cities in Africa; Bleakley and Lin (2012) look at path dependence in the location of economic activity in the US while Baum-Snow ct al. (2012) measure the effect of transportation infrastructure on in China. Michalopoulos and Papaioannou (2013) and Pinkovskiy (2013) explore the impact of pre-colonial ethnic institutions and of national institutions on economic development. The rest of the article proceeds as follows. In the following section, I describe the context of M-PESA use and adoption in Kenya. Then, I discuss the construction of the dataset and provide summary statistics in Section 2.3. Section 2.4 explains the empirical strategy as well as the robustness checks I conduct. Section 2.5 presents the

28 results. Section 2.6 shows how the results vary based on pre-M-PESA characteristics and discusses the potential underlying mechanisms. The final section concludes.

2.2 Background on Mobile Money in Kenya

In March 2007, Safaricom, the largest mobile network operator in Kenya, launched a mobile phone-based payment and money transfer service, called M-PESA. 5 The service allows users to exchange cash for electronic money (e-money) and deposit electronic balances into a mobile money account stored in their cellphones. They can also send electronic balances to other users including sellers of goods and services via Short Messaging Service (SMS) and convert e-money back to cash. Charges apply when a transfer of electronic money is made or when cash is withdrawn.6 Since 2007, M-PESA has spread quickly in Kenya and is now the most successful mobile phone-based financial service in the world. The average number of daily registrations attained 10,000 in December 2007 and even increased in the subsequent years. In just about four years, M-PESA had reached the 14 million accounts mark and about 70 percent of the adult population in Kenya.

Essential to the rapid adoption of M-PESA in Kenya is its network of agents, small business outlets that provide cash-in and cash-out services to users. Agents typically run small businesses such as airtime distribution stores, cellphone retail shops, grocery stores, gas stations, etc. and receive commissions for both M-PESA deposits and withdrawals. M-PESA agents purchase e-money balances from Safari- com or from customers and hold it on their own mobile phones. They also maintain cash on their premises to fulfill any withdrawal required by users. The number of M-

5"Pesa" means money in Kiswahili, hence M-Pesa for MjobileJ-Money 6These charges are deducted directly from users' mobile money accounts.

29 PESA agents has grown from 4,000 in early 2008 to 33,000 in 2010 and to over 88,000 in 2013.' Compared to this growth in the agent network, much smaller changes in standard cash-in and cash-out services points (bank branches, ATM network) oc- curred over the same time period.

This network of agents, maintained and operated by M-PESA management is cru- cial to facilitate access to mobile money services. The closer agents are to households, the easier it is for customers to purchase e-balances or redeem cash from e-money sent to them. The tremendous expansion of the agent network brings cash-in and cash-out services to a walking distance of a large fraction of households.

Virtually all M-PESA users use the service to send remittances to other users.

75% use it to buy airtime, 42% to save and about 25% to pay bills, wages and services

such as taxi rides. By 2013, M-PESA users have made a total of $6.82 billions worth

of deposits and transferred from one to another a cumulative amount of $6.36 billions!

Since the inception of M-PESA, there has been other mobile money networks but

M-PESA was the first and most successful network in Kenya and by far the biggest

in the study period of this article (2000-2013). Consequently, it is the focus of this

study.8

Users pay no fee for depositing funds. Charges apply for transfers and cash with-

drawals and they vary depending on the amount of the transaction.' Despite these

fees, M-PESA is a financial innovation that dominates its money-transfer predeces-

sors on virtually all dimensions. Prior to the advent of mobile money services, the

majority of households used to send money through friends or bus drivers. Users

'Communications Commission of Kenya, Annual Report for the Financial Year 2013-2014. 8The terms M-PESA and mobile money are used interchangeably throughout the article. 9The details on the fee schedule are available at: http: //www. safaricom. co. ke/personal/ m-pesa/tariffs.

30 find M-PESA faster, cheaper, safer and more reliable than alternatives.10 The inno- vation also reached a large population of unbanked people, providing them with an accessible technology to save, pay, transfer and receive money.

2.3 Data

I combine two main sources of data to explore the effects of mobile money on economic growth in Kenya. I focus on light density at night as a proxy for economic activity and I use the expansion of the mobile money agent network to measure access to M-PESA services.

2.3.1 Lights

Ideally, to explore changes in local economic activity, I would use time-varying sub- national measures of either GDP, income, wealth or consumption. Unfortunately such measures are non-existent in Kenya, even at the district or provincial level." I use instead the light density at night as measured by satellites of the Defense Meteorological Satellite Program's Operational Linescan System (DMSP-OLS).12 The light data is available every year from 1992 to 2013 at a very high resolution. Each global digital image recorded by DMSP satellites is made of pixels, each of which represents a cell of 30*30 arcsecond2 , an area of about 0.9 km 2 in Kenya. Attached to each pixel is a value that represents the annual average stable intensity of lights emitted at night by the corresponding area. The data is already processed

'OSee Jack and Suri (2011) "1Up to the last national census in Kenya which took place in 2009, the country was divided into 8 provinces, then 70 districts, then 506 divisions, then 2456 locations and then 6631 sublocations. These are the official subdivisions that will be used throughout the article. 1 2The DMSP data is collected by the US Air Force Weather Agency. The image and data processing come from NOAA's National Geophysical Data Center.

31 to remove interference from clouds, forest fires, aurorae and other factors that are unrelated to normal human economic activity. In some years, two satellites recorded global nighttime light densities. As a result, 34 satellite-years are available for the 22-year period. In years where more than one digital image is available, pixel values are averaged in order to produce an annual dataset. The study period runs from 2000 to 2013, including roughly the same number of years before and after the launch of M-PESA in 2007.

A map of lit and unlit pixels is displayed on Figure 2.A.1. As one can see on that figure, most of Kenya has no visible light at night. But most of the country is also lowly populated. Unsurprisingly, night lights are visible in areas where most of the population density and economic activity in Kenya is, as both sides of the figure show.

The analysis in this article is done at the pixel level with a grid that replicates that of high-resolution light density images in order to extract all the information they contain. Kenya is covered by more than 600 thousands pixels. Each pixel cell is matched to Kenya official administrative subdivisions. On average, a district is made of 90 thousand pixels and a sublocation (the 5th and lowest administrative level within Kenya) comprises 101 pixels.

Lights as a Proxy for Economic Activity

Many recent papers have established that light density at night correlate well with economic performance at the country level (Chen and Nordhaus, 2011; Henderson et al., 2012; Michalopoulos and Papaioannou, 2013). Since in this article, I use highly local measure of lights, it is important to show that light density at night is a good proxy for economic activity at subnational levels. Few papers provide evidence of

32 this. Using a specification based on long-differences (1990/1992-2015), Storeygard

(2016) shows that in China where city- and prefecture-level GDP data exists, there is a significantly positive elasticity of GDP with respect to light density.

Michalopoulos and Papaioannou (2013) provide a cross-validation of light density and subnational economic performance in African countries. Using micro-level data from the Demographic and Health Surveys, they show that nighttime light density has a strong positive correlation with household wealth, and electrification in four large African countries representative of different parts of the continent : Nige- ria (West Africa), Tanzania (East Africa), Zimbabwe (South Africa) and Democratic

Republic of Congo (Central Africa).

Following their exercise, I look at how light density correlates with household wealth and electrification in Kenya. I use the Composite Wealth Index,1 3 a mea- sure of households cumulative living standard based on observables such as asset ownership (radio, TV, bicycles etc.), materials used in housing construction, water access, and sanitation. Averaging the wealth index over households in each DHS enumeration area and comparing it to the average luminosity of pixels 7 or 10 km around the centroid of the enumeration area results in a strongly positive correlation

(respectively 75% and 76%). A linear fit between the average household wealth index of DHS clusters in Kenya and the average light density in a given radius around DHS clusters along with the corresponding scatterplots are plotted on Figure 2.A.2. The radius around DHS clusters centroids is 7 km on the left and 10 km on the right.

Both graphs are produced by combining data from the two rounds of DHS surveys in

Kenya for which relevant nighttime light data and GIS coordinates of DHS clusters

13 See http: //www. dhsprogram. com/topics/wealth-index/Wealth-Index- Construction. cfm for more details on the construction of this index by DHS country teams. The DHS data is down- loaded from http: //www.dhsprogram. com/Data/, accessed in November 2016.

33 centroids are available (2003 and 2008-09). Both graphs depict a strong relationship between light density and household wealth. Although this cross-validation is mostly cross-sectional and uses aggregates of pixel values while the results in this article are based on variations of pixel values over time, the strong correlation between light densities and household wealth index in Kenya DHS clusters is reassuring. The same goes for electrification : correlations of average light density of pixels around 7 km and 10 km of DHS clusters with the household electricity dummy are respectively 72% and 73%. This is a sanity check that light densities capture for the most part electricity lights generated by households."4 Min (2008) establishes that in areas with densities under 4 ppl/sqkm, lights are not a suitable proxy for economic activity because the data is very noisy. Cogneau and Dupraz (2014) highlight the fact that including these areas might lead to spurious correlations and regressions in cross-sectional settings. Consequently, I exclude all areas with densities below 4 ppl/sqkm in 1999. More generally, throughout the analysis, pixel fixed effects will be included to remove any influence from fixed" differences such as densely populated areas vs. sparsely populated areas. I also filter out sublocations" that have no pixel ever lit in the 2000-2013 period which spans the pre- and post period of the analysis. Those areas are mostly lowly populated, with little economic activity. Lights are unlikely to capture changes in economic activity that might be happening there. As a result, these areas are not suitable to the analysis in this article.

14 These correlations are strong despite the displacement of enumeration areas centroids of up to 2 km for urban areas, up to 5 km for rural areas, with a further randomly-selected 1% (every 100th) of rural clusters displaced a distance up to 10 kilometers. This noise is willingly added to the precise geo-coordinates of DHS clusters centroids for obvious privacy reasons. 15over the relevant "treatment period" which runs from 2007 to 2013 6 ' Sublocations are the 5 th and lowest administrative level in Kenya. According to the official subdivisions, in effect up to the last national census in Kenya (2009), it comes after provinces, districts, divisions and locations. Sublocations comprise on average 101 pixels.

34 Finally, I remove from the analysis, pixel cells that never get within walking distance of mobile money agents" to M-PESA during the study period. I look at the sample of areas that got access to M-PESA at some point and use variation in the timing of their access to the financial innovation to understand its impact on economic activity.

After applying these filters, in an effort to remove areas that might be different in dimensions that are not captured by the different controls I use, I am left with 51 thousands pixel cells. As shown in the top panel of Table 2.A.1, the average light density in this sample was 1.47 in 2007 and 3.22 in 2013. In this article, I evaluate how much of this important time variation in lights is due to access to mobile money services.

2.3.2 The M-PESA Agent Network Expansion

I use unique data on financial service providers locations in Kenya collected by Fi- nancial Sector Deepening (FSD) Kenya and its partners (Bill and Melinda Gates

Foundation and Central Bank of Kenya and FSD Kenya, 2016). I map the M-PESA agent network in the country each year during the 2007-2013 period, from the incep- tion of M-PESA to the latest year of available light data.

The number of mobile money agents in Kenya has been in rapid expansion. As of 2013, there were more than 88,000 mobile money agents spread out across the country. This is close to three times the agent count in 2010 and twenty times the agent count in early 2008. Since consumers convert cash to M-PESA e-money and vice versa thanks to agents who provide cash-in and cash-out services, agent proximity to a given household is a good measure of the ease of access to M-PESA

7 1 Access to M-PESA will be defined with more details in Section 2.3.2.

35 services. 18 For each pixel cell, I define access to M-PESA services as having a M-

PESA agent within a 5 km distance.

As shown on Figure 2.A.3, the density of agents has increased considerably be- tween 2007 and 2013. However most districts that were covered in 2013 were already covered in 2007. What changed over time was not new large areas getting access to mobile money services, but rather within each district more areas getting within walking distance to an agent. Indeed, the agent network was quickly rolled out to cover most populated areas and the expansion of access to M-PESA services has come mostly from an increase in agent density rather than from an expansion to new broad areas.

Unsurprisingly, different pixel cells within a sublocation got access to mobile money services at different times and this variation will be key in the analysis that follows. Using ArcGIS tools, I compute, in each year, the distance of each pixel cell to its closest agent. These distances go down over time as agent density dramatically increases. I record the first year that these distances go below walking distance (5 km) as the year 0 of access to M-PESA. As shown in Table 2.A. 1, the average distance to M-PESA agents in the sample of cells with eventual access drops from 13.3 km in 2007 to 2.4 km in 2013.

2.3.3 Other factors

The data on administrative divisions that shapes many of the fixed effects used in this analysis comes from the 2009 master shapefile of the Kenya National Bureau of

Statistics. Attached to it are population counts and densities at the sublocation level.

From the data on financial service provider locations, I have baseline data on bank

18See Jack and Suri (2014) for household-level evidence of the positive relationship between agent density and use of M-PESA.

36 branches locations at the launch of M-PESA in 2007. I also collect publicly available GIS information on baseline locations of major roads, primary and secondary roads from lijmans et al. (2016), cell towers, 19 major electricity transmission lines, 20 and city locations from Henderson ot al. (2016). Using ArcGIS tools, I compute the distances from pixel cells to these infrastruc- tures. As shown in Table 2.A.2, there is a lot of variation in baseline access to the various infrastructures. About half of the sample is under the baseline grid. The average sublocation density averaged over pixels is close to 400 ppl/sqkm.

2.4 Empirical Strategy

I test whether the use of mobile money services significantly increases economic per- formance as proxied by light density at night. I enrich the analysis by looking at how that impact varies based on other factors such as access to roads or bank branches. In this section, I describe my empirical approach and identification assumptions, as well as the robustness checks I conduct.

2.4.1 Basic Specification

The unit of analysis is the pixel cell i and the time frequency t is annual. The key dependent variable is access to mobile money services which is defined by the year within which a given pixel cell has its closest M-PESA agent within walking distance

(5 km) .2' Different pixels get access to mobile money at different times but the time

9from OpenCell ID : http: //opencellid.org. Accessed in November 2016. 20from the African Infrastructure Country Diagnostic database (AICD; 2009): http:/ capacity4dev. ec. europa. eu/afretep/minisite/processed-gis-data. Accessed in November 2016. 2 1Robustness to this cutoff has been tested and the results are presented in Table 2.A.7.

37 zero t = 0 is uniformely defined by the year within which they get access to the service for the first time. Year t = 0 corresponds to 2007 for the cells that got access first and to 2013 for the cells that got access last in the study time period.

The main specification used to explore the effects of access to mobile money services on economic activity takes the following form:

Lit = ai + yt + fAccessit + 9Xit + fit (2.1)

Lit is the light density at night of pixel i in year t. ac and and Yt represent respectively pixel fixed effects and year fixed effects in the 2000-2013 period. Accessit is defined as a dummy that takes the value 1 for years after year 0. Xit indicates time-varying factors that can be added as controls such as district-year fixed effects or sublocation- year fixed effects.

The main coefficient of interest is /. It indicates the average residual difference in light density between pixels that have got access to M-PESA and those that have not yet, after accounting for pixel fixed effects, year fixed effects and other time-varying controls. Throughout the analysis, standard errors are conservatively clustered at the district level (3 administrative levels above sublocations) to account for spatial autocorrelation.

Next, I discuss the threats to identification related to (2.1) and the solutions I propose and exercises I carry out to assuage any related concern.

2.4.2 Identification

Equation (2.1) identifies the causal impact of M-PESA on light density if the ac- cess variable is independent of the error term fit conditional on pixel fixed effects, time fixed effects and time-varying controls. This identification assumption is auto-

38 matically satisfied if the spatial distribution of mobile money agents is exogenous or independent from any other time-varying factor that systematically affects nighttime light density.

To understand the spatial distribution of mobile money agents, one has to look at both sides of the rollout of M-PESA agents: the supply side with Safaricom offering lucrative commissions to would-be agents and as a result, the demand side with agent applicants filing an extremely large number of applications. All applications to become agents had to be screened at Safaricom central offices and this time- consuming approval process led to the rationing of agents. 22

On the supply side, it is possible that Safaricom rolled out M-PESA agents to richer, urban and densely populated areas and we could be merely identifying dif- ferences between these areas and other areas. These are fixed area characteristics at least in the short treatment period of this study. Consequently, including pixel fixed effects appropriately deals with any such concerns.

Another concern is that Safaricom could have approved agent applications based on the most promising areas. In other words, Safaricom might have known which areas where about to grow faster and it might have approved more agents in those locations. In that case, my identification strategy would pick up growth that is unrelated to access to mobile money services. Fortunately, this is not a plausible concern in this particular setting. There is circumstantial evidence that Safaricom did not know the exact location of their agent applicants or even existing agents, beyond knowing their district or city of origin. As a consequence, it is not credible to imagine that they could have known which particular area within a district or city was going to grow faster in the following year or two. Nonetheless, I include sublocation-year

2 2Safaricom had to verify the identity of agents, make sure that they had purchased an initial quantity of electronic money, that they had a bank account and physical stores.

39 fixed effects which effectively removes any concerns that any knowledge of the future trajectory or a city or district could have biased the results. 2 3

Another source of concern has to do with the fact that Safaricom installed a few pilot agents just before the launch of M-PESA. These agents that account for less than 0.5% of the sample of agents are removed from the analysis.

On the demand side of the agent rollout, a threat to identification could be that areas that were experiencing a local boom had more agent applicants logging in applications and as a result got more agents. I include sublocation-year fixed effects to remove this potential source of bias. My results are immune to a scenario in which an area (large or as small as a sublocation) is suddenly on a different trajectory or receives a shock at the same time that some of its pixel cells get access to M-PESA.

It is important to note that including pixel and sublocation-year fixed effects rules out the possibility that results are driven by population (density) effects. First, as the treatment period is short (maximum 6 years and for most cells less than 3), it is hard to imagine big population changes that hinge on the treatment. If differences in population density are considered fixed over the treatment period, pixel fixed effects prevent population differences from impacting my results. Besides, even if population is expected to change differentially in the treatment period, sublocation-year fixed effects make the results immune to such changes because they capture any changes in sublocation populations.2 4

Despite all of this, it is possible that agent applicants knew a lot more than

Safaricom about their local economic geography and could file more applications when they sensed that a particular area was growing within a sublocation. A first

23A sublocation in Kenya has an average area of 9 km X 9 km. It is the lowest administrative level in Kenya. 24The best available population data in Kenya is at the sublocation level, the lowest administrative subdivision in Kenya in the relevant study period.

40 thing to note is that agent applicants could not decide the timing of their opening a mobile money agent business because of the unpredictable and time-consuming approval process of Safaricom. That said, the best way to look at the validity of this threat to identification is to explore pre-trends.

Pre-trends

Different pixels got access to mobile money agents at different times but I can look at the year before the uniformized year 0, the year before the year within which a given pixel got for the first time within walking distance (5 km) to its closest M-

PESA agent. If indeed agent applicants could target within each sublocation areas that were growing and send more applications then, one would expect to see, in the years leading to access to M-PESA, a pre-trend indicating growth before access to

M-PESA, especially given that agent applicants had to wait for Safaricom's lengthy approval process.

To explore pre-trends, I modify (2.1) by breaking down the coefficient of interest into year-specific coefficients as shown below:

Lit = ai + t + E fttYearAccessit + 9Xit + Eit (2.2) t

In equation (2.2), instead of a dummy for access, the key variables YearAccessit

represent years since year 0. They are positive, post-access and negative, pro-access.

Pre-trends are detected if the estimates of the coefficients 13t are statistically signifi-

cant in the years leading to access to the service and of comparable magnitude than

in the years following access. It would mean that the "treatment group", which here

is a not a fixed group, is already experiencing a differential increase in light density

prior to the advent of M-PESA.

41 In order to make sure that I am comparing pixel cells that got access to M-PESA to other cells that could have gotten access to it at the same time but got it a year or two later, I further restrict the sample to cells that got access to M-PESA for at least 2 years after year 0. I call this sample the Reference sample and it is the basis for most results in this article.

2 Running specification (2.2) yields the results presented on Figure 2.A.4. ' The 95% confidence intervals are plotted for each year's coefficient in the figure. The difference between the right and the left picture is that the right one is the result of a specification that includes sublocation-year fixed effects. As one can see on both parts of the figure, most coefficients on years leading to access to M-PESA are not significantly different from 0 and they are much smaller than the coefficients related to the years following access to mobile money services, which are positive and statistically significant. This is reassuring and validates the assumption that areas that just got access to M-PESA are not on a different dynamic than areas that will get access to the same services in a year or more. In other words, this exercise validates the key hypothesis that the "treatment" and "control" groups are on parallel trends in the pre-access period. To summarize, the sources of identification in this analysis are fine-grained fixed effects (pixel fixed effects, year fixed effects and sublocation*year fixed effects) and the fact that different pixels are "treated" at different times even within a sublocation, which assuages concerns about another intervention happening systematically at the same time. An additional piece of evidence on the validity of the agent rollout as a basis for identifying the mobile money effect comes from Jack and Suri (2014). Using detailed household data, they show that the rollout of agents is uncorrelated with observables including wealth, cell phone ownership, literacy, education, use of bank

2 5The corresponding coefficients estimates are presented in Table 2.B.1.

42 account and other financial instruments.

Additional Identification and Robustness Checks

I decompose the results in an extensive and intensive margin of the light variable by looking at differences in baseline lights (zero vs. positive baseline lights). I measure baseline lights as the average lights between year -5 and year -1 of access to M-PESA. I also look at the possibility that the results might be driven by an electrification drive by restricting the sample to areas that were under grid before M-PESA, that is within 10 km of major baseline power transmission lines. I explore robustness to changes in the cutoff distance that defines access to mobile money services. I then interact the access to M-PESA variable to the log of base- line distances to banks, roads, cell towers, electricity transmission lines, and cities. Finally, I interact the main variable of interest with the log of population density as well as with the national aggregate access to M-PESA defined as the proportion of the sample that has access to mobile money services.

2.5 Results

2.5.1 Main Results

Table 2.A.3 presents the estimates of the main specification : equation (2.1) on two samples. The first sample which I will refer to as the Got-Access sample is the subset of pixels which distance to the closest M-PESA agent got under the 5 km cutoff at some point during the study period. The second sample is the subset of the Got- Access sample that got access to mobile services for at least 2 years. This is what I refer to as the Reference sample. It is the main sample used throughout the analysis

43 as it pushes further the logic of looking at cells that got access for a sufficiently long period and only exploiting the differences in the timing of their initial access to evaluate the impact of mobile money services. To be precise, since year 0 is defined as the year within which a given cell was within walking distance of an agent, a pixel cell belongs to the Reference sample if a year of access +1 and a year of access +2 can be defined for that cell during the study period.

Columns 1 and 2 of Table 2.A.3 pertain to the Got-Access sample and columns 3 and 4 to the Reference sample. In all columns, year fixed effects and pixel fixed effects are included. Columns 2 and 4 show the results when sublocation-year fixed effects are added to the regressions. All estimates of the coefficient of interest, the coefficient on the access variable, are significant at the 1% level. When sublocation-year fixed effects are included, the estimates in column 1 and 3 are halved. This indicates that sublocations are on different dynamics. It is possible that the sublocation-year variation is not endogenous and should rightfully be attributed to the impact of mobile money services. The fact that there are no pre-trends when no sublocation- year fixed effects are included (only pixel and year fixed effects) suggests that the sublocation-year variation over and above the pixel and year fixed effects might be due to the differential timing in access to mobile money services. These differences in estimates might indicate positive spillovers of the access to mobile money services to other cells within a sublocation. Nevertheless, my focus will be on zeroing-in on the most plausible exogenous source of variation, the result of which is presented in column 4. After including sublocation-year fixed effects and restricting the sample to areas that got access -to M-PESA for at least 2 years after the year within which they got access, the estimate of the coefficient of interest is 0.139.

It is not straightforward to establish how big the magnitude of the result is. A back-of-the envelope calculation based on the difference in means of the light variable

44 between 2007 and 2013 implies that the effect is comparable to 7.1% of that increase in lights. If we make the gross assumption that the increase in the mean of lights follows the increase in GDP per capita over the same period, then areas with access to mobile money services grew faster than other areas by a rate of 0.5% a year as a result of their access to mobile money services.

2.5.2 Extensive and Intensive Margins

I decompose the result into an extensive and an intensive margin in order to under- stand whether the result is coming mostly from pixels that were already lit or from pixels that were never lit. For the extensive margin, in Table 2.A.4, I focus on the subset of the Reference sample for which the average pixel value over the 5 years prior to year 0 of access is zero (columns 3-6). The estimates of the coefficients in columns

3 and 4 which pertain to the pixel value variable are smaller than the equivalent estimates for the whole Reference sample (columns 1 and 2). When sublocation-year fixed effects are added in column 4, the estimate 0.032 is not significantly different from zero. This suggests that the mobile money effect uncovered in this article is mainly due to the intensive margin than to the extensive margin. This is confirmed by columns 5 and 6 of Table 2.A.4 which show the effect on the lit dummy to be small and not significant when sublocation-year fixed effects are added. Another piece of evidence on the importance of the intensive margin in the mobile money effect is presented in Table 2.A.5. The coefficients in columns 3-4 of that table correspond to the subset of the Reference sample for which the average pixel value over the 5 years leading to year 0 of access to a mobile money agent is strictly positive. The magnitude of the estimates in column 4 is striking when sublocation-year fixed effects are included, the estimate is more than twice the corresponding Reference

45 sample estimate (column 2).

2.5.3 Robustness Checks

Electrification

A potential identification concern is that the results might be picking up an elec- trification drive in the country. This is unlikely to be a real threat to identification because sublocations are not big divisions and it is likely that when cells in a sublo- cation get access to electricity, the entire sublocation does. Therefore, including sublocation-year fixed effects assuages such electrification-related concerns.

A more direct check on the immunity of my results to electrification changes is to look at areas that are close to major power transmission lines before the launch of M-PESA in 2007. These are areas that were already under grid before M-PESA.

Restricting the sample to these locations leads to the results presented in columns 3 and 4 of Table 2.A.6. Comparing columns 2 and 4 which correspond to the inclusion of sublocation-year fixed effects, the point estimates 0.139 and 0.142 are both sta- tistically significant at the 1% level and almost identical. That the result does not vanish and is even unchanged when restricting to areas that were under-grid before the advent of M-PESA assuages any concern that the mobile money effect is driven by electrification.

Mobile Money Access Cutoff

Throughout the analysis, I have used a 5 km distance to characterize the ease of access to mobile money agents. This appears to be a natural walking distance beyond which it is arguably cumbersome for mobile money customers to reach out to agents.

It sits in the middle range of the distances I could use since sublocation-year fixed

46 effects are included and a sublocation covers on average a 9 km X 9 km area. I look at variations to this 5 km cutoff and presents the corresponding results in Table 2.A.7. Unsurprisingly the results are bigger with lower-distance cutoffs : 2 km and 3 km (see columns 1-4) and the number of pixels used to compute these results is lower since many pixels never get that close to agents in the study period. The results in these 4 columns are statistically significant at the 1% level as well. Using 7 km as the cutoff leads to a statistically significant but smaller estimate when only pixel and year fixed effects are included (Column 7 of Table 2.A.7) but when sublocation-year fixed effects are included, the estimate becomes insignificant. This is consistent with the fact that such a distance cutoff leads to circles around agents being closer to sublocation areas and as a consequence, to very little variation left after including sublocation-year fixed effects. Overall, the estimates presented in Table 2.A.7 confirm the robustness of the mobile money effect results to the access cutoff parameter.

2.6 Heterogeneity and Mechanisms

2.6.1 Heterogeneity

In order to get an insight on the mechanisms at play, it is useful to explore how the mobile money effect varies based on baseline characteristics.

Banks

Columns 3 and 4 of Table 2.A.8 present the results of a variant of the main speci- fication on the Reference sample in which the access variable is interacted with the log of distance to the closest baseline branch. It is based on the spatial distribution

47 of branches that prevailed at the launch of M-PESA in 2007. The coefficient on the interaction is negative and significant at the 1% level. This indicates that the mobile money effect on economic activity is stronger the closer the areas of interest are of baseline bank branches.

This is an interesting and non-obvious result because there are countervailing forces in the interplay between banks and mobile money services. Since mobile money is a financial innovation that offers a range of financial services that are more affordable and accessible to the majority of households in Kenya, one could have imagined that mobile money services and bank services would be clear substitutes and as a result, that the mobile money effect would be stronger in areas that are further away from banks.

Conversely, there are complementarities between mobile money and banks. Mo- bile money agents operate more easily when banks are around because they can more easily manage their cash using bank branches. This assuages the complex cash management problem they have to deal with since they do not know exactly how much electronic money there are going to need in a particular week or month. On the other hand, customers who get to more easily receive and transfer funds may find it handy to have a bank branch around to deposit money or use the money they receive through M-PESA as a means to get loans from banks. The results presented in columns 3 and 4 of Table 2.A.8 clearly speak in favor of the complementarity between banks and mobile money services rather than the opposite force. This also suggests that the opportunities offered by mobile money services might be enabling, allowing households to take advantage to a higher degree of other services such as more traditional bank services.

48 Roads

The results of other interactions of the access variable are presented in columns 5-8 of Table 2.A.8. The interacting variable is distance to primary roads in column 5-6 and it is distance to primary and secondary roads in columns 7-8. As for banks, all estimates point to the fact that areas closer to roads benefit more from access to mobile money services. Again, this is not a straightforward result because one could have expected that areas further away from roads who had arguably worse alternatives to send and receive money would benefit more from access to mobile money. This result is again consistent with a reinforcing mechanism where access to mobile money services enables households to benefit more from other infrastructures that allow them to trade and connect with their networks.

Cities

Table 2.A.9 presents the results of the interaction of the access variable with log of distances to cities in Kenya. The interacting variable is the log of distance to the closest city/town in columns 3-4, the log of distance to Nairobi in columns 5-6, the log of distance to Nairobi or Mombasa in columns 7-8 and the log of distance to the closest Top 5 city in columns 9-10. The first thing to note is that the coefficient on the interacting variable is not significant or barely significant once sublocation-year fixed effects are included and the interacting variable is Nairobi or Nairobi-Mombasa related (columns 6 and 8). However, those coefficients are negative and statistically significant at the 1% level when the interacting variable is log of distance to top 5 city and, with a higher magnitude, when the interacting variable is log of distance to the closest city/town. This indicates that the mobile money effect is strongly

49 differential based on distances to a city or town but that this result is not just driven by proximity to Nairobi or proximity to Mombasa, the second largest city in Kenya. The estimates in Table 2.A.9 clearly indicate that urban areas benefit more from access to mobile money services.

Other infrastructure

Table 2.A.10 presents the results of the interaction of the access variable with log of distances to baseline major electricity transmission lines (columns 3 and 4) and log of distances to cell towers in Kenya (columns 5 and 6). Unsurprisingly, being close to a cell tower is associated with a bigger effect of access to mobile money services. In these areas, people can take advantage of mobile money services to a higher degree because of an arguably better phone connection. Areas closer to baseline power transmission lines also see a bigger effect when they have access to M-PESA. This is another piece of evidence that the mobile money effect has more to do with the intensive margin when it comes to lights, which indicates that areas that already had some electricity (arguably richer and more urban areas) benefit more from it.

Aggregate access and Population Density

Columns 3 and 4 of Table 2.A.11 show the results of interacting the access variable with the national percentage of access to M-PESA. As one would expect an increase in aggregate access leads to a bigger effect on areas that have access. Since the essence of M-PESA has to do with connecting people for money transfers, it is perhaps unsurprising that the local effect of M-PESA increases with the aggregate access to it.

50 Finally, I confirm in column 6 of Table 2.A.11 that population density is not an important part of the mobile money effect once sublocation-year fixed effects are included. This is not surprising since the highest-resolution of population data in

Kenya is at the sublocation level, the lowest administrative level in the country.

2.6.2 Channels

M-PESA can impact economic growth through multiple channels. By reducing trans- action costs, it has a positive effect on remittances (Jack and Suri, 2014; Morawczyn- ski and Pickens, 2009). Remittances stimulate local demand and provide other mem- bers in the community with a source of credit. Households can use those increased remittances to start or grow businesses. As a result, access to mobile money can improve the allocation across households and businesses of savings and physical and investments. Households take advantage of the superiority of M-

PESA over its money-transfer predecessors to send money further from home if the return to capital warrants it or to invest in skills that are valuable to the broader network of peers they transact with. Such better allocations result in a higher return to investments making households ultimately richer.

Through easier payment arrangements, access to mobile money services also facil- itates the trade of goods and services. Firms can arrange to pay their suppliers with a push of a few buttons instead of traveling long distances. They can also be paid more easily by their customers. Households can use M-PESA to pay for a host of services, from electricity bills to taxi fares. This allows households and firms to save time, energy and money and it makes the business environment safer as it reduces the need to carry large amounts of cash. By providing an easier and safer storage technology, M-PESA can increase household savings, net of losses due to theft.

51 Without more data, it is not possible to pin down the precise mechanisms or the biggest driving force underlying the positive relationship between the use of mobile money and economic performance. However, there is a clear reinforcing mechanism at play : M-PESA makes it easier for households to benefit from other services/infrastructure available to them to trade or connect in ways that are en- hancing economic activity. My results show that areas that are closer to roads or bank branches or urban areas benefit the most, not the least, from access to mobile money. Access to mobile money services is more of a complement than a substitute to access to traditional banking services, roads or urban amenities.

This enabling effect of M-PESA is more consistent with a growth effect than a mere redistribution effect. In a simple redistribution story, one would expect areas that are poorer and that have less access to potential alternatives to mobile money services (banks, road transport) to benefit more from access to mobile money by receiving remittances from richer areas and by using it to close the wealth gap.

As my results show, it is instead richer and more connected areas that benefit the most from mobile money services by using the increased capacity and ease to send remittances, to make payments, and to trade, to take better advantage of their access to roads, traditional banking services and urban amenities.

2.7 Conclusion

Combining data from the expansion of the mobile money agent network in Kenya and light density data derived from nighttime satellite imagery, I estimate the impact on local economic performance of the mobile money innovation in Kenya, six years after its launch. My results indicate that the economic activity in areas that enjoy an easier access to mobile money services grows faster than that in less well-served

52 areas. The relationship is stronger in areas that are initially richer, urban, and more connected to roads and banks.

The strategies I pursue exploit the variation of local areas that all got access to mobile money services but at different times. The highly local fixed effects that I use, as well as those that are interacted with time, rule out the possibility that the results are driven by initial differences between areas that just got access to mobile money services and those that got it before or later. I show that the relationship is robust to a number of variations and that it presents interesting heterogeneity based on baseline characteristics.

The mobile money financial innovation facilitates the trade of goods and services through easier payment arrangements and can help households save and better al- locate their investments. My results show that it is a good complement to other alternatives that enable people to connect, trade and allocate investments within their network. It enables areas where households have a higher access to those alter- natives (banks, roads, urban amenities) to grow faster. A more granular and precise understanding of the strength of the different channels underlying the positive effect of mobile money on economic activity is a great avenue of future research.

53 54 Appendix

2.A Figures and Tables

55 Figure 2.A.1: Nighttime Light Density and Population Density

Q'4

=0- 4 .F 2 - 1oo 4-6 101 - 300 6 - 8 301 - 600 8 -63 >= 601

(a) Lights (b) Population Density

Note: The left part of the figure shows a heat map of the maximum pixel value of each pixel cell during the 2000-2013 study period. On the right, a heat map of the population density in 1999 is displayed and it highlights densely populated areas where most of the economic activity in Kenya is concentrated. Figure 2.A.2: Nighttime Light Density and Household Wealth - Kenya DHS Clusters

Il) 0 0 a 06 *% (I) 0 0 0 1 0 0 0 % % 0 0 0 % 0, 00 0 1% *#; 0 0 0 %* G* 0 Co X 0 000 00 # a) 0 *%% 0 Ot Ow OWS, % 0 U' 0 00 % 0 0 0 0

0 V) r 0 M

0 10 20 30 40 5b C.11 0 Radius of a DHS Cluster Light Density in a 7km Radius of a DHS Cluster Light Density in a 10km

(a) 7 km radius of DHS cluster (b) 10 km radius of DHS cluster average light density in a given radius Note: A linear fit between the average household wealth index of DHS clusters in Kenya and the DHS clusters centroids is 7 km on the left around DHS clusters along with the corresponding scatterplots are plotted. The radius around Kenya for which relevant nighttime light data and 10 km on the right. Both graphs are produced using the two rounds of DHS surveys in and GIS coordinates of DHS clusters centroids are available (2003 and 2008-09). Figure 2.A.3: Mobile Money Agent Network Expansion

A .j

I

*1

(a) Agents 2007 (b) Agents 2010

00

(c) Agents 2013

Note: Agent locations are shown in years 2007 (leftmost), 2010 (center) and 2013 (rightmost). M-PESA was quickly rolled out to cover most populated areas. The expansion of the network came from a gradual increase in the density of agents in covered areas rather than from the coverage of new broad areas. Figure 2.A.4: Event Study Graphs

Co -

U) C" *0

0) CNJ

Q1 ------

0 a0- 4------

-4 -3 -2 -1 0 r-51 2 3 4 5 6 7 Year Year

(a) No Sublocation-Year Fixed Effects (b) Sublocation-Year Fixed Effects

Note: For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. Both graphs are based on the Reference sample : the set of pixels for which a year +1 and a year -2 of access can be defined. On the left, after purging pixel and year fixed effects from pixel-level light density, difference-in-differences estimates along with their 95% confidence intervals are plotted for each year, relative to year 0. On the right, sublocation-year fixed effects are also purged. Standard errors are conservatively clustered at the district level. The magnitude and the statistical significance of the estimates in the years leading to year 0 of access do not indicate pre-trends. Table 2.A.1: Summary Statistics : Lights and Access to M-PESA Agents

Panel A: Pixels Having Eventual Access to Agents (Got Access Sample)

Mean SD Med 75-pct 95-pct Max Obs Lights 2007 1.47 5.03 0 0 7.5 59.5 51110 Lights 2008 1.84 5.59 0 0 9 62 51110 Lights 2013 3.22 7.54 0 6 13 63 51110 Dist Agent 2007 13.33 20.91 7.45 13.26 40.15 188.49 51110 Dist Agent 2008 8.21 11.17 5.04 8.68 26.23 103.5 51110 Dist Agent 2013 2.38 1.31 2.26 3.42 4.64 5 51110 Access MM 2007 0 0 0 0 0 0 51110 Access MM 2008 .32 .47 0 1 1 1 51110 Access MM 2013 .94 .25 1 1 1 1 51110

Panel B: Pixels Having at least 2 Years of Access to Agents (Reference Sample)

Mean SD Med 75-pct 95-pct Max Obs Lights 2007 1.67 5.37 0 0 8 59.5 44074 Lights 2008 2.09 5.96 0 0 10 62 44074 Lights 2013 3.62 8 0 6 15 63 44074 Dist Agent 2007 12.12 20.22 6.66 11.83 32.88 187.88 44074 Dist Agent 2008 6.97 9.83 4.5 7.36 21.4 96.64 44074 Dist Agent 2013 2.24 1.27 2.09 3.2 4.5 5 44074 Access MM 2007 0 0 0 0 0 0 44074 Access MM 2008 .37 .48 0 1 1 1 44074 Access MM 2013 1 0 1 1 1 1 44074 Lit 2007 .22 .41 0 0 1 1 44074 Lit 2008 .23 .42 0 0 1 1 44074 Lit 2013 .34 .47 0 1 1 1 44074 Ever Lit? .52 .5 1 1 1 1 44074 Note: For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. All pixels that were never within such a distance of M-PESA agents during the 2000-2013 study period are dropped and the remaining sample (Got Access) is the basis for the statistics in the top panel. Statistics in the bottom panel are based on the Reference sample : the sample of pixels for which a year +1 and a year +2 of access can be defined. The access variable is a dummy that takes the value 1 for all years after year 0. Distances to agents are in km. Lit is a dummy for non-zero pixel values.

60 Table 2.A.2: Summary Statistics : Baseline Variables

Mean SD Med 75-pct 95-pct Max Obs Dist Bank 16.09 21.74 10.26 17.44 48.87 189 44074 ln(Dist Bank) 2.32 .91 2.33 2.86 3.89 5.24 44074 Dist Power Line 26.44 50.1 10.6 26.2 101.4 479.6 44074 ln(Dist Power Line) 2.37 1.31 2.36 3.27 4.62 6.17 44074 Dist Major Road 27.64 55.93 9.8 23.8 145.8 466.8 44074 ln(Dist Major Road) 2.31 1.34 2.28 3.17 4.98 6.15 44074 Dist Road 3.23 2.78 2.1 3.8 9 29.4 44074 ln(Dist Road) .94 .62 .74 1.34 2.2 3.38 44074 Dist Cell Tower 7.47 10.65 4.5 8.2 21.8 109 44074 In(Dist Cell Tower) 1.57 .86 1.5 2.1 3.08 4.69 44074 Dist City 23 23.94 15.9 26.8 80.1 188 44074 ln(Dist City) 2.73 .91 2.77 3.29 4.38 5.24 44074 Dist Nairobi 199.98 129.09 181.8 287.3 432.2 674.7 44074 ln(Dist Nairobi) 5.01 .87 5.2 5.66 6.07 6.51 44074 Dist Top2 City 163.31 109.49 149.5 242 328.5 674.7 44074 ln(Dist Top2 City) 4.8 .88 5.01 5.49 5.79 6.51 44074 Dist Top5 City 84.43 78.64 62.7 101.6 214 645.2 44074 ln(Dist Top5 City) 4.09 .87 4.14 4.62 5.37 6.47 44074 Pixel Subloc Density 393.58 1233.15 217 457.9 962.4 73307.3 44074 In Pix Subloc Density 5.24 1.24 5.38 6.13 6.87 11.2 44074 Under Grid .49 .5 0 1 1 1 44074

Note: Variables are observed at the pixel level. Variables that vary dur- ing the 2000-2013 study period are measured befor e the advent of M-PESA (banks in 2007, roads in 2004, electricity transmission lines in 2007 and population density in 1999 (latest census before the launch of M-PESA).

61 Table 2.A.3: Main Difference-in-Differences Estimates

Got Access Reference Sample

(1) (2) (3) (4)

Access_5km 0.555*** 0.253*** 0.289*** 0.139*** [0.128] [0.0411 [0.0861 [0.0381

Year FE Yes Yes Yes Yes

Pix FE Yes Yes Yes Yes

Area*time FE No Subloc*t No Subloc*t Pixels 51110 51110 44074 44074 Observations 715540 715540 617036 617036

Note: Difference-in-differences estimates are reported. The dependent variable is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. All pixels that were never within such a distance of M-PESA agents during the 2000-2013 study period are dropped and the remaining sample (Got Access) is the basis for estimates in columns 1 and 2. Estimates in columns 3 and 4 are based on the Reference sample : the sample of pixels for which a year +1 and a year +2 of access can be defined. The access variable is a dummy that takes the value 1 for all years after year 0. All regressions include pixel and year fixed effects. Columns 2 and 4's regressions include sublocation-year fixed effects. Standard errors conserva-

tively clustered at the district level are reported in brackets. ***, **, and * indicate significance at the 1%, 5% and 10% levels, respectively.

62 Table 2.A.4: Extensive Margin Results

Lights Lights (Baseline Unlit) Lit (Baseline Unlit)

(1) (2) (3) (4) (5) (6)

Access_5km 0.289*** 0.139*** 0.066*** 0.032 0.013*** 0.005 [0.0861 [0.038] [0.0161 [0.021] [0.0031 [0.004]

Year FE Yes Yes Yes Yes Yes Yes

Pix FE Yes Yes Yes Yes Yes Yes

Area*time FE No Subloc*t No Subloc*t No Subloc*t Pixels 44074 44074 28062 28062 28062 28062 Observations 617036 617036 392868 392868 392868 392868

Note: Difference-in-differences estimates are reported. The dependent variable is light density at night observed at the pixel-year level in columns 1-4 and a dummy for it being positive in columns 5 and 6. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. The Reference sample is the set of pixels for which a year +1 and a year +2 of access can be defined and it is the basis for columns 1 and 2's regressions. Regressions whose results are presented in columns 3-6 are based on the subset of the Reference sample for which light densities were zero from year -5 to year -1. The access variable is a dummy that takes the value 1 for all years after year 0. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2, 4 and 6's regressions include sublocation-year fixed effects. Standard errors conservatively clustered at the district level are reported in brackets. ***, **, and * indicate significance at the 1%, 5% and 10% levels, respectively.

63 Table 2.A.5: Intensive Margin Results

Lights Lights (Baseline Lit)

(1) (2) (3) (4)

Access_5km 0.289*** 0.139*** 0.325* 0.367*** [0.0861 [0.0381 [0.1661 [0.1141

Year FE Yes Yes Yes Yes

Pix FE Yes Yes Yes Yes

Area*time FE No Subloc*t No Subloc*t Pixels 44074 44074 16012 16012 Observations 617036 617036 224168 224168

Note: Difference-in-differences estimates are reported. The dependent vari- able is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. The Reference sample is the set of pixels for which a year + 1 and a year +2 of access can be defined and it is the basis for columns 1 and 2's regressions. Columns 3 and 4 results are based on the subset of the Reference sample for which light densities were on av- erage positive between year -5 and year -1. The access variable is a dummy that takes the value 1 for all years after year 0. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2 and 4's regressions include sublocation-year fixed effects. Standard errors

conservatively clustered at the district level are reported in brackets. ***, **, and * indicate significance at the 1%, 5% and 10% levels, respectively.

64 NoW,

Table 2.A.6: Areas Under Baseline Grid

Reference Sample Under Baseline Grid

(1) (2) (3) (4)

Access_5km 0.289*** 0.139*** 0.413*** 0.142** [0.0861 [0.0381 [0.147 [0.0551

Year FE Yes Yes Yes Yes

Pix FE Yes Yes Yes Yes

Area*time FE No Subloc*t No Subloc*t Pixels 44074 44074 21408 21408 Observations 617036 617036 299712 299712

Note: Difference-in-differences estimates are reported. The dependent vari- able is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. The Reference sample is the set of pixels for which a year +1 and a year +2 of access can be defined and it is the basis for columns 1 and 2's regressions. Columns 3 and 4 results are based on the subset of the Reference sample for which cells are within 10 km of existing major transmission lines at the launch of M-PESA in 2007. The access variable is a dummy that takes the value 1 for all years after year 0. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2 and 4's regressions include sublocation-year fixed effects. Standard errors conservatively clustered at the district level

are reported in brackets. ***, **, and * indicate significance at the 1%, 5% and 10% levels, respectively.

65 Table 2.A.7: Robustness to M-PESA Agent Access Distance Cutoff

2 km 3 km 5 km 7 km (1) (2) (3) (4) (5) (6) (7) (8) Access_2km 0.668*** 0.166*** [0.2411 [0.052] Access_3km 0.484*** 0.186*** [0.1531 [0.0461 Access_5km 0.289*** 0.139*** [0.086] [0.0381 Access_7kr 0.172** 0.048 [0.0661 [0.032] Year FE Yes Yes Yes Yes Yes Yes Yes Yes Pix FE Yes Yes Yes Yes Yes Yes Yes Yes Area*time FE No Subloc*t No Subloc*t No Subloc*t No Subloc*t Pixels 15781 15781 26557 26557 44074 44074 55945 55945 Observations 220934 220934 371798 371798 617036 617036 783230 783230 Note: Difference-in-differences estimates are reported. The dependent variable is light density at night ob- served at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below a given distance to its closest agent (2 km in columns 1-2, 3 km in columns 3-4, 5 km in columns 5-6 and 7 km in columns 7-8). All regressions are based on the Reference sample : the set of pixels for which a year +1 and a year +2 of access can be defined. The access variable is a dummy that takes the value 1 for all years after year 0. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2, 4, 6, and 8's regressions include sublocation-year fixed effects. Standard errors conservatively clustered at the district level are reported in brackets. ***, **, and * indicate significance at the 1%, 5% and 10% levels, respectively. Table 2.A.8: Interactions with Distances to Bank and Roads

No Interaction Banks Main Roads Roads (1) (2) (3) (4) (5) (6) (7) (8) Access_5km 0.289*** 0.139*** 2.576*** 2.401*** 1.641*** 1.381*** 0.937*** 0.999*** [0.086] [0.038] [0.582] [0.429] [0.389] [0.259] [0.223] [0.130] Access X ln(Dist Bank) -1.029*** -0.999*** [0.241] [0.195] Access X ln(Dist Main Road) -0.603*** -0.564*** [0.152] [0.123] Access X ln(Dist Road) -0.716*** -0.922*** [0.171] [0.123] Year FE Yes Yes Yes Yes Yes Yes Yes Yes Pix FE Yes Yes Yes Yes Yes Yes Yes Yes Area*time FE No Subloc*t No Subloc*t No Subloc*t No Subloc*t Pixels 44074 44074 44074 44074 44074 44074 44074 44074 Observations 617036 617036 617036 617036 617036 617036 617036 617036 Note: Difference-in-differences estimates are reported. The dependent variable is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. All results are based on the Reference sample : the set of pixels for which a year +1 and a year +2 of access can be defined. The access variable is a dummy that takes the value 1 for all years after year 0. In columns 3-8, interaction of the access variable with log distances to the closest bank branch, primary road, primary or secondary road are successively added. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2, 4, 6, and 8's regressions include sublocation- year fixed effects. Standard errors conservatively clustered at the district level are reported in brackets. * **, and * indicate significance at the 1%, 5% and 10% levels, respectively. Table 2.A.9: Interactions with Distances to Cities

No Interaction Closest City Nairobi Top2 City Top5 City (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Access_5km 0.289*** 0.139*** 2.814*** 2.681*** 5.981*** 0.686* 7.045*** 0.821** 5.211*** 1.380*** [0.086] [0.038] [0.4831 [0.457] [1.751] [0.379] [1.406] [0.354] [1.390] [0.402] Access X ln(Dist City) -0.951*** -0.952*** [0.162] [0.180] Access X ln(Dist Nairobi) -1.145*** -0.110 [0.354] [0.079] Access X ln(Dist Top2 City) -1.414*** -0.143* [0.294] [0.078] Access X ln(Dist Top5 City) -1.214*** -0.305*** [0.332] [0.107] 00 Year FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Pix FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Area*time FE No Subloc*t No Subloc*t No Subloc*t No Subloc*t No Subloc*t Pixels 44074 44074 44074 44074 44074 44074 44074 44074 44074 44074 Observations 617036 617036 617036 617036 617036 617036 617036 617036 617036 617036 Note: Difference-in-differences estimates are reported. The dependent variable is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. All results are based on the Reference sample : the set of pixels for which a year +1 and a year +2 of access can be defined. The access variable is a dummy that takes the value 1 for all years after year 0. In columns 3-10, interaction of the access variable with log distances to the closest city or town, to Nairobi, to Nairobi or Mombasa or to the closest top 5 city are successively added. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2, 4, 6, 8 and 10's regressions include sublocation-year fixed effects. Standard errors conservatively clustered at the district level are reported in brackets. ***, **, and * indicate significance at the 1%, 5% and 10% levels, respectively. Table 2.A.10: Interactions with Distances to Power Transmission Lines and Cell Towers

No Interaction Electricity Lines Cell Towers

(1) (2) (3) (4) (5) (6)

Access_5km 0.289*** 0.139*** 1.868*** 0.626*** 2.172*** 1.574*** [0.086] [0.0381 [0.4351 [0.1821 [0.427] [0.1861

Access X ln(Dist Power) -0.680*** -0.218*** [0.1671 [0.0771

Access X ln(Dist Tower) -1.259*** -0.941*** [0.2541 [0.1171

Year FE Yes Yes Yes Yes Yes Yes Pix FE Yes Yes Yes Yes Yes Yes Area*time FE No Subloc*t No Subloc*t No Subloc*t Pixels 44074 44074 44074 44074 44074 44074 Observations 617036 617036 617036 617036 617036 617036

Note: Difference-in-differences estimates are reported. The dependent variable is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. All results are based on the Reference sample : the set of pixels for which a year +1 and a year +2 of access can be defined. The access variable is a dummy that takes the value 1 for all years after year 0. In columns 3-6, interaction of the access variable with log distances to the closest electricity transmission line or to the closest cell tower are successively added. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2, 4, and 6's regressions include sublocation-year fixed effects. Standard errors conservatively clustered at the district level are reported in brackets. ***, * and * indicate significance at the 1%, 5% and 10% levels, respectively.

69 Table 2.A.11: Interactions with Aggregate Access and Population Density

No Interaction Aggregate Access Population Density

(1) (2) (3) (4) (5) (6)

Access_5km 0.289*** 0.139*** -0.833*** -0.388*** -3.025*** 0.405** 10.086] [0.0381 10.1721 10.1031 [0.7561 [0.1691

Access X National Access 3.940*** 1.808*** 10.8471 [0.4171

Access X ln(Pop. Density) 0.619*** -0.054* [0.1481 [0.0311

Year FE Yes Yes Yes Yes Yes Yes Pix FE Yes Yes Yes Yes Yes Yes Area*time FE No Subloc*t No Subloc*t No Subloc*t Pixels 44074 44074 44074 44074 44074 44074 Observations 617036 617036 617036 617036 617036 617036

Note: Difference-in-differences estimates are reported. The dependent variable is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. All results are based on the Reference sample : the set of pixels for which a year +1 and a year +2 of access can be defined. The access variable is a dummy that takes the value 1 for all years after year 0. In columns 3-6, interaction of the access variable with the national aggregate access and the log of popula- tion density are successively added. The study period runs from 2000 to 2013. All regressions include pixel and year fixed effects. Columns 2, 4, and 6's regressions include sublocation- year fixed effects. Standard errors conservatively clustered at the district level are reported in brackets. ***, **, and * indicate significance at the 1%, 5% and 10% levels, respectively.

70 2.B Additional Tables

71 Table 2.B.1: Event Study Coefficients Estimates

(1) (2) YearAccess_5km-== -6 0.212*** 0.103* [0.059] [0.055] YearAccess_5km== -5 0.242*** 0.081 [0.070] [0.0491 YearAccess_5km== -4 -0.002 -0.044 [0.036] [0.054] YearAccess_5km=- -3 0.066* -0.002 [0.036] [0.047] YearAccess_5km- -2 -0.080* -0.059* [0.047] [0.032] YearAccess_5km==- -1 -0.036 -0.016 [0.034] [0.033] YearAccess_5km== 1 0.343*** 0.192*** [0.076] [0.036] YearAccess_5km== 2 0.680*** 0.386*** [0.181] [0.074] YearAccess_5km==- 3 1.628*** 0.888*** [0.349] [0.123] YearAccess_5km== 4 1.948*** 1.090*** [0.4181 [0.144] YearAccess_5km--- 5 2.874*** 1.699*** [0.609] [0.254] YearAccess_5km-- 6 4.321*** 2.604*** [0.936] [0.396] Year FE Yes Yes Pix FE Yes Yes Area*time FE No Subloc*t Pixels 44074 44074 Observations 617036 617036 Note: Difference-in-differences estimates are reported. The dependent variable is light density at night observed at the pixel-year level. For each pixel cell, year 0 indicates the year within which the cell's distance to the closest M-PESA agent got below 5 km. All results are based on the Reference sample : the set of pixels for which a year +1 and a year -2 of access can be defined. Both regressions include pixel and year fixed effects. Columns 2's regression in- cludes sublocation-year fixed effects. Standard errors conservatively clustered at the district level are reported in brackets. * * and * indicate significance at the 1%, 5% and 10% levels, respectively.

72 Bibliography

Aker, Jenny C., Rachid Boumnijel, Amanda McClelland, and Niall Tier- ney, "Zap it to me: The short-term impacts of a mobile cash transfer program," 2011.

Allen, Franklin, Jun Qian, and Meijun Qian, "Law, finance, and economic growth in China," Journal of FinancialEconomics, 2005, 77 (1), 57-116.

Baum-Snow, Nathaniel, Loren Brandt, J. Vernon Henderson, Matthew A. Turner, and Qinghua Zhang, "Roads, railroads and decentralization of Chinese cities," 2012.

Beck, Thorsten, Ross Levine, and Norman Loayza, "Finance and the Sources of Growth," Journal of FinancialEconomics, 2000, 58 (1), 261-300.

Bertrand, Marianne, Antoinette Schoar, and David Thesmar, "Banking deregulation and industry structure: Evidence from the French banking reforms of 1985," The Journal of Finance, 2007, 62 (2), 597-628.

Bill and Melinda Gates Foundation and Central Bank of Kenya and FSD Kenya, "FinAccess geospatial mapping 2015," 2016.

Bleakley, Hoyt and Jeffrey Lin, "Portage and path dependence," The Quaterly Journal of Economics, 2012, 127 (2), 587-644.

Chen, Xi and William Nordhaus, "Using luminosity data as a proxy for economic statistics," Proceedings of the National Academy of Sciences of the United States of America, 2011, 108 (21), 8589-8594.

Cogneau, Denis and Yannick Dupraz, "Questionable Inference on the Power of Pre-Colonial Institutions in Africa," 2014.

73 Demirgiig-Kunt, Asli and Vojislav Maksimovic, "Law, finance, and firm growth," Journal of Finance, 1998, 53, 2107-2137.

Dyck, Alexander and Luigi Zingales, "Private benefits of control: An interna- tional comparison," The Journal of Finance, 2004, 59 (2), 537-600.

Goldsmith, Raymond, "Financial structure and economic development," Yale Uni- versity Press, New Haven, CT, 1969.

Guiso, Luigi, Paola Sapienza, and Luigi Zingales, "Does Local Financial De- velopment Matter?," The Quarterly Journal of Economics, 2004, pp. 929-969.

Henderson, J. Vernon, Adam Storeygard, and David N. Weil, "Measuring Economic Growth from Outer Space," The American Economic Review, 2012, 102 (2), 994-1028.

_ - , and Uwe Deichmann, "Has climate change driven urbanization in Africa?," Journal of Development Economics, 2016.

Higgins, Dylan, Jake Kendall, and Ben Lyon, "Mobile Money Usage Patterns of Kenyan Small and Medium Enterprises," Innovations, 2012, 7 (2), 67-81.

Hijmans, Robert, Mariana Cruz, Edwin Rojas, Rachel O'Brien, and Israel Barrantes, "DIVA-GIS, version 1.4. A Geographic Information System for the Management and Analysis of Genetic Resources Data.," http://www.diva-gis. org/datadown 2016. Online; accessed November 2016.

Jack, William, Adam Ray, and Tavneet Suri, "Transaction Networks: Evidence from Mobile Money in Kenya," American Economic Review, 2013, 103 (3), 356- 361.

- and Tavneet Suri, "Mobile Money: The Economics of M-PESA," 2011.

- and _ "Risk sharing and transactions costs: Evidence from Kenya's mobile money revolution," The American Economic Review, 2014, 104 (1), 183-223.

Jayaratne, Jith and Philip E. Strahan, "The finance-growth nexus: Evidence from bank branch deregulation," The Quarterly Journal of Economics, 1996, 111, 639-670.

King, Robert G. and Ross Levine, "Finance and growth: Schumpeter might be right," The Quarterly Journal of Economics, 1993, 108, 717-738.

74 La Porta, Rafael, Florencio Lopez de Silanes, Andrei Shleifer, and Robert W. Vishny, "Law and finance," Journal of , 1998, 106, 1113-1155.

Levine, Ross, "Finance and growth: theory and evidence," in "Handbook of eco- nomic growth," Vol. 1A 2005, pp. 865-934.

_ and Sara Zervos, "Stock markets, banks, and economic growth," American Economic Review, 1998, 88, 537-558.

, Norman Loayza, and Thorsten Beck, "Financial intermediation and growth: Causality and causes," Journal of Monetary Economics, 2000, 46 (1), 31-77.

Lucas, Robert E., "On the mechanics of economic development," Journal of Mon- etary Economics, 1988, 22 (1), 3-42.

Mas, Ignacio, "The economics of branchless banking," Innovations, 2009, 4 (2), 57-75.

_ and Olga Morawczynski, "Designing Mobile Money Services: Lessons from M-PESA," Innovations, 2009, 4 (2), 77-91.

Michalopoulos, Stelios and Elias Papaioannou, "Pre-Colonial Ethnic Insti- tutions and Contemporary African Development," Econometrica, 2013, 81 (1), 113-152.

Miller, Merton H., "Financial markets and economic growth," Journal of Applied Corporate Finance, 1998, 11 (3), 8-15.

Min, Brian, "Democracy and light: electoral accountability and the provision of public goods," 2008.

Morawczynski, Olga and Mark Pickens, "Poor people using mobile financial services: observations on customer usage and impact from M-PESA," 2009.

Muralidharan, Karthik, Paul Niehaus, and Sandip Sukhtankar, Payments infrastructure and the performance of public programs: Evidence from biometric smartcards in india, National Bureau of Economic Research, 2014.

Pinkovskiy, Maxim L., "Economic discontinuities at borders: Evidence from satel- lite data on lights at night," 2013.

75 Plyler, Megan, Sherri Haas, and Geetha Nagarajan, "Community Level Eco- nomic Effects of M-PESA in Kenya: Initial Findings," 2010.

Rajan, Raghuram G. and Luigi Zingales, "Financial Dependence and Growth," American Economic Review, 1998, 88, 559-586.

Robinson, Joan, "The generalisation of the general theory," in "The rate of interest and other essays," Vol. 2, MacMillan, London, 1952, chapter The genera, pp. 1-76.

Rousseau, Peter L. and Paul Wachtel, "Equity markets and growth: cross- country evidence on timing and outcomes, 1980-1995," Journal of Banking & Fi- nance, 2000, 24 (12), 1933-1957.

Storeygard, Adam, "Farther on down the road: transport costs, trade and urban growth in sub-Saharan Africa," Review of Economic Studies, 2016, 83, 1263-1295.

Suri, Tavneet and William Jack, "The long-run poverty and gender impacts of mobile money," Science, 2016, 354 (6317), 1288-1292.

76 Chapter 3

Colonial Railroads in Nigeria: the Heterogeneous Impact of a Transportation Technology

77 3.1 Introduction

A number of recent empirical studies document the importance of transportation technologies for economic development in a variety of settings.1 A question that naturally arises is whether these uncovered impacts of transportation technologies are homogeneous in connected areas, and if not, how much they vary based on pre- existing characteristics. In this article, we address these questions in the context of colonial railroads in Nigeria. Exploring heterogeneity in the impact of "new" transportation technologies is a first step towards understanding the conditions under which they lead to sustained economic growth and development. Identifying such conditions is important for the design of optimal policies on new transportation infrastructure.

In Nigeria, the railroads were primarily constructed to enhance export trade with Europe, but various parts of the country differed with respect to the availability of alternative transportation technologies and initial market access to Europe, as re- vealed by pre-railway trade volumes. The colonial railroads connected the interior of the country to coastal ports, but the South, by virtue of its proximity to the sea, already had viable alternatives, such as waterways and roads, which enabled trade with Europeans, and these alternatives co-existed with the railway. Such alternatives did not exist in the North, and the railroads were essential in opening this region to European trade and in shifting trade from the Sahara to the coast. These im- portant baseline differences motivate our choice of Nigeria to explore the potential heterogeneity of the impact of railroads.

'See Banerjee et al. (2012), Faber (2014), and Baum-Snow et al. (2012) on railways and roads in China; Donaldson (2016) on railways in India; and Donaldson and Hornbeck (2016) and Bleakley and Lin (2012) on the impact of the railway and portage sites in the United States. Jedwab et al. (2015), Jedwab and. Moradi (2015), Fourie and Ilerranz-Loncan (2015), and Storeygard (2016) provide evidence on the positive economic impacts of roads and rail networks in Africa.

78 We proceed in a number of steps. First, we present a framework that enables us to causally estimate the long-run impacts of the railway on local economic development.

Based on individual and household data from the 2008 Nigerian Demographic and

Health Survey (DHS) and railway data from the Digital Chart of the World (DMA, 1992), the framework involves state fixed effects to explore differences between areas close to the railway and areas further away within the same state, and an instrumen- tal variable approach involving the distance to straight lines between nodes 2 as an instrument for connection to the railway line. We use the framework to investigate the differential impacts of the railway in the North and the South of Nigeria, ar- eas with significant differences in alternative transportation technologies. Our main finding is that the colonial railway has neither a short-run nor a long-run economic impact in Southern Nigeria, but it has large positive impacts on local development in the North. This is true for broad indicators of economic development measured at the individual and household level. These measures include human capital, occupational characteristics, media access, household wealth, and urbanization.

Second, we analyze historical urbanization and city growth data from Jedwab and

Moradi (2015) in order to establish that the heterogeneous impacts of the railway we document were also present (or absent) in colonial times. The non-impact of colonial railways in the South and its large positive impact in the North are stable over time and have persisted long after the railways became dysfunctional. These empirical findings are consistent with a theoretical model in which, in the North, railroads improved market access to Europe and encouraged the concentration of production factors in connected localities, inducing a spatial equilibrium that persisted in the long run, even after the demise of railroads. In the South, however, railroads did not significantly change the initial spatial equilibrium; they hardly had any short- or

2Hypothetically, this is the shortest path between railway nodes.

79 long-run local development impacts.

Third, we present evidence from historical sources that shed light on the potential underlying mechanisms. We document the fact that prior to the construction of the railway, Southern Nigeria was already involved in large and significant export trade with Europe, and earned a reputation as the "oil palm coast" well before the twentieth century (Law, 2002). Following the introduction of the railway, this trade did not grow significantly, compared to double digit growth rates in export of the main Northern crops.

Furthermore, railway adoption rates were significantly lower in the South, with less than 30% of Southern crops being shipped by rail compared to over 80% for most Northern crops. We find that the proportion of the main Southern crops that were railed to the coast significantly declined during the period of railway expansion, which indicates that railways did very little to stimulate economic activities in these areas. Our evidence on shipping volumes and costs suggests that the low adoption rates in the South are due to higher opportunity costs. While the railway decreased transportation costs in the North by more than 65% compared to roads, our calcula- tions reveal that they were significantly more expensive than alternatives (roads and rivers) in the South.

In addition, we show that railroads have a persistent effect only in areas that were far away from ports of export. The North/South dichotomy in the impact of the railway can be entirely accounted for by the distance to ports of export. The railway had no impact in areas that were close to ports of export, no matter whether they are Northern or Southern areas. We also show that the heterogeneity of the railway impact cannot be explained away by distances to early cities which were more prevalent in the South.

To uncover the long-term impacts of colonial railroads on local economic de-

80 velopment, we exploit a number of empirical strategies, falsification exercises and robustness checks. Our first empirical strategy is to compare areas connected to the railway to areas unconnected to the railway within localities, or states, that were targeted. The assumption is not that states were exogenously connected, but that locations within a state were connected exogenously. In other words, the precise location of the railway, within a targeted state, was exogenous to local characteris- tics related to current development. For example, while the railway was intended to connect the areas surrounding Kano known for being suitable for groundnut produc- tion, the precise location of the line within the Kano area was plausibly exogenous to contemporary or future development. We claim that within a state, the railway was not systematically placed in the most developed localities or in localities that had the most potential for growth.3

To provide evidence that railway-connected localities and non-connected localities were similar, we compare railroad and non-railroad localities with respect to broad geographic and climatic determinants of development as well as presence (and size) of early cities and Christian mission stations. We do not find railroad and non-railroad

locations to differ on these time-invariant geographic characteristics. Furthermore, we do not find that connected localities are more likely to have a Christian mission

station, or to be connected to the road network or to have a river running through their local area.4 Our estimates are robust to the control of the aforementioned

31n fact, there are instances where the railway was located in less prosperous areas for a variety of geographic and other local idiosyncrasies. For example, the railway in "Lagos" began in neighboring Iddo because Lagos itself was an island. A second example is the line that terminated in the state of "Oil Rivers", which in fact ended in Port Harcourt, a city built from scratch, as opposed to more prosperous pre-colonial ports such as Bonny, Calabar, New Calabar, and Opobo. 4The local government is the smallest administrative unit in Nigeria with an average area of 1020 km 2 , and median area of 705 km 2, and serves as the primary measure of the "local area" in which the individual lives. Individuals are identified by their DHS clusters which we refer to as localities, and we match localities to the local government area that they belong to.

81 factors in addition to other individual and household variables, including ethnicity and state fixed effects.

Nevertheless, in the absence of information on all the factors that contribute to local development, we are unable to completely rule out the claim that railroads were endogenously placed within states. In order to address further endogeneity concerns, we use an instrumental variable approach. We compute the distance to straight lines joining major nodes and use it as an instrument, for being connected to the railway line. This identification approach has been implemented in Banerjee et al. (2012), Jedwab et al. (2015), and Jedwab and Moradi (2015), among several others. Once again, within a state, the straight line connecting nodes is the hypo- thetical line that, in theory, would have minimized construction costs, all else equal. Deviation from this line therefore might reveal the extent to which the actual rail trajectory might not have been chosen randomly by the colonial government.5 The identification assumption in this empirical strategy is that straight lines between nodes affect economic development only through their correlation with actual lines. In implementing this method, we exclude observations in nodes, as these connected locations might have been endogenously chosen. We do not find the instrumental variable estimates to be very different from the estimates based on state fixed effects. We continue to find that the railway has a positive effect in the North and no effect in the South. Interestingly, the first stage of the instrumental variable results reveals that most geo-climatic and other local area variables are not significant determinants of connectedness to the railway lines.

We perform several identification checks. We show that our results are not driven

5 This argument ignores other geographic and climatic conditions that might call for deviations from straight lines between nodes. Deviations from straight lines might not be endogenous to the local economic development of connected areas if they are motivated by technical or geo-climatic characteristics of the localities the railway passes through.

82 by alternative transportation means. We also estimate the effect of placebo lines on our outcome variables. These lines are segments that were surveyed and proposed for railway construction, but were not constructed. They were abandoned for a variety of reasons unlikely to be related to short- or long-term economic development, such as the turnover of officials in charge of colonial railways and the conflicting interests of the colonial government (Jackel, 1997). If the effects of the railway we identify using our instrument reflect the developmental impact of the railway on localities closer to a straight line connecting nodes, we would expect the placebo lines to have no impact, especially in areas where placebo lines do not coincide with roads or waterways. Indeed, we do not find the various placebo lines to have any economic effect, whether the effects are estimated for the whole country or separately for the

North where the railway had a significant impact.

Additionally, we use localities close to these placebo lines as a control group to analyze the impact of the railway. Precisely, we estimate the effect of being within

20 km of a railway line relative to being within 20 km of a placebo line. We find a large economic effect of the railway in the North. In the South, the effect is close to zero and is not statistically significant. These findings provide further evidence that the impact of the railway in the North and its non-impact in the South are indeed causal, and not merely driven by being close to a "straight line" connecting early urban areas. This is especially true if the placebo lines were not constructed for idiosyncratic reasons, as Jedwab (t al. (2015) and Jedwab and Moradi (2015) argue.

Our results are robust to a variety of other confounders. In addition to the geo- climatic variables discussed above, our estimates are robust to controlling for the presence of mission stations within the local area and for distances to rivers and roads (possibly endogenous to railways). The estimated impacts of the railway lines are also found for migrants and non-migrants, and areas with and without mission

83 stations. We detect a positive impact of the railway when we exclude local areas that are run through by railway tracks, nodes or stations. In addition, our results are robust to the exclusion of oil-producing areas of the South that might have altered the post-railway spatial equilibrium. Using a continuous measure of connectedness to the railway line (distance to a railway line) instead of the discrete measure in our main specification (being within 20 km of a railway line) yields quantitatively similar results. The estimated impacts of the railway lines are robust to various definitions of the control group: individuals outside 20 km, between 20 to 40 km, between 40 to 60 km, and so on. We further interpret these findings as evidence that there are no significant negative spillovers to adjoining localities.

A few recent studies have examined the long-run economic impact of infrastruc- ture investments (Iluillery, 2009; Banerjee et al., 2012; Baum-Snow et al., 2012; Don- aldson and Hornbeck, 2016; Donaldson, 2016; Bleakley and Lin, 2012). In Africa, urbanization is found to have been sustainably impacted by colonial railroads (Jed- wab et al., 2015; Jedwa.b and Moradi, 2015). Similar to the argument we provide to shed light on the long-run economic impact of colonial railroads in Northern Nigeria, Jedwab and Moradi (2015) explain that, in Ghana, colonial railways lowered trade costs and boosted the cultivation of cocoa in railroad locations, fostering the emer- gence of cities in these locations. This initial spatial equilibrium persisted because railroad locations facilitated the coordination of subsequent investments.6

Our study differs from these papers in a number of important respects. First, we are interested in the heterogeneity of the railway impact. In particular, we do not find that colonial railways have had any local economic impact in Southern Nigeria,

'Also see Storeygard (2016) who underscores the importance of road networks and connection to coastal ports for local economic performance, and Fourie and Herranz-Loncan (2015) who document the importance of the railway in South Africa.

84 and areas closer to the coast, in contrast to their positive impacts in the Northern regions of the country. In further contrast to the average impacts of the railway we estimate, and hence to the general conclusions reached by the extant literature on the long-run effect of colonial railways on urbanization in Africa, we find no evidence that colonial railways were the engine of urbanization in Southern Nigeria. In fact, most cities in the South do not lie along the railway line, while almost all large cities in the North are connected to the railway. This is important for understanding the policy implications of recent studies of the impact of transportation investments. Our results suggest that these investments are most worthwhile in areas where they significantly improve market access and stimulate new trade. The contrast between the low adoption rates and impacts of the railway in the South, and the high adoption rates and impacts in the North, is consistent with the views of Fogel (1964) that new transportation technologies have little impact if pre-existing technologies are viable or can readily be improved. The results on the impacts of railways on individual-level developmental outcomes are also of independent interest to studies of African urbanization. Fay and Opal (2000) and Jedwab and Vollrath (2015) document the poor economic performance of several urban areas in developing countries, compared to historical examples from other regions. We find that, in areas without pre-existing viable transportation technologies, connection to the colonial railway increased urbanization, and that individuals living in these urban areas are more educated, more literate, more likely to work in professional occupations, less likely to work in agriculture, more likely to engage with mass media (TV, radio, newspapers), and live in wealthier households. This suggests that, while urban areas are not industrializing or growing as fast as one would expect from historical examples, urban areas connected to the railway are still generally better off than surrounding countrysides. Furthermore, the fact

85 that, within rural areas, colonial railways have a positive impact on individual-level economic outcomes in the North of Nigeria implies that estimated impacts are not entirely driven by urbanization. The rest of the article is organized as follows. Section 3.2 depicts the historical context of the construction of the colonial railway in Nigeria. Section 3.3 presents our data and the various identification strategies and robustness checks that we use to assess the local impact of the railroads. The corresponding results are described in section 3.4. Section 3.5 analyzes the North-South differences in the long-run impact of the railway. Section 3.6 discusses the dynamics of the path of the impacts of the railway and compares short- and long-run effects. Section 3.7 documents the mechanisms underlying the heterogeneity of the impact of railroads. The final section concludes.

3.2 Historical Background

So vast an area as Nigeria, comprising in all some 380,000 square miles... cannot be commercially developed except by railways - p. 19 Of the Colonial Report of Northern Nigeria, 1900-1901, as quoted in (Onyewuenyi, 1981, p. 65).

Toward the end of the nineteenth century, after the area now known as Nigeria officially came under British control, the colonial government began to seek out ways of linking the interior of the country to its ports in order to facilitate export trade. The construction of the railway was seen as an effective means of moving goods and services from the interior of the country to the coast. Construction of the railway lines largely occurred between 1898 and 1930, with an additional extension completed

86 after independence in 1964.7 The railway was generally constructed to open up the country to export trade with Europe. Three specific reasons were given for the construction of the various lines: agricultural, mineral exploitation, and political or administrative reasons (Taaffe et al., 1963; Onyewuenyi, 1981).

Table 3.A.1 shows the dominant motivations for each of the lines constructed between 1898 and 1964. It establishes that the export of agricultural products was the main motive for the railway. Of all the segments shown in Table 3.A.1, only Zaria-Jos-.

Bukuru and Kaduna-Kafanchan were not constructed for agricultural exploitation reasons. In terms of spatial distribution, the colonial railroads were slightly more extensive in the Northern region which covers 4/5th of the country's area. 8

3.2.1 Alternative Transportation Modes

Before the railways, transportation of goods was done through head portage, bicycles, animals, cart and inland waterways. In the North, there were caravan routes going through Timbuktu to major agglomerations such as Kano and Sokoto and on to

North Africa (Cairo, Tripoli). One consequence of the railway, as we discuss later in the article, was to redirect Northern trade from Trans-Saharan routes to the coastal ports.

The most important transportation mode for goods before the advent of railways were inland waterways. Many rivers, their tributaries, and creeks traverse the coastal plains of the country. In the South, between the coastal ports of Lagos and Opobo for example, the abundant creeks allow transportation of produce and many ports were installed along the way: Epe, Sapele, Warri, Forcados, Burutu, Brass, Degema, and

'A rail line joining Abuja to Kaduna was built between 2011 and 2014. Since it was constructed after the dates of our outcome measures, it should have no bearing on our results. 8There is an average of 3.1 rail length meter per area square kilometer in the South and of 3.4 in the North.

87 Port Harcourt (Onyewuenyi, 1981). These river networks, as well as direct access to roads using bicycles, exposed the South to trade with Europe long before the railway was constructed. While the country's two main rivers, the Niger and the Benue, run through the North, the rivers are navigable only for part of the year and for a fraction of the distance they cover. They are heavily dependent on water levels in the rainy season, and the Niger itself is filled with dangerous rapids. As a result, the only available means of consistent transportation from the Northern parts of the country to the coast was through roads, which were not viable because of the enormous distance and other dangers of road transportation in pre-colonial Nigeria.9 For example, iodder (1959) in a study of tin-mining in Jos estimated that the road journey from the mines to the coast took 35 days by road, and while this was tolerable for mining, it was not conducive to agricultural exports. These problems with river and road transportation meant that most areas of the North were cut off from export trade prior to the construction of the railway.

3.2.2 Railway Construction

The railway construction was done in three main phases. The first phase consisted of initial penetration lines. The origination points were the ports of Lagos, Zungeru and Baro for the Western line, and Port Harcourt for the Eastern line. The Western line originated in Lagos in 1898 and reached the Niger river at Jebba in 1909. The construction of the Eastern line began in 1913 in Port Harcourt and reached Enugu by 1916. In the second phase of the railway development, more interior centers were linked to the ports with lines such as Baro-Kano and Enugu-Kaduna. By 1927, both

9 0n average, Northern populated areas and coastal ports are more than 600 km apart.

88 main North-South links were established giving Northern areas access to the ports of Lagos and Port Harcourt. Building branch lines and extensions such as Zaria-Kaura Namoda or Kano-Nguru made up the third stage of railway development. At the end of this phase, in 1931, the railway was 3,067 km long. New centers of economic activities quickly appeared along the newly constructed railroads. By the time the main lines were built, more than 200 buying and selling stations had emerged along the railway lines (Onyewuenyi, 1981). One of the fastest growing centers was the coastal town of Port Harcourt which was chosen as a terminal node of the Eastern line in 1913 before it even existed as a town. Because of its deeper harbor and direct access to the hinterland, Port Harcourt had developed, by the 1930s, as the second largest port of the country, at the expense of previously established non-railway ports within the region such as Bonny, Opobo, or Degema. Similarly, in the "Lagos" area, the railway did not begin in Lagos itself but in another town known as Iddo because Lagos is an island which would have made construction more expensive. We exploit these local idiosyncrasies to motivate one of the empirical strategies we use to estimate the impact of the railway.

3.2.3 Growth of Export Agriculture Following the Railway Construction

In the Northern Provinces, the history of export cotton production, like . that of groundnut has been closely linked with the history of railway expansion, and it was not until the railway reached Kano in 1912 that the export cotton production attained any importance - Lamb 1925, p. 19.

The incentives to produce more than what was needed for consumption were weak

89 in remote areas in the North of the country, especially in areas poorly connected to rivers. The advent of the railways dramatically changed the trade opportunities available to these areas. The railways were used almost exclusively for goods trans- portation as more than 90% of rolling stock units were devoted to goods service.

Over the period 1901-1950, an average of 2/3 of these goods were agricultural prod- ucts. According to the Colonial Reports of 1913, only a year after the first railways were built in the North of Nigeria, the value of Northern agricultural exports jumped by 150% (groundnuts by 666%, benniseed by 157%, gum arabic by 133%, cotton lint by 45%, hides and skins by 41%, and sheanut products by 20%). Acreage under cultivation increased at all station areas.

The railway stations allowed the concentration of markets along the railroads making possible the clustering of a traditionally scattered population and agricul- tural production. The development of export agriculture was initially limited to "the irradiation area of the railways, the inter-regional roads and auxiliary local roads"

(Sch5itzl, 1973, p. 89). As our results indicate, the incredible boom in export agricul- ture that followed the railway construction had long-run consequences on the human and social development of people living in areas connected to the railways. This im- pact was concentrated in the North of the country, which was the main beneficiary of the introduction of railways.

Next, we describe the empirical strategies and data we use to estimate the long- term impact of the colonial railway in Nigeria, and how this impact differs according to initial market access.

90 3.3 Data and Empirical Strategy

3.3.1 Data

Data on colonial railroads in Nigeria come from the Digital Chart of the World (DMA, 1992). These data are combined with individual-level data from the 2008 Nigeria Demographic and Health Survey [DHS] (NBS and ICF International, 2008) to estimate the long-run impacts of railroads. The DHS uses a two-stage proba- bilistic sampling technique to select clusters in the first stage and households in the second stage. In general, DHS clusters are census enumeration zones, to which we will sometimes refer simply as localities. Using DHS-provided information on the geographical coordinates of each such locality, we match individuals to local areas10 and rail networks. The DHS provides information on each individual's characteristics including age, sex, migration status, religion, ethnicity, and area of residence. Individuals without specific information on ethnicity are dropped from the analysis. Information on each of our individual-level outcome variables - years of schooling, literacy, type of employment (professional or agricultural), and the frequency at which an individual reads newspapers, listens to the radio, and watches TV - is also available in the DHS. Household-level variables include size, the gender and age of the household head as well as a composite wealth index.'1 We complement our outcome dataset with panel data on urban population density

'0 The local government area is the smallest administrative unit in Nigeria with an average area of 1020 km 2 , and median area of 705 km 2, and serves as the primary measure of the "local area" in which the individual lives. Their boundaries are obtained from the GADM database of global administrative areas (UC Berkeley, 2014). "A measure of households' cumulative living standard based on observables such as asset owner- ship (radio, TV, bicycles etc.), materials used in housing construction, water access, and sanitation. See http: //www. dhsprogram. com/topics/wealth- index/Wealth-Index-Construction. cfrm for more details on the construction of this index by DHS country teams.

91 and city presence in 1900, 1960, and 2010, from Jedwab and Moradi (2015). This allows us to analyze urbanization outcomes and explore short- and long-run effects. We collect detailed information on geographic, climatic, and soil conditions from the FAO GAEZ database (Fischer et al., 2008). Specifically, we gather information on average annual rainfall (in mm), average annual temperature (in degree Celsius), elevation (in meters), two important soil characteristics (nutrient retention capacity and workability), and suitability for the production of cocoa, cotton, groundnuts, and oil palm.12 We also collect information on the presence of primary roads and major rivers in each local area. These data are supplied by DMA (1992) and are available in Hijmans et al. (2001). Lastly, we collect data on the presence of Christian mission stations in local areas in 1928 by combining maps published in Ayandele (1966) and Roome (1925). Altogether, we have information on over 30,000 individuals living in 22,798 households belonging to 845 clusters (localities) spread out over 550 local government areas in 37 states. These individuals belong to 30 major ethnic groups that make up over 90% of the country's population.

3.3.2 Identification Strategies

Since railway-connected and railway-unconnected areas might differ on many dimen- sions that are relevant to economic development, comparing them does not neces- sarily yield the long-term impact of the railway. We use a mix of strategies to deal with the possible endogeneity of railway placement. We first use a state fixed effects

2 ' The soil characteristics are measured on a 4-point scale ranging from no or slight constraints (1) to very severe constraints (4). Crop suitability is the average estimated agro-climatically attainable yield in kg/ha for rain-fed agriculture, using medium or low inputs within the local area because that is the dominant form of agriculture in Nigeria. These measures are provided by the FAO for cells of 30 x 30 arcseconds (approximately 0.9 km 2 at the equator) (FAO, 2016).

92 approach that compares connected and unconnected areas within states that were targeted or not to host railroads. We then exploit an instrumental variable strategy based on straight lines between major railway nodes. We complement these strate- gies with a host of falsification exercises and identification checks using placebo lines and various definitions of control groups.

State Fixed Effects

The railway was intended to connect large areas suitable for agricultural and mineral exploitation to the coast. In order to avoid comparing targeted areas with non- targeted areas, we include in our regressions state fixed-effects as states are the closest administrative level to capture these areas of interest.

Within states, we compare areas that are close to the railway lines to areas that are further away. A concern with this strategy is that even within targeted areas, the railway might be placed in areas that are systematically different on dimensions relevant to economic development. Historical accounts of railway placement tend to indicate otherwise. Engineering decisions that took into account elevation and other considerations based on the cost of the railway construction were central to determining the exact location of the railway within a targeted state. For instance, the railway line that terminates in Lagos state actually terminated in a small locality called Iddo because the city of Lagos was an island and having the terminus in such a city would have been very expensive. A second example is the line that terminates in the "Oil Rivers" which is now the city of Port Harcourt. Despite the fact that the city did not even exist at the time, it was chosen instead of prosperous pre-colonial ports such as Bonny, Calabar, New Calabar, and Opobo, because of its deeper harbor and more direct access to the hinterlands.

93 Comparing observables within railway-connected and unconnected area lends fur- ther support to the claim that the exact placement of the railway within targeted areas does not appear to systematically be in areas more developped or more likely to be developed in the future. We refer to localities within 20 km of a rail track as connected areas. Table 3.A.2 presents summary statistics for various observable characteristics in connected and unconnected areas. Baseline observables such as crop suitability, geo-climatic factors and soil constraints do not exhibit significant differences in observable characteristics between connected and unconnected locali- ties. Strikingly, connected individuals are not more likely to live in local areas with mission stations, nor in local areas with primary roads, nor in areas crossed by a major navigable river. Connected areas had fewer cities in 1900 than unconnected areas and were not more likely to have cities before the introduction of the railway or to have larger urban populations. Nevertheless, we control for all these geographic, climatic, population-based and other pre-railway observables. For individual and household-level outcomes, we con- trol for additional factors such as age, religion, ethnicity and household size in order to get more precise estimates of the railway effect." Specifically, our baseline specification for individual- and household-level out- comes is:

Yi,h,a,e,s = + # I<2O + XaA + XiH + Xhr + -1-s+Ye + Ei,h,a,e,s (3.1)

The parameter of interest, 3, is the effect of living within 20 km of a railway track

"While we use living within 20 km of a rail line as the measure of connectedness, we confirm that observables are also balanced using other measures of connectedness such as an indicator for having the railway pass through the local area. We also divide the country into 40 km x 40 km grid cells and show that observables are balanced between connected and unconnected grid cells. These results are available upon request.

94 on outcome Y. The outcome is measured for each individual i in household h, who lives in local area a in state s, and belongs to ethnic group e. The outcome variables are education (years of schooling, literacy), occupation (professional or agricultural worker), media access (newspaper, radio, TV), the DHS-based household wealth index and the probability of living in an urban area. If the railway has a positive impact on local development today, then individuals who are closer to the railway should be more educated and more literate, and should have greater access to the media. Under this hypothesis, we would also expect individuals in railway areas to be more likely to live in urban areas and in wealthier households and to be non-farm workers. 14 Individual, household, and local area observable characteristics are denoted by

Xi, Xh, and Xa, respectively. All regressions involving individual- or household- level outcomes include state fixed effects (o,) and ethnic group fixed effects (Ye). Individual controls (Xi) are age, age-squared, and an indicator for being Christian. At the household level, we control for gender and age of the household head, as well

as for the size of the household. At the local area level, we control for average rainfall, temperature, soil nutrient retention capacity and workability, elevation, suitability for key colonial area cash crops (oil palm, cocoa, cotton, groundnut), and presence of a mission station in the local area as at 1928. In the remainder of the article, these local area controls are referred to as baseline controls. For urbanization outcomes (city presence and city growth), our baseline specifi- cation has the same structure as in (3.1) but the set of controls excludes individual

14We categorize an individual as being literate if they can read some or parts of a sentence. We deem them as utilizing media resources (newspaper, radio, TV) if they use these resources at some point during a month. We adopt broad and inclusive definitions in order to provide conservative estimates, and deal with concerns about arbitrary cutoffs for inclusion into these categories. Our results are in fact stronger if literacy is limited to individuals who can only read whole sentences, or to individuals who use media at least once a week.

95 and household-level controls:

YS= q + 3R<20 + XaA + ar + ea,s (3.2)

Instrumental Variable Approach

To further address endogeneity concerns, we adopt an instrumental variable ap- proach, similar to that used in Baneijee et al. (2012) and Jedwab and Moradi (201.5).

We exploit the distance to straight lines joining major nodes and use it as an instru- ment for being connected to the railway line. We also exclude individuals living in railway nodes from the sample. The identification assumption is that, besides its correlation with the railway line, a straight line connecting nodes is unrelated to economic development.

The major nodes are chosen to be major historical cities existing at the time of the introduction of the railway such as Lagos, Abeokuta, Ibadan, Kano, and junctions in the middle of the country such as Kafanchan. We connect these 12 major nodes in the spirit of the railway introduction, that is, by finding the minimal path to connect areas of interest to the coast for each railway line defined by periods of planning/construction (pre-1912, 1916-1930, and 1964). The result of this simple algorithm is shown in Figure 3.A.2.

The instrument for being within 20 km of a rail line is defined as the log of the distance to node-joining straight lines. All observations within the same local government area as a node are dropped. Only "intermediate" observations are used to estimate the specification.

96 Identification Checks

We use a number of exercises to confirm the causality of the effects we uncover.

Alternative Transportation Technologies. A natural concern with our empir- ical strategies is that distances to straight lines connecting nodes could be correlated with roads and other transportation technologies and that the impacts we bring to light might be unrelated to the railway line. We address this concern by demonstrat- ing that the estimated impacts of the railway line are robust to the inclusion of other transportation technologies.

Placebo Lines. Following Donialdson (2016), we use placebo lines to test whether the effects we measure have to do with the railway and not merely with being close to lines joining nodes that could have been linked by the railway or any other impor- tant transportation technology. Placebo lines are segments that were surveyed and proposed for railway construction but were never actually constructed. Given the prohibitive costs of railway surveys, these segments were seriously considered. They were ultimately abandoned for a host of plausibly exogenous reasons that have to do with the turnover of officials in charge of the Nigerian railways and the conflicting interests of the British colonial decision-makers (unlikely to be related to local eco- nomic development). The data on placebo lines comes from Jackel (1997) who lists the lines that were extensively surveyed but ultimately not constructed.

As Figure 3.A.3 shows, the placebo network that we reconstruct covers an ex- tensive part of the country. In the Southern part of the country, these lines were meant to connect already existing cities. Following independence, these very early cities were finally connected by roads. Thus, in this exercise, we control for the effects of alternative technologies (roads and rivers). We are aware of the fact that

97 the placement of roads, presented in Figure 3.A.4, might be endogenous to the ex- isting railways. Hence, the results of this identification check are only suggestive and should be interpreted with caution. Our hypothesis is that if the effects that we uncover are indeed causal, one would expect them to disappear or reverse once we replace actual railway lines with placebo lines, especially in areas where placebo lines do not coincide with roads or waterways.

Actual Lines versus Placebo Lines. As explained above, it is plausible to as- sume that surveyed localities were de facto exogenously assigned to two groups: the group of localities connected to the railway, and the group of localities that were surveyed but not connected (placebo lines). This makes the use of areas close to placebo lines a powerful control group to check our identification. We implement this by re-estimating specification (3.1) on the clusters that are within 20 km of rail or placebo lines. This analysis would effectively yield a causal effect of the railway if, as argued above, placebo lines were not constructed for exogenous reasons (or reasons not related to long-run economic development).

Varying Control Groups. Another concern with our identification strategies is that we might be merely identifying differences between localities within 20 km of a railway and those very far away, such as clusters over 80 km away from the railway line, which may not be good control clusters. This is because clusters that are very far away are more likely to be different in dimensions not captured by our control variables. To address this concern, we break down the control group into various distances: clusters within 20-40 km, 40-60 kmn, 60-80 km, and farther than 80 km of the railway line." We then re-estimate equation (3.1) using the disaggregated

151n our dataset, we find that 32% of individuals live within 20 km of a railway track, 12.3% within 20-40 km, 13.57% within 40-60 km, 11.42% within 60-80 kmn, and about 30.61% farther than

98 distances as different control groups, and exclude the indicator for individuals living beyond 80 km of the railway from the regression. The coefficient on living within 20 km of a railway line now represents the impact of the railway relative to individuals living beyond 80 km of a railway. This strategy allows us to compare individual outcomes across multiple distances and to account for potential spatial spillovers beyond 20 km. A similar exercise consists of estimating equation (3.1) on samples limited to areas within 40 km, 60 km and 80 km of the railway. Both exercises yield similar results.

Other Identification and Robustness Checks. We complement our empirical strategies with the following identification and robustness checks. We test the robust- ness of our results to other measures of connectedness such as continuous measures based on the distance to the rail lines or the distance to the closest railway station or an indicator for being in the same local government area as the rail line.

We also test the robustness of our results to: using Conley standard errors to correct for spatial autocorrelation, limiting the sample to rural areas only, limiting the sample to migrant individuals or to non-migrants, excluding areas with mission stations, excluding local government areas with rail tracks or rail stations, and even excluding areas within 20 km of a railway station.

80 km of a rail line.

99 3.4 Average Effect of the Railway: Countrywide Es- timates

3.4.1 State Fixed Effects Results

Table 3.A.3 presents results of the estimation of our main specification (3.1). Stan- dard errors are clustered at the local government area level in order to deal with arbitrary correlation between localities (clusters) within local areas.1 6 As reported in Column 1, living within 20 km of a rail line increases schooling attainment by 1.37 years on average. This is associated with a 14% increase in the probability of being literate, a 1.8% increase in the probability of working in a professional wage job, and a 7.1% decline in the probability of being an agricultural worker (Columns 2-4). Furthermore, being connected to the rail line is positively associated with me- dia access. Individuals in connected areas are more likely to read newspapers, listen to the radio and watch TV (Columns 5-7). Finally, being connected to the rail line is associated with living in a wealthier household, and it increases the probability of living in an urban area by 18% (Columns 8-9). These results are all consistent with a strong positive impact of proximity to the rail line on individual and household development outcomes. It is interesting that none of the geographic and climatic variables have a con- sistent impact on the outcomes. Similarly, we do not find that areas that are more suitable for oil palm, cocoa, cotton, or groundnut are more developed in the present. Importantly, this evidence supports our identification assumption that, within a state, geographic characteristics, and any pre-colonial advantages they might have

16We show in a Section 3.4.4 that our results are robust to using Conley standard errors to correct for spatial autocorrelation. We find that the Conley standard errors are similar to the cluster-robust standard errors used throughout the article (See Table 3.B.2).

100 conferred, are largely unrelated to contemporary development. However, and in accordance with previous studies, we find that missionary activity is strongly associ- ated with development at the local level (Gallego and Woodberry, 2010; Nunn, 2014; Moky and Pongou, 2014; Cag6 and Rueda, 2016; Wantchekon et al., 2015). Local missionary activity has a positive impact on years of schooling, literacy, occupational choices, media access, household wealth and urbanization. The impact of the railway and missionary activity, and the non-impact of geographic and climatic variables, are remarkable and speak to the importance of historical circumstances for development at the local level.

3.4.2 Instrumental Variable Estimates

Before turning to instrumental variable results, we first note that the first-stage estimates (presented in Table 3.A.4) indicate that a doubling (100% increase) of the distance to a line joining nodes lowers the probability of being connected to the rail line by about 29.4%. Also important is the finding that missionary activity and most of the geo-climatic variables are uncorrelated with rail presence. An exception is elevation which is negatively correlated with probability of connectedness, a result consistent with a higher cost of building in elevated areas.

The 2SLS estimates of the impact of being connected to the railway are shown in Table 3.A.5. They are generally not statistically different from OLS estimates. An exception is the estimated impact of being within 20 km of a rail line on the probability of being an agricultural worker, which falls in magnitude from -7.1% (OLS) to -4.0% (IV). The IV approach, along with the robustness checks results reported below, lend support to our causal interpretation of the impact of the railway.

101 3.4.3 Identification Checks Results

Alternative Transportation Technologies. As shown in the top panel of Table

3.A.6, we control for the presence of other transportation technologies, because if, for instance, roads were built close or along some of the railway lines, their impacts could be picked up by our estimates. Indeed, when roads and rivers are controlled for,1 7 we find a robust long-term impact of the railroads. The fact that the results are not driven by the correlation of the railway with other transportation infrastructure is important as, otherwise, this might call into question the identification assumption behind the IV strategy. We find that increased distance to roads is negatively cor- related with living in an urban area, household wealth, and other measures of local development. Rivers, on the other hand, are not positively related to development, consistent with our findings on other geographic variables. However, we do not take the estimated impacts of the road network as causal because they might have been constructed in response to the rail network. These results suggest nevertheless that the estimated impact of the railway cannot be explained away by the road network.

Placebo Lines. We conduct two different exercises involving placebo lines. First, we replace actual railways by placebo rail lines (the surveyed lines that were even- tually not constructed), and exclude railway-connected areas. As a result, most estimates decrease dramatically and lose their statistical significance. As shown in the middle panel of Table 3.A.6, the estimates of the coefficients on outcomes such as schooling, literacy, professional occupation, reading newspapers and being an urban resident are significantly smaller and not statistically different from zero. In the bottom panel, we control for the presence of other transportation technolo-

17 Data on the presence of major roads and navigable rivers are obtained from DMA (1992) and are available in Hfijnis et al. (2001).

102 gies, because transportation technologies were built on some of the placebo segments, and their impacts could be picked up by our estimates. Indeed, when roads and rivers are controlled for, the results for the placebo effect, presented in the bottom panel of

Table 3.A.6, paint a clear picture. Most of the coefficient estimates are not significant at the 5% level.

Second, we use localities close to these placebo lines as a control group to analyze the impact of the railway. Precisely, we estimate the effect of being within 20 km of a railway line relative to being within 20 km of a placebo line. As shown in the top panel of Table 3.A.7, we find a large economic effect of the railway compared to placebo areas. These results suggest that localities closer to the railway lines have better development outcomes today than localities close to other straight lines joining major cities and initially proposed to be connected by rail.

By providing evidence that straight lines connecting pre-existing cities are not correlated with local development outside of localities connected to the railway line, these placebo results reinforce our causal interpretation of the effect of railroads.

Varying Control Groups. As shown in the bottom panel of Table 3.A.7, the impact of the railway on individuals living within 20 km of the rail line is robust to the use of different control groups. The impact of being connected to the rail line does not change significantly when we compare individuals living within 20 km of the rail line to those living within 20-40 km or to those living farther away. In fact, the estimated impact of the railway is stronger when individuals living within 20 km of a rail line are compared to those living a further 20 km away at most. For example, relative to individuals living beyond 80 km of a railway line, being connected is associated with an additional 1.23 years of schooling. However, when compared to those living just "outside", within 20-40 km, the impact of being connected is an

103 increase of about 1.6 years of schooling (1.231 + .362) although this difference is statistically insignificant. Thus, extending the control group to all distance groups actually produces a conservative estimate of the impact of the railway as there is no evidence of a positive spillover beyond 20 km of a rail line. Overall, our results strongly suggest that being connected to the historical railway line has a positive impact on development in today Nigeria, even though the railway has deteriorated after the country's independence.

3.4.4 Additional Identification Checks Results

We also confirm that the estimated impact of the railway is not sensitive to the measure of connectedness. Although the point estimates are not directly comparable, the impact of the railway is robust to the use of a continuous measure of closeness to the rail line.' 8 The corresponding results are presented in the top panel of Table 3.B.1. Our results are also robust to using other measures of connectedness to the railway such as proximity to railway stations (second panel) or an indicator for being in the same local government area as the rail line (bottom panel of Table 3.B.1). In Table 3.B.2, we show that our results are robust to using Conley standard errors to correct for spatial autocorrelation."9 We find that Conley standard errors are not much different from the cluster-robust standard errors used throughout the article. We continue to find an impact of the railway when we exclude individuals living in urban areas (top panel of Table 3.B.3). This important result indicates that our results are not merely driven by urbanization. We also estimate the differential

1It is defined as -iog(1 + clusterdistance)where clusterdistanceis the survey cluster's distance to the railway network. 9 We use the methodology described in Conley (1999), and follow the implementation by Rap- paport (2007), with a cutoff of 80 km.

104 impacts of railways by migration status (second and third panels of Table 3.B.3).

While the impacts are larger for migrants, the estimates are not statistically different from the estimated impact on non-migrants.2" This suggests that the long-term effects of railroads are not driven primarily by migrants who might have higher ability, education or skills.

The top panel of Table 3.B.4 shows that our results are not driven by missionary activity, since they are robust to excluding areas without mission stations. If any- thing, the impacts of the railway is stronger in localities without mission stations. In areas with missionary activity, missions had a positive impact on schooling, literacy, and media access, possibly attenuating the impact of the railway on these outcomes.

The second panel of Table 3.B.4 indicates that the impact of the railway is sus- tained when we exclude local areas that contain rail tracks. This is further evidence that the effect that we find is driven by connectedness to the railway line, and not merely by the presence of a railway line or station in the local area. Finally, an important part of the connectedness to the railway line is the proximity to railway stations. The bottom panel of Table 3.B.4 suggests that the impact of the railway is attenuated when we exclude areas within 20 km of railway stations.

3.4.5 Urbanization Outcomes Results

In addition to individual and household-level outcomes, we explore urbanization outcomes, namely urban population density and city presence, using the methodology developed in Jedwab and Moradi (2015). We analyze the long-run effect on city presence and urban population (measured in 2010) of the presence of rail tracks within 20 km of a grid cell, controlling for the 1900 population density Z-score,

20Non-migrants are defined as individuals who indicate they have never lived anywhere else beside their current place of residence.

105 missionary presence, and state fixed effects. Using the standardized score (Z-score) ensures that we measure changes relative to the mean, and controlling for 1900 Z- score ensures that we capture relative city growth. The results are presented in the first column of Table 3.A.8. We find that the presence of a rail track within 20 km of a cluster has a positive effect on both outcomes. Furthermore, the estimates indicate that controlling for 1960 rates of urbanization (column 4), the railway had no further impact on urbanization, suggesting that the impacts of the railway on urbanization largely occurred before independence in 1960.

3.5 North-South Differences in the Impact of the Rail Line

The North and the South of Nigeria had very different situations at the advent of the railway. The South had access to ports of export, thanks to its proximity to the coast and to its use of waterways, while there was no viable transportation alternative from the Northern areas to access the coast. In addition to having alternative access routes to the European market, the South had established and operated trade routes prior to the construction of the railway, thanks to centuries of pre-railway European trade (Anene, 1966; Crowder, 1980; Falola and Heaton, 2008) .2' These pre-railway differences motivate our exploration of the impact of the railway in each of these regions, separately.

2 1 This is reflected in trade statistics. Between 1900-1904, the South was already exporting an annual average of 176,511 tons of palm produce to Europe, while the North was exporting a modest 475 tons of its main crop: groundnuts.

106 3.5.1 Estimated Impact of the Railway in the North and in the South

Table 3.A.9 presents estimates of the impact of connectedness to the railway line on contemporary individuals living in the North and in the South of Nigeria. The top panel shows the state fixed effects results for both the North and the South, and the bottom panel shows the instrumental variable results estimated by two-stage least squares (2SLS). Both panels suggest that the local impact of the railway in the North is larger than the country-wide average impact. In the North, living within 20 km of a rail line increases schooling attainment by almost 2 years on average. This is associated with a 19% increase in the probability of being literate, a 2% increase in the probability of working in a professional wage job and a lower probability of being an agricultural worker. Furthermore, being connected to the rail line in the North is positively associated with media access, higher household wealth, and a 24% greater probability of residing in an urban area. The railway has virtually no impact on contemporary development outcomes in the South. There is a significant impact on household wealth and on the probability of watching TV when the model is estimated using the IV strategy, but this result is not robust. All coefficients are economically small and insignificantly different from zero in the South when estimated using state fixed effects.

Turning to urbanization outcomes, we find the same pattern. Although overall the railways have had a sustained economic impact in the country, their effects are only visible in the North, as shown in columns 2 and 3 of Table 3.A.8. Indeed, in the North, living within 20 km of a rail line increased the Z-score of city presence by 0.153 in 2010, an estimate significant at the 1% level. Similarly, living within 20 km of a rail line increased the Z-score of urban population by 0.128 in 2010. The

107 equivalent estimates are not significantly different from zero in the South.

3.5.2 Differential Impact in the North and South: Robust- ness Checks

We carry out several robustness checks to confirm our findings on the differential impacts of the railroads in the North and in the South of Nigeria. First, we find that the placebo results obtained as an identification check for the country-wide analysis are robust to separate estimations for the North and the South. In the top panel of Table 3.B.5, we show that the strong and positive effect of the railway in the North completely disappears once actual lines are replaced by placebo lines. We also analyze the impact of the railway using placebo lines as the control group for the South and the North. For each region, we estimate the effect of being within 20 km of a railway line relative to being within 20 km of a placebo line. The results are presented in Table 3.B.6. The effect of the railway on each outcome is large and statistically significant in the North, but is close to zero in the South. Overall, these results indicate that the positive impact of the railroads in the North and their non-impact in the South are causal. A possible reason for the non-impact of the railway in the Southern local areas might be that, because of the discovery of oil in the South, oil cities might have eclipsed railway cities. To test for this possibility, we restrict the sample to non-oil- producing areas of the South.22 Table 3.B.7 presents the results of this exercise. As was found with the full sample, there is no impact of connectedness to the railway on local economic development in the South when oil-producing areas are excluded.

22 We exclude localities in the oil-producing states which are Abia, Akwa, Ibom, Bayelsa, Cross River, Delta, Edo, Imo, Ondo, and Rivers.

108 There are no significant differences between Southern connected and unconnected areas in schooling attainment, literacy, occupational choices, media access, household wealth, and urbanization. This is true regardless of the estimation strategy, which provides evidence that our results in the South are not driven by oil cities. Next, we examine the dynamics of the railway impact both in the North and in the South of Nigeria.

3.6 Dynamics and Persistence

A non-existent long-term effect of the railway in the South of Nigeria could mask a short-run effect of the railway that dwindled over time - especially after the demise of railroads - or a non-existent short-run effect that remained stable over time. This raises the question of the stability of the impact of the railway and the comparison of its short- and long-run effects. To explore the path of the railway impacts, we need to examine economic outcomes for which information is available at different points in time. The individual and household outcomes we have examined so far do not meet this criterion but they allow us to look at how the impact of the railway varies across different cohorts. We focus on individuals who have never changed their place of residence (non-migrants) to ensure that the estimates we find reflect conditions in the locality at time of birth. We look at individuals born during the peak of the railway (1948-1975), those born between 1975 and 1984, and the youngest individuals in our dataset, born between 1985-1993. These cohorts were chosen to ensure that observations are roughly of equal number within each category. The results are shown in Table 3.A.10. We first observe that in both parts of the country, the younger cohort (1985-1993) is generally better educated, and has higher media access than older cohorts. This is

109 consistent with the general upward trend in education in Nigeria, with the Southern education trend having a steeper slope (Csapo, 1981; Ajayi et al., 1996; Dev et al., 2016). We find no support for the hypothesis that the impact of the railway was stronger for the older cohort, be it in the North or in the South. If anything, in the North, the impact of being connected to the railway line on schooling appears larger for the youngest cohort. For urbanization outcomes, we use panel data from Jedwab and Moradi (2015) on city presence and urban population density in 1900, 1960, and 2010 to analyze the impact of the railway on these outcomes. This allows us to more clearly separate short-term and long-term impacts. In a first exploration of the path of the railway effect, we look at the effect of the railway on urbanization outcomes in 2010 after controlling for the 1960 population density Z-score. The results are presented in the three rightmost columns of Table 3.A.8. Controlling for the 1960 city presence or urban population Z-score, when estimating the effects of the railway on urbanization, renders the effect either not significant or much smaller than before. We view this result as evidence that the effect of the railroad has hardly changed since independence in 1960. We then explore shorter-run effects of the railway looking at urbanization out- comes in 1960 and find that the effect of the railway is remarkably stable over time. As shown in Table 3.A.11, living within 20 km of a rail line increased the Z-score of city presence by 0.105 in 1960 (column 1). Similarly, living within 20 km of a rail line increased the Z-score of urban population by 0.175 in 1960 (column 1, bottom panel). Although overall, the railways have had a sustained economic impact in the coun- try, their effects are only visible in the North, which is consistent with what we have found for individual-level outcomes. Indeed, in the North, living within 20 km of

110 a rail line increased the Z-score of city presence by 0.127 in 1960, as shown in the top panel of column 2, with these estimates being statistically significant at the 1% level. Similarly, living within 20 km of a rail line increased the Z-score of urban population by 0.124 in 1960 (column 2, bottom panel). It follows that the stability of the railway effect in the overall sample carries through to the North sub-sample.

The estimated impact in the North is robust to the exclusion of all nodes (second to last column). In the South, railways did not have any statistically significant effect on urbanization outcomes, neither in the short run nor in the long run.

The empirical evidence regarding the North of the country is consistent with a model in which, in the North, railroads encouraged the concentration of production factors in connected areas and induced a spatial equilibrium that persisted in the long run.2 3 This equilibrium persistence, even after railroads had become dysfunctional, can be rationalized by railroad locations serving as a coordination mechanism for factor investments in each subsequent period.

3.7 Mechanisms

The North and the South of Nigeria differed in many respects before the introduc- tion of the railway. While we cannot rule out the role of all other possible differences between the two regions, key baseline differences as far as the impact of railroads is concerned include the availability of viable transportation alternatives connect- ing Southern areas to the ports of export at the coast and, related to it, the early existence of a more urban spatial equilibrium in the South involving ancient cities and major trading centers. We present evidence suggesting that railroads were dis-

23See Bleakley and Lin (2012) for more details on theoretical underpinnings of such path depen- dence.

111 pensable in the South where viable pre-existing transportation technologies enabled trade with Europe, whereas railroads played a key role in connecting the North to the coast and the European market.

3.7.1 Adoption Rates and Benefit of Rail by Key Regional Crops

The Zungeru-Barijuko (Kaduna) to Baro would "traverse the greatest trade route in Nigeria, and render possible the export of cotton and other produce grown in the Nupe province and in Southern Zaria [Northern Nigeria]. Without it cotton cannot ... be profitably exported

from those districts." - p. 58 of Colonial Report, Northern Nigeria, 1902, as quoted in (Onyewuenyi, 1981, p.66).

The increase in the export of this valuable product (hides) is most grat- ifying, and as communication with Northern Nigeria is facilitated, it is expected to divert the greater proportion of this trade, which at

present is said to go across to Tripoli. - p. 22 of Colonial Report, Southern Nigeria, Lagos, 1905 as quoted in (Onyewuenyi, 1981, p.66).2

The need for railroads for agricultural purposes was much greater in the North of Nigeria than in the South. The South had already been trading with Europe for centuries, through the slave trade and the palm oil trade which replaced it, while trade in most of Northern Nigeria was directed towards North Africa. The railway was crucial in diverting the trans-Saharan caravan trade to the Nigerian coast by lowering transportation costs. To see this, we compute the net benefit of shipping

24The emphasis highlighted in bold is ours.

112 agricultural goods by railway relative to other means such as waterways and roads, which were the two other modes of transportation available in colonial Nigeria. The calculation is done over the period 1945-1949 for each of the key regional crops - groundnuts and cotton in the North and palm oil and cocoa in the South. The results are shown in Table 3.A.12. Data on prices, quantities, and distances come from World Bank (1955) and Onyewuenyi (1981). The cost of river shipments from the North is estimated as the cost of railing to Baro and then shipping by river to the

Delta ports.2 5 We use the railing distance as the shipping distance for rivers in the

South, although this might be an overestimate given the proximity of the South to several rivers which lead to the coast. 26 While an overestimate of shipping distance through rivers, it helps to illustrate the fact that the railway could not compete with pre-existing means of transportation in the South even with implicit and explicit government subsidies.2 7

We estimate cost reduction from shipping groundnuts and cotton by railway rather than by river to be 1.4% and 51%, respectively.28 The equivalent cost re- duction from railing these goods instead of shipping them by road was 65% and

75%, respectively. These estimates are similar to the estimated reduction in Iodder

(1959), who finds that the railway reduced the cost of shipping from the Jos mines by about 70% relative to road transportation. In comparison, however, railing palm oil and cocoa instead of shipping them by river would have increased their cost by

25Hence, the rail prices in parentheses in Table 3.A.12. 2 6Using average straight line distances from survey clusters in the DHS to various transportation nodes, we calculate that Southern populated areas are four times closer to rivers than Northern populated areas (23 km vs. 90 kin). This discrepancy is worsened if we take into account the navigability of rivers which is much better in the South. 27See discussion in Onyewuenyi 1981, p. 89-93. 2 8These figures underestimate the cost reduction brought about by railways in the North as we are not taking account the navigability of the rivers in the North with was poor and impossible for several months in a year.

113 119% and 16%, respectively. Similarly, railing these crops versus shipping them by road would have increased their cost by 58% and 60%, respectively.

As regards to shipping goods to the coast, railroads were cheaper than alternative transportation modes in the North, whereas the latter were cheaper in the South.

This rationalizes the high adoption rate of railroads in the North and its low adoption rate in the South. Figure 3.A.5 presents percentages of Northern and Southern goods shipped through the railway over the 1931-1949 period. On average, 96% of groundnuts and 81% of cotton were railed to the coast from the North, compared to only 18% of palm kernel, 31% of palm oil, and 26% of cocoa, the three main

Southern crops. We also observe that the fraction of goods shipped to the coast through rail increased over time for Northern goods, while it declined for Southern goods as subsidies for the railways declined (except palm oil).30

Consistent with this, the volume of groundnut exports (the main export in the

North) grew at an annual rate of 13.8%, from 475 metric tons in 1900-1904 to 268,409 metric tons in 1945-1949. Over the 1900-1949 period, during which the railway expanded, exports of palm produce (the main export in the South) grew at an average annual rate of 1.9% only. A non-trivial volume of palm produce was already being exported in the early twentieth century, which illustrates initial access to export markets.

The evidence presented here illustrates two important facts. First, producers in the South had viable transportation alternatives to the railway, and the railway did not substantially lower transportation costs in the region (it actually increased transportation costs, on average). Second, and as a result of the first fact above,

29Figure 3.A.5 plots the ratio of quantities of goods railed to the quantities of goods exported. The outlier above 100% is due to crop spoliage. 30The cocoa example is striking since the percentage shipped through rail shrank from a third of the production to virtually zero over the period 1931-1950.

114 adoption rates of the railway were substantially lower in the South. In the North, however, the railway substantially lowered transportation costs and increased market access and producers adopted it as the primary means of exporting commodities.

3.7.2 Key Factor of Heterogeneity: Distance to Ports of Ex- port

The highlighted heterogeneity of the impact of the railway line, presented so far as a North/South dichotomy can be accounted for by distances to ports of exports.

Southern areas happened to be closer to ports of export and on these short distances, waterways were a viable transportation means to ship goods to the coast. The railway did little to decrease these shipping costs and actually increased them in most areas closer to ports of export. Areas further away from the coast experienced a big drop in their shipping costs to the coast at the advent of the railway. This effectively allowed the export of goods from areas further away from ports of export to flourish.

To confirm that distance to ports is a key factor of heterogeneity, we measure the impact of the railway by distance to ports of export. For each individual, we compute the distance of her survey cluster to the closest port. We then split the sample into two subsamples: individuals with above-median distance to ports and individuals with below-median distance to ports. We estimate the impact of railroads on the outcome variables for each subsample, separately. The results are presented in Table 3.A.13.

The effect of the railway is generally larger in local areas that were further away

"We also divided the sample by mean distance to ports, and into various groups, within 100 km, 100-300 km, and other distances that included a balanced mix of districts in the South and the North. We consistently find that the impacts of the railway begin to emerge beyond 300 km from the coast. These results are omitted for brevity but are available upon request.

115 from ports of export. The effect of the railway for individuals in local areas that had higher pre-railway European market access is economically small and mostly not statistically different from zero. Importantly, state fixed effects and ethnicity fixed effects are included in the regressions presented in Table 3.A.13. Hence, these findings are not driven by specific characteristics of a given state or by ancestral exposure to railways.

In the bottom panel of Table 3.A.13, we interact our main independent variable with the log of distance to coastal ports. The local effects of railways are unequiv- ocally stronger, the farther individuals are from ports. These findings provide a consistent explanation of the non-impact of the railway in the South, based on its geographical proximity to ports, existing waterways connecting it to the different ports and a low change in its market access.

We also show that these results cannot be explained away by proximity to early cities. If the heterogeneity of the effect of the railway that we uncover is driven by the fact that the South had more early cities, one would expect that once we interact proximity to the railway with proximity to early cities, the effect on areas close to both the railway and early cities would be null. Table 3.A.14 shows that this is not the case. For most development outcomes, the effect on areas close to both the railway and early cities is significantly positive. This suggests that the heterogeneity we highlight is not driven by the stronger presence of early cities in Southern parts of

Nigeria, or by the fact that Southern Nigeria was more urbanized. This finding, along with the other pieces of evidence shown above, suggests that opportunity costs are important to the transformative power of railroads. Unlike in the South, railroads were vital in the North of Nigeria to enable export trade with Europe and unlike in the South, they had a tremendous effect in the region.

116 3.8 Concluding Remarks

In this article, we show that colonial railroads did not have a homogeneous impact in all the areas they connected. We find that railroads had very little economic impact in the South of Nigeria which, thanks to its proximity to the coast and to its waterways, already had viable alternatives for the transportation of goods to exporting ports. The North, however, was lacking viable transportation technologies to export goods. The railways were essential in linking these Northern areas to ports of export. Connected areas in the North were transformed by the railway, not only in the short run but also over time and until today.

We find that, in the North, individuals in areas that were connected to the rail- ways are more likely to go to school, to be literate, to have media access, to work in higher-paying professions and to live in wealthier households than individuals in un- connected areas. Connected areas are also more urbanized than unconnected areas. We do not find any of these effects in the South, neither in the short run nor in the long run.

Exploring potential mechanisms, we find that areas that had more access to ports of export thanks to their proximity to the coast and use of waterways were not transformed by railroads. The converse is true for areas that were effectively connected to exporting ports thanks to the railway.

Our findings also indicate a strong path dependence in the positive effect of railways in the North of Nigeria and in their non-effect in the South. They are consistent with the theoretical argument that, in the North, railroads led to the concentration of production factors in connected areas and these initially advantaged areas helped to coordinate factor investments in each of the subsequent periods. This implied a persistence of the initial spatial equilibrium induced by the railroads, even

117 after they became obsolete. In the South, the railway did not even change the existing spatial equilibrium in the short run. It is perhaps not surprising that the non-impact in this region also persisted over time.

118 Appendix

3.A Figures and Tables

119 Figure 3.A.1: Rail Lines Across Clusters and Local Areas

4r

& Legend Rail Line Cluster [845] E3 State (Area) Boundary [37] Local Area [550]

Figure 3 A.2: Straight Lines Between Nodes

guru aura-Na

adunta South

anchan

takurdi Abeo ut a Legend * Major Network Nodes (Cities/Locations) Rail Lines (With Project Dates) 41111 1898 - 1912 Sa- 1916-1930 1111 1940-1964 Harcourt - Straight Une Connecting Nodes

120 Figure 3.A.3: Rail Lines, Ports and Placebo Lines

Legend Rail Line Placebo Line X Coastal Ports *City in19DO

Figure 3 A.4: Rail Lines Roads and Placebo Lines

121

*~...... M jo.R ad

Legend Rail Line ? e Placebo Line Coastal Ports ity in 1900 Major Roads

121 Figure 3.A.5: Railway Adoption for Northern and Southern Main Exports

140.00

120.00

100.00 -0-Groundnuts -dr-Cotton 80.00 *00Palm Kernel

"'Palm Oil 60.00 -*-cocoa

40.00

20.00

0.00 1931 1932 1933 1934 1935 1936 1937 1946 1947 1948 1949

Note: Figure 3.A.5 plots the ratio of quantities of goods railed to the quantities of goods exported. The outlier above 100% is due to crop spoliage. During the 1931-1949 period, groundnuts and cotton were the main Northern exports and palm Kernel, palm oil and cocoa were the main Southern exports.

Table 3.A.1: History of Railway Construction in Nigeria

Link Date Length(km) Motivation Lagos - Otta 1898 32 Administrative & Agricultural Otta - Abeokuta - Ibadan 1901 165 Administrative & Agricultural Ibadan - Ilorin 1908 201 Administrative & Agricultural Ilorin - Jebba 1909 96 Administrative & Agricultural Zaria - Jos - Bukuru 1911 227 Mineral Jebba - Zungeru - Minna 1912 233 Administrative & Agricultural Baro - Kano 1912 573 Administrative & Agricultural Port Harcourt - Enugu 1916 243 Agricultural & Mineral Enugu - Makurdi - Jos 1927 596 Agricultural & Mineral Kaduna - Kafanchan 1927 201 Administrative & Mineral Zaria - Gusau - Kaura Namoda 1929 232 Agricultural Kano - Nguru 1930 229 Agricultural Ifo - Ilaro - Idogo 1930 39 Agricultural Jos - Maiduguri 1964 645 Agricultural Note: The motivations for the railway construction are classified in three categories : adminis- trative (political or military), agricultural exploitation, and mineral exploitation. Source: Onyewuenyi 1981, p. 39.

122 Table 3.A.2: Observables in Areas Within and Outside 20 km of Railway Tracks Within 20 km of Rail Outside 20 km of Rail Variable Mean S.D. North South Mean S.D. North South Difference Lo cal Area Variables Annual Rainfall (mm) 1471 588 912 1788 1503 713 937 2017 -32 Temperature (celsius) 26.36 0.62 26.03 26.54 26.38 0.84 26.40 26.35 -0.02 Soil Nutrients 1.53 0.69 1.61 1.48 1.38 0.64 1.45 1.31 0.15 Soil Workability 1.99 0.86 1.49 2.28 1.75 0.77 1.44 2.04 .24** Elevation (Meters) 217 215 451 84 251 192 379 135 -34 Oil Palm Suitability (kg/ha) 2.86 2.05 0.35 4.29 2.44 2.08 0.5 4.20 0.42 Cocoa Suitability (kg/ha) 0.67 0.47 0.09 1 0.59 0.49 0.14 1 0.08 Cotton Suitability (kg/ha) 0.35 0.48 0.97 0.00 0.43 0.49 0.85 0.04 -0.08 Groundnut Suitability (kg/ha) 1.82 0.39 2 1.72 1.8 0.4 2 1.62 0.02 Mission Station 0.25 0.43 0.11 0.32 0.21 0.4 0.10 0.30 0.04 Primary Road in Area 0.37 0.48 0.73 0.17 0.44 0.5 0.51 0.38 -0.06 Major River in Area 0.37 0.48 0.27 0.43 0.33 0.47 0.26 0.39 0.05 City in 1890 0.07 0.26 0.08 0.07 0.07 0.26 0.02 0.12 -0.00 1900 City Population (Z-score) .21 1.83 -.03 .74 .23 Local Areas with Cities in 1900 12 4 8 27 6 21 Household Variables Male Head 0.84 0.36 0.91 0.80 0.84 0.36 0.93 0.77 0 Age of Head 44.2 13.44 44.2 44.2 44.14 14.26 42.4 45.7 0.05 Household Size 5.83 3.8 7.35 4.96 5.91 3.5 6.68 5.22 -0.09 Individual Variables Age 29.6 10.2 29.5 29.7 29.7 10.4 29.8 29.7 -0.07 Christian 0.52 0.5 0.13 0.75 0.54 0.5 0.19 0.86 -0.02 Observations 10386 4617 5769 28980 15312 13668 39366 Number of Clusters 208 637 845 Number of Local Areas 134 436 550 Note: *** p < .01, ** p < .05. Summary statistics of local government areas for clusters within and outside 20 km of a rail track. Statistical significance of differences in means between connected and unconnected areas is obtained from standard errors clustered at the local area level. For categorical variables, 1 = Yes and 0 = No, so the mean represents the proportion in each area. Table 3.A.3: Effect of Proximity to Railway on Contemporary Outcomes (State Fixed Effects)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.371*** 0.136*** 0.018*** -0.071*** 0.082*** 0.068*** 0.165*** 0.649*** 0.185*** [0.2591 [0.023} [0.0061 [0.0161 [0.0161 [0.0141 [0.028] [0.101] [0.0501 Rainfall -0.000 -0.000 -0.000 -0.000 -0.000 -0.000 -0.000 -0.000 0.000 [0.000] [0.000] [0.000] [0.000] [0.000] [0.0001 [0.0001 [0.000] [0.000] Temperature -0.673** -0.028 -0.005 0.051** -0.035* -0.005 -0.032 -0.227* -0.000 [0.298] [0.027] [0.008] [0.022] [0.020] [0.0241 [0.033] [0.130] [0.065] Nutrient Retention -0.278* -0.022* -0.002 0.013 -0.011 -0.003 -0.019 -0.115* -0.064** [0.148] [0.012] [0.004] [0.012] [0.011] [0.010] [0.015] [0.063] [0.031] Workability 0.298** 0.027** 0.003 -0.008 0.025** 0.010 0.021 0.041 0.034 [0.135] [0.012] [0.003] [0.010] [0.010] [0.0091 [0.0141 [0.056] [0.028] Altitude -0.000 0.000 0.000 0.000 0.000 0.000** 0.000 0.000 0.000 [0.002] [0.000] [0.000] [0.000] [0.000] [0.0001 [0.0001 [0.001] [0.000] Oil Palm Suit. 0.325 0.030* 0.006 -0.040* 0.031** 0.028** 0.041** 0.130* -0.016 [0.225] [0.017] [0.006] [0.021] [0.015] [0.011] [0.017] [0.074] [0.036] Cocoa Suit. -1.061 -0.090 -0.022 0.126* -0.092 -0.060 -0.142** -0.467* -0.113 [0.798] [0.063] [0.024] [0.066] [0.061] [0.043] [0.061] [0.250] [0.149] Cotton Suit. -0.814 -0.108** -0.011 0.090** -0.013 -0.039 -0.042 -0.288* -0.116 [0.526] [0.046] [0.017] [0.046] [0.036] [0.0321 [0.053] [0.171] [0.121] Groundnut Suit. -0.392 -0.031 -0.001 0.062* -0.055 -0.023 -0.078** -0.260* -0.226** [0.299] [0.027] [0.010] [0.034] [0.034] [0.0241 [0.032] [0.157] [0.088] Mission 1.079*** 0.075*** 0.023*** -0.098*** 0.085*** 0.038*** 0.116*** 0.472*** 0.217*** [0.189] [0.015] [0.006] [0.017] [0.016] [0.0111 [0.0201 [0.082] [0.044] Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063 Adjusted R-sq. 0.47 0.39 0.05 0.20 0.20 0.15 0.34 0.49 0.29 Control Means 6.014 0.547 0.045 0.240 0.266 0.737 0.505 2.868 0.276 Note: *p < .1, ** p < 0.05, ***p < 0.01. See more on next page. Note (Continued): Standard errors clustered at the local government area level in brackets. Table shows estimates of the impact of being within 20 km of a railway line on various individual outcomes. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area (county). Data on Christian mission stations comes from historical maps, as described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.A.4: First-Stage Estimates based on Distance to Line Joining Major Nodes

(1) Rail Within 20 km In(Dist. Straight Lines) -0.294*** [0.0211 Rainfall 0.000 [0.0001 Temperature -0.043 [0.039] Nutrient Retention 0.000 [0.023] Workability 0.018 [0.0191 Altitude -0.000* [0.0001 Oil Palm Suit. 0.002 [0.0211 Cocoa Suit. 0.103 [0.1111 Cotton Suit. 0.114 [0.079] Groundnut Suit. 0.014 [0.028] Mission 0.019 [0.0261 Observations 37353 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. The instrument is the log of distance of DHS clusters to straight lines joining railway nodes of historical importance. All nodes of the railway line, regardless of historical importance, are dropped from regression samples. The first stage regression includes ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed us- ing DHS data and GIS information on the rail network . Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other vari- ables are taken from the 2008 Nigeria DHS.

126 Table 3.A.5: Effect of Proximity to Railway on Contemporary Outcomes (2SLS)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban

Rail Within 20 km 1.508*** 0.142*** 0.014* -0.040* 0.088*** 0.072*** 0.181*** 0.642*** 0.115 [0.393] [0.034] [0.008] [0.024] [0.026] [0.023] [0.042] [0.159] [0.078]

Baseline Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes State FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Ethnicity FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Observations 37319 37018 37180 37180 37157 37259 37250 37353 37353 Centered R-sq. 0.47 0.39 0.06 0.20 0.20 0.15 0.34 0.49 0.29 Control Means 6.014 0.547 0.045 0.240 0.266 0.737 0.505 2.868 0.276

Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. The instrument is the log of distance of DHS clusters to straight lines joining railway nodes of historical importance. All nodes of the railway line, regardless of historical im- portance, are dropped from regression samples. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.A.6: Falsification Exercises: Other Transportation Means and Placebo Lines , Panel A: Other Modes of Transportation

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 0.863*** 0.093*** 0.011* -0.034** 0.046*** 0.040*** 0.106*** 0.391*** 0.093** [0.2221 [0.020] [0.0061 [0.015] [0.014] [0.013] [0.024] [0.084] [0.047] ln(Dist. River) -0.052 -0.006 -0.002 0.005 -0.007 -0.004 -0.019** -0.036 -0.002 [0.070] [0.006] [0.002] [0.006] [0.006] [0.005] [0.007] [0.027] [0.019] ln(Dist. Road) -0.745*** -0.062*** -0.010*** 0.055*** -0.051*** -0.041*** -0.083*** -0.376*** -0.136*** [0.076] [0.007] [0.002] [0.0061 [0.006] [0.005] [0.009] [0.030] [0.017] Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063

Panel B: Placebo Lines

(1) (2) (3) (4) (5) (6) (7) (8) (9) ND Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Placebo Within 20 km 0.269 0.020 0.009 -0.029* 0.022 0.035*** 0.039* 0.240*** 0.061 [0.179] [0.015] [0.006] [0.015] [0.015] [0.013] [0.021] [0.085] [0.041] Observations 28663 28454 28568 28568 28543 28606 28598 28687 28687

Panel C: Placebo Lines and Other Modes of Transportation

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Placebo Within 20 km 0.110 0.008 0.006 -0.015 0.010 0.025** 0.021 0.151** 0.030 [0.173] [0.014] [0.005] [0.015] [0.014] [0.012] [0.019] [0.075] [0.039] ln(Dist. River) 0.001 -0.003 -0.000 -0.001 -0.002 -0.001 -0.019** -0.031 0.012 [0.075] [0.006] [0.002] [0.007] [0.007] [0.005] [0.008] [0.030] [0.021] ln(Dist. Road) -0.580*** -0.045*** -0.009*** 0.050*** -0.044*** -0.035*** -0.064*** -0.321*** -0.115*** [0.076] [0.006] [0.002] [0.007] [0.007] [0.005] [0.009] [0.033] [0.019] Observations 28663 28454 28568 28568 28543 28606 28598 28687 28687

Note: *p < .1, ** p < 0.05, ***p < 0.01. See more on next page. Note (Continued): Standard errors clustered at the local government area level in brackets. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network, rivers and roads are computed using DHS data and GIS information on rail, river and road networks. Distances to placebo lines are computed using GIS information on locations of placebo lines (Jaekel, 1997). Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS.

------Table 3.A.7: Robustness Checks : Various Control Groups Panel A: Placebo Lines as Control Group

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.036*** 0.135*** 0.007 -0.045* 0.065*** 0.045** 0.138*** 0.510*** 0.142** [0.329] [0.029] [0.011] [0.024] [0.022] [0.018] [0.041] [0.143] [0.070] Observations 19644 19462 19561 19561 19547 19611 19620 19659 19659

Panel B: Across Multiple Distances

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban I, Rail Within 20 km 1.231*** 0.125*** 0.022*** -0.062*** 0.091*** 0.076*** 0.179*** 0.687*** 0.187*** [0.314] [0.028] [0.008] [0.021] [0.021] [0.020] [0.034] [0.128] [0.062] Rail Within 20-40 km -0.362 -0.029 0.003 0.018 0.010 0.003 0.003 0.008 -0.068 [0.245] [0.021] [0.0071 [0.021] [0.0191 [0.019] [0.027] [0.107] [0.053] Rail Within 40-60 km 0.008 -0.001 0.006 -0.009 0.009 0.014 0.049* 0.120 0.064 [0.232] [0.019] [0.007] [0.022] [0.019] [0.017] [0.029] [0.108] [0.057] Rail Within 60-80 km 0.020 0.007 0.009 0.027 0.016 0.018 -0.005 0.008 0.069 [0.218] [0.016] [0.009] [0.019] [0.017]. [0.014] [0.021] [0.092] [0.051] Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network and to placebo lines are computed using DHS data and GIS information on rail network and locations of placebo lines (Jaekel, 1997). Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.A.8: Long-Run Effects of Railway on Urbanization Outcomes

Panel A: Dependent Variable: Z-score of City Presence in 2010

(1) (2) (3) (4) (5) (6) All North South All North South Rail Within 20 km 0.108** 0.153*** -0.056 0.060 0.094** -0.061 [0.051] [0.0531 [0.1391 [0.0431 [0.045] [0.109] 1900 City Z-score 0.181*** 0.201*** 0.171*** [0.0131 [0.016] [0.0191 1960 City Z-score 0.454*** 0.471*** 0.430*** [0.009 [0.0091 [0.015] Observations 7510 5985 1525 7510 5985 1525

Panel B: Dependent Variable: Z-score of Urban Population in 2010

(1) (2) (3) (4) (5) (6) All North South All North South Rail Within 20 km 0.180*** 0.128** 0.136 0.016 0.013 0.001 [0.067] [0.054] [0.189] [0.020] [0.013] [0.0861 1900 Pop. Z-score 0.404*** 1.304*** 0.244*** [0.121] [0.400] [0.0521 1960 Pop. Z-score 0.907*** 0.945*** 0.832*** [0.055] [0.070] [0.1171 Observations 7510 5985 1525 7510 5985 1525

Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. Table estimates the impact of proximity to the railway line on urbanization in 2010 (measured as city presence and urban population) within 1Okm x 10km local grid cells. The Z-score is the standardized score of the variable of interest, computed as the difference from the mean divided by the standard deviation. We control for the presence of mission stations within the grid cell and update the measure of rail connectedness to reflect the line completed after 1960. All regressions include state of residence fixed effects as well as baseline controls. Climatic and geographic controls are measured as the average within the grid cell.

131 Table 3.A.9: Effect of Proximity to Railway in North and South of Nigeria

Panel A: North (Estimation Strategy : State Fixed Effects)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.928*** 0.192*** 0.023*** -0.084*** 0.107*** 0.106*** 0.224*** 0.840*** 0.236*** [0.3491 [0.033] [0.006] [0.019] [0.0201 [0.020] [0.037] [0.126] [0.059] Observations 19683 19514 19617 19617 19572 19652 19641 19713 19713

Panel B: South (Estimation Strategy : State Fixed Effects)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 0.230 0.033 0.005 -0.029 0.017 0.000 0.049 0.233 0.068 [0.338] [0.023] [0.013] [0.030] [0.027] [0.016] [0.041] [0.158] [0.084] Observations 19344 19191 19264 19264 19286 19309 19317 19349 19349 Panel C: North (Estimation Strategy : IV)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 2.267*** 0.210*** 0.013 -0.050* 0.111*** 0.116*** 0.245*** 0.904*** 0.159* [0.561] [0.050] [0.009] [0.029] [0.034] [0.032] [0.056] [0.207] [0.094] Observations 18741 18582 18682 18682 18637 18717 18699 18771 18771

Panel D: South (Estimation Strategy : IV)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 0.591 0.062* 0.012 -0.044 0.061 0.021 0.114* 0.370* 0.111 [0.459] [0.034] [0.017] [0.044] [0.041] [0.028] [0.059] [0.221] [0.121] Observations 18578 18436 18498 18498 18520 18542 18551 18582 18582

Note: *p < .1, ** p < 0.05, ***p < 0.01. See more on next page. Note (Continued): Standard errors clustered at the local government area level in brackets. The instrument is the log of dis- tance of DHS clusters to straight lines joining railway nodes of historical importance. All nodes of the railway line, regard- less of historical importance, are dropped from regression samples. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.A.10: Differential Impact by Cohort (Non-Migrants Only) Panel A: North (Estimation Strategy : State Fixed Effects)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.267*** 0.124*** 0.016 -0.111*** 0.077*** 0.091*** 0.131*** 0.629*** 0.202*** [0.3881 [0.0351 [0.0141 [0.037] [0.026] [0.023] [0.038] [0.123] [0.061] X 1975-1984 0.572 0.042 0.015 0.013 0.015 -0.050* 0.056** 0.038 0.014 [0.374] [0.033] [0.015] [0.031] [0.027] [0.027] [0.027] [0.053] [0.024] X 1985-1993 0.936*** 0.117*** -0.011 0.061** 0.045 0.029 0.087*** 0.143* 0.037 [0.358] [0.034] [0.016] [0.030] [0.032] [0.029] [0.030] [0.083] [0.029] Born 1975-1984 0.573** 0.023 -0.002 0.039* 0.075*** 0.021 0.052** 0.145*** 0.030 [0.223] [0.025] [0.008] [0.021] [0.021] [0.020] [0.021] [0.052] [0.019] Born 1985-1993 1.714*** 0.107** 0.009 0.018 0.154*** 0.042 0.137*** 0.243** 0.060 [0.389] [0.044] [0.017] [0.034] [0.040] [0.035] [0.034] [0.100] [0.040] I, Observations 9391 9308 9352 9352 9347 9373 9370 9406 9406

Panel B: South (Estimation Strategy : State Fixed Effects)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km -0.263 0.026 -0.021 -0.004 0.004 -0.008 0.000 0.040 0.019 [0.492] [0.043] [0.0211 [0.054] [0.038] [0.029] [0.057] [0.167] [0.098] X 1975-1984 0.621* 0.012 0.024 0.000 0.023 0.020 0.038 0.192*** 0.064** [0.342] [0.038] [0.017] [0.037] [0.037] [0.024] [0.038] [0.069] [0.030] X 1985-1993 0.325 -0.035 0.019 0.029 0.044 0.003 0.017 0.158*** 0.048* [0.362] [0.041] [0.017] [0.040] [0.035] [0.027] [0.038] [0.056] [0.026] Born 1975-1984 2.010*** 0.097*** 0.011 -0.037 0.180*** 0.061*** 0.072*** 0.099* 0.013 [0.254] [0.023] [0.014] [0.024] [0.028] [0.022] [0.025] [0.052] [0.021] Born 1985-1993 3.063*** 0.156*** -0.002 -0.044 0.237*** 0.116*** 0.099*** 0.174** 0.044 [0.327] [0.033] [0.020] [0.032] [0.048] [0.035] [0.038] [0.074] [0.034] Observations 7375 7328 7354 7354 7357 7363 7366 7377 7377 Note: *p < .1, ** p < 0.05, ***p < 0.01. See more on next page. Note (Continued): Standard errors clustered at the local government area level in brackets. The table estimates, by cohort, the impact of distance to the railway on Northern (West, East, Central) and Southern (West, East, South) Nigeria. The omitted cohort is made of older individuals born between 1948-1974. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail and road networks are computed using DHS data and GIS information on rail and road networks. Climatic and geographic controls are measured as the average within the local gov- ernment area. Data on Christian mission stations come from maps described in the text. All other variables are taken from the 2008 Nigeria DHS.

CIO Table 3.A.11: Short-Run Effects of the Railway on Urbanization Outcomes

Panel A: Dependent Variable: Z-score of City Presence in 1960

All Nodes Excluded (1) (2) (3) (4) (5) (6) All North South All North South Rail Within 20 km 0.105** 0.127*** 0.010 0.066 0.085** -0.015 [0.048] [0.0431 [0.159] [0.046] [0.040] [0.155] 1900 City Z-score 0.400*** 0.412*** 0.389*** 0.397*** 0.407*** 0.385*** [0.0251 [0.044] [0.031] [0.026] [0.0521 [0.032] Observations 7510 5985 1525 7487 5971 1516

Panel B: Dependent Variable: Z-score of Urban Population in 1960

All Nodes Excluded (1) (2) (3) (4) (5) (6) All North South All North South Rail Within 20 km 0.175*** 0.124** 0.151 0.119* 0.092* 0.046 [0.065] [0.053] [0.159] [0.0621 [0.0521 [0.112] 1900 Pop. Z-score 0.507*** 1.346*** 0.355*** 0.489*** 1.434** 0.353*** [0.140] [0.508] [0.080] [0.142] [0.621] [0.080] Observations 7510 5985 1525 7487 5971 1516 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. Table estimates the impact of proximity to the railway line on urbanization in 1960 (measured as city presence and urban population) within 10km x 10km local grid cells. The Z-score is the standardized score of the variable of interest, computed as the difference from the mean divided by the standard deviation. We control for the presence of mission stations within the grid cell. All regressions include state of residence fixed effects as well as baseline controls. Climatic and geographic controls are measured as the average within the grid cell.

136 Table 3.A.12: Benefits of Shipping by Rail For Key Regional Crops

Northern Crops Southern Crops Groundnuts Cotton Palm Oil Cocoa Shipping Prices Rail Price (pence per ton km) 1.95 1.37 3.95 2.08 River Price (pence per ton km) .9 (+ 3.1 rail) 2.5 (+ 3.1 rail) 1.8 1.8 Road Price (pence per ton km) 5.6 5.6 2.5 1.3 Shipping Distances Distance Rail (km) 1127 1159 61 193 Distance River (km) 575 river (552 rail) 575 river (584 rail) 61 193 Cost Reduction from Rail As % of River Cost -1.4 -51.1 119.4 15.6 As % of Road Cost -65.2 -75.5 58 60

Note: Table calculates the benefit of the railway over the period 1945-1949. For river shipments in the North, the cost is estimated as the cost of railing to Baro and then shipping by river to the Delta ports, hence the rail prices and distances in parentheses. We use the railing distance as the shipping distance for rivers in the South, although this might be an overestimate given the proximity of the South to several rivers which lead to the coast.

137 Table 3.A.13: Effect of Proximity to Railway By Distance to Coastal Port Panel A: Above Median Distance to Port (192 km)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.807*** 0.177*** 0.022*** -0.073*** 0.103*** 0.104*** 0.209*** 0.779*** 0.207*** [0.326] [0.0301 [0.005] [0.019] [0.019] [0.019] [0.035] [0.120] [0.057 Observations 22164 21987 22088 22088 22052 22129 22118 22194 22194

Panel B: Below Median Distance to Port (192 km)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km -0.080 0.009 -0.007 -0.026 -0.012 -0.017 0.019 0.161 0.131 [0.346] [0.023] [0.015] [0.031] [0.028] [0.016] [0.044] [0.171] [0.089] I, 00 Observations 16863 16718 16793 16793 16806 16832 16840 16868 16868

Panel C: Interaction with log(Distance to Ports)

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km -1.504* -0.194*** 0.001 0.020 -0.086 -0.154*** -0.160* -0.604 0.065 [0.885] [0.067] [0.024] [0.067] [0.069] [0.053] [0.090] [0.367 [0.221] X ln(Dist. Port) 0.494*** 0.058*** 0.003 -0.014 0.028** 0.039*** 0.056*** 0.214*** 0.019 [0.166] [0.014] [0.004] [0.011] [0.012] [0.010] [0.0171 [0.066] [0.038] Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063

Note: Standard errors clustered at the local government area level in brackets. Table estimates the impact of distance to the railway by distance to coastal ports (Bonny, Burutu, Calabar, Degema, Lagos, Opobo, Port Harcourt, Sapele, Warri). All railway nodes are dropped from the regressions. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distance to rail network is computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.A.14: Effect of Railway By Proximity to Early Cities

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban

Rail Within 20 km 1.198*** 0.116*** 0.019*** -0.067*** 0.078*** 0.067*** 0.149*** 0.611*** 0.160*** [0.260] [0.0241 [0.005] [0.017] [0.017] [0.014] [0.030] [0.107] [0.052]

X 1900 City Within 20 km -0.501 -0.021 -0.028* 0.058** -0.066** -0.048** -0.067 -0.393** -0.116 [0.458] [0.038] [0.016] [0.0281 [0.030] [0.023] [0.049] [0.169] [0.092]

1900 City Within 20 km 1.755*** 0.138*** 0.040*** -0.120*** 0.128*** 0.087*** 0.198*** 0.859*** 0.329*** [0.252] [0.022] [0.012] [0.024] [0.022] [0.016] [0.032] [0.117] [0.061]

V Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063

Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. Table estimates the impact of the railway, by proximity to a city in 1900. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS. 3.B Additional Tables

140 Table 3.B.1: Effect of Railway: Robustness to other Measures of Connectedness

Panel A: Closeness to Railway Lines

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Closeness to Rail 0.593*** 0.056*** 0.008*** -0.029*** 0.038*** 0.029*** 0.072*** 0.281*** 0.074*** [0.099] [0.009] [0.002] [0.006] [0.006] [0.006] [0.010] [0.037] [0.019] Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063

Panel B: Proximity to Railway Station

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Station Within 20 km 1.655*** 0.155*** 0.021*** -0.097*** 0.100*** 0.057*** 0.185*** 0.722*** 0.232*** I. [0.314] [0.028] [0.007] [0.016] [0.018] [0.016] [0.033] [0.113] [0.055] Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063

Panel C: Presence of Rail Tracks in Local Area

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail in Local Area 0.696*** 0.072*** 0.013** -0.025 0.047*** 0.047*** 0.096*** 0.383*** 0.090* [0.2661 [0.023] [0.005] [0.016] [0.017] (0.014] [0.029] [0.103] [0.049] Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. Closeness to the railway line is defined as the log of the inverse of 1 plus the distance of the individual's cluster to the railway line. Rail in Local Area takes the value 1 if a railway line crosses the individual's local government area. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.B.2: Conley Standard Errors

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.371*** 0.136*** 0.018*** -0.071*** 0.082*** 0.068*** 0.165*** 0.649*** 0.185*** [0.283] [0.0251 [0.005] [0.019] [0.017] [0.016] [0.0291 [0.112] [0.0501 Observations 39028 38706 38882 38882 38859 38962 38959 39063 39063

Note: *p < .1, ** p < 0.05, ***p < 0.01. Table estimates the impact of proximity to the railway adjusting for spatial correlation with Conley standard errors (in brackets). Conley standard errors are computed with a cutoff of 80 km. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to the rail network are computed using DHS data and information on the rail network. Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS.

I' Table 3.B.3: Effect of Railway: Robustness to Various Sub-samples Panel A: Rural Areas

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 0.835*** 0.085*** 0.011** -0.044** 0.052*** 0.053*** 0.105*** 0.467*** 0.000 [0.298] [0.027] [0.005] [0.018] [0.0191 [0.0161 [0.032] [0.117] [.1 Observations 25508 25272 25410 25410 25387 25462 25452 25523 25523

Panel B: Migrants Only

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.478*** 0.148*** 0.022*** -0.071*** 0.082*** 0.082*** 0.198*** 0.754*** 0.202*** [0.274] [0.025] [0.007] [0.0171 [0.0191 [0.017] [0.030] [0.107] [0.053] Observations 22168 21977 22082 22082 22063 22132 22129 22186 22186

Panel C: Non-Migrants Only

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.225*** 0.123*** 0.010 -0.061*** 0.082*** 0.055*** 0.126*** 0.534*** 0.162*** [0.295] [0.025] [0.007] [0.0221 [0.018] [0.0161 [0.030] [0.103] [0.054] Observations 16767 16637 16707 16707 16705 16737 16737 16784 16784 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. Table shows estimates of the impact of being within 20 km of a railway line on various individual outcomes for different sub-samples. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area (county). Data on Christian mission stations comes from historical maps, as described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.B.4: Effect of Railway: Robustness to Various Sub-samples Panel A: No Mission Stations in Local Area

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.457*** 0.142*** 0.022*** -0.067*** 0.082*** 0.063*** 0.173*** 0.695*** 0.196*** [0.292] [0.026] [0.006] [0.017] [0.018] [0.016] [0.033] [0.118] [0.055] Observations 30706 30436 30588 30588 30562 30653 30643 30733 30733

Panel B: No Rail Tracks in Local Area

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 0.961** 0.093** 0.018* -0.076*** 0.065** 0.026 0.143*** 0.614*** 0.223** [0.410] [0.039] [0.011] [0.022] [0.028] [0.022] [0.042] [0.161] [0.092] Observations 30179 29958 30069 30069 30051 30126 30117 30205 30205

Panel C: No Rail Station in Local Area

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 0.545* 0.062*** 0.013** -0.018 0.037* 0.059*** 0.086*** 0.345*** 0.074 [0.279] [0.0231 [0.006] [0.021] [0.019] [0.017] [0.031] [0.115] [0.066] Observations 32940 32696 32827 32827 32802 32880 32877 32971 32971 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. Table shows estimates of the impact of being within 20 km of a railway line on various individual outcomes for different sub-samples. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network are computed using DHS data and GIS information on the rail network. Climatic and geographic controls are measured as the average within the local government area (county). Data on Christian mission stations comes from historical maps, as described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.B.5: Falsification Exercise: Placebo Lines Estimates in North and South Panel A: Placebo Lines in the North

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Placebo Within 20 km -0.056 -0.001 -0.002 -0.003 -0.016 0.020 0.023 0.143 0.023 [0.230] [0.0211 [0.006] [0.017] [0.015] [0.020] [0.028] [0.103] [0.050] ln(Dist. River) -0.047 -0.001 -0.007** -0.001 -0.013* 0.005 -0.019* -0.042 -0.017 [0.123] [0.011] [0.003] [0.008] [0.008] [0.008] [0.012] [0.044] [0.027] ln(Dist. Road) -0.338*** -0.031*** -0.005* 0.030*** -0.016*** -0.030*** -0.043*** -0.208*** -0.094*** [0.087] [0.008] [0.003] [0.007] [0.006] [0.007] [0.011] [0.037] [0.022] Observations 15099 14978 15053 15053 15022 15072 15057 15120 15120

Panel B: Placebo Lines in the South c~11 (1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Placebo Within 20 km 0.464* 0.035** 0.012 -0.041* 0.040* 0.033** 0.039 0.233** 0.071 [0.237] [0.018] [0.009] [0.025] [0.024] [0.016] [0.027] [0.113] [0.064] ln(Dist. River) 0.031 -0.006 0.007** 0.002 0.008 -0.010 -0.014 -0.013 0.043 [0.093] [0.007] [0.003] [0.011] [0.010] [0.007] [0.010] [0.044] [0.030] ln(Dist. Road) -0.922*** -0.062*** -0.015*** 0.078*** -0.088*** -0.038*** -0.086*** -0.471*** -0.144*** [0.106] [0.008] [0.004] [0.012] [0.010] [0.007] [0.012] [0.048] [0.033] Observations 13563 13475 13514 13514 13520 13533 13540 13566 13566 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. The table estimates the impact of being within 20 km to a placebo line (surveyed lines that were not constructed). All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distances to rail network, rivers and roads are computed using DHS data and GIS information on rail, river and road networks. Climatic and geographic controls are measured as the average within the local government area. Data on Christian mission stations comes from maps described in the text. All other variables are taken from the 2008 Nigeria DHS. Table 3.B.6: Placebo Lines as Control Group in North and in South Panel A: North

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 1.603*** 0.179*** 0.025*** -0.031 0.089*** 0.074*** 0.183*** 0.644*** 0.131 [0.384] [0.0371 [0.0081 [0.0221 [0.023] [0.0221 [0.048] [0.1591 [0.0851 Observations 8617 8530 8582 8582 8554 8602 8605 8628 8628

Panel B: South

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km -0.348 0.024 -0.021 -0.009 -0.002 -0.008 0.014 0.116 0.074 [0.4321 [0.031] [0.025] [0.0381 [0.036] [0.021] [0.0641 [0.236] [0.1181

I, Observations 11024 10929 10976 10976 10990 11006 11012 11028 11028 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. The table estimates separately for Southern and Northern Nigeria the impact of being within 20 km of a railway line relative to being within 20 km of a placebo line. All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Climatic and geographic controls are from Fischer et al. (2008) and Flijrmans et al. (2005), and are measured as the average within the local government area. Data on Christian mission stations comes from maps published by Ayandele (1966) and Roone (1925). All other variables are taken from the 2008 Nigeria DHS. Table 3.B.7: Robustness of No Effect in South Excluding Crude Oil Producers

(1) (2) (3) (4) (5) (6) (7) (8) (9) Schooling Literacy Professional Agricultural Read Paper Listen to Radio Watch TV Wealth Index Urban Rail Within 20 km 0.030 0.025 -0.004 -0.008 0.004 -0.008 0.031 0.178 0.006 [0.3921 [0.0271 [0.0181 [0.033] [0.031] [0.0191 [0.0561 [0.203] [0.0981 Observations 9329 9239 9292 9292 9301 9317 9320 9333 9333 Note: *p < .1, ** p < 0.05, ***p < 0.01. Standard errors clustered at the local government area level in brackets. Table estimates the impact of proximity to the railway in Southern Nigeria (West, East, South) excluding oil-producing areas. Oil-producing areas are the historically oil-producing states (Abia, Akwa Ibom, Bayelsa, Cross River, Delta, Edo, Imo, Ondo, Rivers). All regressions include ethnicity and state of residence fixed effects as well as all baseline controls. Distance to rail network is computed using data from the from the 2008 Nigeria DHS and information on rail network taken from DMA (1992). Climatic and geographic controls are from Fischer et al. (2008) and llijmans et al. (2005), and are measured as the average within the local government area. Data on Christian mission stations comes from maps published by Ayandele (1966) and Roome (1925). All other variables are taken from the 2008 Nigeria DHS. 148 Bibliography

Ajayi, J.F.A., L.K.H. Goma, and A.G. Johnson, The African Experience with Higher Education, Association of African Universities, 1996.

Anene, J.C., Southern Nigeria in Transition, 1885-1906: Theory and Practice in a Colonial Protectorate, Cambridge University Press, 1966.

Ayandele, E.A., The Missionary Impact on Modern Nigeria, 1842-1914: A Political and Social Analysis Ibadan History Series, Longmans, 1966.

Banerjee, Abhijit, Esther Duflo, and Nancy Qian, "On the Road: Access to Transportation Infrastructure and Economic Growth in China," NBER Working Papers 17897, National Bureau of Economic Research, Inc Mar 2012.

Baum-Snow, Nathaniel, Loren Brandt, J Vernon Henderson, Matthew A Turner, and Qinghua Zhang, "Roads, Railroads and Decentralization of Chi- nese cities," Technical Report, Working paper 2012.

Bleakley, Hoyt and Jeffrey Lin, "Portage and Path Dependence," The Quarterly Journal of Economics, 2012, 127 (2), 587.

Cage, Julia and Valeria Rueda, "The Long-Term Effects of the Printing Press in Sub-Saharan Africa," American Economic Journal: Applied Economics, July 2016, 8 (3), 69-99.

Conley, T.G., "GMM Estimation with Cross Sectional Dependence," Journal of Econometrics, 1999, 92 (1), 1-45.

Crowder, M., The Story of Nigeria Faber paperbacks, Faber & Faber, 1980.

Csapo, Marg, "Religious, Social and Economic Factors Hindering the Education of Girls in Northern Nigeria," Comparative Education, 1981, 17 (3), 311-319.

149 Dev, Pritha, Blessing U. Mberu, and Roland Pongou, "Ethnic Inequality: Theory and Evidence from Formal Education in Nigeria," Economic Development and Cultural Change, 2016, 64 (4), 603-660.

DMA, Defense Mapping Agency, Digital Chart of the World, Fairfax, Virginia: Defense Mapping Agency, 1992. Available at http://www.diva-gis.org/gdata.

Donaldson, Dave, "Railroads of the Raj: Estimating the impact of transportation infrastructure," American Economic Review, 2016, Forthcoming.

- and Richard Hornbeck, "Railroads and American Economic Growth: A Market Access Approach," The Quarterly Journal of Economics, 2016.

Faber, Benjamin, "Trade Integration, Market Size, and Industrialization: Evidence from China's National Trunk Highway System," Review of Economic Studies, 2014, 81 (3), 1046-1070.

Falola, T. and M.M. Heaton, A History of Nigeria, Cambridge University Press, 2008.

FAO, "Nigeria At a Glance," 2016. Available at http: //www. fao. org/nigeria/ f ao- in-nigeria/nigeria-at-a-glance/en/, Accessed: February, 2016.

Fay, Marianne and Charlotte Opal, Urbanization Without Growth: A Not So Uncommon Phenomenon Policy Research Working Papers, World Bank, Policy Research Dissemination Center, 2000.

Fischer, G., F. Nachtergaele, S. Prieler, H.T. van Velthuizen, L. Verelst, and D. Wiberg, Global Agro-Ecological Zones Assessment for Agriculture (GAEZ 2008), IIASA, Laxenburg, Austria and FAO, Rome, Italy, 2008.

Fogel, R.W., Railroads and American Economic Growth: Essays in Econometric History, Baltimore: Johns Hopkins Press, 1964.

Fourie, Johan and Alfonso Herranz-Loncan, "Growth (and Segregation) by Rail : How the Railways Shaped Colonial South Africa," 2015.

Gallego, Francisco A. and Robert Woodberry, "Christian Missionaries and Education in Former African Colonies: How Competition Mattered," Journal of African Economies, 2010, 19 (3), 294-329.

150 Hijmans, R. J., M. Cruz, E. Rojas, and L. Guarino, "DIVA-GIS, version 1.4. A Geographic Information System for the Management and Analysis of Genetic Resources Data.," http: //www. diva-gis . org/gdata 2001.

Hijmans, Robert J., Susan E. Cameron, Juan L. Parra, Peter G. Jones, and Andy Jarvis, "Very High Resolution Interpolated Climate Surfaces for Global Land Areas," International Journal of Climatology, 2005, 25 (15), 1965- 1978.

Hodder, B. W., "Tin Mining on the Jos Plateau of Nigeria," Economic Geography, 1959, 35 (2), 109-122.

Huillery, Elise, "History Matters: The Long-Term Impact of Colonial Public Invest- ments in French West Africa," American Economic Journal: Applied Economics, 2009, pp. 176-215.

Jaekel, Francis, The History of the Nigerian Railway: Opening the Nation to Sea, Air and Road Transportation,Vol. 1, Spectrum Books, 1997.

Jedwab, Remi and Alexander Moradi, "The Permanent Effects of Transporta- tion Revolutions in Poor Countries: Evidence from Africa," CEH Discussion Pa- pers 031, Centre for Economic History, Research School of Economics, Australian National University Jan 2015.

_ and Dietrich Vollrath, "Urbanization Without Growth in Historical Perspec- tive," Explorations in Economic History, 2015, 58, 1-21.

, Edward Kerby, and Alexander Moradi, "History, Path Dependence and Development: Evidence from Colonial Railroads, Settlers and Cities in Kenya," The Economic Journal, 2015, p. Forthcoming.

Lamb, P. H., "Past, Present and Future of Cotton-Growing in Nigeria," Empire Cotton-Growing Review, 1925, pp. 18-23.

Law, Robin, From Slave Trade to 'Legitimate' Commerce: The Commercial Transi- tion in Nineteenth-Century West Africa African Studies, Cambridge, U.K.: Cam- bridge University Press, 2002.

NBS and ICF International, 2008 Nigeria Demographic and Health Survey [Data File], Vol. NGIR52FL.DTA and NGMR52FL.DTA, Calverton, Maryland: ICF

151 International: Nigeria Bureau of Statistics and ICF International, 2008. Avail- able at http: //dhsprogram. com/dat a/dataset /NigeriaStandardDHS_2008. cfm?flag=1. Nunn, Nathan, "Gender and Missionary Influence in Colonial Africa," in Emmanuel Akyeampong, Robert H. Bates, Nathan Nunn, and James Robinson, eds., African Development in Historical Perspective, Cambridge: Cambridge University Press, May 2014.

Okoye, Dozie and Roland Pongou, "Historical Missionary Activity, Schooling, and the Reversal of Fortunes: Evidence from Nigeria," MPRA Paper 58052, Uni- versity Library of Munich, Germany Aug 2014. Onyewuenyi, Remy N, Railway Development and the Growth of Export Agriculture in Nigeria during the 1900-1950 Period., University of Ottawa (Canada)., 1981. Rappaport, Jordan, "Moving to Nice Weather," Regional Science and Urban Eco- nomics, 2007, 37 (3), 375-398. Roome, William R.M., Ethnographic Survey of Africa Showing the Tribes and Languages: Also the Stations of Missionary Societies, Edward Stanford Ltd, 1925. Schiitzl, Ludwig, Industrialization in Nigeria: A Spatial Analysis, Vol. 81, Weltfo- rum Verlag Miinchen, 1973. Storeygard, Adam, "Fartheron down the Road: Transport Costs, Trade and Urban Growth in Sub-Saharan Africa," The Review of Economic Studies, 2016, 83 (3), 1263-1295. Taaffe, Edward J, Richard L Morrill, and Peter R Gould, "Transport Expan- sion in Underdeveloped Countries: A Comparative Analysis," Geographicalreview, 1963, 53 (4), 503-529. UC Berkeley, Museum of Vertebrate Zoology, GADM Database of Global Ad- ministrative Areas, Version 2.7, University of California, Berkeley, 2014. Available at www.gadm.org. Wantchekon, Leonard, Marko Klagnja, and Natalija Novta, "Education and Human Capital Externalities: Evidence from Colonial Benin," The Quarterly Jour- nal of Economics, 2015, forthcoming.

World Bank, "The Economic Development of Nigeria," Technical Report 11151, The World Bank, Baltimore, MD 1955.

152 Chapter 4

Pay-As-You-Go vs. Capital Markets : A Rational Model of the Adoption of Savings Investment Technologies

153 One of the puzzles of macroeconomic policy analysis is the extent to which the importance of pay-as-you-go retirement systems varies across countries in the de- veloped world. Table 4.E.1, which presents effective tax rates for these systems in the largest countries of the OECD, shows that, in 2002, they varied from 5 % in Australia to 36 % in Italy.

Analysts have sought to explain this diversity in different ways. Some have argued that it reflects historical differences, such as differences in the degree of social conflict and the power of unions.1 Others have suggested that it is the mark of cultural factors, some societies being more risk averse than others, and thus more intent on ensuring the stability of retirement incomes.

We seek to demonstrate that pay-as-you-go (henceforth paygo) effective contri- bution rates - are the consequence of rational, welfare maximizing decisions by in- dividuals and societies in countries whose underlying economic characteristics are themselves different. Our hypothesis builds on the Aaron (1966) proposition that a society's choice between paygo and funded saving should depend on whether the natural rate of growth is greater or less than the rate of return on capital - paygo being optimal in the first case and funded saving in the second. Aaron's criterion was a knife-edge criterion. We suggest that, in the richer world of stochastic dynamics, it is optimal for societies to choose to rely on both forms of provision (paygo and saving), and that it is the balance of the two that is affected by economic conditions.

Simply put, in countries in which labor income is expected to grow slowly and to be subject to recurrent shocks, individuals and society will, ceteris paribus, emphasize personal saving for the provision of retirement income. In countries in which the real

'In seeking exogenous explanations for different attitudes toward in the United States and Europe, Alesina and Glaeser (2004) point, among other factors, to the presence of racial tensions in the United States. In their discussion of the divergent evolution of social security provisions in different countries, Bruno and Sachs (1985) emphasized the nature of union pressures.

154 return to capital is expected to be low and volatile, they will, on the contrary, put more weight on paygo transfers.

In order to test this approach, we construct a model of how societies and individ- uals determine the levels of paygo and saving in the provision of retirement income.

This is a simple, two-period overlapping generations model in which a benevolent public authority and a representative individual jointly choose that paygo tax rate and saving rate which maximize the representative individual's expected life-time utility. The next step is to construct empirical estimates of the dynamic distribu- tions of labor and capital income in each country. Armed with these distributions, we use the model to generate a cross-section of predicted values of paygo and saving rates for each country in a reference year, which we take to be 2002. The test of the validity of the model is how well it explains actual, effective paygo commitments in that year.' Though the model describes the dynamic behavior of individuals and so- cieties over time, what it explains is differences in effective paygo contribution rates across countries at a given point in time.

The biggest challenge in this exercise is data construction. In order to analyze a society's paygo and saving choices, we must first estimate the distribution of the labor and capital income of a representative individual for sixty or more years - active life plus retirement. Moreover, the variability and higher moments of this distribution are decisively important for the computation of expected utility. In order to estimate these, we would ideally like to have historical data on as many life-cycles as possible.

The available macroeconomic data, which only stretch back one hundred years, do not provide direct evidence of more than two or three lifetimes.

2 We chose 2002 as reference year because it is the first year when comparative and comprehensive data on pension systems in the OECD became available (OECD, 2003). We would like also to test the ability of the model to predict saving rates, but data on life-time saving rates, which are necessarily estimated by cohort rather than by calendar year, exist in only a few countries.

155 To overcome this obstacle, we resort to a Monte-Carlo approach. Using data since the end of World War II, we estimate simple models of the annual dynamics of labor and capital income. We then use the parameters of these models, and bootstrapped estimates of the distributions of their error terms, to simulate as many life-time histories as we need. With these, we compute the expected life-time utility of the representative individual corresponding to any pair of paygo tax and saving rates.

Our model says that the rates that maximize this expected utility are the rates that society will choose.

We test the model on a subset of the countries of the OECD, focusing on high- income countries which have developed financial markets, and which, though open to trade, are not so small that trade dwarfs domestic production. Specifically, we chose from the 23 developed countries in the OECD in 2003, the eight for which the average value of the ratio of exports plus imports to GDP during the decade of the 1990s was less than 55 percent.3

This article builds on a long literature about the implications of risk and uncer- tainty for the efficiency of paygo schemes. Merton (1983), Gordon and Varian (1988), Gale (1990), Demange and Laroque (1999, 2001), Deniange (2002) and others have shown that a paygo system can enhance welfare in dynamically efficient economies because of the unique opportunity it provides for workers to spread the risk of life- time earnings over different generations. Gottardi and Kubler (2011) extend these analyses and note that the result depends on the degree to which the paygo system reduces capital accumulation. In DeMenil et al. (2006), we derived analytical results relating steady state optimal saving and paygo tax rates (when they exist and are

3 The countries we selected are Australia, France, Germany, Italy, Japan, Spain, the United Kingdom and the United States. The average ratio of exports plus imports to GDP during the 1990s in these eight countries ranges from 18 % in Japan to 53% in the United Kingdom.

156 unique) to underlying fundamentals. The present article uses the same model to obtain empirical estimates of a cross-section of economically rational tax and saving rates.

More recently, several papers have used models similar to ours to analyze the properties of public policies designed to provide intergenerational risk sharing. Bohn

(2009) posits a comprehensive, intergenerational welfare maximization framework, and concludes that social security policies that shift risk from the working years to the the years of retirement are preferable to policies that shift risk in the opposite direction. Thogersen (1998) reaches a similar conclusion in an analysis which focuses on the mean and variance of life-time income in both periods combined. Using a model also similar to the one used here, Wagener (2003) argues that this conclusion follows from the use of an ex-ante utility criterion. He shows that the conclusion is reversed if the income of the representative individual in her working years is taken as given, and utility is maximised over expectations of retirement income. Beetsma et al. (2011) extend this conclusion to economies in which there coexist both paygo and mandatory funded pension systems. They argue that a fixed replacement rule in the funded pillar provides risk sharing benefits which a defined contribution rule can not. Hemert (2005) analyses state-contingent pension transfers. Gollier (2008) abstracts from the uncertainty of labor income in order to focus on the volatility of the return to saving, and shows that pooled pension funds can of themselves provide intergenerational insurance which the kind of individual saving behavior modeled here can not.

Simulation methods have been increasingly used to estimate the welfare effects of either increasing or down-sizing a paygo system such as that of the United States.

Krueger and Kubler (2006) posit an economy populated with representative agents living through a life-cycle of nine stages, who face macroeconomic shocks to labor and

157 capital income. They use Monte-Carlo techniques to compute the expected life-time utility of each cohort of agents. Their focus is on the Pareto optimality of a single experiment - the introduction of Social Security with a 2% tax rate in the United

States in 1935. In their most general cases, they conclude that the experiment was not Pareto improving.

Nishiyama and Srnetters (2007) emphasize individual diversity (neglected in this study and many of those cited). In their model of the U.S. economy, individuals have different skill levels, and are subject to idiosyncratic shocks. Their model does not allow for macroeconomic shocks to the wage rate and the return to capital. They find that progressively reducing the U.S. paygo system to half the size it began with in 2003 would diminish welfare in the sense of Hicks, unless the down-sizing were accompanied with labor tax reforms that compensated for the insurance against idiosyncratic risk provided by the original paygo system.

Fuster et al. (2007) depart from the non-altruistic framework adopted in much of this research, and posit individuals who incorporate the well being of both children and parents in their own decision making. They simulate the effect of a once-and-for- all elimination of Social Security on the welfare of dynasties with different skill and demographic characteristics, and find a majority benefit from the change. Their re- sult depends critically on the elasticity of life-time labor supply to the taxes required to finance Social Security.

Other authors Dutta et al. (2000) and Matsen and Thogersen (2004) have used historically observed growth rates and rates of investment return to calculate op- timal paygo tax rates and saving rates for different developed countries. However, our Monte-Carlo approach to estimating the relevant moments of these variables is different from theirs. Dutta, Kappur and Orszag (DKO) and Matsen and Thogersen

(MT) take decades as representative of life histories.

158 DKO and MT's "portfolio approach" further reduces the design of a social security system to a decision regarding the optimal weights to be given, in a fixed portfolio of social security assets, to investment in capital and investment in an asset whose rate of return is equal to the growth of the wage bill. They deal only in rates of growth and thus implicitly assume that the real wage bill follows a random walk with a drift. They do not allow for the possibility that real wages may tend to regress to their long-term trend. As MT themselves point out, in such a world, there is no way to insure against the risk of a bad draw in the lottery for life-time labor income. Their approach thus assumes away one of the principal early arguments in favor of paygo retirement schemes. Our approach is to model the real wage bill as a process of stochastic deviations around a deterministic growth path, and test for the stationarity of that process. When non-stationarity is rejected, as it is in our estimates, paygo can provide intergenerational insurance against the variability of real wage income. We calculate the amount of such insurance which would be optimal for each country.

This article is structured as follows: Section 4.1 discusses the empirical evidence of diversity in effective paygo provision in the OECD. Section 4.2 presents a model of how societies and individuals determine paygo tax and saving rates. Section 4.3 presents our data sources and estimates of the annual dynamics of average earnings and the average return to saving made with macroeconomic data from 1950 through

2002. This section includes a test of the stationarity of the wage process. Section 4.4 aggregates these annual income components into summary measures of the labor and capital income of the representative individual in the active and retirement years.

Section 4.5 presents the results. We derive the tax and saving rates which maximize the expected utility of the representative individual in each country. We test the

159 ability of these predicted rates to explain the cross-country variation of the effective paygo tax rates in our sample. Our principal result is that our preferred estimates explain 83% of this cross-country variance. This section ends with an attempt to decompose the explained variance, and attribute it to the principal sources of national difference. Section 4.6 addresses two hypothetical questions: What would the effect on paygo tax rates be of a global convergence of national capital markets? How would paygo tax rates have differed if individuals had anticipated the possibility (but not predicted the occurence) of a crisis like that of 2008? Section 4.7 concludes.

4.1 The Diversity of Effective Paygo Rates

Table 4.E.1 presents in country alphabetical order the effective paygo tax rates which we seek to explain. A comment is in order about the nature of this variable and the way it is measured. It is well known that social security budgets are often out of balance, that paygo pensions are frequently partially financed out of general revenue, and that, as a consequence, wage tax rates are a poor measure of the importance of a paygo system. The answer is to construct alternative, effective paygo tax rates. A country's effective rate measures what the wage tax would have to be for the revenue it generates to finance the paygo disbursements to which that country is committed.4

It measures a country's forward commitment to paygo provision.

Let , be the average pension an employee is entitled to at retirement - relative to the then average economy wide wage, and P be the ratio of the number of pensioners to the number of active employees at that time, then the effective tax rate required to balance the system is

4 See Disney (2004).

160 p(4.1) 0 =, . w e

Appendix 4.A discusses how we estimate b, the "relative pension level", and

the number of retirees per employee, in order to construct effective paygo tax rates.

The salient characteristic of the distribution presented in Table 4.E.1 is its di-

versity. The observations form three distinct clusters. Three countries - Australia, the United Kingdom and the United States - have low effective tax rates. At the

other end of the spectrum are Italy and Spain - countries whose effective rates are

more than double those of the first group. And, in between, is the third cluster -

Germany, Japan and France - with effective rates roughly half way between the top

group and the bottom group. What explains this pattern? Cultural and historical

differences? Or a shared pattern of common response to differences in underlying

economic conditions?

4.2 The Model

The central issue individuals and societies face in constructing a pension system is

how to allocate income between the active years and the retirement years. The stan-

dard, two-period, overlapping generations model describes the basic relationships.

We assume that the representative individual saves only for her retirement consump-

tion.' Where we depart from the standard model, is that we treat the paygo tax rate

I We also assume that all individuals are identical, and thus neglect one of the important ratio- nales for social security - its potential effect on intragenerational redistribution. Our representative agent assumption precludes, for instance, exploring the difference between Bismarkian and Bev- eridgean paygo systems. Disney and Whitehouse (1993) examine the effect of "opting out" in the U.K. on intragenerational distribution. Pestieau (1999) examines the influence of different paygo

161 as an endogenous, collective decision. We posit the existence of a benevolent author- ity which sets the tax rate. The representative individual and the tax authority are assumed both to know the distributions of capital and labor income. Together, acting behind a "veil of ignorance", before lifetime income is known, they jointly choose the tax and saving rates that maximize expected life-time utility. They assume that the economy is on the steady state path towards which it is tending when their decision is made.6

Let WA be the labor income of the representative individual during her active years, and WB her pro-rata share of the labor income of the individuals who are working during her retirement years. Each individual provides for her retirement by contributing a portion 0 of her labor income during her active years to a balanced paygo system, and by saving and investing an amount S.7 The budget constraint relating consumption during the active and retirement years, CA and CB, to her labor income, her pension, and her saving is:

CA = (1 - G)wA - S (4.2)

CB ~ OWB+ RS (4.3) formulae on distribution within each cohort. Our representative agent assumption facilitates the analysis of cross-country heterogeneity. 6 We model a decision making process of a consensual kind, in which societies and individuals base their choices on the expected utility of a representative individual over her entire life-cycle. Galasso and Profeta (2002) and others focus instead on conflicts between categories of the popula- tion, notably between the young and the old. The dynamics in their models depend critically on the demographic structure of the population. 7 We assume that 0 < 0 < 1, that liquidity and institutional constraints impose S > 0, and that 0 and S remain constant throughout the individual's life. As iemert (2005) points out, state contingent decisions would improve welfare. Though some governments have enacted state- contingent paygo mechanisms, these are rare. A state contingent pago tax rate would attenuate the volatility of both paygo payments and paygo benefits. To the extent that the volatility of labor income is an important empirical factor, it could enhance the attractiveness of paygo provision.

162 R is the gross return on the saving of the working years. The model assumes that the individual plans to consume all her saving during retirement, and does not plan to leave a bequest.

We emphasize the fact that labor income (WA and WB) and R are random variables.

The choice8 of any S and 9 by the individual and society therefore entails evaluating the risks associated with that choice. Let

V = E[u(cA)] + 3E[u(cB) (4.4) be the ex-ante expected utility of the representative individual, where u(-) is the separate utility function applicable to income in each phase of life and 3 is a discount factor reflecting personal time preference. The tax authority and the individual jointly choose 9 and S by maximizing V in (4.4) subject to (4.2) and (4.3). We refer to their choices as 9* and S*.

We leave the possible feedback from S to R and w (through a closed economy aggregate production function, for instance) out of the analysis. One way to interpret this simplification is to argue that free trade determines factor prices up to structural and institutional country-specific effects.9

8 The two-period structure of the model abstracts from much of the complexity of an optimal consumption path. In a dynamic program, the representative agent would choose different levels of saving in each period within her working-age and retirement phases. It is difficult to conjecture how the results would differ within a dynamic setup, as many other factors may intervene. Moreover, the outcome of a dynamic program would not consist of a single saving rate, but of a set of age-specific saving rates. A plausible hypothesis would be that the representative agent would favor high-return, high-risk investments at the beginning of her active life (i.e. higher saving), and conversely, lower- return and lower-risk forms of income subsequently (i.e. lower saving). We see no reason to expect that the average saving rate over her life-time would differ systematically from the one we derive in our simplified model. ' We assume that differences in product mix, skill composition and financial institutions cause long-run differences in factor price trends. The important simplifying assumption is that these country-specific effects are not affected by domestic capital accumulation.

163 4.3 The Diversity of National Characteristics

The challenge in applying the theory of the preceding section is an empirical one.

We seek, for each of the eight countries in our sample, to construct a representation of the life time distribution of labor and capital earnings, derived from information available to the actors at the time they make their decisions. Because our objective is to explain cross-country differences in paygo rates, we derive these distributions, as much as possible, from specifically national data.

We use national dependency rates, national survival functions, national unem- ployment rates, national growth rates of labor income, national capital market re- turns, and national measures of the variability of the latter two.

4.3.1 Data

We posit that for public and private agents setting paygo rates at the beginning of the 2 1't Century, the relevant history is post-War history. Too much structural change separates the post-War from the pre-War. We also posit that these agents use as much information about the period as possible to extrapolate the future. Our earliest data point is 1950.

We identify the annual return to saving in each country with the total real return on stocks, as measured by the broadest stock index incorporating dividends available in each country, which we obtain from Global Financial Data (n.d.). Monthly stock indices are averaged to obtain an annual index. Real total annual returns, rt, are calculated by using the national CPI from IMF (2004) to deflate nominal returns. 10

10 We have experimented with using a measure of the return to saving which incorporates returns to bonds as well as to equities. However, the only bond return data we were able to find on a comparable basis for our eight countries was data on the return to government bonds. These are an inadequate measure of the return to corporate bonds. We do not obtain satisfactory results with

164 Section 4.3.2 explains how we use this data, from 1950 to 2002, to estimate country- specific, average real rates of return to saving, and to obtain country-specific measures of its variability.

Paradoxically, data on labor income is not as widely available in our eight coun- tries as data on capital income. The model calls for a measure of average, annual earnings. But comparable date on economy-wide average earnings are only available since 1955 for three of our countries, and since 1960 or up to 1970 for the others.

If we were to choose those later starting dates to estimate income expectations, we would lose up to 20 important years of data, and would be limited to a truncated view of income dynamics in each country. Therefore, our preference is to use real annual GDP as a proxy for labor income. GDP data is available from 1950 in all of our eight countries. We let real average annual labor income per employee be wt = yt/et, where yt is real annual GDP and et is total employees." Average annual labor income per member of the labor force is yt/lt = (1 - ut)wt, where it is the total labor force and ut is the unemployment rate.

We subject the model and the results to a thorough sensitivity analysis by reestimating the relevant distributions with the available data on labor earn- ings." We report in Appendix 4.C that when the models are estimated on identical but shorter sample periods, the GDP based estimations and the wage based estima- tions both support the central hypothesis, and are each similar to the other. But mixed portofolios of stocks and government bonds. " We obtain real GDP and total employees data from OECD (2005). We extend both series back to 1950 using historical data from Mitchell (2003, 2005). Dutta et al. (2000), Matsen and Thogersen (2004) and others also use GDP as a proxy for labor income. 12 For seven of the countries in the sample, we take total annual labor earnings and total employ- ment from OECD (2016). In the case of the U.K., we are able to extend the length of the sample by using the employment series in Bank of England (2016). Total annual earnings include both overtime and part-time earnings. Initial dates are 1955 for France, the United Kingdom and the United States; 1960 for Italy and Japan; 1961 for Spain; 1964 for Australia; and 1970 for Germany. Annual nominal earnings are deflated by the CPI series in IMF (2004).

165 when GDP data going back to 1950 is used, the explanatory power of the model increases. Both the earnings based models and the GDP based models provide country- specific estimates of the trend rate of growth of real labor earnings per employee, and country-specific measures of their variability. Functional forms are discussed in Section 4.3.2. Data on national unemployment rates is taken from OECD (2016). Rational paygo decisions also depend on projected dependency rates, which vary significantly from country to country as a function of demographic trends, trends in labor participation, and country-specific institutional arrangements." We also use country-specific life-tables, taken from United Nations (2015), to derive the national survival functions which condition the estimation of life-time expected utility by the representative individuals in each country (See Eq. 4.17). There is one important parameter for which we do not have national data, namely the relative rate of risk aversion (RRA). Our strategy is to seek to explain national differences in paygo rates without appealing to differences in tastes. The question nonetheless remains: at what value should we fix the common relative rate of risk aversion (assumed to be constant) in our estimations of national expected utility?' 4 A value of 5 is frequently used in empirical macroeconomic work. The length of the planning horizon in pension decisions suggests, however, that a higher rate of

" We calculate expected national dependency rates from OECD (2016) demographic and labor force participation data for the periods 2042-2062. See Appendix 4.A. We treat d as non-random, but, as a referee has pointed out, projections of its future level are, in fact, very uncertain. Esti- mating a different probability distribution of d for each country would be challenging, but would constitute a useful variant of our model. 14In a study based on portfolio allocation evidence, Friend and Blurne (1975) infer values of RRA between 2 and 3. Eisenhauer and Ventura (2003) and Guiso and Paiella (2008), analyzing individual responses to a lottery proposed by the Bank of Italy, report relative rates of risk aversion between 4.5 and 14. Beetsma and Schotman (2001) derive estimates in the neighborhood of 6 from responses to a popular Dutch television quiz program. Camipbell (2003) uses national financial data similar to ours to estimate consumption-based asset pricing models, and finds implied values of RRA between 50 and 600 for several of the countries in our sample.

166 risk aversion should apply in this case. 15 We report results with values of RRA ranging from 5 to 20, but our preferred value is 12. This is half way between the Beetsma and Schotman (2001) survey result of 6 and the value of 18 used by Krueger and Kubler (2006) in their study of U.S. social security.

4.3.2 Annual Dynamics

We model average annual earnings per employee wt as varying around a growth path which converges to a country-specific, long-run trend. Differences in these trend growth rates are in principle one of the differences conditioning the relative desirability of paygo and funded saving across countries. In order to estimate these trends, and the variability of earnings around them, without bias and efficiently, we must take account of a possible phenomenon of convergence. We allow for this possibility by including slow-down effects represented by the function f(t) in our equation for the real wage:

Inwt = a + gt + f (t) + xt (4.5)

lim f(t) = 0 t-+oo

15 In empirical studies, the commonly adopted value of the relative risk aversion parameter is based on Arrow's derivation of the certainty equivalent premium paid for a fair lottery with small positive and negative (equal) payoffs (the premium is equal to half the risk aversion parameter times the squared payoff). When a lottery is played n times it can be shown that the premium is approximately n times its value in the one-shot lottery. In our context, the working lifetime phase stretches over many years. An investment with uncertain returns kept for many years is like a lottery with a positive drift (otherwise a risk-averse individual would not invest in such a lottery) played many times. Risk increases in proportion to the number of years. In this context, it is reasonable to adopt a significantly larger risk aversion parameter compared to one used for a one-time investment.

167 where g is the rate of growth on the long-term exponential growth path toward which wt converges, w, = 1, and xt is the innovation on annual wages. 16 Though different specifications of the function f were tested, the simple exponential form pePt was selected for all countries.

. We assume that, when predicting her life-time earnings, the representative agent takes average unemployment experience into account, and projects the average an- nual earnings of persons in the labor force1 7 , yt/lt = (1 -ut)wt. We therefore complete our annual model with the simplest possible representation of the unemployment rate,

Ut, which we model as the sum of a constant mean U and an innovation (t.

Ut = U+ 0 (4.6)

We represent the log-returns to equity simply as the sum of a constant mean return f and an innovation Et. 1 8

ln(1 + rt) = +'Et (4.7)

Though we treat the long-term trend of wages and the long term return on in- vestment as independent of one another, we allow for cyclical interaction between the deviations from long-term values of both factor incomes. We model this interaction with an estimated vector auto-correlation of errors from each income generating pro-

16 Setting w, = 1 at the beginning of the active life of each cohort ensures that every cohort in every country, no matter when it is born, views and analyzes its life-time prospects in the same way as every other. 17 This is equivalent to assuming that the earnings of the employed are shared mutually with the unemployed. Empirical tests of an earlier version of our model, which incorporated the volatility po- tentially introduced by the random nature of individual unemployment experience, concluded that random individual unemployment experience does not have a significant effect on paygo decisions. This version abstracts from that potential volatility. 18 In the equation which follows, we define r as the expectation of ln(1 + rt).

168 cess. The wage income process and the variability of investment returns in "normal" times are estimated with data from 1950 through 2002 (we discuss the implications of low probability crises like that of 2008 in Appendix 4.D).

Table 4.E.2 and the first four columns of Table 4.E.3 describe the results we obtain when we estimate equations (4.5) and (4.7). Table 4.E.2 shows that, with the non-linear trend specification just mentioned, residuals xt pass stationarity tests for all countries at a 5% confidence level and for most countries at a 1% confidence level.

This implies, as we emphasized in the introduction, that a paygo system is capable of providing intergenerational insurance against the variability of real wages.

Table 4.E.3 shows that country-to-country differences in the mean and variability of real investment returns dwarf differences in the growth and variability of wages.

The contrast between Italy - where the real return to investment averages 4.1% and has a coefficient of variation of 6.0 - and the United States - where it averages 7.5% and has a coefficient of variation of 1.8 - is striking. The coefficient of variation for

Spain resembles that of Italy. The coefficient of variation for the United Kingdom resembles that of the United States. The other countries lie in between.

Our discussion so far has not yet taken account of serial and cross correlations in the dynamics of wage and investment returns. We estimate a first-order VAR process to model the dynamic interaction between xt and ct. Specifically, we suppose that

(2 =K::-: Z'r:'- + 77t (4.8) Et P6txt-1 Petft-1 ( t-1 ) t) where rt and vt are two white noises. Results are presented in columns 5-8 of Table

4.E.3. We find a significant degree of serial correlation of wage innovations - which is consistent with the existence of a business cycle - in all countries (between 0.58

169 in Spain and 0.85 in the United States). We also find moderate but significant serial correlation of investment returns in four countries with volatile stock markets (Germany, Italy, Japan and Spain). The United States exhibits a significant negative correlation between wage innovations in one year and real stock market returns in the following year. We interpret this lagged, negative feedback as reflecting the dynamics of profit margins over the business cycle.19

If annual capital returns are serially uncorrelated, the variability of their sum benefits from the Law of Large numbers. Positive serial correlation will, on the other hand, amplify the variability of the sum.

4.4 Life-time income dynamics

The purpose of this section is to show how the annual income histories which we simulate below can be condensed into summary measures of income in the active and retirement years.

4.4.1 Average wage income

Redefine WA as the average wage income of the representative individual in a specific life history or trajectory. In a similar manner, redefine WB as her share of the wages on the same trajectory of the individuals who will be working when she is retired. Though WA and WB are given for any specified trajectory, ex-ante, before a specific trajectory has been drawn, they are random variables.

19 Investigating this dynamic further would take us far afield from the focus of our study. Suffice it to say that other scholars analyzing comparable databases have found a similar negative, delayed cross correlation. See Bottazzi et al. (1996) in their macroeconomic study of international portfolio decisions.

170 For the moment, assume that the maximum length of a working period is T years and the maximum length of retirement is also T (we will introduce survival probabilities later). Then2'

T

t=1

Let pt be the number of retired persons, and recall that yt represents the wage bill. Then: 2T WB = Yt/Pt (4.10) T t=T+1 Noting that one can decompose

Yt ytetlt (4.11) Pt et it pt where et is total employment, it the labor force, wt = yt/et is average annual earnings per employee, et/lt = 1 - ut, and It/pt = 1/dt, with dt the dependency ratio (or ratio of retirees to the active labor force), one can rewrite (4.11) as

WB 2T (1 - ut)wt/dt (4.12) Y, T t=T+1

The dependency ratio, dt, differs from country to country because of differences in the growth rate of the population and differences (due to custom, fertility and mortality) in the ratio of retirement years to active years. On any given historical trajectory, dt

20 In the following expression, wt stands for the real wage of the representative individual in the t-th year of her active life. We assume (for lack of better information) that this wage does not depend on seniority, and, therefore, that the representative individual earns the economy-wide average wage rate every year.

171 varies as demographic patterns change. But, on a steady state path, dt is necessarily constant.2 1

4.4.2 Average investment income

Consider now the computation of the average, life-time return to saving. We assume, for simplicity, that the purpose of saving is exclusively to provide income during retirement, and we focus on the rate at which annual saving can be transformed into a retirement annuity. If the representative individual saves S every year of her active life, she will have accumulated

T T F = S II (1+r,) (4.13) t=1 r=t by the time she retires. 22 That sum will purchase an annuity A, which we assume satisfies the following "fair value" constraint:

2T F = A (4.14) t=T+ 1 -r=T+1(1+ rr)

with o-t being the representative individual's probability of survival to age t.

21 The dependency rates used in our simulations are computed from OECD demographic and labor force participation data for the periods 2042-2062. The projected averages for this period are: Australia - 0.27; France - 0.35; Germany - 0.45; Italy - 0.47; Japan - 0.39; Spain - 0.44; U.K. - 0.29; and U.S.A. - 0.24. See Appendix 4.A. 22 We focus on her potential saving, conditional on survival. We account for the probability of survival below.

172 We can write the resulting annuity as23

A =( ) I_~ S (4.15) t=T+11-t(1+rr) or A=RS (4.16) with R equal to the expression in brackets in (4.15). Note that R measures both accumulation and annuitization. It is, as before, a random variable.

4.4.3 Expected utility

With these empirical definitions of WA, WB and R, the budget constraints 4.2 and 4.3 can be interpreted as depicting the relationship between average consumption dur- ing the representative individual's active years and average consumption during her retirement years. Our central hypothesis is that, in each country, the representative individual and the tax authority jointly choose the paygo tax rate and the saving rate, 9* and S*, which maximize the representative individual's ex-ante, expected life-time utility. In calculating this expected utility, the individual and the tax authority take account of the fact that the individual will enjoy it conditional on her continuing to be alive, 24 and that a positive rate of time preference, 6, leads her to discount more distant instantaneous utilities more heavily. The probability of survival, o-t, and the

23 In writing (4.15) and (4.16), we do not mean to imply that actuarial markets are broadly prevalent or widely used in our eight OECD countries. We simply assume that, when projecting her retirement income, the representative agent takes account of the fact that her retirement fund will continue to grow, conditional on her survival, after she stops contributing to it. This also allows us to assume that total accidental bequests are zero. 24 We take the probability of survival as exogenously given. Pestieau et al. (2008) argue that survival rates are a function of spending on health, which in turn depends on pension provisions.

173 rate of time preference will affect the weight attributed to more distant utilities in the same manner. The planner's maximand can therefore be written:25

' T 2T ( t+2T (1 6)tu(cB) (417) V = E U t=1 U JYt=T+l 1+

We assume that u(c) is characterised by a constant relative risk aversion (RRA). 0* and S* are the values which maximize V in (4.17).

4.5 Results

We now use equations (4.4), (4.5), (4.7) and (4.8) to simulate many life histories of wt, rt and Ut (we chose to generate 1000). For each history, we draw values at random from empirical distributions of rt, vt, and the unemployment innovations

(t 27 constructed by boot-strapping our estimates of these error terms. We then proceed to compute 0* and S*, using the algorithm described in Ap- pendix 4.B. We focus on our model predictions of 9*. Appendix 4.B presents a full table' of values of 9* and S* based on a range of different assumptions regarding RRA. 25 In order to remain within the framework of a two-period model, we have to make the simplifying assumption that income and consumption in the active years and income and consumption in the retirement years are equal every year to their respective average values. 26 T is the maximum length of the representative agent's active years and retirement years. We lack the institutional, national information which would allow us to attribute a different value for this parameter to each country. Realized, average terms depend on the survival function. The assumption we apply to all countries is that T = 40. We also assume 3 = .02. We have tested that results are not sensitive to these assumptions. 27 Unemployment innovations (t are the difference between the unemployment rate ut and its mean (Eq. 4.6). We restrict the estimation of the empirical distribution of unemployment innova- tions to the period 1980-2002 to avoid difficulties associated with the secular increase in unemploy- ment rates in Europe after the first oil shock.

174 Table 4.E.4 presents the values of 0* which we obtain from these simulations when

RRA = 12 and RRA = 5, and compares them with the values of effective tax rates, 9e, reported in Table 4.E.1. We discuss both results in the next section.

4.5.1 Testing the Model

The test of our rational, economic model is the degree to which 0* predicts actual values of Oe in a cross-section regression,

k =* + sE, (4.18) where k is a country index. We use two different sets of estimates of 0* to test the hypothesis. The set to which we devote our main focus is the one in which GDP is a proxy for labor earnings, and samples in all countries begin in 1950. In the other, we use actual, annual data for labor earnings, and are, therefore, confined to sample periods which begin variously in 1955, 1960, 1961, 1964 and 1970. In both cases, we test the hypothesis twice, once under the assumption that RRA = 12, and once under the assumption RRA - 5. The GDP based results are presented in this section. Appendix 4.C presents the earnings based results and compares them with the GDP based results. In all cases, the test supports our central hypothesis, but the support is strongest with the GDP based data and longer sample periods.

In Figure 4.E.1, we plot the GDP-based predicted rational rates based on RRA

12, 0*, on the horizontal axis and the effective rates, 0e, on the vertical axis. There -2 is clearly a strong relationship between the variables. R of the linear regression drawn in the figure is 83%. This result confirms our basic hypothesis that objective, economic differences explain the diverse pattern of effective paygo tax rates in our

175 sample.

However, the regression implies that actual rates are too sensitive to predicted rates. The estimated slope, 3.7, far exceeds the theoretical value of 1.0 in equation

(4.18). We conjecture that omitted variables cause the underlying relationship to be non-linear. The argument is that high paygo rates and low saving rates are correlated with weak regulatory infrastructure - poor policing of insider trading, inadequate protection of minority stockholder rights, etc. When capital markets provide low and variable returns, these institutional weaknesses also tend to be present, and to cause reliance on paygo provision to be even higher than it otherwise would be.

We therefore estimate the following Poisson relationship:

9% =exp(a + b*) + Ek (4.19)

Predicted and actual values are plotted in Figure 4.E.2. In the Poisson model, the estimated slope is 1.0 when 0* is low, and increases as it rises. 28 The Poisson model is our preferred model.2 9

4.5.2 Decomposition of Explained Variance

Though we have only one explanatory variable in our model, it is constructed as a composite of many elements. We now ask how much each element contributes to the explanatory power of the model. To that purpose, we run a number of counter- factual simulations in which we remove these elements, one at a time, and evaluate the resulting reduction in explained variance. At each stage, the way we remove 28 In the Poisson model, the slope is 1.0 at 0* = 7, and 2.7 at 0* = 12. 29Although the F-test of the linear model and the Chi-square test of the Poisson model are not strictly comparable, a RMSE corrected for degrees of freedom, computed for the Poisson case (4.2), is smaller than the RMSE of the linear estimate (5.0).

176 the particular element's contribution to explained variance is by attributing its value in one country to all of the countries. For instance, we consider what happens to the model's ability to explain cross-country variation, when we attribute U.S. capital market characteristics - or French labor market characteristics - to each of the other seven countries. We take the reduction in explained cross-country variance of each counter-factual simulation to be a heuristic measure of the contribution of the removed element to total explained variance.3 0 The results are presented in Tables

4.E.5 and 4.E.6. Table 4.E.5 analyzes the results of standardizing capital market characteristics, and Table 4.E.6 of standardizing labor market characteristics.3 1

The tables elicit three general observations, which can be made before we examine the details:

* Understanding the stochastic structure of the environment in which individuals

and societies plan for retirement is critical to understanding those decisions.

The stochastic nature of our model is not just a refinement; it is its central

feature.

* Differences in capital market characteristics account for more of the cross-

country variance of paygo levels than any other factor (with the projected

dependency rate being a close second). The reason is clear. The volatility

of capital market shocks is, in general, two orders of magnitude greater than

30 The highly non-linear character of the model means that we can not attribute particular significance to combinations of these individual measures. 31 In both Tables 4.E.5 and 4.E.6, we have set the cross-VAR coefficients of Eq. (4.8) equal to zero before simulating rational paygo rates. We do this because our purpose is to identify separately the effects of capital market innovations and labor market innovations. Non-zero cross-effects would cause them to be comingled in ways that vary from country to country. As it turns out, the estimates used as a base of comparison in these tables are similar to the "rational paygo rates" presented in Table 4.E.4. Eliminating cross effects has only a marginal impact on every country except the United States. For consistency, we use these estimates with no cross-effects also to simulate standardizing the dependency rate, d.

177 that of labor market shocks, and it varies dramatically from country to country. The striking feature of this result is that it is very different from what the early theorists of paygo systems expected. They focussed on the volatility of labor earnings. One of their central points was that a paygo system may be pareto improving, even when the economy is dynamically efficient, because it allows agents to insure against the volatility of labor earnings.

* The magnitude of the effects of different national characteristics depends im- portantly on what we assume the relative rate of risk aversion to be. This observation follows naturally from the first, above. If stochastic structure is important, then the lens through which individuals evaluate stochastic events - their utility function - is central.

Table 4.E.5 analyzes the contribution of capital markets to paygo differences. If we focus on the left half of the table, where RRA = 12, we see that the volatil- ity of capital market shocks accounts for about a quarter of the explained vari- ance.12 Volatility proves to be more important than differences in mean expected return. Fixing that everywhere at its U.S. value reduces explained variance by only 4%. We conjecture that what lies behind this result is a) the striking extent of the differences in the variability of capital returns from country to country, and b) the sensitivity of expected utility to this variability. If we turn to the right side of Table 4.E.5, where RRA = 5, we see that the explanatory importance of capital market variability appears even greater there. In that case, differences in the volatility of returns account for 70% of the explained variance, whereas differences in mean expected return account for barely 1% of ex-

" In colunn 3, the reduction in explained variance when this volatility is set everywhere equal to U.S. values, is 22%, which is about a quarter of the total explained variance, 85%.

178 plained variance.33 What is perhaps even more striking is the magnitude of the shift away from paygo triggered by improvements in capital market volatility. Whether

RRA = 12 or 5, the preference of all countries for saving increases when capital mar- ket volatility decreases. But the drop in 0* associated with lower volatility is much larger when RRA = 5.34

Table 4.E.6 analyzes the contribution to paygo differences of factors which affect implicit paygo returns. The first thing one notices is the insignificance of the variability of labor earnings, whether RRA = 12 or 5. When we attribute the relatively low volatility of these earnings in France to the other countries, explained variance is effectively unchanged, which implies that this factor is not a significant contributor to cross-country variance. Similarly, when we attribute the French trend rate of growth of labor income, g, to other countries, explained variance is again effectively unchanged (This result is not shown in the table.) Admittedly, our prior for that simulation was ambiguous, because the multiplicative nature of the wage rate error term (see Eq. 4.5) implies that increasing g, and thus raising the implicit return of paygo, also increases the amplitude of wage shocks, and raises the volatility of that return.

The dominant influence on the labor income side is that of the dependency rate, d, which agents expect will prevail when the current generation retires. This parameter, which we treat as non-random, figures directly and importantly in calculations of the implicit return of paygo. Table 4.E.6 shows that its marginal contribution is 29% of explained variance when RRA = 12, and 11% of explained variance when RRA =

"' Part but not all of the striking drop in variance explained is a reflection of the sharp drop in the rational rates of Germany and Japan. The same calculations excluding Germany and Japan attribute almost 50% of the variance explained to the remaining capital shocks. 34 Lower volatility causes the average rational paygo rate to drop from 9.1 to 4.7 when RRA = 5 (Compare col. 6 with col. 5). The drop is more moderate, from 12.6 to 10.5, if RRA =12 (Compare col. 3 with col. 2).

179 5.35 What is particularly striking about the dependency rate is the direction of its effect on rational paygo rates. Reducing d appears to have both an income and a substitution effect. When the substitution effect dominates, an increase in ex- pected d creates an incentive for society to lower 0*. Agents expect a larger number of claimants for future pensions, and, therefore, other things being equal, reduced pensions. When the income effect dominates, increasing d - though it reduces the expected return of paygo for the representative individual - causes 6* to rise. In Table 4.E.6, when risk aversion is high and RRA = 12, the income effect dominates in every country. When RRA = 5, the substitution effect dominates in Australia, the United Kingdom and the U.S., and the income effect continues to prevail, but more moderately, in the remaining countries. Higher risk aversion appears to cause individuals and society to protect the rel- ative stability of retirement income that paygo provides. The return to saving is highly volatile in all countries, much more so in some than in others. Paygo provides a less volatile alternative. Thus, when individuals are very risk averse, a drop in the implicit return of paygo causes them to want more of it. Which value of RRA best describes sensitivity to risk in the developed countries in our sample? The argument in Section 4.3.1 that the length of the planning horizon implies greater sensitivity to risk, weighs in favor of higher values of RRA. Moreover, the evidence in Appendix 4.C is that higher values of RRA, specifically RRA = 12, contribute to a more powerful explanation of the cross-country variation of effective rates. In our judgment, we understand the logic of pension planning in our countries

3 We find, in simulations not reported here, that standardizing the survival functions has only a minor effect on rational paygo rates. Differences in life expectancy per se appear to have less effect on pension decisions than fertility and the many institutional and other factors affecting dependency.

180 better, if we attribute high relative rates of risk aversion to them. When RRA is on the high end of the acceptable range, we can more clearly identify the pension impli- cations of the factor which most differentiates one developed country from another - the stochastic characteristics of its capital markets. The fact that high values imply under certain circumstances a desire to defend the stability of paygo in spite of its declining internal return does not alter our judgment.

4.6 Further Results

In this section, we use the rational, economic model (maintaining RRA = 12), to consider alternative economic scenarios. We ask how paygo rates may be influenced by the integration of national capital markets, and by a new awareness of the low but significant probability of crises like that of 2008. In each case, we modify the relevant assumptions, rerun the algorithm described in Appendix 4.B, and compare the new predictions with our previous "normal" rates.

4.6.1 The integration of financial markets

If the capital markets of the large OECD countries were to become fully integrated, what would the consequences be for national patterns of paygo tax rates? We address the issue by performing the following thought experiment. Suppose that our eight countries continue to have distinct and separate labor markets, but that they share a common, pooled capital market. Assume further, for simplicity, that the stochastic characteristics of the return to investment in that pooled capital market are the same as those we have observed for the United States. The simulations presented in Table 4.E.5 closely approximate this scenario. In Table 4.E.7, col. 3, we

181 go one step further, and completely replace each country's own distribution of capital returns and its own VAR coefficient measuring the serial correlation of investment

3 returns (p,,_1 in equation (4.8)) with that of the United States. " Not surprisingly, the pattern of retirement provision converges in the eight countries. The cross- country standard deviation of the predicted rational rates goes down from 2.8 in the standard set of results (where VAR cross-effects are muted) to 0.5 in the integration experiment results. Access to the relative stability of a global capital market would bring about declines in paygo rates which vary commensurately with the previous volatility of national capital markets.

A completely different exercise involves assuming that the representative agent eliminates financial markets completely from her retirement planning. We do this in Table 4.E.7, col. 4, by imposing the constraint S = 0 in all countries. This thought experiment leaves all the other cross-country differences (dependency, wage distributions, long-term wage growth trend) full room to play out. Deprived of saving, agents choose higher paygo rates in all countries. The exercise shows how much capital markets (in this case their absence) influence desired paygo rates.3

36 The radical nature of this hypothetical exercise should be emphasized. It abstracts from all of the institutional differences that characterized OECD capital markets in the second half of the twentieth Century. It also assumes away national heterogeneity of capital. For the real rate of return to be identical across countries, they would all have to share a common capital market. In addition, those which had common currencies or were linked by fixed exchange rates would have to share a common inflation rate. Currency depreciation would have to exactly compensate for inflation differences between countries with floating rates. 37 This exercise was suggested by an anonymous referee. There is some convergence of paygo rates, but it is not as stark as in the integration of financial markets case. The coefficient of variation (last row in Table 4.E.7) drops by 76% in the integration of financial markets exercise (compare col. 3 to col. 2) and by 40% when the representative agent forgoes any access to capital markets (compare col. 4 to col. 2).

182 4.6.2 Global Crises

We have assumed that in 2002 individuals and the pension authority did not allow for the possibility of crises like that which unfolded in 2008. If they had, would they have made the pension commitments implicit in the predictions of section 4.5? Clearly, the realization of such a possibility would have reduced expectations of life-time utility. But, even though faced with a more somber future, each society would still have had to choose the best possible balance between paygo and saving. Would awareness of the possibility of a crisis have affected those calculations? Our analysis in Appendix 4.D suggests that paygo rates would not have been fundamentally altered.

4.7 Conclusion

The central result of this article is that a rational economic, and consensual model of how societies set paygo tax rates replicates the diverse pattern of effective paygo rates in large OECD countries at the beginning of the 21st Century. The model is a simple, two-period OLG model, in which a representative individual and a benevolent tax authority jointly choose the tax rate and the saving rate which maximize the expected life-time utility of the representative individual. We assume that the individual and the tax authority both have complete knowledge of the distribution of labor and capital income over the individual's lifetime. We construct this distribution by estimating annual equations for the wage rate and the return to capital, and using them to simulate large numbers of life histories. Taking expectations over these life histories, we compute the expected life-time utility of the representative individual as a function of the paygo tax and saving rates. The model predicts that society will choose the tax and saving rate which maximize the individual's expected life-time

183 utility. Though the model assumes that every agent expects the paygo budget to be balanced over her lifetime, its focus is more on pension provision than pension financing.

We find that societies in which capital markets are relatively stable, and offer rates of return well in excess of the rate of growth of labor income, tend to have moderate paygo tax rates. By contrast, societies in which capital markets are rel- atively volatile, choose higher paygo tax rates. These considerations - the most important of which are differences in the volatility of the return to capital - explain approximately 83% of the cross-section variance of observed effective paygo tax rates in 2002. The results point to the importance for pensions of reducing capital market volatility. Allowing for the possibility of extreme events, like those of 2008, does not change our conclusions. Among the many questions calling for future research, one of the central ones has to do with the effect of the distributional characteristics of paygo systems in a heterogeneous population. Addressing such questions will require the use of substantially more disaggregated models.

184 Appendix

4.A Constructing Estimates of Effective Paygo Tax Rates in 2002

Following Disney (2004), we write the effective tax rate as

w e

Of the two parameters on the right hand side, the more difficult to estimate is , "the relative pension level", because it depends on numerous, detailed, country- specific regulations. We use estimates of this parameter first published in 2003 in a new OECD publication, Pensions at a Glance. Noting that laws in effect at a given moment generally entail future commitments to increases or reductions of benefits spread out over many years, the OECD computes, on the basis of legislation in effect in 2002, the average pension which an employee who entered the labor force that year, and subsequently fulfilled all relevant working requirements, would be entitled to at the statutory retirement age. To the extent that a social security system is contributory, different individuals (all of whom have met all the work requirements) will receive different pensions, because of differences in their life-time

185 earnings. The OECD therefore bases its projection of pension entitlements on what it projects the range of life-term earnings to be. It then averages these entitlements, and presents this average "relative pension level" as a percentage of its projection of the economy wide average wage during the years when the individual in question will be in retirement.

The forward looking nature of the OECD measure of the relative pension level,

A, requires for consistency that our measure of P also be forward looking. It should reflect the balance between retirees and workers for the duration of the retirement of the cohort whose entitlements we are estimating. We base our estimate of 2 in each country on OECD projections of the number of retirees and the number of active employees in each country between 2042 and 2062.

We adjust for the statutory nature of the OECD's measure of relative pension levels by recognizing that many individuals who are working at the time they retire will nonetheless not receive a full, potential pension, because they have incurred spells of maternity leave, unemployment, or other interruptions from active work.

We estimate that the average, normal pension is 20 % lower than the potential level calculated by the OECD."8 As in Disney (2004), we further argue that persons who were not employed during the last decade of their working lives receive half of a normal pension, this additional discount corresponding to average widow and survivor provisions.

The effective tax rates thus obtained are not current tax rates. They reflect the future commitments that legislation currently in effect implies for the cohort entering employment, and measure the burden of those commitments on those who will be

3 OECD (2003) presents projected pension levels for men and not for women. Though, in all the countries in our sample, statutory pension provisions are the same for women as for men, the actual pensions women receive are lower than those received by men, because their annual earnings are lower. Much of our discount corresponds to the lower average career earnings of women.

186 employed during this cohort's retirement. Their forward looking nature corresponds to the forward looking nature of the collective decision process hypothesized in our model.

4.B Simulation Algorithm

We construct measures of the rational, economic paygo tax rates which our model predicts by using the annual equations for labor and capital income estimated in section 4.3 to simulate numerous life cycle histories. The algorithm has seven steps:

1. To simulate each life cycle j, we draw a full history, 0 < t < 2T, from the

empirical distributions of 'qt, vbt and (t . We use the first two to construct a sample of innovations (XU, ) using the vector auto-regression (4.8).

2. Capital returns are simply deduced from (4.7), whereas annual wages in the

steady-state regime are given by

InWu4) = +4+ t + (4.20)

The convergence of earnings is assumed completed.

3. Using (4.9) and (4.12), we compute WA,,WB. We use (4.15) and (4.16) to

compute R(A.

4. We select values of the saving and tax rates (0, S), and compute c , CU

and the lifetime utility associated with that history and those values of (0, S),

gito1.W U(A) ZtaTd l (1+.5)tims.

5. We go to step 1 and loop 1000 times.

187 6. We compute expected utility V for each pair (0, S), using (4.17).

7. We scan over the space 0 < 9, S < 1, allowing each variable to increase in steps of 0.001. The values 9* and S*which maximize V are the values which the model predicts that society and the individual will adopt.

The values of S* and 0* depends on RRA. Table 4.F.1 displays the values obtained when RRA varies between 5 and 20. As RRA rises and societies become more risk averse, they increasingly avoid the volatility of capital markets, and rely more on paygo. 9* rises, and S* falls.

4.C Estimations Using Data on Average Annual Earn- ings.

In sections 4.3 through 4.5, we construct measures of rational paygo rates using GDP as a proxy for earnings. This makes it possible to start our estimates of the underlying annual models in 1950, prior to the availability of comparable earnings series. In this appendix, we compare the results above with the results we obtain when we use actual, average annual earnings data. Working with a range of starting dates increases the difficulty of estimating the convergence function in equation (4.5). In the five of the eight cases in which the sample period starts more than ten years later than previously,40 actual and trend

39 We also use this simulation algorithm to calculate quasi confidence intervals, conditional on the choice of RRA, for the values of 0* and S* obtained. We take the maximum value of the expected life-time utility of the representative agent in each country, and trace pairs of paygo tax and saving rates that cause expected utility to be 99% of that maximum value. The exercise shows that the estimated utility function is relatively flat, and that variations of +/- 10% of 0* are within this 99% confidence interval. 40 See Table 4.F.2 for starting dates.

188 growth are too collinear to allow simultaneous estimation of the trend and the con- vergence function. Hence, we constrain g in (4.5) to be equal to the actual rate of growth between 1973 (the first oil shock) and 2002 for the three countries which earn- ings data is available from 1955 onwards and between 1980 and 2002 for the other countries. We then estimate the convergence function conditional on that prior, and use the values of the residuals of the wage equation (4.5) along with those of the return to capital equation (4.7) to estimate the coefficients of the VAR (4.8), and to extract the i.i.d. series which drive annual income in each country. All eight real wage equations pass the same stationarity tests to which the GDP based data were subjected.4 1 The fact that the wage and capital innovations jointly determine the error terms in the wage and capital return equations means that when wage data is only available for a shorter period, capital return data can also only be used for that shorter period. The counterpart to these shortcomings is that the earnings data directly measures labor earnings, rather than a proxy for labor earnings.

In Table 4.F.2, we compare rational paygo rates obtained from this earnings data with the rational paygo rates derived in the text from GDP data. The two series display similar cross-section characteristics. Each one, on its own, explains almost 80% of the cross-section variance of effective rates presented in Table 4.E.1. The last two lines of the table present the adjusted Root Mean Square Error of the linear and Poisson regressions of effective on rational rates.

4' Detailed results are available upon request.

189 4.D The Implications of 2008

We examine the possible effects on paygo tax rates of incorporating the previously ignored probability of a crisis. Using Barro and Ursia (2008) and Barro (2009), we infer a binomial variable with a known probability of realization which simultaneously generates two outcomes - a decline of GDP (our proxy for wage income) and a drop in the return to capital." In steps 1 and 2 of the algorithm in Appendix 4.B, we add draws from this binomial distribution to the values of 't and E't derived from the population of shocks previously estimated from data from 1950 to 2002. Each of the eight countries is characterised by its own population of "normal" shocks 't and ^t.The binomial distribution describing crises, which is added to those "normal" shocks, is the same across countries (we only have one estimate), but we continue to simulate independent life histories for each country.

Examination of Table 4.F.3 suggests that incorporating the expectation of low probability crises does not substantially change the calculus of rational individuals and authorities. The new predicted paygo tax rates are similar to the old, though they are everywhere slightly higher. Though expected utility declines, the trade-off between paygo and saving remains substantially the same.

42 Specifically, we refer to the data in Barro and Ursia (2008) on macroeconomic crises since 1870. According to Table 8 p.279, 70 crises have taken place in 17 OECD countries in the 136 years of history examined, implying a probability of crisis equal to 3.0%. The corresponding average decline in GDP per capita was 17.4% (Table 9 p.283). We estimate the shock to the stock price to be 22.9%, which is the average decline among OECD countries during GDP crises as calculated from Table C2 p.323.

190 4.E Tables and Figures

Table 4.E.1: Effective Paygo Tax Rates

Effective Tax Rates

Australia 4.8

France 18.4

Germany 19.1

Italy 36.3

Japan 18.8

Spain 33.5

United Kingdom 10.9

United States 8.7

Source: OECD (2003), OECD (2013) la-

bor force participation projections and au-

thors' calculations.

191 Table 4.E.2: Unit Root Test Applied to Wage Innovations (five lags)

Augmented Dickey-Fuller Statistics Phillips-Perron Statistics

Australia -3.67*** -3.81***

France -3.70*** -3.37***

Germany -2.53** -2.62***

Italy -2.55** -2.65***

Japan -2.01** -2.67***

Spain -2.17** -4.82***

United Kingdom -2.93*** -2.39**

United States -2.65*** -2.06***

Note: *** (resp. **) represent significance at the 1% (resp. 5%) level. Critical

values at the 1% and 5% levels are respectively equal to -2.61 and -1.95 for aug-

mented Dickey-Fuller statistics and for Phillips-Perron statistics. The stationar-

ity tests results presented in this table are based on wage series constructed us-

ing GDP as a proxy for annual labor earnings. Wage series using OECD (2016)

data on actual labor earnings were also tested for stationarity and are discussed in Appendix 4.C. Source: Authors' calculations.

192 Table 4.E.3: Annual Statistics

6-x Er r / t px,_ Pxtet- PEtxt-l PEct-

Australia 1.40% 0.049 6.38% 2.58 0.59*** 0.04 -1.06** 0.18 (0.12) (0.03) (0.46) (0.14) France 1.59% 0.015 6.81% 2.74 0.67*** 0.00 -3.36* 0.17 (0.11) (0.01) (1.76) (0.14) Germany 1.96% 0.030 7.55% 2.58 0.75*** 0.02 0.93 0.33** (0.11) (0.02) (0.92) (0.14) Italy 1.93% 0.030 4.12% 5.98 0.72*** 0.01 -0.21 0.35** (0.11) (0.01) (1.20) (0.14) Japan 1.90% 0.056 8.21% 2.48 0.76*** 0.01 -0.06 0.31** (0.10) (0.03) (0.52) (0.14) Spain 1.60% 0.034 5.59% 3.79 0.58*** 0.03 -0.70 0.59*** (0.12) (0.02) (0.80) (0.12) United Kingdom 1.94% 0.034 6.78% 2.41 0.84*** 0.03 -1.15* 0.20 (0.08) (0.02) (0.66) (0.14) United States 1.25% 0.028 7.45% 1.84 0.85*** 0.00 -1.88*** 0.17 (0.09) (0.02) (0.68) (0.14)

IIAu Nt St. du d* A *i I-th 6 C:ULn ar errLV.O aypu-LI v sV. , anL reree a sgnum' an aLLC.U the 1%, 5% and 10% levels, respectively. This table presents estimates of coefficients in equations (4.5), (4.7) and (4.8). Wage series are constructed using GDP as a proxy for annual labor earnings. is the estimate of the long-term growth rate of wages. Ox and 6 r are empirical estimates of the standard deviation of innovations xt and Et. Er is the empirical estimate of the expectation of 1n(1 + rt). The estimates of the four VAR coefficients are reported in columns 5-8. Source: Authors' calculations.

193 Table 4.E.4: Effective and Predicted Paygo Tax Rates

Predicted Paygo Tax Effective Tax RRA=12 RRA=5

Australia 4.8 10.8 7.0

France 18.4 12.4 8.3

Germany 19.1 14.7 11.6

Italy 36.3 16.6 15.2

Japan 18.8 12.5 8.5

Spain 33.5 16.7 14.2

United Kingdom 10.9 9.2 5.5

United States 8.7 11.3 9.4

R - 83% 78%

Note: Predicted paygo tax rates, generated by Monte-Carlo sim- ulations with 1000 replications, assuming a relative risk aver- sion coefficient of 12 in column 2 and 5 in column 3. The last row reports the R2 of the linear regression of effective tax rates

(column 1) on predicted rational paygo tax rates. Source: Authors' calculations.

194 Table 4.E.5: Capital Return Effects

Predicted Tax - RRA = 12 Predicted Tax - RRA = 5 Effective Er Tax Base of Same vt Same Base of Same vi Same Comparison & pE,,E Er Comparison & pft 1 Er Australia 4.8 11.1 10.4 10.1 7.8 5.9 5.5 6.38% France 18.4 12.3 10.0 11.6 8.0 3.4 6.8 6.81% Germany 19.1 13.6 8.9 13.6 9.2 0.8 9.4 7.55% Italy 36.3 16.6 15.7 15.0 15.3 12.7 11.6 4.12% Japan 18.8 12.6 8.5 13.1 8.6 0.6 9.8 8.21% C-" Spain 33.5 16.8 13.0 16.0 14.7 5.7 12.9 5.59% United Kingdom 10.9 9.6 9.1 9.0 6.4 5.1 5.1 6.78% United States 8.7 8.3 8.3 8.3 3.0 3.0 3.0 7.45% R2 - 85% 64% 82% 82% 25% 81% R2 : Reduction - - 22% 4% - 57% 1% Average 18.8 12.6 10.5 12.1 9.1 4.7 8.0 Std. Dev. 10.5 2.8 2.4 2.6 3.8 3.6 3.2 Std. Dev./Avg. 0.56 0.23 0.23 0.22 0.42 0.77 0.40 Note: Predicted paygo tax rates, generated by Monte-Carlo simulations with 1000 replications, after setting cross-VAR coefficients to 0, assuming a relative risk aversion coefficient of 12 in columns 2-4 and 5 in columns 5-7. vt and p,,,_ are the return white noise and VAR serial correlation coefficient in Eq. (4.8). Er is the empirical estimate of the expectation of 1n(1 + rt) in Eq. (4.7). R2 is the R-squared of the linear regression of effective tax rates (column 1) on predicted rational paygo tax rates. The row R2 : Reduction reports the drop in explained variance in columns 3-4 relative to column 2 and in columns 6-7 relative to column 5. Std. Dev. and Avg. respectively stand for the standard deviation and the average of the paygo tax rates. Source: Authors' calculations. Table 4.E.6: Labor Return Effects

Predicted Tax - RRA = 12 Predicted Tax - RRA = 5 Effective d Tax Base of Same gi Same Base of Same ap Same Comparison &p,,_ d Comparison & p,,_1 d Australia 4.8 11.1 11.1 12.4 7.8 7.7 7.2 27.4% France 18.4 12.3 12.3 12.3 8.0 8.0 8.0 35.0% Germany 19.1 13.6 13.6 11.7 9.2 9.2 9.0 44.8% Italy 36.3 16.6 16.6 13.5 15.3 15.3 13.2 47.0% I, Japan 18.8 12.6 12.5 11.8 8.6 8.6 8.5 39.2% Spain 33.5 16.8 16.8 14.3 14.7 14.7 13.2 44.4% United Kingdom 10.9 9.6 9.5 10.5 6.4 6.3 5.9 29.3% United States 8.7 8.3 8.2 9.2 3.0 3.0 0.5 23.9%

R2 - 85% 85% 56% 82% 83% 72% R2 : Reduction -- 1% 29% - -1% 11% Average 18.8 12.6 12.6 12.0 9.1 9.1 8.2 Std. Dev. 10.5 2.8 2.9 1.5 3.8 3.8 3.8 Std. Dev./Avg. 0.56 0.23 0.23 0.13 0.42 0.42 0.47 Note: Predicted paygo tax rates, generated by Monte-Carlo simulations with 1000 replications, after setting cross-VAR coefficients to 0, assuming a relative risk aversion coefficient of 12 in columns 2-4 and 5 in columns 5-7. rt and px_, are the labor white noise and VAR serial correlation coefficient in Eq. (4.8). d is the dependency rate. R2 is the R-squared of the linear regression of effective tax rates (column 1) on predicted rational paygo tax rates. The row R2 : Reduction reports the drop in explained variance in columns 3-4 relative to column 2 and in columns 6-7 relative to column 5. Std. Dev. and Avg. respectively stand for the standard deviation and the average of the paygo tax rates. Source: Authors' calculations. Table 4.E.7: Integration of Financial Markets (RRA=12)

Effective Base of U.S. Capital Tax Comparison Market S = 0

Australia 4.8 11.1 8.8 14.0 France 18.4 12.3 9.0 15.9 Germany 19.1 13.6 9.1 17.2 Italy 36.3 16.6 9.5 17.9 Japan 18.8 12.6 9.4 16.1 Spain 33.5 16.8 9.6 18.7 United Kingdom 10.9 9.6 8.2 12.4 United States 8.7 8.3 8.3 13.1

Std. Dev. 10.5 2.8 0.5 2.1 Std. Dev./Avg. 0.56 0.23 0.05 0.14

Note: Predicted paygo tax rates, generated by Monte-Carlo sim- ulations with 1000 replications, after setting cross-VAR coeffi- cients to 0, assuming a relative risk aversion coefficient of 12.

In column 3, vt and p ,,,,1 (Eq. (4.8)) and E, (Eq. (4.7)) are everywhere set equal to their values in the U.S. This simulation combines together the assumptions of the simulations of column 3 and column 4 of Table 4.E.5. Std. Dev. and Avg. respec- tively stand for the standard deviation and the average of the paygo tax rates. Source: Authors' calculations.

197 Figure 4.E.1: Linear Model

ItI WlT OSP

V)

2t 04 0 GER

Ho-C: *UK 0 us RMSE = 4.96 F-stat = 29.9 OAUS p-value 0.002

Predited atioa Tx R F-s(RAt = 129. ) 10 12 14 16 18

Figure 4.E.2: Poisson Model

SI

)

C-4 *GER

C) _*UK *us RMSE = 4.20 SAUS Chi2(1) = 38.9 p-value = 0.000 0D 8 10 12 14 16 18 Predicted Rational Tax Rate (RRA = 12)

198 4.F Additional Tables

Table 4.F.1: Predicted Rational Paygo and Saving Rates (in percent) for different RRA

Predicted Tax/Predicted Saving Rate Effective Tax RRA=5 8 10 12 15 20

Australia 4.8 7 9.4 10.3 10.8 11.2 11.8 5.1 3.1 2.4 2 1.6 1.1 France 18.4 8.3 10.9 11.8 12.4 13 13.4 4.9 3 2.4 1.9 1.5 1.1 Germany 19.1 11.6 13.7 14.3 14.7 15.1 15.4 3.6 2.1 1.7 1.4 1.1 0.8 Italy 36.3 15.2 16.2 16.4 16.6 16.7 16.8 2.1 1.2 1 0.8 0.6 0.4 Japan 18.8 8.5 11 11.9 12.5 13.1 13.7 4.6 2.9 2.3 1.9 1.5 1.1 Spain 33.5 14.2 15.9 16.3 16.7 16.9 17 2.8 1.6 1.3 1 0.8 0.5 United Kingdom 10.9 5.5 7.8 8.7 9.2 9.7 10.1 4.9 3 2.3 1.9 1.5 1.1 United States 8.7 9.4 10.6 11 11.3 11.5 11.8 3 1.8 1.4 1.1 0.9 0.6

R2 - 78% 82% 83% 83% 84% 82%

Note: Predicted paygo tax rates, generated by Monte-Carlo simulations with 1000 replications, assuming a relative risk aversion coefficient varying from 5 in col- umn 2 to 20 in column 7. The last row reports the R2 of the linear regression of effective tax rates (column 1) on predicted rational paygo tax rates. Source: Authors' calculations.

199 Table 4.F.2: Predicted PaygoTax Rates : Comparison of GDP and Earnings Based Estimates

RRA=12 RRA= 5 GDP Earnings GDP Earnings Based Based Based Based Australia 11.6 13.6 8.8 8.1 (1964) France 13.9 14.1 11.9 10.6 (1955) Germany 16.5 16.6 16.0 10.1 (1970) Italy 17.8 20.1 18.3 19.2 (1960) Japan 14.9 14.0 14.1 8.2 (1960) Spain 17.7 17.1 16.8 14.3 (1961) United Kingdom 9.7 12.4 6.7 10.7 (1955) United States 12.1 12.6 11.6 7.4 (1955) R2 79% 79% 74% 77% RMSE, Linear 5.6 5.6 6.1 5.8 RMSE, Poisson 4.3 6.2 4.9 6.6 Note: Predicted paygo tax rates, generated by Monte-Carlo simulations with 1000 replications, assuming a relative risk aversion coefficient of 12 in columns 1-2 and 5 in columns 3-4. The sample period goes from the year indicated under each country to 2002. The estimates in columns 1 and 3 are the smaller-sample counterparts of the predicted, rational rates presented in Table 4.E.4. R 2 is the R-squared of the linear regression of effective tax rates on predicted rational paygo tax rates. The last two rows report the Root Mean Square Error in the linear regression (Eq. (4.18)) and in the Poisson regression (Eq. (4.19)). Source: Authors' calculations.

200 Table 4.F.3: Predicted PaygoTax Rates with and without Account- ing for the Risk of a Crisis (RRA=12)

Paygo Rate Assuming Paygo Tax Rate with no Crisis Possible Crisis

Australia 10.8 11.5

France 12.4 12.9

Germany 14.7 15.1

Italy 16.6 16.8

Japan 12.5 13.0

Spain 16.7 16.9

United Kingdom 9.2 9.9

United States 11.3 11.7

Note: Column 1 reports the predicted rational tax rates shown in Table 4, column 2. Crises have a binomial probability distribu- tion and are added to the national stochastic processes summa- rized in Table 4.E.3. The predicted rational rates in the crisis experiment are presented in column 2. Source: Barro and Ursna (2008) and authors' calculations.

201 202 Bibliography

Aaron, Henry, "The Social Insurance Paradox," The Canadian Journal of Eco- nomics and Political Science, 1966, 32 (3), 371-377.

Alesina, Alberto and Edward Ludwig Glaeser, Fighting Poverty in the United States and Europe: A World of Difference, Oxford UK: Oxford University Press, 2004.

Bank of England, "Three centuries of data online database," 2016.

Barro, Robert J., "Rare Disasters, Asset Prices, and Welfare Costs," American Economic Review, March 2009, 99 (1), 243-64.

_ and Jose F. Ursna, "Macroeconomic Crises since 1870," Brookings Papers on Economic Activity, 2008, 1, 255-350.

Beetsma, Roel M.W.J., A. Lans Bovenberg, and Ward E. Romp, "Funded pensions and intergenerational and international risk sharing in general equilib- rium," Journal of InternationalMoney and Finance, 2011, 30 (7), 1516 - 1534.

_ and Peter C. Schotman, "Measuring risk attitudes in a natural experiment: data from the television game show Lingo," The Economic Journal, 2001, 111 (474), 821-848.

Bohn, Henning, "Intergenerational risk sharing and fiscal policy," Journal of Mon- etary Economics, 2009, 56 (6), 805 - 816.

Bottazzi, Laura, Paolo Pesenti, and Eric van Wincoop, "Wages, profits and the international portfolio puzzle," European Economic Review, 1996, 40 (2), 219 - 254.

Bruno, Michael and Jeffrey D. Sachs, Economics of worldwide stagflation, Cam- bridge, MA: Harvard University Press, 1985.

203 Campbell, John Y., "Consumption-based asset pricing," in G. Harris and M. Stulz, eds., Constantinides, Handbook of the Economics of Finance 2003.

Demange, Gabrielle, "On optimality in intergenerational risk sharing," Economic Theory, 2002, 20 (1), 1-27.

- and Guy Laroque, "Social Security and Demographic Shocks," Econometrica, 1999, 67 (3), 527-542.

- and _ , "Social security with heterogeneous populations subject to demographic shocks," The Geneva Papers on Risk and Insurance Theory, 2001, 26 (1), 5-24.

DeMenil, Georges, Fabrice Murtin, and Eytan Sheshinski, "Planning for the optimal mix of paygo tax and funded savings," Journal of Pension Economics and Finance, 2006, 5 (1), 1-25.

Disney, Richard, "Are contributions to public pension programmes a tax on em- ployment?," Economic Policy, 2004, 19 (39), 267-311.

_ and Edward Whitehouse, "Contracting-out and lifetime redistribution in the U.K. state pension system," Oxford Bulletin of Economics and Statistics, 1993, 55 (1), 25-41.

Dutta, Jayasri, Sandeep Kapur, and J. Michael Orszag, "A portfolio approach to the optimal funding of pensions," Economics Letters, 2000, 69 (2), 201 - 206.

Eisenhauer, Joseph G. and Luigi Ventura, "Survey measures of risk aversion and prudence," Applied Economics, 2003, 35 (13), 1477-1484.

Friend, Irwin and Marshall E. Blume, "The demand for risky assets," The American Economic Review, 1975, 65 (5), 900-922.

Fuster, Luisa, Ayge imrohoro'lu, and Selahattin imrohoroglu, "Elimination of social security in a dynastic framework," The Review of Economic Studies, 2007, 74 (1), 113-145.

Galasso, Vincenzo and Paola Profeta, "The political economy of social security: a survey," European Journal of Political Economy, 2002, 18 (1), 1 - 29.

Gale, Douglas, "The efficient design of public debt," in Mario Draghi and Dornbush Rudiger, eds., Public debt management: theory and history, Cambridge, England: Cambridge University Press, 1990.

204 Global Financial Data, "Total return series," http: //www.globalf indata. com.

Gollier, Christian, "Intergenerational risk-sharing and risk-taking of a pension fund," Journal of Public Economics, 2008, 92 (5), 1463 - 1485.

Gordon, Roger H. and Hal R. Varian, "Intergenerational risk sharing," Journal of Public Economics, 1988, 37 (2), 185 - 202.

Gottardi, Piero and Felix Kubler, "Social security and risk sharing," Journal of Economic Theory, 2011, 146 (3), 1078 - 1106.

Guiso, Luigi and Monica Paiella, "Risk aversion, wealth, and background risk," Journal of the European Economic Association, 2008, 6 (6), 1109-1150.

IMF, "International Financial Statistics," http: //if s .apdi .net/imf /about . asp 2004.

Krueger, Dirk and Felix Kubler, "Pareto-Improving Social Security Reform when Financial Markets are Incomplete?," American Economic Review, June 2006, 96 (3), 737-755. Matsen, Egil and Oystein Thogersen, "Designing social security - a portfolio choice approach," European Economic Review, 2004, 48 (4), 883 - 904.

Merton, Robert C., "On the role of social security as a means for efficient risk sharing in an economy where human capital is not tradable," in Zvi Bodie and John Shoven, eds., Financialaspects of the United States pension system, Chicago, Illinois: University of Chicago Press, 1983.

Mitchell, Brian, InternationalHistorical Statistics: Europe 1750-2000, Hampshire, New York: Palgrave MacMillan Reference Ltd, 2003.

_, International HistoricalStatistics: Africa, Asia and Oceania 1750-2005, Hamp- shire, New York: Palgrave MacMillan Reference Ltd, 2005.

Nishiyama, Shinichi and Kent Smetters, "Does social security privatization produce efficiency gains?," The Quarterly Journal of Economics, 2007, 122, 1677- 1719.

OECD, Pensions at a Glance 2003.

, "Unpublished projections of labor force and participation rates by country," 2013. Online; accessed June 2013.

205 _ , "Online database," http: //stats.oecd.org/Index. aspx?lang=fr 2016.

Pestieau, Pierre, "The political economy of redistributive social security," 1999. Working Paper of the IMF, WP/99/180.

_, Gregory Ponthiere, and Motohiro Sato, "Longevity, health spending, and pay-as-you-go pensions," FinanzArchiv: Public Finance Analysis, 2008, 64 (1), 1-18.

Thogersen, Oystein, "A note on intergenerational risk sharing and the design of pay-as-you-go pension programs," Journal of Population Economics, 1998, 11 (3), 373-378.

United Nations, World Population Prospects: The 2015 Revision, DVD Edition, Department of Economic and Social Affairs, Population Division, 2015. van Hemert, Otto, Optimal intergenerationalrisk sharing, LSE Library, London, UK: London School of Economics and Political Science, 2005.

Wagener, Andreas, "Pensions as a portfolio problem: fixed contribution rates vs. fixed replacement rates reconsidered," Journal of Population Economics, 2003, 16 (1), 111-134.

206